Mahidol University Logo
Faculty of ICT, Mahidol University
 

Admissions

Printable Version

 

AUTOMATIC QUERY EXPANSION FOR INFORMATION RETRIEVAL USING LOCAL CONTEXT ANALYSIS WITH LOCATION-WEIGHTED LEVEL OF RELATED TERMS IN DOCUMENTS

 

TITLE AUTOMATIC QUERY EXPANSION FOR INFORMATION RETRIEVAL USING LOCAL CONTEXT ANALYSIS WITH LOCATION-WEIGHTED LEVEL OF RELATED TERMS IN DOCUMENTS.
AUTHOR TIPPARAT SOOKSRI
DEGREE MASTER OF SCIENCE PROGRAMME IN COMPUTER SCIENCE
FACULTY FACULTY OF SCIENCE
ADVISOR DAMRAS WONGSAWANG
CO-ADVISOR CHOMTIP PORNPANOMCHAI
 
ABSTRACT
The problem of information retrieval is that some relevant documents are not retrieved while non-relevant documents are retrieved. The reason is that users use different query terms from the collection indexed terms, and they do not know how to use complex Boolean query. To solve this problem, an automatic query expansion is proposed. This technique helps users formulate a more suitable and complex query from a simple query. Many approaches for query expansion have been proposed. This thesis proposes the query expansion model, called Local Context Analysis with Location-Weighted Level (LCALWL) model, which is an extension of the original local context analysis model by adding the use of location-weighted level of related terms in documents and also taking the co-occurrence between document terms and query terms into consideration. Documents kept in a collection will be divided into three main location types: title, heading, and details. The more frequency of co-occurrence between document terms and query terms in a high ranked document location occur, the more significant expanded query terms we get. To verify and prove the model, a prototype of LCALWL has been developed and four parameters are investigated. The first parameter is the weight for each location level. The second one is the threshold value for judgement whether a concept should be weighted by location level of its co-occurrences. The third one is the number of top ranked relevant documents. The last one is the number of top ranked expansion terms. Results from the experiments showed that various weight values in each location level give significant recall and precision improvement. Increasing the number of top-ranked expansion terms will increase the recall. LCALWL model also provides more precision improvement than the original local context analysis model. However, the recall improvement of both models is almost the same. Further research work should be explored for practical implementation of LCALWL such as speeding up processing time using a multithread technique, applying more than one retrieval models with variety of test collections.
KEYWORD INFORMATION RETRIEVAL / RETRIEVAL / LOCAL CONTEXT ANALYSIS / LOCATION / QUERY EXPANSION.

 

Go to Top

 

ICT Building, Mahidol University, 999 Phuttamonthon 4 Road, Salaya, Nakhonpathom 73170 Tel. +66 02 441-0909 Fax. +66 02 849-6099
Mahidol University Computing Center, The Faculty of ICT, Mahidol University , Rama 6 Road, Rajathevi, Bangkok 10400 Tel. +66 02 354-4333 Fax. +66 02 354-7333