Mahidol University Logo
Faculty of ICT, Mahidol University
 

Admissions

Printable Version

 

AUTOMATIC HYPERTEXT GENERATION BY APPLICATION OF LEXICAL CHAIN

 

TITLE AUTOMATIC HYPERTEXT GENERATION BY APPLICATION OF LEXICAL CHAIN.
AUTHOR SUWIMOL WAHAKIT
DEGREE MASTER OF SCIENCE PROGRAMME IN COMPUTER SCIENCE
FACULTY FACULTY OF SCIENCE
ADVISOR DAMRAS WONSAWANG
CO-ADVISOR SUPACHAI TANGWONGSAN
 
ABSTRACT
The dramatic growth of the World Wide Web illustrates the importance of hypertext as a method for organizing the rapidly expanding amount of on-line text. At present, it is common practice to manually author hypertexts. If the initial collection of documents is large and also consists of multimedia documents, a completely manual authoring can be impossible. There is no question that building and maintaining a large Web site requires large amounts of time and money. Aside from these concerns, there is evidence that when humans construct hypertext links they do so inconsistently, that is, different people will tend to place different links into the same document. This inconsistency means that the links would be less useful for a user searching for specific information. Most of the proposed methods for automatic hypertext construction rely on term repetition. The underlying philosophy of these systems is that texts that are related will tend to use the same terms. While this approach has proven quite successful, it suffers from two problems related to the meanings of words. The first of these is polysemy, when a single word has several meanings. The second is synonymy, when different words have the same meanings. Both of these phenomena disrupt the simple identity relationship used by traditional IR systems. In this research lexical chain application was used to build hypertext links between articles that will account for the fact that two articles that are about the same thing will tend to use similar words. By using the lexical chains extracted from the articles, rather than just the words, the problems of synonymy and polysemy can be accounted for. In the experiments, a collection of related computer science articles was tested. The retrieval performance with keyphrase extraction combined to lexical chaining was improved. The factors that affect retrieval improvement were also studied. These factors were the value of W(weight) in the ranking model and the threshold value of similarity between a pair of documents. The appropriate value of W in the ranking model was 0.5 and threshold value of similarity between documents could be used to enhance precision, but decrease the recall improvement. The results suggest that further research involving dynamic hypertext links should be conducted to help WWW users to access desired information more effectively.
KEYWORD HYPERTEXT LINK / LEXICAL CHAINING /INFORMATION RETRIEVAL

 

Go to Top

 

ICT Building, Mahidol University, 999 Phuttamonthon 4 Road, Salaya, Nakhonpathom 73170 Tel. +66 02 441-0909 Fax. +66 02 849-6099
Mahidol University Computing Center, The Faculty of ICT, Mahidol University , Rama 6 Road, Rajathevi, Bangkok 10400 Tel. +66 02 354-4333 Fax. +66 02 354-7333