Mahidol University Logo
Faculty of ICT, Mahidol University
 

Admissions

Printable Version

 

IMPROVEMENT ON STRING MATCHING ALGORITHM USING PARTITIONING AND HASHING

 

TITLE IMPROVEMENT ON STRING MATCHING ALGORITHM USING PARTITIONING AND HASHING.
AUTHOR SIRIWAN CHAIWITOOANUKUL
DEGREE MASTER OF SCIENCE PROGRAMME IN COMPUTER SCIENCE
FACULTY FACULTY OF SCIENCE
ADVISOR DAMRUS WONGSAWANG
CO-ADVISOR CHOMTIP PORNPANOMCHAI
 
ABSTRACT
String matching or exact string matching, is a fundamental problem in computer science and plays the key role in the information retrieval (IR) system. It is an essential function in many applications such as text editor, data retrieval , word processing etc. Although data are memorized in various ways text remains the main form in which information is exchanged. This is particularly evident in literature or linguistics where data are composed of a huge corpus and dictionaries. These are applied to computer science as well where a large amount of data are stored in linear files. The problem with string matching is to find all occurrences of a pattern in a given text. Normally, most string matching algorithms consist of two essential procedures. One involves the comparison between each character of a pattern and a substring of a text. The other involves to determining the position of the next comparison. These are called the checking step and the jumping step, respectively. The performance of the algorithm depends on these two procedures. Hence, the number of character comparisons in the checking step, and the distance of jumping are the main factors affecting matching performance. Many well-known string matching algorithms have been optimized but the researchers are still trying to decrease the cost of space and the time spent on the process. Some researchers are interested in decreasing the number of character comparisons while others are interested in increasing the jumping distances. This thesis has improved the string matching algorithm, as recently proposed by Sun Kim, by applying the hashing technique and partitioning technique. The Sunday Quick Search technique is also applied to enhance the efficiency of the algorithm. It is called the Improvement of the String Matching Algorithm with Partitioning and Hashing(ISMPH). The testing environments are simulated and the results are compared with those of the original one as well as with those of other traditional algorithms. Many types of text files including HTML files used in the experiments here. From the experimental results, the ISMPH outperformed Sun Kim’s algorithm as well as other string matching algorithms currently in use in all cases of text files that are tested in this research.
KEYWORD STRING MATCHING / PATTERN MATCHING / EXACT STRING MATCHING ALGORITHM

 

Go to Top

 

ICT Building, Mahidol University, 999 Phuttamonthon 4 Road, Salaya, Nakhonpathom 73170 Tel. +66 02 441-0909 Fax. +66 02 849-6099
Mahidol University Computing Center, The Faculty of ICT, Mahidol University , Rama 6 Road, Rajathevi, Bangkok 10400 Tel. +66 02 354-4333 Fax. +66 02 354-7333