Mahidol University Logo
Faculty of ICT, Mahidol University
 

Admissions

Printable Version

 

XPACK: A GRAMMAR-BASED XML DOCUMENT COMPRESSION

 

TITLE XPACK: A GRAMMAR-BASED XML DOCUMENT COMPRESSION
AUTHOR KESMANAS MAIRIANG
DEGREE MASTER OF SCIENCE PROGRAMME IN COMPUTER SCIENCE
FACULTY FACULTY OF SCIENCE
ADVISOR CHARNYOTE PLUEMPITIWIRIYAWEJ
CO-ADVISOR DAMRAS WONGSAWANG
THANWADEE SUNETNANTA
 
ABSTRACT
Most data that are stored and interchanged on the Web are represented as XML documents. Normally, the size of the XML documents is large with respect to the size of the required information contained in them due to the replication of tags. In XML documents, the same tag is used to describe different data items of the same type. To reduce the effect of the replication, methods for the compression of the XML documents hve been developed. In this thesis, we introduce a grammar-based compression technique for semantically lossless compression of XML documents. This technique is developed in the context of the XPACK system, which supports both compression and decompression of XML documents. The XPACK system consists of three main steps: 1) the derivation of grammar rules from the analysis of document structures, 2) the document compression using the grammar rules, and 3) the document decompression. In experimental testing, our compression technique was found to compress an XML document to a size 74% to 96% smaller than its original size. This technique provides a better compression performance than GZIP or XMILL.
KEYWORD XML COMPRESSION/ XML/ DATA COMPRESSION

 

Go to Top

 

ICT Building, Mahidol University, 999 Phuttamonthon 4 Road, Salaya, Nakhonpathom 73170 Tel. +66 02 441-0909 Fax. +66 02 849-6099
Mahidol University Computing Center, The Faculty of ICT, Mahidol University , Rama 6 Road, Rajathevi, Bangkok 10400 Tel. +66 02 354-4333 Fax. +66 02 354-7333