The University of Sheffield
School of Computer Science

Manh Tri Nguyen Undergraduate Dissertation 2017/18

The XML Olympics: Benchmarking Java XML Toolkits

Supervised by A.Simons

Abstract

XML is a type of hierarchical database and widely used to exchange data between web services. This project aims to benchmark different Java implementations of toolkits for processing XML and produce a research paper at the end. The project is styled like the Olympic Games where different candidates can compete in different events (a candidate can compete in one event or more than one events depends on its functionalities).

The Java tools to be compared include the Java built-in JAXP package that consists of DOM SAX, StAX; a number of tree-based, streaming, unmarshal/marshal tools and Sheffield's own contender JAST (Java Syntax Trees), which does all of those. The tools will compete for different events such as DOM speed reading/writing, the volume of data stored in memory, the speed of marshalling/unmarshalling, XPath coverage, the speed of searching; and (for usability) degree of configuration required; and ease of use.

All software will be developed in Java using a high-level micro-benchmarking tool Java Microbenchmark Harness (JMH).