Parsing XML with SAX (in J2ME)
On the previous pages, we described XML parsing using XPath.
XPath, and the Document Object Model (DOM) on which the XPath parser relies,
provide random access to elements of the document.
XPath also takes a lot of
the work out of retrieving data from the document: since a key feature of XML is
its hierarchicaly structure, it makes sense to use an API that gives us a simple means
of traversing that hierarchy. Random access is possible an
object representation of the entire document is first created in memory. Even if this
representation takes up several megabytes— or even tens of hundreds of
megabytes for a moderately large document— that doesn't usually matter on modern server and desktop machines.
An alternative means of parsing XML is via SAX, the Simple API for XML.
Unlike DOM/XPath, this is a serial parse, in which the document is traversed
exactly once, and our code is notified of events during the parse (e.g. the start
of a particular element, text within an element etc). SAX is a fundamently inconvenient
way of parsing XML and is rarely worth bothering with on machines with plentiful
memory and processor resources. But it has two redeeming features:
- it is efficient and has little memory overhead, handy on devices with limited
resources such as many mobile devices;
- it is available as standard on some recent J2ME devices (specifically
those that implement JSR-172 or say they implement web services APIs).
If you're feeling masochistic, SAX is also available in the standard JDK,
but our example below will focus on using SAX with J2ME. (As mentioned, if you have
the standard JDK available, there are far more convenient ways of parsing XML!)
How to use SAX: overview
The first step is to make sure that you have included relevant libraries in
your build path. If you're compiling with Sun's Wireless Toolkit, you'll need to
make sure that your Project Settings have JAXP XML Parser (JSR 172) ticked.
In practice in many applications, you'll be pulling the XML document down from
the web and will probably need CLDC 1.1 as well (this provides the
Connector class for opening an HTTP or HTTPS connection). If you're
compiling/editing in an external IDE, you'll need to include either j2me-xmlrpc.jar
or (naughtily) the standard JDK libraries (which also include the SAX libraries).
Then, the steps for programming with SAX are generally as follows:
- include org.xml.sax.* and appropriate subpackages;
- create a subclass of org.xml.sax.helpers.DefaultHandler, overriding
methods such as startElement(), endElement() to be notified
when the parser reaches interesting points in the document;
- get an instance of SAXParser;
- call parse() on that parser, passing in an input stream to
the document and your handler.
On the next page, we look specifically at how to create a
SAX handler by overriding