XML parsing in Java with XPath

In case you're not too familiar with XML, we'll start with a brief overview of a typical, simple XML document. We're not going to get bogged down with some of the less common features of XML, but concentrate on the most common ones. Then, as a basis for parsing the XML document, we're going to use XPath, which is essentially a scheme for referring to parts of an XML document as though it were a file system.

XML overview

Here's a typical, simple, small XML document:

<?xml version="1.0" encoding="iso-8859-1"?>
<configuration>
  <maxConnections>100</maxConnections>
  <minConnections ignorable="true">10</minConnections>
  <extraParams/>
</configuration>

If you're familiar with HTML, various features of XML will appear familiar. Here are the main features:

Compared to typical HTML, XML has a slightly "tighter" format:

XML documents can have other features that we won't get bogged down in here, such as namespaces and document type definitions (essentially a means for a document to be validated when it is read).

XPath

As mentioned, XPath is a scheme for accessing parts of an XML document as though it were a file system. For relatively short documents where you need "random access" to an XML file, it's usually the most practical means of parsing the document. (Unfortunately, the XPath implementation of current releases of Java is slightly buggy in that it performs catastrophically on large documents.)

As an example, the following XPath expression refers to the text of the maxConnections node (with the value "100" in this case):

/configuration/maxConnections/text()

while the following refers to the value of the ignorable attribute of the minConnections node:

/configuration/minConnections/@ignorable

Next: evaluating XPath expressions in Java

Now we've seen the overall principles of XML and XPath, on the next page, we look at the actual code to evaluate XPath expressions in Java.


If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.

Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.