Yet another Java XML API: xmlpull (2/2) - exploring XML

Yet another Java XML API: xmlpull

The next funny is that the XMLPullParser is responsible for everything else in the parsing process, except providing information for a parsing problem, that's kindly left to XMLPullException.:

XMLPullParser provides functions next() and nextToken() to iterate over the XML data. next() only returns tags and text and skips everyting else , whereas nextToken() faithfully reports processing instructions, entity declarations, and comments.
XMLPullParser provides access to the current XML token through various getter-functions: for a token's name, namespace, prefix, text, and information on attributes. getEventType() returns one of a number of constants detailing which kind of XML token was found. However, not all functions work at the same time.

The reason that the design of xmlpull ignores many principles used in the Java Standard and Enterprise Editions is simple, this API is designed for J2ME, the micro edition of Java. This means that:

It's important to pool resources through the use of factories. XMLPullParserFactory can do clever things behind the scenes to manage multiple XMLPullParsers simultaneously.
It's also important to instantiate as few objects as possible. A hypothetical XMLToken class would have many instances created and waiting for garbage collection in one parsing process.
Borrowing from well-known Java concepts like Readers and Tokenizers is less important because they are not part of J2ME.

So this madness has a scheme behind it.

Pulling an Example

After so much destructive criticism let's focus on a constructive example. We want to count the maximal depth of nested elements in an XML document. To set up the parser, enter:

XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
XmlPullParser parser = factory.newPullParser();
InputStream in = new FileInputStream(new File("blah.xml"));
parser.setInput(in, "ISO-8859-1"); // encoding can be set manually, use null for auto-detect

Ready, set, pull:

int maxdepth = 0;
int depth = 0;
for (int type = parser.next(); type != parser.END_DOCUMENT; type = parser.next()) {
  if (type == XmlPullParser.START_TAG) {
    depth++;
    if (depth > maxdepth) {
      maxdepth = depth;
    }
  }
  else if (type == XmlPullParser.END_TAG) {
    depth--;
  }
}

After the parser has signalled END_DOCUMENT, maxdepth contains the maximal nesting depth of XML nodes. Much less hassle than registering a callback interface with SAX's push model, or traversing document trees in DOM. In case you forgot the details, you can easily review SAX and DOM.

Conclusion

An Application Programming Interface is only as good as its implementations. Two free packages exist:

kXML from the Enhydra folks
XPP3/MXP1 by Aleksander Slominski

Choosing to use these products in Micro-Java projects, as opposed to established SAX and DOM parsers, will decide the fate of this approach to XML parsing.

Produced by Michael Claßen

URL: https://www.webreference.com/xml/column76/2.html
Created: Mar 03, 2003
Revised: Mar 03, 2003

Yet another Java XML API: xmlpull (2/2) - exploring XML