dtddoc step 4: Generating HTML (1/2) - exploring XML

dtddoc step 4: Generating HTML

In this last installment of our mini-series on documenting DTDs we finish our dtddoc tool by generating the HTML pages out of the parsed DTD and its annotations. This is the fourth and final step in its creation:

Parsing a DTD
Creating the internal data structure
Adding element and attribute descriptions
Generating HTML

Generating HTML

Now that we have collected all the available information from a DTD and found a suitable format for annotating it, our last task is to pour that content into a nice format. Registering my project on SourceForge I realized there is already a DTDDoc project. Shame on me for not looking there in the first place, but examining the output of that software reveals similar weaknesses to the already introduced dtd2html:

dtd2html spreads information across too many pages.
DTDDoc packs all data on very few pages.
Hyperlinking is not used to its maximum benefit.
Neither of them follow the proven javadoc page structure.

Comparing all the strengths and weaknesses of the existing solutions I ended up with the following page structure:

One overview page for the introduction of the documented DTD.
One tree page detailing the hierarchical structure of a DTD's elements.
One index page listing all DTD tokens in alphabetical order.
One page per DTD element denoting the content model, all attributes, and parent elements.

Frames are not used as in javadoc, because a DTD is roughly equivalent to one Java package, and the frame navigation is only really useful for navigating across many packages at the same time. DTDs are not interrelated in the same way, something that might change with XML Schema and RELAX NG.

I want to spare you all the gory coding details, they are available as source code listings in the various language sections below. The whole package for all three languages is also available.

The Java implementation

The Java solution delegates the generation of the various pages to a specialized method, passing along a java.io.Writer object for the respective output file. Navigating the in-memory data structure for DTD elements proved to be cumbersome for our purposes, so I reverted to the ORO regular expression matching for processing an element's content model. Every element name is augmented with a hyperlink to its respective detail page. You can inspect the source code.

On to PHP and Perl...

Produced by Michael Claßen

URL: https://www.webreference.com/xml/column68/index.html
Created: Nov 11, 2002
Revised: Nov 11, 2002

dtddoc step 4: Generating HTML (1/2) - exploring XML