dtddoc step 3: Element and attribute descriptions (1/3) - exploring XML

dtddoc step 3: Element and attribute descriptions

In this installment we are going to look at ways to add extra information to the elements and attributes of a DTD. This extra information shall then be used by our new tool dtddoc to generate HTML documentation. This is step 3 of 4 in its creation:

Parsing a DTD
Creating the internal data structure
Adding element and attribute descriptions
Generating HTML

Adding element and attribute descriptions

There are a number of ways to add extra information to elements and attributes of a DTD. dtd2html reads a separate description file alongside the DTD to parse and then combine the information contained in both. This approach has a number of drawbacks:

A new file format needs to be invented and learned for the descriptions. dtd2html uses an XML file with processing instructions denoting the element that is being documented, and interspersed HTML for the documentation.
Versioning has to be applied to the DTD and its description file, otherwise the two are out of sync in a snap.
Inconsistencies between the two files are hard to spot for the human eye.

The single advantage is that existing DTDs can be documented without touching their original content.

Preferring an inline approach to documentation, the following possibilities arise:

Define entities for documentation that are not used in the DTD itself.
Make use of comments.

Defining entities would have the advantage that the DTD parser does some work for us in parsing them and presenting them as part of its API, but this pollutes the DTD's definition with documentation-only content. Parsing a DTD without the goal of creating documentation would always end up with additional overhead from processing these entity definitions.

Using specially formatted comments, as our inspiration javadoc does for Java code, seems like the cleaner solution. Comments are usually ignored by the DTD parser, only for the documentation run do they need to be examined. On the other hand this means that a DTD parser must be able to optionally give access to the comments included in a DTD. This can be accomplished in multiple ways.

Reading DTD comments

Let's examine various DTD parsers for suitability in our project. While Bourret's DTD parser would need to be extended for handing back comments, Wutka's parser returns DTDComment objects as part of the parsed data structure. Traversing the DTD tree once and collecting all the comments would give us all the necessary information.

The Perl SGML package offers a callback function for comments. DTDset_comment_callback($callback) sets the function, $callback, to be called when a comment declaration is read during DTDread_dtd. $callback is called as &$callback(*comment_text); where *comment_text is a pointer to the string containing all the text within the SGML comment declaration (excluding the open and close delimiters). In the callback function we could extract all the relevant data from the comments.

The PHP parser for DTDs, like Wutka's Java parser, knows a DTDComment object as part of the internal data structure. The same method of traversing the tree and collecting all the information from the comments should also succeed here.

Adding dtddoc comments...

Produced by Michael Claßen

URL: https://www.webreference.com/xml/column67/index.html
Created: Oct 28, 2002
Revised: Oct 28, 2002

dtddoc step 3: Element and attribute descriptions (1/3) - exploring XML