dtddoc step 3: Element and attribute descriptions (1/3) - exploring XML
dtddoc step 3: Element and attribute descriptions
In this installment we are going to look at ways to add extra information to the elements
and attributes of a DTD. This extra information shall then be used by our new tool dtddoc
to generate HTML documentation. This is step 3 of 4 in its creation:
- Parsing a DTD
- Creating the internal data structure
- Adding element and attribute descriptions
- Generating HTML
Adding element and attribute descriptions
There are a number of ways to add extra information to elements and attributes of a DTD.
dtd2html
reads a separate description file alongside the
DTD to parse and then combine the information contained in both. This approach has a number
of drawbacks:
- A new file format needs to be invented and learned for the descriptions.
dtd2html
uses an XML file with processing instructions denoting the element that is being documented, and interspersed HTML for the documentation. - Versioning has to be applied to the DTD and its description file, otherwise the two are out of sync in a snap.
- Inconsistencies between the two files are hard to spot for the human eye.
The single advantage is that existing DTDs can be documented without touching their original content.
Preferring an inline approach to documentation, the following possibilities arise:
- Define entities for documentation that are not used in the DTD itself.
- Make use of comments.
Defining entities would have the advantage that the DTD parser does some work for us in parsing them and presenting them as part of its API, but this pollutes the DTD's definition with documentation-only content. Parsing a DTD without the goal of creating documentation would always end up with additional overhead from processing these entity definitions.
Using specially formatted comments, as our inspiration javadoc
does for Java code, seems like the cleaner solution. Comments are usually ignored by the
DTD parser, only for the documentation run do they need to be examined. On the other hand this
means that a DTD parser must be able to optionally give access to the comments included in a DTD.
This can be accomplished in multiple ways.
Reading DTD comments
Let's examine various DTD parsers for suitability in our project.
While Bourret's DTD parser would need to be extended for handing back comments, Wutka's parser
returns DTDComment
objects as part of the parsed data structure. Traversing the
DTD tree once and collecting all the comments would give us all the necessary information.
The Perl SGML package
offers a callback function for comments. DTDset_comment_callback($callback)
sets the function,
$callback
, to be called when a comment declaration is read during DTDread_dtd
.
$callback is called as &$callback(*comment_text);
where *comment_text
is a pointer to the string containing all the text within the SGML comment declaration
(excluding the open and close delimiters). In the callback function we could extract all the
relevant data from the comments.
The PHP parser for DTDs, like Wutka's Java parser, knows a DTDComment
object as
part of the internal data structure. The same method of traversing the tree and collecting all
the information from the comments should also succeed here.
Produced by Michael Claßen
URL: https://www.webreference.com/xml/column67/index.html
Created: Oct 28, 2002
Revised: Oct 28, 2002