dtddoc step 1: Parsing a DTD (1/2) - exploring XML
dtddoc step 1: Parsing a DTD
Now that we have looked at the -- to my knowledge -- only tool for documenting DTDs, let's get started with our own development effort. This can be broken down into the following steps:
- Parsing a DTD
- Creating the internal data structure
- Adding element and attribute descriptions
- Generating HTML
We will look at the first step today and leave the others to later installments of this column.
Parsing a DTD
A DTD parser has the job of reading a textual representation of a DTD and transforming it into an in-memory representation suitable for further manipulation in the respective programming language. Instead of writing a DTD parser ourselves, we find these software packages on the Web:
- Programmed in Java:
- Developed in Perl:
- DTD Parser in the perlSGML library
- Written in PHP:
- DTD Parser in phpxmlclasses
All of these parsers have a straightforward way of calling them on an input stream and creating an internal data structure in return.
Bourret's DTD parser
With Bourret's parser you simply create a DTDParser
and call either parseXMLDocument()
or parseExternalSubset()
,
depending on whether the DTD is embedded in the XML document or external to it:
import org.sax.InputSource; import org.xmlmiddleware.schemas.dtds.DTDParser; import org.xmlmiddleware.schemas.dtds.DTD; DTDParser parser = new DTDParser(); InputSource is = new InputSource("html4.dtd"); DTD dtd = parser.parseExternalSubset(is, null);
Parsing DTDs with the other tools...
Produced by Michael Claßen
URL: https://www.webreference.com/xml/column65/index.html
Created: Sep 30, 2002
Revised: Sep 30, 2002