dtddoc step 2: Internal data structures (1/2) - exploring XML
dtddoc step 2: Internal data structures
After parsing XML DTDs with four different software packages in our last installment, we will look at the resulting data structures in Java, Perl, and PHP, respectively. This will be step two out of four in our creation of dtddoc:
- Parsing a DTD
- Creating the internal data structure
- Adding element and attribute descriptions
- Generating HTML
Creating the internal data structure
We used the following parsers:
- Bourret's DTD Parser in Java
- Wutka's DTD Parser, also in Java
- Perl DTD Parser in the perlSGML library
- PHP DTD Parser in phpxmlclasses
As demonstrated before, these parsers have a straightforward way of calling them on an input stream, creating an internal data structure in return. Let's examine those internal data structures in turn.
Bourret's DTD parser
Using DTDParser
's parseXMLDocument()
or parseExternalSubset()
functions returns a DTD
object for further inspection. The DTD
object holds
hashtables of entities and element types of the parsed DTD. An ElementType
has various members
that carry interesting information:
member | value |
---|---|
attributes | A Hashtable of attributes for the respective element type. |
children | A Hashtable of child ElementTypes. |
content | A Group representing the content model of the element type. |
contentType | The type of content, one of: CONTENT_ANY: Any content type. CONTENT_ELEMENT: Element content type. CONTENT_EMPTY: Empty content type. CONTENT_MIXED: "Mixed" content type. CONTENT_PCDATA: PCDATA-only content type. CONTENT_UNKNOWN: Unknown content type. |
name | The XMLName of the element type. |
parents | A Hashtable of parent ElementTypes. |
With this information, enumerating all element types and outputting the above information in HTML should
be a snap. The content model's hierarchy can be easily navigated through the parents
and
children
members of each element type.
on to inspecting the other tools' data structures...
Produced by Michael Claßen
URL: https://www.webreference.com/xml/column66/index.html
Created: Oct 14, 2002
Revised: Oct 14, 2002