dtddoc step 2: Internal data structures (2/2) - exploring XML
dtddoc step 2: Internal data structures
Wutka's DTD parser
Wutka's DTD
class looks similar
to Bourret's, with the addition of a distinguished member
identifying the root element. The individual DTDElements
, however,
have only three members:
member | value |
---|---|
attributes | The element's attributes. |
content | The element's content. |
name | The element's name. |
This restricts navigation to a top-down approach, descending from the root element. For our documentation generation project we would have to derive the parents of a particular element ourselves.
perlSGML's DTDParser
As mentioned earlier, Perl programs make less use of classes, in favor of a more functional programming style around the built-in hash and array data types. The following functions can be invoked on a DTD object:
function | value |
---|---|
get_base_children($element) | An element's children (content elements). |
get_elem_attr($element) | An element's attributes. |
get_elements($nosort) | Returns all elements in a DTD, in sorted or order of appearance. |
get_elements_of_attr($attr_name) | Retrieve all elements that have an attribute $attr_name defined in the DTD. |
get_parents($element) | Get all elements that may be a parent of $element. |
get_top_elements() | Get the top-most elements defined in the DTD. Top-most elements are those elements that cannot be contained within another element or can only be contained within itself. |
is_child($element, $child) | returns 1 if $child can be a legal child of $element. Otherwise, 0 is returned. |
is_element($element) | returns 1 if $element is defined in the DTD. Otherwise, 0 is returned. |
There are more SGML-specific functions, which have been omitted for brevity. The functions listed above
allow for both a top-down navigation of the DTD and a combination of parent and child traversal. All data
structures are returned as Perl hashes or arrays, DTD
is the only class in this package.
DTD Parser in phpxmlclasses
The PHP DTD parser is designed like Wutka's Java version, with the DTD
object having a root
element and an array of all elements. Those elements contain:
member | value |
---|---|
attributes | The element's attributes. |
content | An element's content. |
name | An elements name. |
So here too, we would have to construct the parent relationships ourselves. Apart from that, everything needed for our project is present.
Conclusion
The DTD parsers fall into two categories: The ones that simply parse the DTD into a hierarchical structure
of arrays or custom objects, and those that additionally provide full parent/child references. All parsers
are suitable candidates for dtddoc
, so we can choose the implementation language based on
our development needs going forward.
Produced by Michael Claßen
URL: https://www.webreference.com/xml/column66/2.html
Created: Oct 14, 2002
Revised: Oct 14, 2002