Converting DTDs to XML Schemas (3/3) - exploring XML
Converting DTDs to XML Schemas
Manual work
We immediately run into a bug here: Both the channel and item definitions contain elements of name ? and +, rather than having these symbols translated into minOccurs and maxOccurs constraints. This is also provoked by the strange DTD definitions of these elements, which should rather read:
<!ELEMENT channel (title, description, link, language, item+, rating?, image?, textinput?, copyright?, pubDate?, lastBuildDate?, docs?, managingEditor?, webMaster?, skipHours?, skipDays?)>
This stricter element definition would be more accurately converted into:
<element name="channel"> <element ref="title" minOccurs="1" maxOccurs="1"/> <element ref="description" minOccurs="1" maxOccurs="1"/> <element ref="link" minOccurs="1" maxOccurs="1"/> <element ref="language" minOccurs="1" maxOccurs="1"/> <element ref="item" minOccurs="1"/> <element ref="rating" minOccurs="0" max="1"/> <element ref="image" minOccurs="0" max="1"/> <element ref="textinput" minOccurs="0" max="1"/> <element ref="copyright" minOccurs="0" max="1"/> <element ref="pubDate" minOccurs="0" max="1"/> <element ref="lastBuildDate" minOccurs="0" max="1"/> <element ref="docs" minOccurs="0" max="1"/> <element ref="managingEditor" minOccurs="0" max="1"/> <element ref="webMaster" minOccurs="0" max="1"/> <element ref="skipHours" minOccurs="0" max="1"/> <element ref="skipDays" minOccurs="0" max="1"/> </element>
Increased Precision
In the last article we emphasized that XML Schema allows for increased precision for specifying constraints on elements and attributes. The RSS version atribute needs to be "0.91". In XML Schema we can express this with:
<element name="rss"> <complexType content="elementOnly"> <attribute name="version" type="string" use="fixed" value="0.91"/> </complexType> </element>
One of the conventions on Netscape's former RSS syndication system was that a channel could only have up to 15 elements.
<element ref="item" minOccurs="1" maxOccurs="15"/>More constraints on other elements of the RSS definition can be enforced in similar ways.
Conclusion
The conversion tool cannot work magic if the underlying DTD is already fairly lax. The RSS 0.91 definition is not very restrictive so a lot of manual work is required, including working around bugs. Nevertheless on the W3C site there are successful conversions of P3P, SMIL, XHTML, and MathML, so an automatic conversion seems feasible for better-behaved DTDs.
Produced by Michael Claßen
URL: https://www.webreference.com/xml/column35/3.html
Created: Jul 18, 2001
Revised: Jul 18, 2001