Converting DTDs to XML Schemas (1/3) - exploring XML
Converting DTDs to XML Schemas
In the last installment of this column I described a neat tool, dtd2xsd, to convert Document Type Definitions to XML Schema. Let's put dtd2xsd to work with the RSS 0.91 DTD as an example.
RSS 0.91
Netscape invented RSS in order to advertise news channels for their Web service My Netscape Network, one of the first personalized portals. Any site could write up a summary file in RSS, submit the URL to My Netscape Network and enable users to construct their personal start page on that service from any of these registered news sources. Unfortunately this service silently disappeared from Netscape's site, leaving behind a slightly confused RSS community.
The DTD can be found at the My Netscape site.
<!ELEMENT rss (channel)> <!ATTLIST rss version CDATA #REQUIRED> <!-- must be "0.91"> --> <!ELEMENT channel (title | description | link | language | item+ | rating? | image? | textinput? | copyright? | pubDate? | lastBuildDate? | docs? | managingEditor? | webMaster? | skipHours? | skipDays?)*> <!ELEMENT title (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT link (#PCDATA)> <!ELEMENT image (title | url | link | width? | height? | description?)*> <!ELEMENT url (#PCDATA)> <!ELEMENT item (title | link | description)*> <!ELEMENT textinput (title | description | name | link)*> <!ELEMENT name (#PCDATA)> <!ELEMENT rating (#PCDATA)> <!ELEMENT language (#PCDATA)> <!ELEMENT width (#PCDATA)> <!ELEMENT height (#PCDATA)> <!ELEMENT copyright (#PCDATA)> <!ELEMENT pubDate (#PCDATA)> <!ELEMENT lastBuildDate (#PCDATA)> <!ELEMENT docs (#PCDATA)> <!ELEMENT managingEditor (#PCDATA)> <!ELEMENT webMaster (#PCDATA)> <!ELEMENT hour (#PCDATA)> <!ELEMENT day (#PCDATA)> <!ELEMENT skipHours (hour+)> <!ELEMENT skipDays (day+)>
In a nutshell an RSS 0.91 channel definition consists of a title, a description, a Web link, a language description and a number of news items. Elements like PICS rating, an image, author and date information can optionally be added. I omitted the long list of entity definitions for HTML characters, such as for a non-breaking space.
The Conversion Tool
The DTD to XML Schema Conversion Tool takes a DTD and translates it into its equivalent XML schema definition. Usage of the tool is as follows:
perl dtd2xsd.pl [-alias] [-prefix p] [-ns n] [file] -alias enables special aliases (default off) -prefix t specify namespace prefix -ns https://www.w3.org/namespace/ specify namespace URI -simpletype pattern base treat parameter entities whose name match this pattern as simple datatypes derived from this base type -attrgroup pattern treat parameter entities whose name match this pattern as attribute groups -modelgroup pattern treat parameter entities whose name match this pattern as model groups
So let's call the tool like this:
perl dtd2xsd.pl -prefix rss -ns https://purl.org/rss/0.91 rss091.dtd
Let's look at the result of the conversion.
Produced by Michael Claßen
Created: Jul 18, 2001
Revised: Jul 18, 2001