Introduction to RELAX NG (1/2) - exploring XML
Introduction to RELAX NG
In the last installment we discussed the different approaches to schema definition put forward by the W3C and OASIS. More specifically, we followed the criticism surrounding XML Schema, and looked at some improvements offered in the alternative, RELAX NG. Today we'll explain the basics of RELAX NG by example.
A simple RELAX NG example
Let's assume we want to specify a person's name in XML, consisting of first name, last name and middle initial, such as:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE name SYSTEM "name.dtd"> <name bidi="fml"> <first>George</first> <mi>W</mi> <last>Bush</last> </name>
The bidirectional atribute bidi
shall indicate in which order the name parts are usually written
in the person's culture, so that "lastname first" is also supported.
This document is a valid instance of the following DTD:
<!ELEMENT name (first, mi, last)> <!ATTLIST name bidi CDATA #IMPLIED> <!ELEMENT first (#PCDATA)> <!ELEMENT mi (#PCDATA)> <!ELEMENT last (#PCDATA)>
This DTD declares name
to consist of first
, mi
, and last
.
An optional attribute bidi
can be attached to the name
element. Since DTDs do not
support data types, the remaining elements need to be declared as character data.
The aformenetioned data structure also matches the following RELAX NG schema in XML syntax:
<?xml version="1.0" encoding="UTF-8"?> <element name="name" xmlns="https://relaxng.org/ns/structure/1.0"> <optional> <attribute name="bidi"/> </optional> <element name="first"><text/></element> <element name="mi"><text/></element> <element name="last"><text/></element> </element>
Elements are defined using the element
tag with the mandatory attribute name
.
The top-level element
also defines the root element for all XML documents following this schema.
The attribute
element defines the occurrence of the specified name on the enclosing element.
In our case, the bidi
attribute is flagged as optional by putting it inside an optional
element.
There is also non-XML syntax put forward, which resembles the EBNF notation used in grammar definition. The compact syntax of the same schema is:
element name { attribute bidi { text }?, element first { text }, element mi { text }, element last { text } }
Here the start and end tags have been replaced with braces, sequence is expressed through the use of commata,
and the optional
is now symbolized by the question mark. This syntax needs much fewer characters
and is easier to read for humans, while it can still be processed by machines.
Produced by Michael Claßen
URL: https://www.webreference.com/xml/column60/index.html
Created: Jul 22, 2002
Revised: Jul 22, 2002