WebReference.com - Part 1 of Chapter 10 from Professional PHP4 XML, from Wrox Press Ltd (7/7)
[previous] |
Professional PHP4 XML
Streaming
One good characteristic of SAX filters is that as the SAX parser parses the XML chunks, SAX events are propagated for the chunk. So the last filter in the chain can stream the result of the composite transformation step-by-step without having to concatenate or wait for the whole result to arrive. This allows streaming the result to the disk, database, or whatever its destination is without delays.
To stream the result we need to have a streaming filter as the last link in the chain that, instead of calling the print()
statements in our example, calls a Stream()
method in another object. That object can save the results chunk by chunk to a file, database, or send them to the browser:
class FilterStreamer extends AbstractFilter
{
var $stream;
function FilterStreamer($obj)
{
$this->stream = $obj;
}
function StartElementHandler($name, $attribs)
{
$this->stream->Stream("<$name");
foreach ($attribs as $key => $val) {
$this->stream->Stream(" $key='$val'");
}
$this->stream->stream(">");
}
function EndElementHandler($name)
{
$this->stream->Stream("</$name>");
}
function CharacterDataHandler($data)
{
$this->stream->Stream("$data");
}
}
This filter expects to receive an object in its constructor, and that object must implement a Stream()
method capable of receiving the result of the filter chain as chunks. For example:
class SimpleStreamer extends AbstractStreamer
{
function Stream($data)
{
print ($data);
}
}
We can build the filter chain as follows:
$ss = new SimpleStreamer();
$f1 = new ExpatParser("applications.xml");
$f1->ParserSetOption(XML_OPTION_CASE_FOLDING, 0);
$f2 = new FilterNameBold();
$f3 = new FilterStreamer($ss);
$f2b = new FilterName();
$f2b->SetListener($f3);
$f2->SetListener($f2b);
$f1->SetListener($f2);
$f1->Parse();
The following table shows some of the advantages and disadvantages of using SAX:
Advantages | Disadvantages |
Scales very well for huge documents | Transformations can be complex |
Resource consumption is constant for documents of any size | Harder to use if the result is not XML |
It is fast | Requires a lot of coding for even simple transformations |
Allows streaming |
Choosing a Transformation Strategy
So far in this chapter we've seen how DOM, XSLT, and SAX can be used to perform transformations. The question is when to use each approach, and as we can imagine there's no definite answer. The best recommendation, however, seems to be to use XSLT whenever we can and SAX only when XSLT can't be used. DOM could be an alternative only for very simple transformations.
This table describes the best possible approach in each situation:
Document Type | Simple Transformations | Medium Complexity Transformations | Complex Transformations |
Small | DOM or XSLT | XSLT | XSLT |
Mid-sized | XSLT or DOM (if the transformation is very simple) | XSLT | XSLT |
Large | XSLT or SAX | XSLT (preferred) or SAX | XSLT |
Very Large | SAX | SAX or XSLT | Try to use XSLT or SAX (if XSLT is impossible) |
If transformations are almost always the same, we may want to use XSLT and caching no matter how big the documents are, since we'll only waste time and resources once in a while.
If the result of the transformation is not XML, XSLT is the best alternative and we should try to evade DOM or SAX unless there's no escape.
Abstracting XML Transformations
As we can see, sometimes the choice between SAX and XSLT can be difficult, and at times we may start using XSLT and then need to prepare a SAX alternative for huge documents. Taking this into consideration we recommend abstract transformations by defining transformation classes that implement a given transformation, for example, RSS to HTML. We then can implement that transformation class using XSLT or SAX without changing the application. The class is also a good place to put a caching mechanism if necessary. Abstraction is always worth the effort.
[previous] |
Created: August 12, 2002
Revised: August 12, 2002
URL: https://webreference.com/programming/php/php4xml/chap10/1/7.html