XIP - XML Iterative Parsing

Fetch the software.

XIP is an XML parsing method even simpler than SAX. A header file and sample implementation in C++ are provided. Bindings to other languages are certainly possible.

There are two main types of API for parsing XML documents. DOM parsers read in the whole document and build a parse tree. If the document is large, this obviously takes a lot of cycles and memory before you even get started handling semantics. SAX parsers, on the other hand, interpret the document on the fly, giving the pieces to callback routines that the application has registered. However this does require that the application keep track of what elements are currently in effect, typically via a stack.

The idea here is to parse the document on the fly, like SAX, but the only type of data returned is text, i.e. the stuff between all the markup elements. The trick is that the text strings come with a list of elements attached (and each element has a (name,value) map of attributes). The parser keeps track of the stack of elements currently in effect, instead of making every application do it. Self-closing elements ("<foo bar=bletch /foo>") return an empty string with the element list attached.

Because only one type of data is getting returned, we can dispense with SAX's callbacks and just return the data from a function call.


ACME Labs / Software / XIP
email