XML APIs

November 18, 2006

How to design APIs for XML is debated daily, and has been done so for long. For too long. Now ages ago, companies formed at W3C to design the DOM, using language neutrality and document editing with load/save persistence as goals(it seems, and some says). But some needed other things, such as a streamed, less verbose approach and hence SAX was brought to use. Others found SAX cumbersome to use, and StAX was deployed. And so on, and so on.

One urge I have is to cry out: why can we never design a sensible API? But that reaction wouldn’t be justified. Software is the implementation of ideas. When the software has to change, it’s the reaction stemming from that the ideas(the requirements) changed.

Afterall, SAX works splendidly for some scenarios. I don’t expect one tool for all scenarios, because XML is used in too varied ways. But still, even though one can expect tools to become obsolete and that one size doesn’t fit all, the current situation is more worse than what is reasonable.

In Qt, the dilemma the XML community has is present as well, painfully. The QtXml module provides an (in my opinion poor) implementation of DOM, and SAX. Something needs to be added in order to make XML practical to work with using Qt. Some of the ideas I’ve heard are by the book: add StAX as a streamed-but-easy-to-use API, and a XOM-like API for doing in-memory representation. The latter would be an API that doesn’t carry the legacy of XML’s evolution(the addition of namespaces, for instance) and in general do what an XML API is supposed to do: be an interface for the XML and therefore take care of all the pesky spec details, which XOM does in an excellent way.

If Trolltech added StAX and a XOM-like API to Qt no one could blame them. Other do it and it is the politically correct alternative at this point of our civilization(just as DOM once was). But I start suspecting that it’s the wrong direction. That the step of learning a lesson of adding yet another API could be skipped, in favour of jumping directly to what would follow.

Let’s look at what XML is:

  • It is a medium, a text format for exchanging data, specified in XML 1.0 and XML namespaces. XML is absolutely terrific at this. The IT’s history is tormented with interoperability problems such as encoding issues. XML solves all that in one go. It abstracts away from primitive details, and provides a platform. This is why XML is popular.
  • A set of concepts to express ideas. This is all that about elements and nodes formed in a hierarchial structure(that from a reader’s standpoint can be difficult to distinguish from the text representation, since we humans instantly see the logical structure when looking at an XML document). Exactly what that is, is not so obvious. The different appproaches are often referred to collectively as data models, and there are plenty of them: the XPath 2.0 Data Model, the XML Information Set, the PSVI infoset extension, the DOM(that it stands for Document Object Model is a hint), and the list goes on. These are all different ideas to what a sequence of characters arranged to be valid XML, actually means.

That one can view XML as consisting of these two parts reveals a bit about how XML has evolved. First XML 1.0 arrived, taking care of syntax details. Later on, this plethora of data models arrived to formally define what XML 1.0 informally specified. Understandably many wants to make the XML specification also specify the data model. The question is of course which one to choose, and what the effects are of that.

But the list of data models doesn’t stop with the above. Those are just examples of standardized models. I believe that one data model exist for each XML scenario.

When a word processor reads in a document with the DOM, the actual data model consists of words, paragaphs, titles, sections and so on. The DOM represents that poorly, but apparently acceptably well. Similarly, when a chemistry program reads in a molecule, its data model consists of atoms.

That XML is used for different things can be seen in the APIs being created. SAX is popular because it easily allows a specialized data model to be created, by that the programmer receive the XML on a high level and from that builds the perfect data structure. DOM allows sub-classing of node classes by using factories and attaching user data to nodes, in order to make the DOM instance closer to the user’s data model.

XML is not wanted. Communication is a necessary evil, and therefore XML is as well. If programs could just mindwarp their ideas, molecules and word processor documents, to another program they would, instead of dwelving into the perils with communicating through XML.

I believe this is a good background when tackling the big topic of providing tools for working with XML. It’s not questions like “How do we design an API that avoid the namespace problems the DOM exposes?” It starts at a higher level:

How do we allow the user to in the easiest and most efficient way go from XML to the data model of choice?

Ideally, the user shouldn’t care about details such as namespaces and parent/child relationships. If the API has to push that onto the user, it’s an necessary evil. It’s again about not getting far away from the ideal data model. The idea is in general already practiced when it comes to the most primitive part: serialization. It’s widely agreed that a specialized mechanism(a class) should take care of the serialization step.

Let’s try to apply this buzzword talk to Patternist and Qt. A QAbstractItemModel is typically used to represent the data, since the data is practically separated from its presentation, with the model/view concept. The user wants to read an XML file, and produce an QAbstractItemModel instance.

Patternist, just as Saxon, is designed to be able to send its output to different destinations. It’s not constrained to produce text(XML) or SAX events or building a DOM tree, it just uses a callback. And that callback could just build an item model. It should be possible to write that glue code such that it works for arbitrary models.[1] With such a mechanism, one would only have to write an XQuery query or XSL-T stylesheet that defines a mapping between the XML and the item model, in order to do up and down conversions.

Using Patternist to directly creating item models might not be the way to go. But I do think one should concentrate on what the user wants to achieve instead of trying to fix the current tools(perhaps it doesn’t matter that the hammer is broken, because in either case a screw driver should be used). And amongst what the user wants to do, I believe converting between XML and the data model of choice is a very common scenario.

1.

In general, it all seems interesting to write “interactive” output receivers and trees with Qt. One would be able to write queries/stylesheets that generate widgets, write queries over the file system or QObject tree, etc. But that’s another topic.

XPath 2.0/XQuery has, as many other languages, a set of builtin functions, that the implementation somehow needs to provide. One way is to implement them in the host language. Another approach is to directly use XPath constructs, to some degree. What approach is best?

Read the rest of this entry »

XQuery’s Error Codes

October 27, 2006

That XQuery and the second generations of XSL-T and XPath requires error codes to be reported generates opinions. And work for the working groups. However, I think it’s possible to discuss these error codes with nuance.

Read the rest of this entry »

Blue Room

October 19, 2006

You are in an empty squared room painted blue. You perceive it as sitting on a meadow where a girl in front say to you: “Dear, you’re not not sitting in a meadow here with me, you’re in a blue room.”

Read the rest of this entry »

XProc is Interesting

October 18, 2006

If you’ve built solutions with XML technologies you know it. You have your nice set of Docbook sources that XInclude pulls together that a set of schemata subsequently validates, followed by a XSL-T transform that writes out PDF and XHTML output. This was all done in an platform/implementation indendependent, safe way, except for the hacky script that glued together these steps. Read the rest of this entry »

Waldo’s announcement on a documentation framework for open source projects caught my attention for being both needed and challenging. While I haven’t participated in meetings concerning this, I here try to take a stab at it.
Read the rest of this entry »

KDE’s URIs

October 13, 2006

Significant for KDE’s IO-handling is its use of proprietary URI schemes. As recently mentioned on the kde-usability list, there’s for example the home:/ scheme that when entered into Konqueror displays the content of the /home folder, more or less.

If “proprietary URI schemes” sounds all evil and wrong, it does so because it is. Registered schemes have an agreed meaning in a wide community. Proprietary URI schemes is like any other kind of interoperability breaking — like when Microsoft decides to add its own modifications and extensions to Javascript and CSS. Any piece of software understands file:///home while its only KDE applications using KIO that capiche home:/. That’s the best case scenario. Perhaps someone decides to behave and standardize a scheme by name home:/, which happens to have different semantics to KDE’s.

This is not a theoretical problem, see report #126847 for an example of where these homemade schemes cause trouble.

While I think I am correct in my bashing so far, one surely can add nuance to it. Many would probably say that KDE’s proprietary schemes is a necessary evil for making the usability of URIs reasonable. For example, they would claim that man:/cmake is better than tag:kde.org,2006:man:cmake(a use of the tag scheme to produce an interoperable URI).

I think arguing about URIs’ usability is in the wrong ball court, because URIs have bad usability no matter what. As soon URIs are advertised in commerials on TV, it’s never http://www.tv2.dk, they elegantly let it read tv2 dk.

KDE’s widespread belief in URIs perhaps stems from that developers fail to abstract from themselves and see it from the user’s perspective. That one can enter “gg: wikipedia” in Konqueror to automatically google for the term “wikipedia” is touted like the invention next to sliced bread. But how should users — except for the geeks — know how to use gg:, let alone know it exists?

The same could be seen in discussion on the kde-usability list: some didn’t know the home:/ scheme exists, which to me is fully understandable. I don’t see how users should be able to learn the pesky syntax details of using home:/, let alone a reason why.

Therefore, I don’t believe the right way to solve URIs’ poor usability is to use proprietary schemes. For example, googling quickly should be done with one of the well known approaches browsers take, and navigating man pages should not be enabled by having memorized the man:/ scheme, anyone should be enabled to it via a friendly interface.

Dropping down to the command line for interfacing the user is about a two-decades step backwards in usability.

PS. If you don’t want to use a proprietary URI scheme for your app, check out the KDE URI Guidelines.

Information Overload

October 5, 2006

As for others my days consists of an astonishing amount of information: ideas, ideas and ideas on mailinglists and blog feeds. For some reason it is first now I’ve realized I must prioritize. Perhaps I’m getting old.

Read the rest of this entry »

Improving Mail Reading

October 1, 2006

I read a lot of email. Most of us do. Most of us have also, in one way or another, learnt how to correctly snip mails, who to actually reply, how to properly ask questions, and other matters of email etiquette. Those who haven’t, create work for those who have to read their mails.

I spend too much effort on non-snipped mails. How many times doesn’t one start with scrolling a large mail in order to find what the author actually has written, or scrolled down to the end of a mail only to find out it was merely the remnants of what the author replied? I think one can extend email clients, such as KMail, to make the absence of snipping unnoticeble to the user.

The solution consists of a mechanism I’ve dubbed paragraph folding.

Read the rest of this entry »

Not so Old Fashioned Anymore

September 28, 2006

I’m usually quite an old fashioned guy. Do I use one of those fancy new media players for my music? Nooo, I use the mplayer command line utility, and shell’s auto-completion for searching(so I call it). Do I use Kontact’s fancy todo-plugin? Nah, I have literally a stack of papers for “organizing” my daily work.

However, some days ago, don’t ask me why, I thought “let’s try out all those modern thingies”. And so I did.

Read the rest of this entry »