Query Your Toaster
November 15, 2007
People have asked for Qt’s XQuery & XPath support to not be locked to a particular tree backend such as QDom, but to be able to work on arbitrary backends.
Any decent implementation(such as XQilla or Saxon) provide that nowadays in someway or another, but I’d say Patternist’s approach is novel, with its own share of advantages. So let me introduce what Qt’s snapshot carries.
<ul>
{
for $file in $exampleDirectory//file[@suffix = "cpp"]
order by xs:integer($file/@size)
return <li>
{string($file/@fileName)}, size: {string($file/@size)}
</li>
}
</ul>
and the query itself was set up with:
QXmlQuery query;FileTree fileTree(query.namePool()); query.setQuery(&file, QUrl::fromLocalFile(file.fileName()));query.bindVariable("exampleDirectory", fileTree.nodeFor(QLibraryInfo::location(QLibraryInfo::ExamplesPath)));if(!query.isValid())return InvalidQuery;QFile out;out.open(stdout, QIODevice::WriteOnly);query.serialize(&out);
These two snippets are taken from the example found in examples/xmlpatterns/filetree/, which with about 250 lines of code, has virtualized the file system into an XML document.
In other words, with the tree backend FileTree that the example has, it’s possible to query the file system, without converting it to a textual XML document or anything like that.
And that’s what the query does: it finds all the .cpp files found on any level in Qt’s example directory, and generate a HTML list, ordered by their file size. Maybe generating a view for image files in a folder would have been a tad more useful.
The usual approach to this is an abstract interface/class for dealing with nodes, which brings disadvantages such as heap allocations and that one need to allocate such structures and hence the possibility to affect the implementation of what one is going to query.
But along time ago Patternist was rewritten to use Qt’s items & models pattern, which means any existing structure can be queried, without touching it. That’s what the FileTree class does, it subclasses QSimpleXmlNodeModel and handles out QXmlNodeModelIndex instances, which are light, stack allocate values.
This combined with that the engine tries to evaluate in a streamed and lazy manner to the degree that it thinks it can, means fairly efficient solutions should be doable.
So what does this mean? It means that if you would like to, you can relatively cheaply be able to use the XQuery language on top of your custom data structure, as long as it is somewhat hierarchical.
For instance, a backend could bridge the QObject tree, such that the XQuery language could be used to find Human Interface Guideline-violations within widgets; molecular patterns in a chemistry application can concisely be identified with a two or three liner XPath expression, and the documentation carries on with a couple of other examples. No need to convert QWidgets to nodes, or force a compact representation to sub-class an abstract interface.
A to me intriguing case would be a web robot that models the links between different pages as a graph, and finds invalid documents & broken links using the doc-available() function, or reported URIs that a website shouldn’t be linking to(such as a public site referencing intranet pages).
Our API freeze is approaching. If something is needed but missing, let me know.
November 15, 2007 at 11:46 am
Hi Frans,
I’m very glad, you made it: Patternist works with a “generic” backend 🙂
Congrat’s.
November 15, 2007 at 12:49 pm
Hi Frans, sounds cool. Perhaps you should s/Xml/Tree/g all class names then? XML is after all only a datastream format AFAIU. In memory you make your own QAbstractTreeNodeModel or such. XML is a buzzword, yes, but a classic name for the data structure from graph theory might be better, no? It is not a pure tree, sure. The purist in me just sees the xtendable markup language in the name, which does not match the data modell (language, hu?) in memory, and is not happy 😉
November 15, 2007 at 4:40 pm
One thing that would be neat would be letting you use XPath queries against the QObject hierarchy from QtScript. I shouldn’t think it would be too hard to do this either.
November 19, 2007 at 2:09 pm
That’s really interesting! What about supporting any custom file engine based on QAbstractFileEngine ? Or simply supporting QDir/QFile ?
November 19, 2007 at 2:44 pm
Hi Eric,
I think the example supports that by default, since it uses QFileInfo, and QFileInfo use QAbstractFileEngine. Feel free to have a look at the example, you’ll find it in the Qt snapshot, under examples/xmlpatterns/filetree/.
Cheers,
Frans
December 2, 2007 at 12:49 am
Is TrollTechs XML API based on the libXML project?
December 2, 2007 at 12:00 pm
Hi Jeff,
No it isn’t. Patternist is written from the ground up to be clean C++ code, and to be based the XPath Data Model, such that XQuery, and XPath/XSL-T 2.0 as opposed to 1.0, can be/is implemented. libxml2 & libxslt are of course awesome, but they will never reach the things built on top of the XPath Data Model, as Daniel Veillard has stated.
December 11, 2007 at 6:54 pm
Hi, i am trying to develop a client – server application, the server connects to the database and the clients connect to the server, my problem is how do i serialize database querys ( including blobs ) and pass them via network from the server to the client ?
Can i use XQuery to implement a solution ?
PD: Sorry my bad english
thank you
December 13, 2007 at 9:26 pm
Hi,
awesome stuff you have there. I just checked out the 4.4 snapshot and tryed it.
I’m looking into replacing berkley xmldb with something less a pita. Unfortunatly what i see in the snapshot is still not powerfull enought to drive a several thounds of documents big database. The XQuery implementation lacks an option to define indexes, like bdxml has.
also QAbstractXmlNodeModel assumes you have all the data in one big file, wich is not the case for what i have here with bdxml (it uses fn:collection() ). of course i
could map all the small files to a big one, but seeing that the iterators are not quiet efficient, that would mean loading several terrabytes into the ram…
I really hope there will be some work on patternist before the release. i want to get rid of that darn xmldb and finally use somthing as lovely as Qt.
December 14, 2007 at 12:19 pm
Hi again aep,
We want it to be as powerful as one needs, so if you have any suggestions for what API changes to do or what that needs to be added in order to do these more database centric things you’re welcome to mail it to me.
I’m in particular interested in:
* how QAbstractXmlForwardIterator(I guess it’s that one you’re referring to) can be made more efficient
* In what way QAbstractXmlNodeModel requires all data to be held in memory(and possible what changes that would help)
* Apart from that Patternist shipping with a database backend, what kind of hooks/features related to fn:collection() would you like?
June 23, 2008 at 10:38 pm
HrkV1P Blogs rating, add your blog to be rated for free:
http://blogsrate.net
April 15, 2009 at 4:56 pm
Not that I’m totally impressed, but this is a lot more than I expected when I stumpled upon a link on Digg telling that the info is awesome. Thanks.
April 15, 2010 at 2:21 pm
However, to be able to transfer that love and understanding to a student is a whole new set of skills that few possess