Category Archives: SPARQL

http://www.w3.org/TR/rdf-sparql-query/

SPARQL and SIMILE Timeline

Danny Ayers has been working on getting the SIMILE Timeline to eat SPARQL through the use of its JSON interface and some XSLT, he has notes on the ESW wiki.

While trying to get his work running here, I realized that the trip through XSLT to create JSON output really wasn’t necessary.

Instead, I’ve created a custom SPARQL event source parser, to load SPARQL results directly into the timeline. This way, the SPARQL results format generated by running the query doesn’t need a round trip into either JSON or the custom Timeline XML format.

The SPARQL Timeline demo works with any RSS 1.0 feed (try it with the one from Planet RDF) .

Update: Now also works with “raw” SPARQL results, try it with photos of laptops from The Gargonza Experiment (scroll to April of 2005). Expected variable bindings are date, title, description, and link, although the latter is optional and the first can be replaced by start.

Update: Now really works with “raw” SPARQL results. Due to javascript’s security model, only files on this server worked — until now. Also, a buglet regarding empty literal elements have been fixed.

Describing Source Content for Redland/MySQL

I mentioned SADDLE (which used to be a part of the SPARQL Protocol draft, but is no longer) in passing the other day, when describing OWL-S Maker and talking about service description in general.

Service description in this context — and in the context of Dion Hinchcliffe’s OWL-S-less overview of SDLs — is mostly about the interface, the inputs and outputs, not what’s in between.

In contrast, SADDLE originally entered that territory with its properties like saddle:vocabulary, and the other day on dev@gargonza Damian Steer announced a nice little javascript hack for using source content descriptions — this is not about I/O, but about what a “service” contains information about.

Central to Damian’s hack is a source content description, containing OWL statements about which classes and properties are present in the SPARQL source. For example, his description shows that all objects of foaf:name statements (in this particular store) are literals.

While the above example was handmade, I realized this was getting close to what I’ve been meaning to do for generating simpler and cleaner UIs for triplestores (asking for a foaf:Person? It’s likely you’d also want a foaf:name then…), so I figured I should try to generate such an SCD — Source Content Description — automagically, as Damian hints to himself: Ideally this information would mined from the store.

I’ve managed to come up with a single query that returns all the information necessary to construct an SCD, but since it’s quite complex, I’ll explain the steps I took on the way there.

Continue reading Describing Source Content for Redland/MySQL

Redland Hacking

During the last few days, I’ve been hacking a bit on — and with — Redland.

First off, I verified that a bug and associated patch from Simon Cross regarding portability of the hash calculations in the MySQL storage engine was indeed working. When originally writing the code for this I hadn’t thought of the use case of accessing a storage on a different architecture, but that is of course an important one. The issue is now closed, Dave Beckett has applied the patch to CVS.

I also created an issue regarding the design decision to not look for hash collisions, 28: Hash collisions possible in MySQL storage engine. I don’t have a solution ready for this, but I thought it would be a good idea to get it out in the open, so people are aware of the problem.

Another minor issue with the MySQL storage was its excessive use of connections, especially visible when using Rasqal. I wrote a patch to make it use persistent connections, and Christopher Schmidt was kind enough to help me test it. It seems to be working fine — it does here as well, so I sent a message to redland-dev asking for comments, hopefully this will get into CSV soon as well.

Then came a bit of work on the long-running issue with the PHP interface to Redland. PHP has its own unique NULL-value, so when the Redland Bindings blindly returned a C NULL wrapped in a regular PHP object (in the case of an error), Redland would crash Apache/PHP upon trying to use that object. In the past, Dave has been kind enough to hack a bit here and there when I ran into problems, but I decided to try to close the issue more pemanently. Thus, as explained in 15: PHP binding functions should return a PHP null, I patched the pointer return function to always return a PHP NULL instead a C one. My first version of the patch seems to have been faulty, as Dave couldn’t apply it to CVS, but I created a new one that I hope will do a better job. Also, as a side effect of this change, it is now no longer possible to pass a C NULL into some of the Redland functions where needed, so it seems we have to create a few PHP helper functions to return a C NULL wrapped in a PHP object…

I’ve got more ideas for improvement to Redland, but they really can’t be considered as anything other than feature requests to be coded on a day (and night) with nothing else to do, so I haven’t created issues for these:

  • An option for the MySQL storage to prefix table names with a constant string, to make it possible to have more than one storage in the same database, inspired by the way WordPress does it, and to help out with Dan Brickley’s SparqlPress project.
  • Some builtin “reasoning” functions, to — among other things — make my Redland Smusher obsolete. I’ve discussed this a bit with Dave, but we still haven’t figured out the “right” or best way to implement it.
  • It seems the new version of 3store will store simple datatyped literals like integers in separate columns, to make it easier for the database enginge to work with the values and to better support SPARQL. I think I’d like to do the same for the Redland MySQL storage, but still have to figure out the implications.
  • A new MySQL storage enginge that reads — later on maybe writes as well — the Jena schema layout. This could perhaps be an option to the MySQL storage enginge, in which case it would be almost trivial to also add an option for storing in a simpler, denormalized layout, where all the information is in a single table instead of spread out over four.

Last, and in some sense also least, I hacked a little conversion service, CSV-SPARQLer, that simply takes a URI to a CSV file and turns it into SPARQL Variable Bindings Results format (example, show query, extra example, show extra query).
As the extra example shows, I wanted to be able to subscribe to the action that goes on in the Redland Issue Tracker, but all it made available was a CSV file, so there: A CSV file converted into SPARQL result format, then converted into RSS through SPARQL Conversions XSLT. The resulting RSS is not perfect, notably the titles are a bit generic, but it’s good enough.

OWL-S Maker

There’s lots of talk about service descriptions these days, most notably around Tim Bray’s WSDL discussions (which includes SMEX-D), Norm Walsh’s NSDL for WITW, and the DAWG’s Saddle, to be a part of the SPARQL protocol. Phew, acronym overload…

Also in that space is OWL-S, an OWL-based Web service ontology that Danny Ayers has looked at. I though it would be fitting to describe semantic web services with RDF, so I tried it out but ended up feeling quite like Norm after his WSDL endeavour.

I did however manage to get something working, OWL-S Maker, which only uses a subset (and possibly in a wrong way), but is self-powered through its own service description, a PHP class (which currently has too many local dependencies to be worth anything to anyone but myself), and some XSLT. The descriptions could use some more work regarding the “typing” of inputs, but I thank Saddle may eventually provide some insights there. As for the general modelling of services, input, and output, I think I may have to look elsewhere — OWL-S seems too complicated.

As a usage example, see the Sparqlette service description input and the description of nearestAirport for a partially handcrafted example with two profiles.

SPARQL Conversions XSLT

To help develop and test my new Sparqlette service, I hacked a couple of XSLTs that might come in handy here and there…

SPARQL to RSS (lastest version: 0.3)
As its name implies, this XSLT turns a SPARQL Query Results XML Format document (Variable Binding Results) into an RSS channel, making it possible to subscribe to the results of an (almost) standard SPARQL query without using CONSTRUCT. As can be expected, not all query results work, as the RSS specification mandates certain elements. Thus, the value of the channel’s rss:link property is taken from an XSLT parameter named _uri, the variable bindings to use for rss:link and rss:title in each item is determined via some crude heuristics, and only items that have a URI for the chosen rss:link binding are created.

Variable selection heuristics for rss:link / rss:title:

  1. If there’s a variable named rsslink or rsstitle respectively, the bindings for that variable is used for all items.
  2. Otherwise, the first variable that has a binding to a URI is used for rss:link, and the first variable that has a binding to a literal is used for rss:title

For rss:link I wanted to add another option between the two, that would locate a variable that only has bindings to URIs, but I couldn’t get it working with a single XPath expression, so I gave up.
Example RSS (view Sparqlette input parameters).

SPARQL to SPARQL (latest version: 0.1)
This XSLT simply converts documents in the syntax of any of the currently two SPARQL Query Results XML Format draft specifications, W3C Working Draft 21 December 2004 and $Revision: 1.29 $ of $Date: 2005/05/03 09:58:04 $, into the syntax of the latest version, currently the latter. I promise to do my best to stay up to date…

Note: This entry — as all entries in the Release category — will serve as a changelog (you can subscribe to its RSS feed if you want to make sure you don’t miss out on any updates).