Category Archives: Redland

http://librdf.org/ Redland RDF Application Framework
skos:related http://crschmidt.net/blog/archives/author/crschmidt/skos.rdf#c4 Redland RDF Application Framework

Named Graph Exchange

Following up on Exchange of Named RDF Graphs and the rapidly developing ARC2 RDF system, I have written a PHP/ARC2 version of my script for parsing and serialising a graph archive, and repackaged the original version into a single script for Redland.

I will be using this for testing ARC2 (performance) with my photo database, to see if I can manage a simpler interface without sacrificing the excellent performance from Redland. So far, it seems parsing might be a bottleneck, but that isn’t really important, if the query handling is good (so far it looks great, I can implement a SPARQL endpoint in 6 lines of PHP) — I can do batch processing offline.

You can find the scripts and some example archives in its bzr repository: named-graph-exchange, and download the whole package in .zip– or .tgz-format.

Next in the pipeline is an implementation that talks to a SPARQL endpoint, only downstream for now, but possibly using SPARQL+ or SPARUL for remote updates in the future.

The scripts are licensed under the Eiffel Forum License, version 2, per sbp’s considerations.

SPARQL and SIMILE Timeline

Danny Ayers has been working on getting the SIMILE Timeline to eat SPARQL through the use of its JSON interface and some XSLT, he has notes on the ESW wiki.

While trying to get his work running here, I realized that the trip through XSLT to create JSON output really wasn’t necessary.

Instead, I’ve created a custom SPARQL event source parser, to load SPARQL results directly into the timeline. This way, the SPARQL results format generated by running the query doesn’t need a round trip into either JSON or the custom Timeline XML format.

The SPARQL Timeline demo works with any RSS 1.0 feed (try it with the one from Planet RDF) .

Update: Now also works with “raw” SPARQL results, try it with photos of laptops from The Gargonza Experiment (scroll to April of 2005). Expected variable bindings are date, title, description, and link, although the latter is optional and the first can be replaced by start.

Update: Now really works with “raw” SPARQL results. Due to javascript’s security model, only files on this server worked — until now. Also, a buglet regarding empty literal elements have been fixed.

Describing Source Content for Redland/MySQL

I mentioned SADDLE (which used to be a part of the SPARQL Protocol draft, but is no longer) in passing the other day, when describing OWL-S Maker and talking about service description in general.

Service description in this context — and in the context of Dion Hinchcliffe’s OWL-S-less overview of SDLs — is mostly about the interface, the inputs and outputs, not what’s in between.

In contrast, SADDLE originally entered that territory with its properties like saddle:vocabulary, and the other day on dev@gargonza Damian Steer announced a nice little javascript hack for using source content descriptions — this is not about I/O, but about what a “service” contains information about.

Central to Damian’s hack is a source content description, containing OWL statements about which classes and properties are present in the SPARQL source. For example, his description shows that all objects of foaf:name statements (in this particular store) are literals.

While the above example was handmade, I realized this was getting close to what I’ve been meaning to do for generating simpler and cleaner UIs for triplestores (asking for a foaf:Person? It’s likely you’d also want a foaf:name then…), so I figured I should try to generate such an SCD — Source Content Description — automagically, as Damian hints to himself: Ideally this information would mined from the store.

I’ve managed to come up with a single query that returns all the information necessary to construct an SCD, but since it’s quite complex, I’ll explain the steps I took on the way there.

Continue reading Describing Source Content for Redland/MySQL

Redland Hacking

During the last few days, I’ve been hacking a bit on — and with — Redland.

First off, I verified that a bug and associated patch from Simon Cross regarding portability of the hash calculations in the MySQL storage engine was indeed working. When originally writing the code for this I hadn’t thought of the use case of accessing a storage on a different architecture, but that is of course an important one. The issue is now closed, Dave Beckett has applied the patch to CVS.

I also created an issue regarding the design decision to not look for hash collisions, 28: Hash collisions possible in MySQL storage engine. I don’t have a solution ready for this, but I thought it would be a good idea to get it out in the open, so people are aware of the problem.

Another minor issue with the MySQL storage was its excessive use of connections, especially visible when using Rasqal. I wrote a patch to make it use persistent connections, and Christopher Schmidt was kind enough to help me test it. It seems to be working fine — it does here as well, so I sent a message to redland-dev asking for comments, hopefully this will get into CSV soon as well.

Then came a bit of work on the long-running issue with the PHP interface to Redland. PHP has its own unique NULL-value, so when the Redland Bindings blindly returned a C NULL wrapped in a regular PHP object (in the case of an error), Redland would crash Apache/PHP upon trying to use that object. In the past, Dave has been kind enough to hack a bit here and there when I ran into problems, but I decided to try to close the issue more pemanently. Thus, as explained in 15: PHP binding functions should return a PHP null, I patched the pointer return function to always return a PHP NULL instead a C one. My first version of the patch seems to have been faulty, as Dave couldn’t apply it to CVS, but I created a new one that I hope will do a better job. Also, as a side effect of this change, it is now no longer possible to pass a C NULL into some of the Redland functions where needed, so it seems we have to create a few PHP helper functions to return a C NULL wrapped in a PHP object…

I’ve got more ideas for improvement to Redland, but they really can’t be considered as anything other than feature requests to be coded on a day (and night) with nothing else to do, so I haven’t created issues for these:

  • An option for the MySQL storage to prefix table names with a constant string, to make it possible to have more than one storage in the same database, inspired by the way WordPress does it, and to help out with Dan Brickley’s SparqlPress project.
  • Some builtin “reasoning” functions, to — among other things — make my Redland Smusher obsolete. I’ve discussed this a bit with Dave, but we still haven’t figured out the “right” or best way to implement it.
  • It seems the new version of 3store will store simple datatyped literals like integers in separate columns, to make it easier for the database enginge to work with the values and to better support SPARQL. I think I’d like to do the same for the Redland MySQL storage, but still have to figure out the implications.
  • A new MySQL storage enginge that reads — later on maybe writes as well — the Jena schema layout. This could perhaps be an option to the MySQL storage enginge, in which case it would be almost trivial to also add an option for storing in a simpler, denormalized layout, where all the information is in a single table instead of spread out over four.

Last, and in some sense also least, I hacked a little conversion service, CSV-SPARQLer, that simply takes a URI to a CSV file and turns it into SPARQL Variable Bindings Results format (example, show query, extra example, show extra query).
As the extra example shows, I wanted to be able to subscribe to the action that goes on in the Redland Issue Tracker, but all it made available was a CSV file, so there: A CSV file converted into SPARQL result format, then converted into RSS through SPARQL Conversions XSLT. The resulting RSS is not perfect, notably the titles are a bit generic, but it’s good enough.