Category Archives: ARC

Introducing SPO(G)

SPO(G) is not a new syntax, nor a format or a protocol. It is, however, a syntactic profile and a convention.

<sparql xmlns="http://www.w3.org/2005/sparql-results#">
  <head>
    <variable name="g"/>
    <variable name="s"/>
    <variable name="p"/>
    <variable name="o"/>
  </head>
  <results>
    ...
  </results>
</sparql>

I have previously written about exchange of named RDF graphs through the use of quads-in-zips, and while that approach works just fine, it needs to be implemented on both sides of the exchange.

This is also true for SPO(G), but with SPARQL implementations being widespread, the export side is already in place all around the web, and the import side is quite easily implemented — I have sent an implementation of SPARQLXMLResultLoader for ARC to bengee, and while he is also busy working on a streaming serialiser, it seems likely it’ll be a part of a coming release of ARC.

As can be seen from the example, SPO(G) is simply a constrained SPARQL Query Results XML Format: It needs to have three or four variables, s, p, and o must be present, with g being optional (making it YARS). For all results, all variables must be bound.

SPO(G) isn’t as compact as quads-in-zips, but there’s no reason for it not to be compressed during exchange, either through a manual process or via the usual gzip-encoding on-the-fly.

I should perhaps write it up properly, but I think I’d rather go off and implement it for Redland.

SPARQL Endpoint

SPARQL

As I hinted to in my post about Named Graph Exchange, and later picked up by Danny, it is possible to create a SPARQL endpoint using ARC with only six lines of PHP — or rather was, as Benjamin Nowack has now announced a suggested update, that makes it possible in only fourthree lines:

include_once('path/to/arc/ARC2.php');
$ep = ARC2::getStoreEndpoint(array(...));
$ep->go();

The same announcement, the ARC Release Notes and Change Log, and Benjamins post about his cool data wiki also mentions the new ARC plugin system, that I helped take off.

My plugin submission is essentially a SPARQL client, making it possible to use ARC for accessing remote SPARQL endpoints: ARC2::RemoteEndpointPlugin

The plugin works, in that it supports the read-only query types SELECT, CONSTRUCT, DESCRIBE, and ASK, but not yet the ones that write, since the ARC class that does HTTP only speaks GET at the moment.

The plugin homepage doubles as its documentation — it contains an example of how to use the plugin: Basically just like with a regular ARC2::Store, only with a simpler configuration — only a SPARQL endpoint URL is needed.

The homepage is also a Bazaar repository, and as such provides an Atom feed with updates, and — of course — a DOAP file, that I hope will one day play a role as the authoritative source of information shown on the ARC plugins page, with automatic updates.

Fitting, also, that this surfaces on the very day that SPARQL becomes a recommendation. It has been quite a journey, and fun to be a (small) part of, starting way back when Squish and RDQL were state of the art, and later with The Gargonza Experiment — perhaps it is now time to retire my partial SPARQL Rewriter and resurrect Sparqlette

Named Graph Exchange

Following up on Exchange of Named RDF Graphs and the rapidly developing ARC2 RDF system, I have written a PHP/ARC2 version of my script for parsing and serialising a graph archive, and repackaged the original version into a single script for Redland.

I will be using this for testing ARC2 (performance) with my photo database, to see if I can manage a simpler interface without sacrificing the excellent performance from Redland. So far, it seems parsing might be a bottleneck, but that isn’t really important, if the query handling is good (so far it looks great, I can implement a SPARQL endpoint in 6 lines of PHP) — I can do batch processing offline.

You can find the scripts and some example archives in its bzr repository: named-graph-exchange, and download the whole package in .zip– or .tgz-format.

Next in the pipeline is an implementation that talks to a SPARQL endpoint, only downstream for now, but possibly using SPARQL+ or SPARUL for remote updates in the future.

The scripts are licensed under the Eiffel Forum License, version 2, per sbp’s considerations.