Multi-lingual Literals in RDF

As a non-native english speaker, it’s good to see that both XML and RDF support language “tagging” of literals, to avoid the blind assumption that everything will be in English. Apparantly the concept doesn’t get much use though, I have yet to see any tools that support multiple languages at the application level (with the possible exception of foaf-a-matic, which I translated into Danish, but it doesn’t do it at the vocabulary level).

Since I sometimes do development in Danish, but also want to integrate with the rest of the world, I have begun creating vocabularies with labels in at least Danish and English. I have also set up partial HTML presentation in multiple languages – the syntax parts are completed, and the content negotiation setup should be working: When dereferencing e.g. the namespace URI for the label vocabulary, http://purl.org/net/vocab/2004/03/label, you should get the English HTML version, unless you have your browser set up to accept Danish [da] as I have, in which case you should get the Danish version. If your user agent sends an Accept: header containing application/rdf+xml, you should get the RDF/XML version – even if this method of operation isn’t completely defined, the W3C TAG is working hard on the issue, httpRange-14.

Now, how to decide which literal to use when multiple are present?

Continue reading Multi-lingual Literals in RDF

Easy RDF-parsing with PHP

A common complaint about the RDF/XML syntax in the XML-literate communities is the lack of a simple PHP parser. While Redland with Raptor does the job perfectly, it almost demands root access to install, and doesn’t run on the Windows platform without cygwin.

The best alternative for PHP is RAP, but that is often claimed to be too slow or there are problems understanding and using the API.

In trying to help out, I won’t be writing an RDF/XML parser from scratch (perhaps someone else will port Sean B. Palmer’s rdfxml.py to PHP), but I have created a little wrapper class for RAP, SimpleRdfParser, that only gives access to the RDF/XML parser, and thus doesn’t need the entire library. Also, the exposed API is simply an array of triples (indexed by subject), and together these simplifications help out on the parsing speed. There’s still room for improvement though, RAP was started a while ago and is based on previous syntax specifications, so it contains support for a number of constructs that aren’t legal anymore.

In addition to the parse method, string2triples, the class also contains a serialiser, triples2string, which turns the graph into a simple subset of RDF/XML, suitable for handling with a regular XML parser or XSLT, should anyone have those desires…

Examples:

The careful reader will notice that there is something missing in the output: The literal “Morten Frederiksen” should have a language of “en”, but it doesn’t. This is a bug in RAP, which has been reported and will likely be fixed in the next version.

Update: A small benchmark for parsing and reserializing appr. 800 statements (source) 100 times with Redland/Raptor, SimpleRdfParser, and RAP:

It turns out Redland/Raptor is about 3 times as fast as SimpleRdfParser, which is about twice as fast as RAP.

Update 2: A more realistic benchmark, doing only the parsing, no serialising:

Transforming RDF/XML with XSLT

A couple of months ago, while working on a project that will hopefully see the light of day soon, I realised I needed terms for singular and plural labels for properties and classes. Even with the help of SchemaWeb I couldn’t find existing terms, so I decided to cook my own, resulting in the label vocabulary with two properties:

plural
A relation between a term and its label in literal plural form.
singular
A relation between a term and its label in literal singular form.

This was not the only vocabulary I was working on at that moment, and I needed to be able to get an overview, a human-readable version. Last year I did the RDFS Explorer for basically the same purpose, but since I was entering OWL territory, it wasn’t really up to the task. Back to square one.

Continue reading Transforming RDF/XML with XSLT

WordPress Plugin: Linkifier

To scratch an itch, I’ve hacked a simple plugin, that makes it easier to link to friends and category topics.

When enclosing the name of a friend, a friend’s blog name, or a category name, in {}, the plugin takes care of turning it into a link, including XFN link relations for friends. If a category or link isn’t found for a particular name, a warning will be shown above the post preview in the administration interface, but the text will be left alone.

Now, whenever I link to Danny Ayers, all I have to do is write his name within curly brackets: {Danny Ayers}. Same goes for my {RSS} category: RSS.

Of course, it’s not quite as simple as that, a few prerequisites need to be in order. Luckily, they coincide with the requirements of my FOAF output hack:

  • Categories must have only a URI in their description.
  • Link URIs must be to the weblog of the person.
  • Link name must be the name of the weblog.
  • Link description must be the name of the person.

Download the plugin: Linkifier (rename to linkifier.php and place it in the /wp-content/plugins/ directory)
View source: Linkifier Source

FOAF Explorer update

It has bothered me for a while, that the FOAF Explorer wasn’t able to handle duplicate statements. It would either repeat the entire property/value pair, or in some situations “just” show the values next to each other without whitespace or other separators in between.

That last issue really isn’t fixed yet, but at least it now only happens with different values – I managed to remove duplicate statements with some crude PHP hacking.

I was already doing a parse and custom (re-)serialise with Redland/Raptor (the PHP source is available) before passing it on to the XSLT, so it was “just” a matter of making sure the same statement wasn’t serialised twice.

Since part of the point of reserialising was to group statements by subject, I had an index in the form of an array of statements per subject. Even though it can be optimized, I simply added a loop to check for the presence of the current predicate/object pair:

$found=0; 
  $os=librdf_node_to_string($object); 
  reset($Nodes[$node]); 
  while(!$found && list(,$pso)=each($Nodes[$node])) { 
    $p=key($pso); 
    list($s,$o)=current($pso); 
    $found=($p==$predicate && librdf_node_to_string($o)==$os); 
  }

It works (try it!), and while the entire FE processing should now take longer, it actually helps somewhat that the XSLT doesn’t have to cope with too much…

While I was at it, I added support for the use of XFN as an RDF vocabulary, with the namespace http://gmpg.org/xfn/1#. It is now treated the same way as the Relationship vocabulary and the Trust vocabulary, which means that it’s handled as if all the properties are rdfs:subPropertyOf foaf:knows. It’s not perfect, the display could use some collapsing, but it works (try it!).

Oh, almost forgot: Also added support for the Quaffing vocabulary by Leigh Dodds.

If only FE really knew about rdfs:subPropertyOf