Category Archives: XML

http://www.w3.org/XML/

Label Vocabulary in Spanish

The Label vocabulary now also contains labels in Spanish.

{Leandro Mariano López}, the master behind inkelog and the Speaks, Reads and Writes Schema, stepped up to plate last night and sent me a translation of the terms and comments – thanks!

That of course meant that I had to make it possible to navigate between the different language versions, and tweak the content negotiation a bit.

The owl2html XSLT is now up to version 0.2, and a new small tool has seen the light of day: rdf-path.

It’s a simple Perl script with an XPath interface to an RDF/XML document, with a bunch of prefixes and namespaces predeclared. It’s simple to the point of triviality, but does its job well, in this case extracting a list of available languages for an ontology, to automate the generation of HTML pages:

#!/bin/bash
base=http://purl.org/net/vocab`/bin/pwd|sed -e 's/^.*web//'`/
for lang in `rdf-path "/*/*[rdf:type[@rdf:resource='http://www.w3.org/2002/07/owl#Ontology']]/rdfs:comment/@xml:lang" $1.rdf`; do
  owl2html $1.rdf uri $base$1# lang $lang css "http://www.wasab.dk/morten/2004/06/owl2html.css" > $1.$lang.html 
done

Additional translations are of course more than welcome.

Multi-lingual Literals in RDF

As a non-native english speaker, it’s good to see that both XML and RDF support language “tagging” of literals, to avoid the blind assumption that everything will be in English. Apparantly the concept doesn’t get much use though, I have yet to see any tools that support multiple languages at the application level (with the possible exception of foaf-a-matic, which I translated into Danish, but it doesn’t do it at the vocabulary level).

Since I sometimes do development in Danish, but also want to integrate with the rest of the world, I have begun creating vocabularies with labels in at least Danish and English. I have also set up partial HTML presentation in multiple languages – the syntax parts are completed, and the content negotiation setup should be working: When dereferencing e.g. the namespace URI for the label vocabulary, http://purl.org/net/vocab/2004/03/label, you should get the English HTML version, unless you have your browser set up to accept Danish [da] as I have, in which case you should get the Danish version. If your user agent sends an Accept: header containing application/rdf+xml, you should get the RDF/XML version – even if this method of operation isn’t completely defined, the W3C TAG is working hard on the issue, httpRange-14.

Now, how to decide which literal to use when multiple are present?

Continue reading Multi-lingual Literals in RDF

Easy RDF-parsing with PHP

A common complaint about the RDF/XML syntax in the XML-literate communities is the lack of a simple PHP parser. While Redland with Raptor does the job perfectly, it almost demands root access to install, and doesn’t run on the Windows platform without cygwin.

The best alternative for PHP is RAP, but that is often claimed to be too slow or there are problems understanding and using the API.

In trying to help out, I won’t be writing an RDF/XML parser from scratch (perhaps someone else will port Sean B. Palmer’s rdfxml.py to PHP), but I have created a little wrapper class for RAP, SimpleRdfParser, that only gives access to the RDF/XML parser, and thus doesn’t need the entire library. Also, the exposed API is simply an array of triples (indexed by subject), and together these simplifications help out on the parsing speed. There’s still room for improvement though, RAP was started a while ago and is based on previous syntax specifications, so it contains support for a number of constructs that aren’t legal anymore.

In addition to the parse method, string2triples, the class also contains a serialiser, triples2string, which turns the graph into a simple subset of RDF/XML, suitable for handling with a regular XML parser or XSLT, should anyone have those desires…

Examples:

The careful reader will notice that there is something missing in the output: The literal “Morten Frederiksen” should have a language of “en”, but it doesn’t. This is a bug in RAP, which has been reported and will likely be fixed in the next version.

Update: A small benchmark for parsing and reserializing appr. 800 statements (source) 100 times with Redland/Raptor, SimpleRdfParser, and RAP:

It turns out Redland/Raptor is about 3 times as fast as SimpleRdfParser, which is about twice as fast as RAP.

Update 2: A more realistic benchmark, doing only the parsing, no serialising:

Transforming RDF/XML with XSLT

A couple of months ago, while working on a project that will hopefully see the light of day soon, I realised I needed terms for singular and plural labels for properties and classes. Even with the help of SchemaWeb I couldn’t find existing terms, so I decided to cook my own, resulting in the label vocabulary with two properties:

plural
A relation between a term and its label in literal plural form.
singular
A relation between a term and its label in literal singular form.

This was not the only vocabulary I was working on at that moment, and I needed to be able to get an overview, a human-readable version. Last year I did the RDFS Explorer for basically the same purpose, but since I was entering OWL territory, it wasn’t really up to the task. Back to square one.

Continue reading Transforming RDF/XML with XSLT

Garmin Geko 201 and RDF

Note: This post originated outside of the weblog, but I figured it really belongs here, and it makes it easier to find.

Following in the footsteps of Matt Biddulph, I acquired a Garmin Geko 201 GPS unit, wanting to annotate my digital photographs with location information.

Matt also wrote a Python script to extract the tracklogs and waypoints from the unit and turn them into RDF statements.

I tried it out, and found I had to overcome some dependency problems, as well as do a little tweaking to get all the information I wanted.

Dependencies

The following dependencies applied to the script on my almost clean Redhat 9 laptop installation with Python 2.2.2:

  1. PyGarmin, which itself doesn’t have any dependencies. It did however need a little patching (diff for garmin.py) to keep Python from complaining.
  2. Redland, with Python interface.

The libraries should be installed in reverse order…

Waypoints tweaking

To keep my version of Python (2.2.2) from crashing, I had to fix some import statements in the original version of Matt’s garmin2rdf.py. Also, a variable name clash was resolved.

Now being able to output an RDF model, I noticed the waypoints only had a property (dc:title) for the name given to it in the unit, but no indication of the symbol used (such as a house, an airport or a building icon). I figured it would be nice to be able to use that information as well, and decided to map each symbol to a Wordnet term, through the use of the Wordnet 1.6 vocabulary namespace.

The Garmin Protocol Specification defines a number of symbols, but the Geko 201 only has a subset of these (and two additional symbols not defined in the specification).

Each numeric symbol ID is mapped to a string identifier such as sym_airport, which in turn (for the Geko 201 symbols only, it’s not easy to create a sensible mapping) is mapped to a Wordnet noun, indicating which type of place is marked. This also fits in nicely with the spacenamespace effort.

When outputting the RDF model, each waypoint is assigned an rdf:type of the wordnet term if found. If no term is found, the string identifier is output as the literal object of a http://hackdiary.com/ns/gps#symbol property, and if no string identifer is found, the symbol ID is output as a literal object of a http://hackdiary.com/ns/gps#symbolid property.

Waypoint symbol mapping

Some of the symbols used relates to verbs, but places need to be identified by nouns. The mapping below doesn’t seem perfect, comments and suggestions are welcomed, especially regarding what to call a place with information (as well as the two special Geko 201 symbols, 8255 and 8256, a closed and an open box)…

Symbol ID String identifier Wordnet term
0 sym_anchor Harbour-1
6 sym_dollar Bank-4
7 sym_fish Fishery-1
8 sym_fuel Gas_station-1
10 sym_house Home-1
11 sym_knife Restaurant-1
14 sym_skull Danger_zone-1
18 sym_wpt_dot Train_station-1
19 sym_wreck Wreck-4
150 sym_boat_ramp Lake-1
151 sym_camp Campground-1
152 sym_restrooms Restroom-1
155 sym_phone Telephone-1
156 sym_1st_aid Hospital-1
157 sym_info Information-2
159 sym_park Park-1
160 sym_picnic Park-2
161 sym_scenic Sight-2
162 sym_skiing Mountain-1
163 sym_swimming Beach-1
170 sym_car Parking_lot-1
171 sym_deer Zoo-1
173 sym_lodging Lodging-1
175 sym_trail_head Spot-1
178 sym_flag Place-1
8197 sym_golf Golf_course-1
8234 sym_building Building-1
8255 sym_8255 Place-1
8256 sym_8256 Place-1
16384 sym_airport Airport-1
16395 sym_parachute Amusement_park-1

Files

Thanks

… to mattb and danbri for making this possible.