A common complaint about the RDF/XML syntax in the XML-literate communities is the lack of a simple PHP parser. While Redland with Raptor does the job perfectly, it almost demands root access to install, and doesn’t run on the Windows platform without cygwin.
The best alternative for PHP is RAP, but that is often claimed to be too slow or there are problems understanding and using the API.
In trying to help out, I won’t be writing an RDF/XML parser from scratch (perhaps someone else will port Sean B. Palmer’s rdfxml.py to PHP), but I have created a little wrapper class for RAP, SimpleRdfParser, that only gives access to the RDF/XML parser, and thus doesn’t need the entire library. Also, the exposed API is simply an array of triples (indexed by subject), and together these simplifications help out on the parsing speed. There’s still room for improvement though, RAP was started a while ago and is based on previous syntax specifications, so it contains support for a number of constructs that aren’t legal anymore.
In addition to the parse method,
string2triples, the class also contains a serialiser,
triples2string, which turns the graph into a simple subset of RDF/XML, suitable for handling with a regular XML parser or XSLT, should anyone have those desires…
The careful reader will notice that there is something missing in the output: The literal “Morten Frederiksen” should have a language of “en”, but it doesn’t. This is a bug in RAP, which has been reported and will likely be fixed in the next version.
Update: A small benchmark for parsing and reserializing appr. 800 statements (source) 100 times with Redland/Raptor, SimpleRdfParser, and RAP:
It turns out Redland/Raptor is about 3 times as fast as SimpleRdfParser, which is about twice as fast as RAP.
Update 2: A more realistic benchmark, doing only the parsing, no serialising: