Redesign, Retrospective and Resolutions

At some point this fall, I promised myself I’d refactor my web pages, to give them all a similar look, while making it easy to update that look in the future, and drive most of the content with RDF — after all, web pages are resources.

I’m not quite done with all the corners, most notably my homepage, but at least now the weblog and the photo albums share a common stylesheet, with everything in place for tweaking the rest, including a Planet Morten feed!

For the coming year, I intend to continue my switch of focus from producing RDF to consuming it. I have started out by generating a faceted interface for my photos (which could use an additional interface like libby’s calendar view), and with Leigh Dodds releasing Slug: A Simple Semantic Web Crawler, I’m reminded to get back to work with my scutter, Scutter Strategies and the Scutter Vocabulary. Also, Bob DuCharme has created rdfdata.org, which means that it’s now easier than ever to find data to play around with. Integral to most of this is me getting around to writing/porting the RDQL/SPARQL rewriting code to the Redland/MySQL storage backend.

To see what it’s like, I also intend to start a “real” (Danish) weblog, one that is updated on an (almost) daily basis, I think it’ll be good for me to get into the habit of writing more often than now, where most of the stuff I do sits quietly behind the scenes, waiting for that elusive moment when there’s time to refactor and document it properly. In short: Moving to a state of mind where (seemingly!) perfect is an option, not a requirement — a state I’m finding it hard to get to, but also a state from which I have learned a lot from others in the Open Source community.

So much to do, so little time, but I think it’s important to showcase how RDF can actually be used, not just produced, all the while making interesting stuff simpler.

WordPress Plugin: Weighted Interests

I recently ran into Matt Kingstons’s Weighted Categories plugin. It was inspired by flickr’s tag list, showing tag usage with a font size for each tag proportional to the number of photos tagged.

I decided to clean it up a bit and make it work on pages that didn’t include all posts, and then turned it into an example of how to extend the FOAF Output Plugin profile page (example, download, source).

The current version is 1.1 (released 2005-01-02).

Changes since 1.0:
  • Fixed problems when no interests were found.

This plugin requires the FOAF Output Plugin. Note that only categories that are classified as interests by the FOAF Output Plugin (that is, the ones that have a URI in their description) are included in the list. If you wish to show all categories, simply change the line that gets the list of interests:

$cats=get_foaf_output_interests(falsetrue);

In case you are interested, here’s the CSS I added to my global stylesheet to make it look like it does:

.profile dl dd.weighted-interests { 
  text-align: center; 
  padding: 0.5em; }
.profile dd.weighted-interests li { 
  list-style: none; 
  display: inline; 
  margin: 0.2em; }
.profile dd.weighted-interests a { 
  text-decoration: none; } 

Italian XML

You know you might be hungry when you are greeted by the following error message:

xsltStylePreCompute: unknown xsl:parma
xsltApplyOneTemplate: parma was not compiled

The offending part:

<xsl:template mode="navigation" priority="0.2" match="*[@rdf:resource]"> 
  <xsl:parma name="head" select="false()"/> 
  <div class="navigation"> 
    <xsl:variable name="this-p" select="concat(namespace-uri(),local-name())"/>

To keep this sort of on-topic: There was a workshop in Italy last week, SWAP 2004, Danny Ayers has more.

Aggregating and Archiving RSS Items

One of the better arguments for RSS 1.0 over other syndication formats is the claim that the (meta) data plugs directly into the greater Semantic Web, thus making it possible to go both back and forth between the two, making them one. Unfortunately, most aggregators don’t really aggregate, at most they just present a cached version of what’s currently offered, resulting in a disconnect, as Bob DuCharme recently pointed out on rdf-interest (eventually leading to rdfdata.org).

However, archiving “items” from RSS feeds over time presents a few issues.

Not all RSS items have their own globally unique identifier
Some RSS feeds are “linkrolls” more than a list of recently created or updated resources. A linkroll references other resources directly, sometimes making incorrect statements about e.g. the creator or time of publication (example: del.icio.us/mortenf). Reliable identification is needed to be able to recognise items that are new, old or updated.
The ambigious definition of a channel
In the RSS 1.0 spec it says the following about the rdf:about attribute on the channel element: Most commonly, this is either the URL of the homepage being described or a URL where the RSS file can be found. The right choice seems to always be the channel URI, the source of the statements, as that is what is commonly referred to by rdfs:seeAlso in e.g. blogrolls and personal FOAF files, and most often as the identifier used for provenance in a triple store.
The rss:items/rdf:Seq construct
Each item is associated with one or more channels through the rss:items property, referencing a sequence of the “current” items. The sequence of items is determined through the use of the RDF/XML syntactic construct rdf:li, which is expanded to rdf:_1, rdf:_2, and so on, in the RDF model. When a new item is added to a channel, it is added at the first position, rdf:_1, the existing items shift towards the end of the sequence, and the last item disappears from the sequence. In a naïve implementation, archiving a channel over time would lead to a “sequence” with each item being referenced more than once, and loss of actual temporal information — it’d be impossible to determine the actual order in which the items appeared. Note also, that in an even more naïve implementation (one that doesn’t recognise that the two sequences should be seen as one), the result wouldn’t be an “invalid” sequence, but instead a channel with multiple rss:items properties, each with a perfectly fine sequence.

Continue reading Aggregating and Archiving RSS Items