Category Archives: Python

DOAP from the Bazaar

Last year, when I first released bzr-feed for generating an Atom feed for a Bazaar repository, I added an item to the TODO almost while writing the first lines:

* Add RDF/XML output with DOAP support

Just now, I removed that item, not because I updated bzr-feed, but because I have created a new Python script for generating DOAP using the same technique: bzr-doap.

Usage is quite simple — simply add something along the following lines to your .htaccess file, and you’re good to go (presuming the usual cgi-bin stuff is in order):

RewriteCond %{REQUEST_FILENAME} !-s
RewriteRule (.*).rdf$ bzr-doap.cgi?dir=$1

Output is generated based on the information present in the bzr branch, but it can be augmented/overridden through the use of a .doaprc in the current and/or parent directory, like this:

[Project]
short_desc: DOAP generator for a Bazaar repository branch.
description: A CGI script for automatically generating DOAP for a Bazaar repository branch.
programming-language: Python
license: http://usefulinc.com/doap/licenses/python
[Maintainer]
foaf_homepage: http://www.wasab.dk/morten/

There is of course a DOAP for bzr-doap, and as usual an Atom feed for you to follow its development in its Bazaar repository.

Recursitivity Galore

Sam Ruby: Of course, I would create the consolidated feed using Venus.

Ditto.

It’s really quite simple:

Through my use of Venus for e.g. Planet SF, I started using Bazaar, for which I created an Atom feed generator, the code for which is also stored in a Bazaar repository, which of course provides a feed and is being picked up by e.g. Sam, who in turn maintains another Bazaar repository that provides another feed, that gets picked up by my Venus installation, that then generates a global feed with all the changes — once.

Did I mention that I think Bazaar hits a sweet spot?

Bazaar Development

Inspired — again — by Sam Ruby, I have begun using Bazaar for source control. My first use case was creating a branch of Venus to implement a cache expunge mechanism. Also, I think Bazaar hits a sweet spot regarding ease of use for personal as well as distributed development, and once the prerequisites are in place, it’s easy to set up.

While doing that I learned some more about Python, and found out I wanted to be able to subscribe to the changes in a Bazaar branch.

Starting out with Sam’s tarify.cgi and Joe Gregorio’s sparklines as working examples I have managed to create a simple Python-script for generating an Atom feed: bzr-feed. You can of course subscribe to the changes!

On the TODO is creating RDF output with DOAP, but I think I might need to figure out a way to store and report more information than is currently available in the Bazaar repository.

To use bzr-feed, you will need something like the following in the .htaccess file in the directory containing the branches:

<FilesMatch ".*\\.cgi">
Options ExecCGI
AddHandler cgi-script .cgi
</FilesMatch>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-s
RewriteRule (.*).atom$ bzr-feed.cgi?dir=$1

As a bonus, while working on bzr-feed, I realized that Apache apparently supports If-Modified-Since out of the box for CGI scripts as long as the Last-Modified header is sent (though ETag support still needs to be implemented separately). Nice.

Planet Changes

Recently, a new solar system was discovered, one with a planet that just might contain liquid water.

This is not about that.

Rather, this is about the Planet Planet, a flexible feed aggregator, that Sam Ruby and Danny Ayers (among others) have been hacking on recently.

I have created a personal planet for myself, one of the introverted ones that gather what I produce rather than what I consume: Planet Morten (styling yet to be perfected).

While setting it up, and getting it running like I wanted to, I noticed that it updated the generated files on every run, even though no new entries had been included. On a web that knows about Last-Modified and ETag (as Planet Planet itself does), it seemed like waste of bandwidth to preserve the incoming bytes but not the outgoing ones.

My limited Python skills to the rescue.

Two patches against the latest nightly — the one with a Last-Modified header of Mon, 22 May 2006 16:02:22 GMT (even though it contains files that were changed in the future when I GOT it):

planet-filecmp.diff
This patch makes Planet Planet write its output to a temporary file, which is then compared to the previous version, which is then only overwritten if the contents differ. This precludes the use of <TMPL_VAR date> in templates, as that will surely make the files differ, but the patch has the added bonus of not trashing the previous version of the generated file, in case something goes wrong during the write process.
planet-conditional-output.diff
This patch contains the above patch and additional logic to prevent output files from being generated if no channels were updated. Thus, the original files will be left untouched if no new entries were found, logic that also somewhat invalidates <TMPL_VAR date> in templates, since it can’t be trusted anymore.

The Planet Planet development list has been notified.

Update: Sam Ruby was kind enough to point out some shortcomings in my solution and prompt me for a test case. Thus:

Garmin Geko 201 and RDF

Note: This post originated outside of the weblog, but I figured it really belongs here, and it makes it easier to find.

Following in the footsteps of Matt Biddulph, I acquired a Garmin Geko 201 GPS unit, wanting to annotate my digital photographs with location information.

Matt also wrote a Python script to extract the tracklogs and waypoints from the unit and turn them into RDF statements.

I tried it out, and found I had to overcome some dependency problems, as well as do a little tweaking to get all the information I wanted.

Dependencies

The following dependencies applied to the script on my almost clean Redhat 9 laptop installation with Python 2.2.2:

  1. PyGarmin, which itself doesn’t have any dependencies. It did however need a little patching (diff for garmin.py) to keep Python from complaining.
  2. Redland, with Python interface.

The libraries should be installed in reverse order…

Waypoints tweaking

To keep my version of Python (2.2.2) from crashing, I had to fix some import statements in the original version of Matt’s garmin2rdf.py. Also, a variable name clash was resolved.

Now being able to output an RDF model, I noticed the waypoints only had a property (dc:title) for the name given to it in the unit, but no indication of the symbol used (such as a house, an airport or a building icon). I figured it would be nice to be able to use that information as well, and decided to map each symbol to a Wordnet term, through the use of the Wordnet 1.6 vocabulary namespace.

The Garmin Protocol Specification defines a number of symbols, but the Geko 201 only has a subset of these (and two additional symbols not defined in the specification).

Each numeric symbol ID is mapped to a string identifier such as sym_airport, which in turn (for the Geko 201 symbols only, it’s not easy to create a sensible mapping) is mapped to a Wordnet noun, indicating which type of place is marked. This also fits in nicely with the spacenamespace effort.

When outputting the RDF model, each waypoint is assigned an rdf:type of the wordnet term if found. If no term is found, the string identifier is output as the literal object of a http://hackdiary.com/ns/gps#symbol property, and if no string identifer is found, the symbol ID is output as a literal object of a http://hackdiary.com/ns/gps#symbolid property.

Waypoint symbol mapping

Some of the symbols used relates to verbs, but places need to be identified by nouns. The mapping below doesn’t seem perfect, comments and suggestions are welcomed, especially regarding what to call a place with information (as well as the two special Geko 201 symbols, 8255 and 8256, a closed and an open box)…

Symbol ID String identifier Wordnet term
0 sym_anchor Harbour-1
6 sym_dollar Bank-4
7 sym_fish Fishery-1
8 sym_fuel Gas_station-1
10 sym_house Home-1
11 sym_knife Restaurant-1
14 sym_skull Danger_zone-1
18 sym_wpt_dot Train_station-1
19 sym_wreck Wreck-4
150 sym_boat_ramp Lake-1
151 sym_camp Campground-1
152 sym_restrooms Restroom-1
155 sym_phone Telephone-1
156 sym_1st_aid Hospital-1
157 sym_info Information-2
159 sym_park Park-1
160 sym_picnic Park-2
161 sym_scenic Sight-2
162 sym_skiing Mountain-1
163 sym_swimming Beach-1
170 sym_car Parking_lot-1
171 sym_deer Zoo-1
173 sym_lodging Lodging-1
175 sym_trail_head Spot-1
178 sym_flag Place-1
8197 sym_golf Golf_course-1
8234 sym_building Building-1
8255 sym_8255 Place-1
8256 sym_8256 Place-1
16384 sym_airport Airport-1
16395 sym_parachute Amusement_park-1

Files

Thanks

… to mattb and danbri for making this possible.