Pipelining SPARQL queries in memory with the rdflib Python library

Using retrieved data to make more queries.

Last month in Dividing and conquering SPARQL endpoint retrieval I described how you can avoid timeouts for certain kinds of SPARQL endpoint queries by first querying for the resources that you want to know about and then querying for more data about those resources a subset at a time using the VALUES keyword. (The example query retrieved data, including the latitude and longitude, about points within a specified city.) I built my demo with some shell scripts, some Perl scripts, and a bit of spit…

Note: I wrote this blog entry to accompany the IBM Data Magazine piece mentioned in the first paragraph, so for people following the link from there this goes into a little more detail on what RDF, triples, and SPARQL are than I normally would on this blog. I hope that readers already familiar with these standards will find the parts about doing the inferencing on a Hadoop cluster interesting.

Expand those shortened URLs before archiving twitter messages

What if a shortening service goes down?

People love to talk about the implications of going down, but what if a URL-shortening service goes down? When I had trouble getting to recently, I realized that when they’re down tweets referencing URLs are worthless—and that it wouldn’t be too difficult to do something about it before this happens. (I have wondered, though: why doesn’t twitter grab some short domain name and offer their own shortening service?) After all, if you’re saving any…

My own little Twitter client

No AJAX, Flash, or AIR; just HTML, but arranged the way I want it.

I’ve tried various Twitter clients, but usually just went back to the web-based interface that people hate so much. My main complaint with it—and I saw no other clients that did any better—was that it showed tweets in reverse chronological order. Conversations and the multi-tweet mini-essays that some people write are difficult to read that way, so I decided to write my own little client.