publishing

Scraping and linked data

Wired Magazine gives scraping the buzzword treatment but remains clueless about the semantic web and linked data.

The latest issue of Wired has an article with the provocative title of The Data Wars about web sites built around data retrieved by “bots” doing “scraping”. I quote these because the article twists the terms a bit to make them and their subjects seem more dramatic, more cutting edge, and—you guessed it—more “Web 2.0”.

Checking Out Yahoo Pipes

Easy, quick, and useful.

You’ve probably heard that Yahoo has this new, drag-and-drop tool to easily combine and manipulate RSS and Atom feeds. (Forgive me for omitting the exclamation point from their name—speaking of which, shouldn’t the logo for yahoo.es be “¡Yahoo!”?) Tim O’Reilly called Yahoo pipes no less than “a milestone in the history of the internet.” Early reports mentioned load problems, and I was extra busy with work, so I waited a bit before trying it.

2007-01-19 update: It looks like Bitpass is going under. While there’s no mention of it on their website, I just got email from them saying that “due to circumstances beyond our control, we are discontinuing our operations.” If anyone knows of a comparable service or a service with different ideas about enabling small vendors to sell content on the internet, please let me know.

Finding free content

People who should know better often think it's easy.

A few weeks ago I wrote about free personal data that was really just randomly generated names and contact information created for some tests. Coherent prose by knowledgeable people is something that you can’t generate with a python script, and it’s interesting to see the schemes that some people have made to find such content.