I’ve been thinking about which machine learning tools can contribute the most to the field of digital humanities, and an obvious candidate is document embeddings. I’ll describe what these are below but I’ll start with the fun part: after using some document embedding Python scripts to compare the roughly 560 Wikibooks recipes to each other, I created an If you liked… web page that shows, for each recipe, what other recipes were calculated to be most similar to that…

Custom HTML form front end, SPARQL endpoint back end

Your website's users sending SPARQL queries, even if they haven't heard of SPARQL.

In a recent Twitter exchange, Dr Joanne Paul asked “Does/can this exist? A website where I enter a title (eg. ‘earl of pembroke’) and a year (eg. 1553) and it spits out who held that title in that year (in this case, William Herbert).” Michelle Watson replied “I bet you could probably write SPARQL query to Wikipedia that would come close to doing that. Not sure how you’d embed that into a webpage though.” I replied to that: “Have an HTML form that…

Converting sqlite browser cookies to Turtle and querying them with SPARQL

Because you have more SQLite data than you realized.

There is a reasonable chance that you’ve never heard of SQLite and are unaware that this database management program and many database files in its format may be stored on all of your computing devices. Firefox and Chrome in particular use it to keep track of your cookies and, as I’ve recently learned, many other things. Of course I want to query all that data with SPARQL, so I wrote some short simple scripts to convert these tables of data to Turtle.

OpenStreetMap, or “OSM” to geospatial folk, is a crowd-sourced online map that has made tremendous achievements in its role as the Wikipedia of geospatial data. (The Wikipedia page for OpenStreetMap is really worth a skim to learn more about its impressive history.) OSM offers a free alternative to commercial mapping systems out there—and you better believe that the commercial mapping systems are reading that great free data into their own databases.

Converting JSON-LD schema.org RDF to other vocabularies

So that we can use tools designed around those vocabularies.

Last month I wrote about how we can treat the growing amount of JSON-LD in the world as RDF. By “treat” I mean “query it with SPARQL and use it with the wide choice of RDF application development tools out there”. While I did demonstrate that JSON-LD does just fine with URIs from outside of the schema.org vocabulary, the vast majority of JSON-LD out there uses schema.org.

Exploring JSON-LD

And of course, querying it with SPARQL.

I paid little attention to JSON-LD until recently. I just thought of it as another RDF serialization format that, because it’s valid JSON, had more appeal to people normally uninterested in RDF. Dan Brickley’s December tweet that “JSON-LD is much more widely used than Turtle” inspired me to look a little harder at the JSON-LD ecosystem, and I found a lot of great things. To summarize: the amount of JSON-LD data out there is exploding, and we can query it with SPARQL, so…

Changing my blog's domain name and platform

New look, new domain name.

For too long I’ve postponed the migration of my blog to something more phone-friendly. I accumulated many notes about doing this, and I also wanted to move more of my online life from the snee.com domain to bobdc.com. When someone recently asked me about changing the stylesheet (I have dug and dug in the aforementioned notes but can’t remember who and will add their name here if I ever find it) I thought I’d take a deep breath and follow through with this. This is the last new…