Queries to explore a dataset

Even a schemaless one.

I recently worked on a project where we had a huge amount of RDF and no clue what was in there apart from what we saw by looking at random triples. I developed a few SPARQL queries to give us a better idea of the dataset’s content and structure and these queries are generic enough that I thought that they could be useful to other people.

In my last posting I described Carnegie Mellon University’s Index of Digital Humanities Conferences project, which makes over 60 years of Digital Humanities research abstracts and relevant metadata available on both the project’s website and as a file of zipped CSV that they update often. I also described how I developed scripts to convert all that CSV to some pretty nice RDF and made the scripts available on github. I finished with a promise to follow up by showing some of the…

I think that RDF has been very helpful in the field of Digital Humanities for two reasons: first, because so much of that work involves gaining insight from adding new data sources to a given collection, and second, because a large part of this data is metadata about manuscripts and other artifacts. RDF’s flexibility supports both of these very well, and several standard schemas and ontologies have matured in the Digital Humanities community to help coordinate the different data sets.

17 years of my web bookmarks, with metadata

Featuring "75 Bleeding-Edge Search Engines To Beat Google", and more!

Much of the original point of the web was not just linking from one page to another but also saving and managing links, ideally with some metadata. Because of this, all browsers give you some way to save a link to a web page as a bookmark, and they typically let you sort these into a hierarchical arrangement of folders.

My command line OWL processor

With most of the credit going to to Ivan Herman.

I recently asked on Twitter about the availability of command line OWL processors. I got some leads, but most would have required a little coding or integration work on my part. I decided that a small project that I did with the OWL-RL Python library a few years ago gave me a head start on just creating my own OWL command line processor in Python. It was pretty easy.

You probably don't need OWL

And if you do there's a simple way to prove it.

During the course of my recent blog posts What is RDF?, What is RDFS?, What else can I do with RDFS?, and Taxonomy management with SKOS, some readers wondered if I would do a “What is OWL?” followup. I recommended to one inquirer that he read pages 39-41 and 263 - 269 of Learning SPARQL; I think that provides a pretty good introduction to OWL’s history and how to do some of the set-based logic that was an important part of its original intent.