I originally planned to title this “Partial schemas!” but as I assembled the example I realized that in addition to demonstrating the value of partial, incrementally-built schemas, the steps shown below also show how inferencing with schemas can implement transformations that are very useful in data integration. In the right situations this can be even better than SPARQL, because instead of using code—whether procedural or declarative—the transformation is driven by the data model…
I’ve been hearing more about the Blazegraph triplestore (well, “graph database with RDF support”), especially its support for running on GPUs, and because they also advertise some degree of RDFS and OWL support, I wanted to see how quickly I could try that after downloading the community edition. It was pretty quick.
Note: I wrote this blog entry to accompany the IBM Data Magazine piece mentioned in the first paragraph, so for people following the link from there this goes into a little more detail on what RDF, triples, and SPARQL are than I normally would on this blog. I hope that readers already familiar with these standards will find the parts about doing the inferencing on a Hadoop cluster interesting.
Once, at an XML Summer School session, I was giving a talk about semantic web technology to a group that included several presenters from other sessions. This included Henry Thompson, who I’ve known since the SGML days. He was still a bit skeptical about RDF, and said that RDF was in the same situation as XML—that if he and I stored similar information using different vocabularies, we’d still have to convert his to use the same vocabulary as mine or vice versa before we could use our…
OK, it’s a rhetorical question. I know the answer: we can attach metadata to class and property declarations, so when we know that a given instance is a member of a particular class and has certain properties, if those are declared, we know more about the instance and can do more with it, not least of all aggregate it more easily with other data that uses the same or related classes and properties.
About two years ago I wondered if RDF Schema had become merely a layer of OWL or if anyone used RDFS by itself without OWL. My theory was that because tools such as TopBraidComposer, Protege, and SWOOP that let you design RDFS vocabularies also let you assign OWL properties to your classes, people used those because they were there, and we ended up with few pure RDFS vocabularies.