Triples about existing triples

The easy way and the hard way.

triple within a triple

Triples about existing triples

Several years ago in the blog post RDF* and SPARQL* I described how I had played with implementations of the new reification syntax that Olaf Hartig and Bryan Thompson proposed in their paper Foundations of an Alternative Approach to Reification in RDF. I found the new syntax to be straightforward and useful. As you can see from the recent W3C Community Group Report RDF-star and SPARQL-star, this syntax has progressed—with a more search-engine-friendly spelling of the spec’s name—closer to W3C standardization. (You’ll also see me listed as an author of that specification; I merely did a pull request that revised the tutorial from an earlier draft, so I was honored to be co-credited on that document.)

Because of the advancing specification, the wider implementation, and some potential syntax trickiness for situations that I would consider to be edge cases, I wanted to first review the current syntax that I feel will be the most popular and then review the potentially tricky part that I think most people can ignore. (I realized that the second part of my subtitle of “The easy way and the hard way” could imply the original reification syntax from years ago, but I think we can all put that behind us.)

The simple way: annotation syntax

The simple way is called annotation syntax, which as far as I know did not exist yet when I did my earlier experiments with RDF-Star and SPARQL-Star. Using the Turtle-Star syntax, if you have a triple that expresses a statement and you want to record other triples about that triple in annotation syntax, you put them after it inside of {| and |} delimiters.

Here is the example from that earlier blog entry expressed in annotation syntax. It has three triples that I got from Olaf’s slides that the blog entry linked to:

  1. One triple saying that (Stanley) Kubrick was influenced by (Orson) Welles.
  2. Another saying that triple 1 has a significance of 0.8.
  3. A third one saying that triple 1 has its source at a URL at nofilmschool.com.
@prefix d: <http://www.learningsparql.com/ns/data/> .

d:Kubrick d:influencedBy d:Welles {| 
   d:significance 0.8 ;
   d:source <https://nofilmschool.com/2013/08/films-directors-that-influenced-stanley-kubrick>
|} .

Using Apache Jena arq or the free version of Ontotext’s GraphDB, a SELECT * WHERE {?s ?p ?o} query to get all the triples in that block of Turtle-Star retrieves this:

?s                                      ?p             ?o
--------------------------------------- -------------- ------------------
d:Kubrick                               d:influencedBy d:Welles .
<< d:Kubrick d:influencedBy d:Welles >> d:significance "0.8"^^xsd:decimal . 
<< d:Kubrick d:influencedBy d:Welles >> d:source https://nofilmschool.com/2013/08/films-directors-that-influenced-stanley-kubrick . 

It’s the three triples from the numbered list above.

To understand better what this syntax adds, here is the sample data from my earlier blog entry on this topic:

@prefix d: <http://www.learningsparql.com/ns/data/> .
<<d:Kubrick d:influencedBy d:Welles>> d:significance 0.8 ;
      d:source <https://nofilmschool.com/2013/08/films-directors-that-influenced-stanley-kubrick> .

The same query on this data will show the second and third result rows above but not the first one. In other words, this data doesn’t actually say that Kubrick was influenced by Welles; it only has metadata about this statement.

When you use annotation syntax in a SPARQL query, you’re using SPARQL-Star. To let me make my next SPARQL-Star query a little more interesting, I added the following data to the triples above:


d:Scorsese d:influencedBy d:Rosselini {| 
   d:significance 0.9 ;
   d:source <https://en.wikipedia.org/wiki/Martin_Scorsese>
|} .

d:Tarantino d:influencedBy d:Scorsese .

For which director influence triples do we have annotations about the significance of that influence?

PREFIX d: <http://www.learningsparql.com/ns/data/> 

SELECT ?director
WHERE {
  
  ?director d:influencedBy ?o {|
      d:significance ?significanceScore
  |} .

}

The result:


?director
-----------
d:Scorsese
d:Kubrick

It’s all pretty simple, until we get to…

Quoted and asserted triples

The original proposal that I mentioned in the first paragraph above did not mention the concepts of quoted or asserted triples until its authors later added a “This document has become obsolete” paragraph at the top. In the latest version of the specification, the first subsection of the Concepts and Abstract Syntax section is titled Quoted and Asserted Triples and includes this:

A quoted triple is a triple used as the subject or object of another triple. Quoted triples can also be called “embedded triples”.

in RDF 1.1, an asserted triple is an element of the set of triples that make up an RDF graph. RDF-star does not change this except that an RDF-star triple can contain quoted triples. A triple can be used as an asserted triple, a quoted triple, or both, in a given graph.

This tells me that the regular triples that we’ve been using all along are now known as asserted triples, and the new kind—the kind that can be used as the subject or object of another triple—are known as quoted or embedded triples. (I did enjoy this quote from the Community Group Report after it used a Lisp analogy to explain the difference between asserted and quoted triples: “Obviously this way of thinking is helpful only if you understand how Lisp works”.)

Here is an example. The following Turtle translated to plain English says “Sam said that the earth is flat”.

@prefix d: <http://learningsparql.com/ns/data#> .

d:sam d:said << d:earth d:shape "flat" >> .

It does this with:

  • An asserted triple that tells us that Sam said something.
  • A quoted triple that tells us what he said: that the earth is flat.

That second one is a quoted triple because it’s used as the object of the first triple, and the << >> delimiters show us that it’s a quoted triple.

If I do a SELECT * WHERE {?s ?p ?o} query on this data to get of that examples’ triples, this is all I will see:

--------------------------------------------------------
| s       | p        | o                                |
========================================================
| d:sam   | d:said   | << d:earth d:shape "flat" >>     |
--------------------------------------------------------

What if I do a query asking for triples about the earth’s shape, like this?

PREFIX d: <http://learningsparql.com/ns/data#>

SELECT *
WHERE {
  d:earth d:shape ?earthShape
}

I won’t get any response. That data has no asserted triples about the earth’s shape.

If I wanted this earth-is-flat triple to be both an asserted triple and a quoted triple, I can record it as both:

@prefix d: <http://learningsparql.com/ns/data#> .

d:sam d:said << d:earth d:shape "flat" >> .
d:earth d:shape "flat" . 

Example 7 in the Community Group Report also demonstrates this.

I don’t like this redundancy because maintaining a thing and a separate copy of the thing is usually a bad idea. If you edit one, then maybe you do or don’t need to make the same edit to the other, and maintenance gets messy. That’s why it was nice to see that the first section of the Community Group Report’s Concrete Syntaxes section is Annotation Syntax, which I described above as the simpler way to just have triples about triples without some of those triples having a special status that prevents them from showing up as the result of an ?s ?p ?o query.

I’m sure that having this separate status be a part of the architecture will enable some finer-grained modeling. To just have triples about triples (especially to express data about edges between graph nodes, which was a key inspiration for all of this), I’m happy with the annotation syntax for now.


Comments? Reply to my tweet (or even better, my Mastodon message) announcing this blog entry.