Slash with natural Les Paul

I’ve understood SPARQL’s property path features well enough to demo them in the “Searching Further in the Data” section of my book Learning SPARQL. (See example files ex074 - ex085.) To be honest, I have very rarely used them in actual queries that I’ve written. I’ve only just realized how the property path slash operator can help with a pattern that I have used in a large percentage of my queries. It makes these queries more concise and removes at least one variable that would not have been in my SELECT statement anyway.

As an example, here is some very simple data about three people and who follows who on social media:

@prefix schema: <http://schema.org> .
@prefix d:  <http://learningsparql.com/ns/data#> .

d:i0432 d:name "Richard Mutt" . 
d:i9771 d:name "Cindy Marshall" . 
d:i8301 d:name "Craig Ellis" . 

@prefix schema: <http://schema.org/> .
d:i0432 schema:follows d:i9771, d:i8301. 

If I want to list who Richard follows, I want to list their actual names, not their URIs. This would be an obvious query to do that:

PREFIX d:      <http://learningsparql.com/ns/data#>
PREFIX schema: <http://schema.org/> 

SELECT ?name WHERE {
  
  ?follower d:name "Richard Mutt" ;
            schema:follows ?person .
  
  ?person d:name ?name .
}

It finds the URIs of the people that Richard follows, stores them in the ?person variable, and then finds the d:name value of each of those people. Having a query find resources that meet a certain condition and then using another triple pattern to get the human-readable names of those resources (and then using those names in the SELECT statement) is extremely common in SPARQL.

The property path slash character lets me do the same thing with no need for the ?person variable in the previous query. This next query asks, for each resource that Richard follows, what their name is:

PREFIX d:      <http://learningsparql.com/ns/data#>
PREFIX schema: <http://schema.org/> 

SELECT ?name WHERE {
  ?follower d:name "Richard Mutt" ;
            # For each followed resource, what is its name?
            schema:follows/d:name ?name . 
}

In graph terms, we store the URI of Richard Mutt’s node in the ?follower variable, then traverse schema:follows graph edges to any nodes that then have a d:name edge, and then we store each value that the d:name edge leads to in the ?name variable.

I don’t think that it’s intuitively very readable, which is why I added the comment in the query, but perhaps as I use this more I will get used to it. (Note also that the comment doesn’t ask “What is the name of each followed resource?”; I wanted it to reflect the syntax it describes a little more closely.)

This is such a common pattern that I wanted to show some examples from more real-life contexts. The following query asks Wikidata for the names of the members of Daft Punk. It does this by storing the URI representing each member of the group in the ?member variable, and it then asks for the rdfs:label value of each, filtered to only show the English representation. (You can execute this query with the Wikidata Query Service yourself.)

PREFIX wd: <http://www.wikidata.org/entity/>

SELECT ?name WHERE {
  wd:Q185828 wdt:P527 ?member . 
  ?member rdfs:label ?name . 
  FILTER(lang(?name) = "en")
}

But, we don’t need that ?member variable and second triple pattern! We can just do this:

PREFIX wd: <http://www.wikidata.org/entity/>

SELECT ?name WHERE {
# For each member of Daft Punk, what is their name? 
  wd:Q185828 wdt:P527/rdfs:label ?name . 
  FILTER(lang(?name) = "en")
}

Run this second query and you will see the same results as the query before it.

I could do this with something besides names, such as their birth dates, but a list of dates with no context about what resources they describe isn’t very helpful. (Using it for names also just happens to build on a theme of recent entries in my blog, Human-readable names in RDF and Querying for labels.)

As another example, I was going to create a query for the Rhizome Artbase SPARQL endpoint that I wrote about in Generating websites with SPARQL and Snowman, part 1. Then, I realized that I could use a query that was already in that blog entry, which you can run yourself:

PREFIX rt: <https://artbase.rhizome.org/prop/direct/>

SELECT DISTINCT ?artistName WHERE {
  ?artwork rt:P29 ?artist . 
  ?artist rdfs:label ?artistName .
}
ORDER BY (?artistName)
LIMIT 250

This time, we’ll remove the ?artist variable from the end of the first triple pattern and the beginning of the second and create a property path out of rt:P29 and rdfs:label:

PREFIX rt: <https://artbase.rhizome.org/prop/direct/>

SELECT DISTINCT ?artistName WHERE {
  ?artwork rt:P29/rdfs:label ?artistName .
}
ORDER BY (?artistName)
LIMIT 250

Run this one and you’ll see the same result as the previous query.

PREFIX rt: <https://artbase.rhizome.org/prop/direct/>

SELECT * WHERE {
  ?artist rdfs:label "Jessica Gomula"@en . 
  ?artwork rt:P29 ?artist .
  ?artwork rdfs:label ?name . 
}

Has anyone else found a particular property path pattern to be worth using in a high percentage of their SPARQL queries?


Comments? Reply to my tweet (or even better, my Mastodon message) announcing this blog entry.

CC BY-SA 2.0 photo by Dineshraj Goomany