Human-readable names in RDF

Sometimes simple, sometimes not.

labeled tomato plants

rdfs:label

First, reviewing some basics before I discuss the edge cases: resources in RDF are represented by URIs, and the spelling of a given URI often provides no clues about what the URI represents. For example, you wouldn’t know from looking at http://www.wikidata.org/entity/Q144 that it represents “dog” as a Wikipedia topic. (We’ll see below that this is a for a good reason.)

Subject-predicate-object triples use predicate-object pairs to describe the resources represented as URIs by each subject. (We sometimes forget that RDF stands for “Resource Description Framework”.) The most popular predicate is the one that gives us a human-readable name to tell us what resource the URI represents: rdfs:label. People typically use it to assign an identifying name to a resource.

You can optionally add a language tag to indicate the spoken language of the label value. Assigning multiple terms in different languages to the same resource makes it easier to build multi-lingual applications.

@prefix wd: <http://www.wikidata.org/entity/> . 

wd:Q144 rdfs:label "dog"@en . 
wd:Q144 rdfs:label "perro"@es . 

This also reminds us why it’s a bad practice to include descriptive text as part of the URI: including “dog” in the URI http://www.wikidata.org/entity/Q144 would only help people who know English, and including “perro” would only help people who know Spanish.

schema:name

While most schemas and ontologies are built around a specific domain such as a business sector or an academic discipline, the very successful schema.org is much broader, covering many aspects of ordinary life and commerce. Unlike most other vocabularies, schema.org does not use rdfs:label for names, but its own schema:name property instead. The discussion What is the difference between schema:name and rdfs:label? on a schema.org development issue page explains why: many processors that can read schema.org data from a web page won’t know about RDF and won’t recognize rdfs:label. As part of that discussion, Dan Brickley mentions adding a subPropertyOf assertion to the definition of schema:name, which we see right in the property’s definition in the RDFS schema that you can download from the schema.org Developers page:

schema:name a rdf:Property ;
    rdfs:label "name" ;
    rdfs:comment "The name of the item." ;
    rdfs:subPropertyOf rdfs:label ;
    owl:equivalentProperty dcterms:title ;
    schema:domainIncludes schema:Thing ;
    schema:rangeIncludes schema:Text .

This is a perfect response to RDF geeks who complain that schema.org should have used rdfs:label instead of making up its own schema:name property—for a system that can parse full RDF and do even minimal inferencing, a schema:name value counts as an rdfs:label value. It says so right on the fourth line of the above excerpt.

A brief detour: dc:title and skos:prefLabel

The schema excerpt above also includes an assertion that schema:name is an owl:equivalentProperty to the Dublin Core dcterms:title property. The Dublin Core vocabulary is almost as old as the web itself, predating schema.org by sixteen years. That vocabulary’s specification describes both dcterms:title and the property of which it is a subproperty, dc:title, as “A name given to the resource”, which supports Dan’s note that the schema:name property means the same thing as dc:title.

I think of the Dublin Core terms as slightly narrower than that. The Wikipedia page for Dublin Core describes it as a set of “metadata items for describing digital or physical resources”, which aligns it with rdfs:label, but Dublin Core was first developed in response to the rapidly expanding ideas of what constituted “publishing” in the early days of the web, so I’ve always thought of it as by and for the publishing industry. (I once took part in a standards group that developed standards more specifically for the magazine industry, and when they needed separate properties for a given issue’s publication date and newsstand date, making each a subproperty of dcterms:date was a perfect use case for RDFS subproperties.) I suppose the word “title” also makes me think of a label for a book, musical album, or other published work.

The SKOS skos:prefLabel property, which names something’s preferred label (as opposed to alternative or hidden labels, which are additional SKOS properties) may seem equivalent to rdfs:label. I don’t think of it as suitable for just any existing or imaginary resource, the way rdfs:label is, but instead for for naming concepts within the taxonomies and thesauruses that SKOS was designed to help manage. The SKOS specification does say that it’s a subproperty of rdfs:label, so this supports the idea that it’s a specialized version of that, but the actual SKOS schema shows that skos:prefLabel does not have an rdfs:domain of skos:Concept (that is, it’s not defined as being used only for describing labels of Concepts) as I had expected. Still, it was defined as part of SKOS, and SKOS is about managing vocabulary terms and their relationships and other metadata, with concepts being the central organizing unit for managing these terms and their metadata.

One person’s SKOS taxonomy might be another person’s hierarchical class structure; converting one to the other with a SPARQL CONSTRUCT query has helped many people take advantage of available data that otherwise wasn’t a perfect fit for their system. This typically means converting between concept skos:prefLabel values and class rdfs:label values.

How do we query for all these types of labels? Generally, the same way we query for any other RDF values, but in my next blog entry I’ll talk about a built-in special service in Wikidata that lets you replace several lines of label-retrieving SPARQL code with a single line.


Comments? Reply to my tweet (or even better, my Mastodon message) announcing this blog entry.

CC BY 2.0 photo by F Delventhal