# What's wrong with undeclared classes and properties?

It's not like the RDF spec requires them.

OK, it’s a rhetorical question. I know the answer: we can attach metadata to class and property declarations, so when we know that a given instance is a member of a particular class and has certain properties, if those are declared, we know more about the instance and can do more with it, not least of all aggregate it more easily with other data that uses the same or related classes and properties.

I learned from Paula Gearon and Tom Heath tweets that section 2.3.2 of the “Weaving the Pedantic Web” paper (pdf) presented at the Linked Data on the Web conference in Raleigh bemoans the existence of undeclared classes and attributes. I agree that this is not a good thing, but we should be careful about attacking it.

The Pedantic Web paper does point out that “such practice is not prohibited”, which many people seem to forget. This reminds me of the decision to qualify merely well-formed XML as legal, parsable markup, which was one of the big breaks that XML made from SGML, or Tim Berners-Lee’s decision to accept the possibility of broken links in his hypertext system, unlike those of his predecessors. Serious XML-based applications still use DTDs or schemas and well-maintained web sites use some kind of link management, but the simpler, grass roots efforts don’t necessarily, and that turned out to be a great thing. It let these technologies grow to a point where millions of people can see their benefits.

If I have a triple that says

<http://www.snee.com/d/r/s3/l9d> <http://www.snee.com/8r/xa/32e>  "true"


and my subject and predicate aren’t declared anywhere, it doesn’t tell you much. If I have one that says this with an undeclared subject and predicate,

<http://www.snee.com/d/r/invoice#l9d> <http://www.snee.com/8r/xa/paid>  "true"

I worry that I fall into the standardista class because I think that using the word "semantic" in your marketing literature isn't enough to qualify your work as part of the semantic web.

you can get a general idea of what’s going on even with no declarations, as you often can from element and attribute names in XML documents that have no corresponding schemas. Unlike the XML example, though, we can see a domain name associated with “invoice#129d” and “paid” here, which gives some context and therefore a bit of semantics about them.

One great thing about RDF is that you can add on metadata after the fact, as Jim Hendler’s group at RPI is doing with a lot of the US government data. Third parties certainly can’t fix broken web links, and while James Clark’s wonderful trang can generate schemas from documents, that’s more useful as a content analysis tool than as something that you’d use to create production schemas. Adding metadata such as declarations to triples after the fact is a perfectly normal thing to do, and it helps connect those triples to each other to form a, you know, web.

I certainly don’t want to imply that the Pedantic Web effort is doing anything wrong; their efforts to educate people about the value of doing these things with more rigor are very valuable. In the name-calling that most discussions of new technology seem to devolve into these days (pedant! fanboy! standardista!), I worry that I fall into the standardista class because I think that using the word “semantic” in your marketing literature isn’t enough to qualify your work as part of the semantic web. I want to see support for relevant W3C standards involved, a position that apparently can get me lumped into the class of unreasonably demanding geeks who don’t appreciate the big picture, so I wanted to point out that the (spec-compliant) optional nature of class and property declarations can be a huge contributor to the growth of the semantic web.

XML and Tim Berners-Lee’s hypertext system scaled up to the point that they did because of both carefully engineered efforts and the fast growth of unrigorous ones. Careful engineering of a system using semantic web technology can get a lot of value from class and property declarations, but we should remember that the other great thing about RDF, besides the ease of adding metadata to existing data, is that triples are simple and easy to aggregate and therefore share. Let’s not discourage people from doing so if they don’t happen to be doing it the way that we would.