In an October 14th article in the New Yorker about the use of Artificial Intelligence to generate prose, John Seabrook wrote: “A recent exhibition on the written word at the British Library dates the emergence of cuneiform writing to the fourth millennium B.C.E., in Mesopotamia”. That got me thinking about some notes I once took on the early history of metadata, and I wondered if there was any scholarship to show that the earliest metadata is as old as the earliest writing. Not quite, but cuneiform tablets of metadata from the early second millennium B.C.E. are still some pretty old metadata.
First, how do I define “metadata”? The classic definition “data about data” is a bit vague; a movie review is data about data, but it's not metadata. I would define metadata as data—ideally, structured data—recorded to aid in the navigation of other data. I was going to say “navigation and retrieval and maintenance”, but you can't efficiently retrieve or maintain data that you have difficulty finding, so it all builds from navigation. As a working definition I think this covers most uses of metadata.
I followed a footnote from the 2000 book The Great Libraries: From Antiquity to the Renaissance to the article Archive and Library Technique in Ancient Mesopotamia published by Danish researcher Mogens Weitemeyer in the International Library Review journal Libri in 1956. The article's main point is to explore the idea of a “library” as opposed to an “archive” as these terms may apply to a particular archaeological site. To describe one particular set of cuneiform tablets that led to a library vs. archive debate among scholars, Weitemeyer wrote
Some small tablets from the III Dynasty of Ur (a few somewhat older) found in Lagash, Umma (Djoha), Puzurish-Dagan (Drehem), and Ur tell us about the way in which the archive tablets were stored. At the left edge of the small tablets there are two holes comparatively near each other. From one hole ot the other extended a strand of reed (thin like bast), the impression of which is still clearly visible in the clay (Fig. 4b). By means of this reed-strand the small tablet was fastened to a container of tablets. This appears from the first line of the small tablet, which reads, in Sumerian, gá-dub-ba (dub=tablet, gá=container), i.e. tablet container. Hence, the small tablets were no doubt labels, attached to the receptacles and indicating their contents.
“Figure 4b” refers to the label tablet on the right in the picture above. Weitemeyer went on to point out how you can see the pattern from the basketwork in the tablet on the left of the picture. He also went on to say
The labels first stated that the receptacle was a tablet basket; then followed information about the contents of the tablets, e.g. legal verdicts, accounts, receipts and expenses. At the end was an indication of the period covered; in most cases the period was one year, in some cases the beginning year (or month) and the finishing year (or month) were indicated.
Each small tablet had information about a larger dataset (the content of the container it was attached to) to help people determine whether the information they needed was in that container. Not only is this clearly metadata, but with the apparently regular practice of indicating the period covered by the referenced data at the end of the small tablet's description, this metadata even has some structure to it. Recording the date range covered by a set of described data has continued to be a pretty classic piece of metadata, and with the Third Dynasty of Ur being 4,000 years ago, that's some pretty old structured metadata.
I have been researching the history of metadata on and off for a few years and may write up some more of what I found in future blog entries. (The next stop would be Mycenaean Greece.) It has been fun to find that the idea of metadata, which we consider to be so modern today, has actually been around for literally thousands of years.