"Readings in Database Systems": wisdom from Michael Stonebraker

and two other guys--updated and free online.

Michael Stonebraker

As I tweeted last July, I always learn so much about both the past and future of database computing from recent Turing Award winner Michael Stonebraker. I recently learned that the latest edition of Readings in Database Systems, also known as the “Red Book,” is available for free online under a Creative Commons license—or at least the introductions to the readings are. With most of these being by Stonebraker, and quite up-to-date, I consider these 43 pages required reading for anyone interested in database technology.

The serious student should find and read the actual papers, but I learned plenty from the introductions by Stonebraker and his co-editors Peter Bailis and Joe Hellerstein. (Ben Lorica’s recent podcast interview with Hellerstein is also worth a listen.) For example, after reading the introduction to chapter 4 I now have me a much better understanding of the advantages of column stores over more traditional row stores, and chapter 12 helped me to understand the history of Data Warehouses and the role of ETL much better.

This is the fifth edition of the book, published in 2015, so it is very current, as you can see from the way it treats MapReduce as past history. They published the first edition in 1988, so this has clearly been a long-term project, and it’s interesting to see which twentieth century papers appear in the new fifth edition—for example, Sergey Brin and Larry Page’s 1998 classic The Anatomy of a Large-scale Hypertextual Web Search Engine.

Several of Stonebraker’s more opinionated assertions were enough fun to read they tempted me to start a fake Twitter account, modeled on the hilarious @boredElonMusk, that I would call @crankyMikeStonebraker. It would feature real quotes from the Red Book such as these:

  • “SQL will be the COBOL of 2020, a language we are stuck with that everybody will complain about.”

  • “[JSON] is a disaster in the making as a general hierarchical data format.”

  • “I consider ODBC among the worst interfaces on the planet.”

  • “The rest of the world is seeing what Google figured out earlier; Map-Reduce is not an architecture with any broad scale applicability.”

  • “The MapReduce crowd has turned into a SQL crowd and Map-Reduce, as an interface, is history.”

  • “Just because Google thinks something is a good idea does not mean you should adopt it.”

  • “We begin with a sad truth. Most data science platforms are file-based and have nothing to do with DBMSs.”

  • “the new buzzword is master data management (MDM)… MDM is the opposite of business agility.”

While the very title of “Readings in Database Systems” will make some peoples’ eyes glaze over, bits like these make it much more fun to read than many would expect, especially if you care at all about the role that database systems play in modern applications.

Photo of Michael Stonebraker by D Coetzee via flickr (CC0)