More advice about software documentation

Especially documenting APIs.

May 14, 2023

Early last year, in the blog entry Doing a podcast interview about technical writing, I described an interview I did for the IEEE Software Engineering Radio podcast. Listening to it again this week I saw that I covered a lot of good ground. Since then I have thought of a few other points I wish I’d mentioned, so here they are in another bulleted list. Because of some recent experience I had enough thoughts about documenting APIs that I gave that discussion its own section below.

Documentation (in particular, the User Guide) should reflect a company’s official vision for the product, which is something that a team of people at the company typically worked pretty hard on. Coordinate with them and their work. For each thing that the vision promises about the product, it should be easy to find information about how to do that thing in the documentation.
Good documentation is a form of marketing literature and good marketing literature is a form of documentation. Documentation should convince the reader that the product will help them get useful work done and marketing literature should educate the reader about how the product gets used.
Jupyter notebooks are good for documentation–but just for tutorials, because they walk through a series of specific steps for a particular scenario and show the results. When you have tutorials, you still need a User Guide to explain big-picture topics and a Reference Guide to explain every detail of the product. Notebooks are so focused on specific scenarios that they’re not good for those purposes.
Something I hate, and I’ve seen entire books of it: developers who think that long programs from them in the documentation with lots of comments are the best way for others to learn. (Ooh, a “complete application!”) Examples should be short and self-contained so that they are easier to apply to other contexts. In a book, explanations of code samples should be in a readable proportionally-spaced font such as Times Roman or Helvetica. This is part of the appeal of Jupyter notebooks, which let you mix executable code with nicely-formatted prose text.
While I stand by my description of the basic categories of documentation, the term “User Guide” seems to have gone out of fashion. Instead of a five-section User Guide as a top-level document in a product’s documentation collection, nowadays a software company is more likely to present each of those five sections as a top-level document in its own right. The good news: what was formerly a third-level heading in that content becomes a second-level heading, so it’s easier to display descriptions of more sections in an expanded table of contents. The bad news: the list of top-level documents can more easily get too long and therefore difficult to quickly evaluate when looking for something.

Documenting APIs

The core of a software product may be an API, or Application Programming Interface. Customers doing their work in a particular programming language are given libraries or access to a server where they can call the functions that make up the API. If you buy a robot with a Python API, the product includes Python libraries so that you might make calls like head.turn(left,30) if you want to turn the robot’s head 30 degrees to the left. You don’t have to worry about the robot’s internal electronics because the vendor has provided you with a Python-based interface at a higher level of abstraction that lets you simply tell the robot what to do.

Usually, the developers who actually implemented the API functions like head.turn(direction,degrees) included comments with their code that describe more about these functions and what you can do with them. If their comments follow the right formatting conventions, an automated program such as Swagger, Sphinx, PyDoc, or Doxygen can extract those comments (known as docstrings) and package them in HTML documents that make it easy for API users to look up the information they need. The conversion package will include CSS modules and other means to customize the content’s look and feel.

From the user’s perspective, the API documentation is the formatted list of the functions or methods that they can call along with a description of each one’s purpose, the role and types of parameters to pass to it, what it returns, and maybe an example.

I’ve heard people refer to this as automated creation of API documentation, but the automated generation of the HTML doesn’t mean that the actual writing of the documentation is automated. For example, the developer who coded the head.turn() function might just give the two parameter names of direction and degrees and leave it at that. This can lead to many questions: how do you specify the direction? What are the choices? Are they constants or quoted strings? Does the number of degrees have to be a whole number?

To improve these descriptions, a tech writer working on API documentation functions as an editor, but much more than a copy editor correcting spelling and punctuation. These tech writers are also reporters, interviewing the developers about who would use a given function, for what purpose, in what situations. You often ask those same questions about each parameter passed to the function: why is each one there? What powers does it give to the developer calling the function? API documentation seems very different from marketing literature, but to return to my point above about aligning them, even API documentation should make it clear that each feature provides useful value to the user that fits in with the bigger picture of what the marketing department promises about the product.

Just providing the type of each parameter might not be enough. For example, for a number, the documentation should indicate what would be a typical high value and a typical low value and what effect these would have. I was once revising the description of a parameter whose docstring merely said that it was a decimal number and then found out from the responsible developer that it represented a percentage, so that a value of .5 meant 50%. The fact that 1 was the highest possible value that you could pass was a surprise to me, and I made sure to include that in its documentation.

It’s not unusual for tech writers to ask the developer these questions and then go and fix the docstrings in the source code themselves–coordinating, of course, with the developers in charge of maintaining that code and often going through the same Github pull request steps that modifications to the executable code go through. These tech writers must be familiar with the syntax of the programming language being used, because you don’t want to break anything; accidentally deleting the wrong comma can prevent the code from compiling at product build time.

If the API is a significant part of the product, tech writers should know the language that calls the API well enough to write programs that try out these functions themselves. This can often help them to answer their questions about the functions that need more documentation, thereby reducing the need to pester the developers who wrote these functions in the first place.

Ideally, tech writers will enjoy writing programs that use the API to make the product do interesting things. If so, they are in the right job!

API diagram picture by Ben Smith, Creative Commons CC BY-NC 2.0

Comments? Reply to my tweet (or even better, my Mastodon message) announcing this blog entry.

SPARQLing anything

Querying for audio on Wikidata

Use SPARQL to query for movies, then watch them

SPARQL queries of the Billboard Hot 100

Visualizing RDF

Using regular expressions to manipulate data in a SPARQL query

Appreciating the SPARQL property path slash character more

Triples about existing triples

Querying for labels

Human-readable names in RDF

blog

home