Something I’m looking at right now is generation of (in my case, HTML) API documentation from some simple markup format. The usual way to do this is by writing documentation markup inline in the source code, using specially formatted comments in header files.
Some people argue that well-written source code should be its own documentation. Well, that’s true, it is: but it’s documentation with limited utility. Source code provides the following documentation:
- Document, for the compiler’s benefit, the machine instructions that the compiler should generate
- Document, for the programmer’s benefit, the machine instructions that the compiler should generate
Developing a high-level model of how software works from its source code is possible, but mentally taxing. It’s not designed for that. It would be like asking an ant to map the coastline of Africa: it can be done, but the information available is at entirely the wrong scale.
Several other approaches for high-level documentation of software systems exist. Of course, each of them is not actually the source code, but a model: a map of Africa is not actually Africa, but if you want to know what the coastline of Africa looks like then the map is a very useful model.
Comment documentation is one such model of software. Well-written comment docs explain why you might use a method or class, and how its properties or parameters help you to use it. It gives you a sense of how the classes fit together, and how you can exploit them. They’re called comment docs because they go into comments right alongside the source they document, usually marked up with particular tokens. As an example of a token, many documentation comment systems let me use a line like this:
@author Graham J. Lee
to indicate that I wrote a particular part of the project.
Of course, this documentation could go anywhere, so why put it into the header files? For a start, you already have the header files, so you’re not having to maintain two parallel hierarchies of content. Also, the proximity of the documentation to the source code means that there’s a higher probability (still not unity, but higher) that a developer who changes the intent or usage of a method will remember to update the documentation. Additionally, it means that the documentation can be no more verbose than required: anything that is obvious from the source (the names of methods and their parameters, for example) can be discovered from the source. This is not inconsistent with my earlier statement about source-as-documentation: you can easily find out a method signature without having to grub through the actual program instructions.
Finally, it means that when you’re working with the source, you have the documentation right there. This is one very common way to interact with comment documentation. The other way is to use a tool to create some friendly formatted output (for me, the goal is HTML) by reading the source files and extracting the useful information from the comments.
I have for the last forever (alright, three years) used Doxygen, for a couple of reasons:
- The other people on my team at the time I adopted it were already using it
- Since then, it has gained support for generating Xcode documentation sets, which can be viewed inside the Xcode organizer.
It’s not great, though. It’s a very complicated tool that tries to do all things for all people, so the configuration is huge and needs a lot of consideration. You can customise the look of the output using a CSS file but you pretty much need to as the default output looks like arse. Making it actually output different stuff is trickier, as it’s a C++ project and I’ve only ever learned enough C++ to customise Clang.
So I decided to fish around for alternatives. Of course, we know that Apple uses (and ships) HeaderDoc, but it uses its own comment format. Luckily, its comment format is identical to Javadoc, which is the format used by Doxygen. You can get headerdoc2html to look for the /** trigger by passing the -j flag.
Speaking of headerdoc2html, this is one of two programs in the HeaderDoc distribution. The other is gatherheaderdoc, which generates a table of contents from a collection of output files. Each of these is a well-written and well-documented perl script, which makes extension and modification super-easy.
Neither should be immediately required, in fact. The tool understands all of the Objective-C language features, and some extra bonus things like availability macros and groups of related methods. The default output basically looks like someone forgot to apply the style sheet to developer.apple.com. It’s trivial to configure HeaderDoc to use an external style sheet, and you can even create a custom template HTML file so that things appear in whatever fashion you want. Another useful configuration point is the C preprocessor, which you can get HeaderDoc to run through before interpreting the documentation.
The GNUstep project uses its own comment documentation tool called Autogsdoc. The fact that the autogsdoc documentation has buggy HTML does not fill one with confidence.
In fact, Autogsdoc’s output is unstyled XHTML 1.0, so it would be easy to style it to look more useful. Some projects that I’ve seen appear to have frames-based HTML, which is unfortunate.
Autogsdoc uses inline comment documentation, the same as the other options we’ve looked at. However, its markup is significantly different, as it uses SGML-style tags inside the documentation. It has a (well-documented, of course) Objective-C implementation with a good split of responsibilities between different classes. Hacking about on its innards shouldn’t be too hard for any motivated Cocoa coder.
I know this has been annoying you all since the first paragraph in the section on HeaderDoc: */.