On documentation

Over at the daily WTF, Alex Papadimoulis writes about Documentation Done Right. His conclusion is spot on:

The immediate answer to what’s the right way to do documentation is clear: produce the least amount of documentation needed to facilitate the most understanding, and be very explicit about which documentation is to be maintained and which is to be archived (i.e., read-only and left to rot).

The amount of documentation appropriate to any project depends very much on the project, and the people working on it. However I’ve found that there are some useful rough guides that can generally be applied. Like Alex, I’m ignoring user documentation here and talking about internal project artifacts.

Enterprise project documentation exists to give managers someone to blame.

The reason large projects seem so documentation-heavy is twofold: managers don’t like to think they don’t know what’s going on; and managers like to be able to come down like a ton of shit on someone when they find out that they don’t know what’s going on. That’s where the waterfall model comes into its own. The big up-front planning means that everyone on the project has signed off on the ridiculous unreadable project documentation of doom, so it must describe the best software possible. Whoever does something that isn’t the same as the documentation has fucked up.

You never need the amount of documentation produced by most large-company software projects. Never. It’s usually inaccurate even at the moment the it’s signed off, because it took so long to get there the requirements changed, and because the level of precision required leads authors to make assumptions about how the OS, APIs etc. work. It’s often very hard to work from, because there are multiple documents, poorly cross-referenced using support “tools” like Word, each with its own glossary depending heavily on terminology relevant to the author’s domain. And they weren’t written by the same author.

In order to code up a feature from one of these projects, you need to check the product requirements document to see what it’s supposed to do (and whether it’s the highest priority outstanding work – you also get to find out who asked for the feature, how much money it’s worth, and plenty of other information that’s of no use to a software engineer). You check the functional specification to see how it’s supposed to do it. You check the architecture document to see what classes go in which packages. You raise a change control because the APIs don’t support part of the functional specification. Two weeks later, a fix has been approved. You write the code, ensuring that you use the test plan to find out what the acceptance criteria are, and when they’ll be tested (in fact, I’ve worked on projects where the functional and performance test plans were delivered at different times by different people). Oh, and make sure you’re keeping up to date with the issue tracker, otherwise you might do work that’s been assigned to someone else. Your systems engineer can point to the process documentation to let you know how you do that.

Good documentation doesn’t answer all the questions, but leaves you capable of asking smart questions of the right people.

It’s no secret that I’m a fan of user stories as a form of requirements documentation. A well-written user story tells you what the user expects to be able to do after a feature has been added, and precious little else. There might be questions over terminology, or specifics such as failure cases, but you know who to ask – the person who wrote the user story. There will be nothing about architecture or the GUI layout, because those things aren’t up to the user, customer or product manager (or at least they shouldn’t be).

Similarly, a good architecture diagram is going to tell you something about how the classes in an application fit together, and precious little else. If you can’t work out how to fit your feature in, you can ask the architect, but you’ll both be able to use the diagram as a good place to start. As Alex says in his post, precise documentation will go out of date quickly, so the documentation needs to be good enough to hang discussions on, and no “better”.

UML diagrams are great to describe code before it’s been written; they’re not so great at writing code.

If you’re trying to explain to another engineer how you think a feature should work in code, what design pattern to follow, or what steps will be needed to talk to a server, then a UML diagram is a great way to do it. Many engineers understand UML, and those who don’t can work out what it means (with your help) very quickly. It’s a consistent language for talking about software.

That said, when I draw UML diagrams I tend to prefer whiteboards and agnostic diagramming software like Omnigraffle or dia over syntax-checking UML tools like Enterprise Architect or ArgoUML. The reason is that I gave in the last section: the diagram needs to be good enough to explain to someone else what we’re going to do, not a gold-plated example of conformance to the UML specification. I don’t care if my swim-lane is in the wrong place if everybody involved knows what I mean (or can ask).

Code generated from UML tools tends to have readability issues. Whereas you or I might put related methods together, the tool might sort them by visibility or name. If you’re diagram is rough enough to be useful, then the tool will only generate a few method stubs – and you’ll be tempted to fill them in rather than creating small private methods to do logical units of work. These and other problems (I’ve seen EA generate code that uses UUIDs as class identifiers) can be fixed, but you shouldn’t need to fix them. You should write your application and sell it.

After code has been written, the only accurate documentation is the code itself (or generated from it).

There are other useful forms of documentation – for instance, a whiteboard diagram explains a developer’s understanding of the code. It doesn’t document the code itself – an important distinction.[*] All of those shiny enterprise documents your project manager got you to write went out of date as soon as the first customer reported a bug or feature request. The code, however, documents exactly what the code does – just not necessarily in the most useful way. Javadoc/doxygen comments are more likely than any thing else to stay in sync with the code due to their proximity, but even those can be outdated or unhelpful.

This is where UML tools can come in very handy. Those with the ability to generate diagrams based on code (even Xcode does this) can automatically give you a correct view of the application’s behaviour, at a more appropriate level of abstraction than the code itself. If what you need is a package dependency diagram, it’s better to get it from a UML tool than to try and read all of the source.

Unfortunately, Xcode (and Doxygen’s very limited capabilities) are the only games in town for Objective-C. Tool support for Java and C# is way ahead, but for Cocoa developers there’s only MacTranslator (which I’ve never tried). Not that UML maps particularly well onto Objective-C anyway.

[*]Though documenting a developer’s assumptions and comparing them with reality can often explain where some subtle bugs come from, and is of course useful.

The better your prototypes, the worse the feedback.

Back in the 1990s there was an explosion of Rapid Application Development tools, after the rest of the software industry saw Interface Builder and decided it was good. The RAD way (which is, of course, eminently achievable using IB) is to produce an executable prototype for users to give feedback on. Of course, the problem is that you end up shipping the prototype. I’ve actually ended up doing that on a couple of Mac applications.

One problem is that because the app looks complete, users assume it is complete and that they’re being asked to provide polishing details, or spot spelling mistakes and misplaced buttons. The other is that because it looks complete, managers assume it is complete and will tell you to ship it.

Don’t make that mistake. Do UI prototypes as paper-based wireframes, Keynote presentations or Cappuccino apps. Whatever you do, make it look like it’s just a crappy sketch you’re willing to have ripped to shreds. That way, people will rip it to shreds.

There are few document “artifacts” that need to hang around.

If you think about what you’re going to do with an application after you’ve written it, there’s selling it, supporting it, maintaining it, and extending it. Support people might need some high-level architecture knowledge so that they can work out what component a problem is in, or how to diagnose a particular failure.

Such an architecture document can be a good aid for planning new features or bug fixes, because you can quickly see where you need to modify the app to get the desired behaviour (clearly you then need to dive into the code to get a better idea, but you know where to jump). Similarly, architecture rationale documentation (including the threat model, API/library use justification) can be handy so that you don’t need to go through the same debates/research when fixing a bug or adding a feature. Threat models particularly can take a lot of time and expertise to construct from scratch.

Sales people will need to know which features have been delivered, which features are on the roadmap, and can probably find out specific questions from engineers if they get a grilling from an awkward customer. Only in a very limited set of circumstances will sales staff need to give customers security documentation, test coverage information, or anything other than the user manual.

Notice that in each of these cases, one of the main aspects of keeping the document around is so that you can keep it up to date. There’s no point having the 1.0 feature list when you’re selling version 2.5 – in fact it would be positively detrimental. So obviously the fewer documents you keep around, the fewer you need to keep up to date. There’s some trade-off involved (natürlich) – if you need something you didn’t hang on to, you’re going to have to regenerate it.

Enterprise project documentation exists to give managers someone to blame.

Good documentation doesn’t answer all the questions, but leaves you capable of asking smart questions of the right people.

UML diagrams are great to describe code before it’s been written; they’re not so great at writing code.

After code has been written, the only accurate documentation is the code itself (or generated from it).

The better your prototypes, the worse the feedback.

There are few document “artifacts” that need to hang around.

About Graham

OOP the Easy Way

APPropriate Behaviour

APPosite Concerns

FSF