Let’s talk about self-documenting code

You think your code is self-documenting. That it doesn’t need comments or Doxygen or little diagrams, because it’s clear from the code what it does.

I do not think that that is true.

Even if your reader has at least as much knowledge of the programming language you’ve used as you have, and at least as much knowledge of the libraries you’ve used as you have, there is still no way that your code is self-documenting.

How long have you been doing your job? How long have you been talking to experts in the problem domain, solving similar problems, creating software in this region? The likelihood is, whoever you are, that the new person on your team has never done that, and that your code contains all of the jargon terms and assumptions that go with however-much-experience-you-have experience at solving those problems.

How long were you working on that story, or fixing that bug? How long have you spent researching that specific change that you made? However long it is, everybody else on your team has not spent that long. You are the world expert at that chunk of code, and it’s self-documenting to you as the world expert. But not to anybody else.

We were told about “working software over comprehensive documentation”, and that’s true, but nobody said anything about avoiding sufficient documentation. And nobody else has invested the time to understand the code that you just wrote that you did, so the only person for whom your code is self-documenting is you.

Help us other programmer folks out, think about us when avoiding documentation.

No True Humpty-Dumpty

Words change meaning.

Technical words change meaning.

Sometimes, you need to check out a specific commit of a word’s meaning from the version control, to add context to a statement.

“I’m talking about Open Source in its early meaning of Free Software without the confusion over Free, not its later meaning as an ethically empty publication of source code.”

“I mean Object-Oriented Programming as the loosely-defined bucket in which I can put all the ills of software that I’m claiming are solved by Haskell, not the earlier sense of modelling business processes in software with loosely-coupled active programs communicating by sending messages.”

“The word Agile here refers to the later sense of Agile where I run a waterfall with frequent checkpoints and get a certification from a project management institute.”

The problem is that doing so acts as a thought-terminating cliche to people who are not open to hearing a potentially valuable statement about Open Source, Object-Oriented Programming, or Agile development.

If your commit is too early in history, then you can easily be dismissed as etymologically fallacious, or as somebody who won’t accept progress and the glorious devaluation of the word you’re trying to use.

If your commit is too recent in history, then you can easily be dismissed as a Humpty-Dumptyist who’s trying to hide behind a highfalutin term that you have no right to use.

What I’ve come to realise is that my technique for dealing with people who use these rhetorical devices can be as simple as this: ignore them. If they do not want to hear, then I do not need to speak.

To become a beginner, first become an expert

We have a whole load of practices in programming that only really work well if you’re already good at whatever the process is supposed to help with.

Scrum is a process improvement framework, but only if you already know how to do process improvement. If you don’t, then Scrum is just the baseline mini-waterfall process with a chance to air your dirty laundry every fortnight.

Agile is good at helping you embrace change, but only if you’re already good enough at managing change to understand which changes should be embraced.

#NoEstimates helps you avoid the overhead of estimates, but only if you’re already good enough at estimates to know that you always write user stories that take 0.5-2 days to implement.

TDD helps you design your APIs, but only if you’re already good enough at API design to understand things like dependency injection and loose coupling.

Microservices help you isolate modules, but only if you’re already good enough at modularity not to get swamped in HTTP calls.

This is all very well for selling consultancy (“if your [agile] isn’t working, then you aren’t [agiling] hard enough, let me [agile] you some more”) but where’s the on-ramp?

Discovery

I think the last technical conference I attended was FOSDEM last year, and now I’m sat in the lobby of the Royal Library of Brussels working on a project that I want to take to some folks at this year’s FOSDEM, and checking the mailing lists of some projects I’m interested in to find meetups and relevant talks.

It’s been even longer since I spoke at a conference, I believe it would have been App Builders in 2016. When I left Facebook in 2015 with the view of taking a year out of software, I also ended up taking myself out of community membership and interaction. I felt like I didn’t know who I was, and wasn’t about to find out by continually defining myself in terms of other people.

I would have spoken at dotSwift in 2016, but personal life issues probably also related to not knowing who I was stopped me from doing that. In April, I went to Zurich and gave what would be my last talk as a programmer to anyone except immediate colleagues for over 18 months.

In the meantime, I explored bits of the software world outside of my immediate experience. I worked on a Qt app in high-performance computing, and I’m working on “full-stack” Javascript now.

[Full-stack Javascript is, of course, a lie, but I’d rather it keeps its current meaning as “the application bit of the stack in Javascript” than took a more correct sense.]

During this time, I found that I don’t mind so much what I’m working on: I have positive and negative opinions of all of it. I have lots of strong opinions, and lots of syntheses of other ideas, and need to be a member of a community that can share and critique these opinions, and extend and develop these syntheses.

Which is why, after a bit of social wilderness, I’m reflecting on this week’s first conference and planning my approach to the second.

All the things

It’s been a long time since I had a side project, or one that didn’t get abandoned very early on. I tend to get sidetracked by other thoughts about computing, or think “while I’m doing this, I’m leaving that unsolved” so nothing gets very far.

In an attempt to address that, to clear all of the different thoughts I have about the matter of computing out of my head, organise them, identify conflicts, and prioritise what I work on, I spent this evening jotting down the big points and a brief abstract about each one. I’m hoping this will cut the Gordian knot by letting me see it all in one place and start to make choices.

The format I chose to represent this braindump is this personal Technology Radar, based on the Thoughtworks build-your-own tool. It seemed like a good place to see everything at once, and look for clusters or trends.

You’ll notice that almost everything in this radar is fairly old tech! That’s mostly a matter of taste, as I enjoy learning about things that were tried, what succeeded or failed, and what can be learnt from that to put to use today. I’m not good at novelty for novelty’s sake.

I expect to get some mileage (for my own benefit, you might like it too) out of expanding on some of the entries in this radar over a few more posts, so I’ve created a techradar category in this blog that you can filter on/out.

Considered harmless

Don’t like a new way of working? Just point out the absurdity of suggesting that the old way was broken:

Somehow, the microservices folks have failed to notice all that software that was in fact delivered as monoliths.

What the Rust Evangelism Strike Force doesn’t realise is that we’ve spent decades successfully building C programs that don’t dereference the NULL pointer.

This is a sort of “[C|Monoliths] considered harmless” statement. Yes, it’s possible to do it that way, but that doesn’t mean that there aren’t problems, or at least trade-offs. “C considered harmless” is as untrue and unhelpful as “C considered harmful”; what we want is “C considered alongside alternatives”.

On books

I’d say that if there’s one easy way to summarise how I work, it’s as an information focus. I’m not great at following a solution all the way to the bitter end so you should never let me be a programmer (ahem): when all that’s left is the second 90% of the effort in fixing the bugs, tidying up edge cases and iterating on the interaction, I’m already bored and looking for the next thing. Where I’m good is where there’s a big problem to solve, and I can draw analogies with things I’ve seen before and come up with the “maybe we should try this” suggestions.

Part of the input for that is the experience of working in lots of different contexts, and studying for a few different subjects. A lot of it comes from reading: my goodreads account lists 870 books and audiobooks that I’ve read and I know it to be an incomplete record. Here are a few that I think have been particularly helpful (professionally speaking, anyway).

  • Douglas Adams, The Hitch-Hikers’ Guide to the Galaxy. Adams is someone who reminds us not to take the trappings of society too seriously, and to question whether what we’re doing is really necessary. Are digital watches really a neat idea? Also an honourable mention to the Dirk Gently novels for introducing the fundamental interconnectedness of all things.
  • Steve Jackson and Ian Livingstone, The Warlock of Firetop Mountain. I can think of at least three software projects that I’ve been able to implement and describe as analogies to the choose your own adventure style of book.
  • David Allen, Getting Things Done, because quite often it feels like there’s too much to do.
  • Douglas Hofstadter, Godel, Escher, Bach: An Eternal Golden Braid is a book about looking for the patterns and connections in things.
  • Victor Papanek, Design for the Real World, for reminding us of the people who are going to have to put up with the consequences of the things we create.
  • Donald Broadbent, Perception and Communication, for being the first person to systematically explore that topic.
  • Steven Hawking, A Brief History of Time, showing us how to make complex topics accessible.
  • Roger Penrose, The Road to Reality, showing us how to make complex topics comprehensively presentable.
  • Douglas Coupland, Microserfs, for poking fun at things I took seriously.
  • Janet Abbate, Recoding Gender, because computering is more accessible to me than to others for no good reason.
  • Joshua Bloch, Effective Java, Second Edition, for showing that part of the inaccessibility is a house of cards of unsuitable models with complex workarounds, and that programmers are people who delight in knowing, not addressing, the workarounds.
  • Michael Feathers, Working Effectively with Legacy Code, the one book every programmer should read.
  • Steve Krug, Don’t make me think!, a book about the necessity of removing exploration and uncertainty from computer interaction.
  • Seymour Papert, Mindstorms, a book about the necessity of introducing exploration and uncertainty into computer interaction.
  • Richard Stallman, Free as in Freedom 2.0, for suggesting that we should let other people choose between ther previous two options.
  • Brad Cox, Object-Oriented Programming: An Evolutionary Approach, for succinctly and effortlessly explaining objects a whole decade before everybody else got confused by whether a dog is an animal or a square is a rectangle.
  • Gregor Kiczales, Jim des Rivieres, and Daniel G. Bobrow, The Art of the Metaobject Protocol showed me that OOP is just one way to do OOP, and that functional programming is the same thing.
  • Simson Garfinkel and Michael Mahoney, NEXTSTEP Programming: Step One was where I learnt to create software more worthwhile than a page of BASIC instructions.
  • Gil Amelio, On the Firing Line: My 500 Days at Apple shows that the successful business wouldn’t be here if someone hadn’t managed the unsuccessful business first.

There were probably others.

Working Effectively with Legacy Code

I gave a talk to my team at ARM today on Working Effectively with Legacy Code by Michael Feathers. Here are some notes I made in preparation, which are somewhat related to the talk I gave.

This may be the most important book a software developer can
read. Why? Because if you don’t, then you’re part of the problem.

It’s obviously a lot easier and a lot more enjoyable to work on
greenfield projects all the time. You get to choose this week’s
favourite technologies and tools, put things together in the ways that
suit you now, and make progress because, well anything is progress
when there’s nothing there already. But throwing away an existing
system and starting from scratch makes it easy to throw away the
lessons learned in developing that system. It may be ugly, and patched
up all over the place, but that’s because each of those patches was
needed. They each represent something we learned about the product
after we thought we were done.

The new system is much more likely to look good from the developer’s
perspective
, but what about the users’? Do they want to pay again
for development of a new system when they already have one that mostly
works? Do they want to learn again how to use the software? We have
this strange introspective notion that professionalism in software
development means things that make code look good to other coders:
Clean Code, “well-crafted” code. But we should also have some
responsibility to those people who depend on us and who pay our way,
and that might mean taking the decision to fix the mostly-working
thing.

A digression: Lehman’s Laws

Manny Lehman identified three different categories of software system:
those that are exactly specified, those that implement
well-understood procedures, and those that are influenced by the
environment in which they run. Most software (including ours) comes
into that last category, and as the environment changes so must the
software, even if there were no (known) problems with it at an earlier
point in its evolution.

He expressed
Laws governing the evolution of software systems,
which govern how the requirements for new development are in conflict
with the forces that slow down maintenance of existing systems. I’ll
not reproduce the full list here, but for example on the one hand the
functionality of the system must grow over time to provide user
satisfaction, while at the same time the complexity will increase and
perceived quality will decline unless it is actively maintained.

Legacy Code

Michael Feather’s definition of legacy code is code without tests. I’m
going to be a bit picky here: rather than saying that legacy code is
code with no tests, I’m going to say that it’s code with
insufficient tests
. If I make a change, can I be confident that I’ll
discover the ramifications of that change?

If not, then it’ll slow me down. I even sometimes discard changes
entirely, because I decide the cost of working out whether my change
has broken anything outweighs the interest I have in seeing the change
make it into the codebase.

Feathers refers to the tests as a “software vice”. They clamp the
software into place, so that you can have more control when you’re
working on it. Tests aren’t the only tools that do this: assertions
(and particularly Design by Contract) also help pin down the software.

How do I test untested code?

The apparent way forward then when dealing with legacy code is to
understand its behaviour and encapsulate that in a collection of unit
tests. Unfortunately, it’s likely to be difficult to write unit tests
for legacy code, because it’s all tightly coupled, has weird and
unexpected dependencies, and is hard to understand. So there’s a
catch-22: I need to make tests before I make changes, but I need to
make changes before I can make tests.

Seams

Almost the entire book is about resolving that dilemma, and contains a
collection of patterns and techniques to help you make low-risk
changes to make the code more testable, so you can introduce the tests
that will help you make the high-risk changes. His algorithm is:

  1. identify the “change points”, the things that need modifying to
    make the change you have to make.
  2. find the “test points”, the places around the change points where
    you need to add tests.
  3. break dependencies.
  4. write the tests.
  5. make the changes.

The overarching model for breaking dependencies is the “seam”. It’s a
place where you can change the behaviour of some code you want to
test, without having to change the code under test itself. Some examples:

  • you could introduce a constructor argument to inject an object
    rather than using a global variable
  • you could add a layer of indirection between a method and a
    framework class it uses, to replace that framework class with a
    test double
  • you could use the C preprocessor to redefine a function call to use
    a different function
  • you can break an uncohesive class into two classes that collaborate
    over an interface, to replace one of the classes in your tests

Understanding the code

The important point is that whatever you, or someone else, thinks
the behaviour of the code should be, actually your customers have paid
for the behaviour that’s actually there and so that (modulo bugs) is
the thing you should preserve.

The book contains techniques to help you understand the existing code
so that you can get those tests written in the first place, and even
find the change points. Scratch refactoring is one technique: look
at the code, change it, move bits out that you think represent
cohesive functions, delete code that’s probably unused, make notes in
comments…then just discard all of those changes. This is like Fred
Brooks’s recommendation to “plan to throw one away”, you can take what
you learned from those notes and refactorings and go in again with a
more structured approach.

Sketching is another technique recommended in the book. You can draw
diagrams of how different modules or objects collaborate, and
particularly draw networks of what parts of the system will be
affected by changes in the part you’re looking at.

Tsundoku

I only have the word of the internet to tell me that Tsundoku is the condition of acquiring new books without reading them. My metric for this condition is my list of books I own but have yet to read:

  • the last three parts of Christopher Tolkien’s Histories of Middle-Earth
  • Strategic Information Management: Challenges and Strategies in Managing Information Systems
  • Hume’s Enquiries Concerning the Human Understanding
  • Europe in the Central Middle Ages, 962-1154
  • England in the Later Middle Ages
  • Bertrand Russel’s Problems with Philosophy
  • John Stuart Mill’s Utilitarianism and On Liberty (two copies, different editions, because I buy and read books at different rates)
  • A Song of Stone by Iain Banks
  • Digital Typography by Knuth
  • Merchant and Craft Guilds: A History of the Aberdeen Incorporated Trades
  • The Indisputable Existence of Santa Claus
  • Margaret Atwood’s The Handmaid’s Tale

And those are only the ones I want to read and own (and I think that list is incomplete – I bought a book on online communities a few weeks ago and currently can’t find it). Never mind the ones I don’t own.

And this is only about books. What about those side projects, businesses, hobbies, blog posts and other interests I “would do if I got around to it” and never do? Thinking clearly about what to do next and keeping expectations consistent with what I can do is an important skill, and one I seem to lack.

Answer: none of them

A question programmers frequently ask when they’re considering career growth or personal learning is “which programming language should I learn next?”

Why would learning another programming language help? If you only know one programming language and it is provided by a single vendor, then learning another will decouple your success from theirs, but that might not be such a common situation. Well, a book like Seven Languages in Seven Weeks makes the point that it’s not about learning the language, but about learning the model and thought process enabled by using that language. OK, so why don’t I learn that model or thought process, using the tools that are already available to me, instead of having to add fighting unfamiliar syntax to the problem?

And if what I’m truly trying to do is to learn to think about problems in a different way, a week-long effort at dabbling in a side project isn’t going to change my way of thinking. Those years of learned processes, visualisations and analyses are going to take more than a couple of hours to dislodge. I’ve worked through Seven Languages, and the fact that I spent a couple of hours solving the Eight Queens Problem in Prolog (or in fact telling Prolog what a solution to Eight Queens looks like and letting it solve it) doesn’t mean I now think about any other software problem as if I’m using a logic programming tool, or even as if I have such a tool available. I’ve spent much longer than that studying and using the relational calculus and SQL, but don’t even think about every problem as if it should be a collection of tables in the third normal form.

It may be that it would be useful to learn something that isn’t a programming language, shock horror! It turns out that programming is an activity embedded in a socio-technical system comprising other activities, and you might need to know something about them: software security, testing (I think I can count on my noses the number of programmers I’ve met who haven’t responded to the phrase “equivalence partitioning” with a blank stare, and I wouldn’t use all of my noses), planning, business, marketing, ethics…I even wrote a whole book on the things programmers should know that aren’t programming.

And then there’s the thing that your customers, clients, colleagues, or victims are trying to do with the software. Learning something about that would make it easier to empathise with them, to evaluate your solutions in context, and to propose better ways of working and better ways for your software to enable their work. Rewriting your code in Elixir would…not do that so much.