Working Effectively with Legacy Code

I gave a talk to my team at ARM today on Working Effectively with Legacy Code by Michael Feathers. Here are some notes I made in preparation, which are somewhat related to the talk I gave.

This may be the most important book a software developer can
read. Why? Because if you don’t, then you’re part of the problem.

It’s obviously a lot easier and a lot more enjoyable to work on
greenfield projects all the time. You get to choose this week’s
favourite technologies and tools, put things together in the ways that
suit you now, and make progress because, well anything is progress
when there’s nothing there already. But throwing away an existing
system and starting from scratch makes it easy to throw away the
lessons learned in developing that system. It may be ugly, and patched
up all over the place, but that’s because each of those patches was
needed. They each represent something we learned about the product
after we thought we were done.

The new system is much more likely to look good from the developer’s
perspective
, but what about the users’? Do they want to pay again
for development of a new system when they already have one that mostly
works? Do they want to learn again how to use the software? We have
this strange introspective notion that professionalism in software
development means things that make code look good to other coders:
Clean Code, “well-crafted” code. But we should also have some
responsibility to those people who depend on us and who pay our way,
and that might mean taking the decision to fix the mostly-working
thing.

A digression: Lehman’s Laws

Manny Lehman identified three different categories of software system:
those that are exactly specified, those that implement
well-understood procedures, and those that are influenced by the
environment in which they run. Most software (including ours) comes
into that last category, and as the environment changes so must the
software, even if there were no (known) problems with it at an earlier
point in its evolution.

He expressed
Laws governing the evolution of software systems,
which govern how the requirements for new development are in conflict
with the forces that slow down maintenance of existing systems. I’ll
not reproduce the full list here, but for example on the one hand the
functionality of the system must grow over time to provide user
satisfaction, while at the same time the complexity will increase and
perceived quality will decline unless it is actively maintained.

Legacy Code

Michael Feather’s definition of legacy code is code without tests. I’m
going to be a bit picky here: rather than saying that legacy code is
code with no tests, I’m going to say that it’s code with
insufficient tests
. If I make a change, can I be confident that I’ll
discover the ramifications of that change?

If not, then it’ll slow me down. I even sometimes discard changes
entirely, because I decide the cost of working out whether my change
has broken anything outweighs the interest I have in seeing the change
make it into the codebase.

Feathers refers to the tests as a “software vice”. They clamp the
software into place, so that you can have more control when you’re
working on it. Tests aren’t the only tools that do this: assertions
(and particularly Design by Contract) also help pin down the software.

How do I test untested code?

The apparent way forward then when dealing with legacy code is to
understand its behaviour and encapsulate that in a collection of unit
tests. Unfortunately, it’s likely to be difficult to write unit tests
for legacy code, because it’s all tightly coupled, has weird and
unexpected dependencies, and is hard to understand. So there’s a
catch-22: I need to make tests before I make changes, but I need to
make changes before I can make tests.

Seams

Almost the entire book is about resolving that dilemma, and contains a
collection of patterns and techniques to help you make low-risk
changes to make the code more testable, so you can introduce the tests
that will help you make the high-risk changes. His algorithm is:

  1. identify the “change points”, the things that need modifying to
    make the change you have to make.
  2. find the “test points”, the places around the change points where
    you need to add tests.
  3. break dependencies.
  4. write the tests.
  5. make the changes.

The overarching model for breaking dependencies is the “seam”. It’s a
place where you can change the behaviour of some code you want to
test, without having to change the code under test itself. Some examples:

  • you could introduce a constructor argument to inject an object
    rather than using a global variable
  • you could add a layer of indirection between a method and a
    framework class it uses, to replace that framework class with a
    test double
  • you could use the C preprocessor to redefine a function call to use
    a different function
  • you can break an uncohesive class into two classes that collaborate
    over an interface, to replace one of the classes in your tests

Understanding the code

The important point is that whatever you, or someone else, thinks
the behaviour of the code should be, actually your customers have paid
for the behaviour that’s actually there and so that (modulo bugs) is
the thing you should preserve.

The book contains techniques to help you understand the existing code
so that you can get those tests written in the first place, and even
find the change points. Scratch refactoring is one technique: look
at the code, change it, move bits out that you think represent
cohesive functions, delete code that’s probably unused, make notes in
comments…then just discard all of those changes. This is like Fred
Brooks’s recommendation to “plan to throw one away”, you can take what
you learned from those notes and refactorings and go in again with a
more structured approach.

Sketching is another technique recommended in the book. You can draw
diagrams of how different modules or objects collaborate, and
particularly draw networks of what parts of the system will be
affected by changes in the part you’re looking at.

Tsundoku

I only have the word of the internet to tell me that Tsundoku is the condition of acquiring new books without reading them. My metric for this condition is my list of books I own but have yet to read:

  • the last three parts of Christopher Tolkien’s Histories of Middle-Earth
  • Strategic Information Management: Challenges and Strategies in Managing Information Systems
  • Hume’s Enquiries Concerning the Human Understanding
  • Europe in the Central Middle Ages, 962-1154
  • England in the Later Middle Ages
  • Bertrand Russel’s Problems with Philosophy
  • John Stuart Mill’s Utilitarianism and On Liberty (two copies, different editions, because I buy and read books at different rates)
  • A Song of Stone by Iain Banks
  • Digital Typography by Knuth
  • Merchant and Craft Guilds: A History of the Aberdeen Incorporated Trades
  • The Indisputable Existence of Santa Claus
  • Margaret Atwood’s The Handmaid’s Tale

And those are only the ones I want to read and own (and I think that list is incomplete – I bought a book on online communities a few weeks ago and currently can’t find it). Never mind the ones I don’t own.

And this is only about books. What about those side projects, businesses, hobbies, blog posts and other interests I “would do if I got around to it” and never do? Thinking clearly about what to do next and keeping expectations consistent with what I can do is an important skill, and one I seem to lack.

Answer: none of them

A question programmers frequently ask when they’re considering career growth or personal learning is “which programming language should I learn next?”

Why would learning another programming language help? If you only know one programming language and it is provided by a single vendor, then learning another will decouple your success from theirs, but that might not be such a common situation. Well, a book like Seven Languages in Seven Weeks makes the point that it’s not about learning the language, but about learning the model and thought process enabled by using that language. OK, so why don’t I learn that model or thought process, using the tools that are already available to me, instead of having to add fighting unfamiliar syntax to the problem?

And if what I’m truly trying to do is to learn to think about problems in a different way, a week-long effort at dabbling in a side project isn’t going to change my way of thinking. Those years of learned processes, visualisations and analyses are going to take more than a couple of hours to dislodge. I’ve worked through Seven Languages, and the fact that I spent a couple of hours solving the Eight Queens Problem in Prolog (or in fact telling Prolog what a solution to Eight Queens looks like and letting it solve it) doesn’t mean I now think about any other software problem as if I’m using a logic programming tool, or even as if I have such a tool available. I’ve spent much longer than that studying and using the relational calculus and SQL, but don’t even think about every problem as if it should be a collection of tables in the third normal form.

It may be that it would be useful to learn something that isn’t a programming language, shock horror! It turns out that programming is an activity embedded in a socio-technical system comprising other activities, and you might need to know something about them: software security, testing (I think I can count on my noses the number of programmers I’ve met who haven’t responded to the phrase “equivalence partitioning” with a blank stare, and I wouldn’t use all of my noses), planning, business, marketing, ethics…I even wrote a whole book on the things programmers should know that aren’t programming.

And then there’s the thing that your customers, clients, colleagues, or victims are trying to do with the software. Learning something about that would make it easier to empathise with them, to evaluate your solutions in context, and to propose better ways of working and better ways for your software to enable their work. Rewriting your code in Elixir would…not do that so much.

On the extremes of computer science

I didn’t study computer science at school or university, and still manage to work as a programmer.

That is not to say that I don’t need to know some things that are taught on computer science courses. Just this week I’ve had to build a couple of different data structures and understand their running time: very CS.

I’ve also needed to know things that aren’t on a CS degree, too. The acceptance criteria for one of my projects are written in French, and none of the CS courses I’ve seen in UK universities include that in the syllabus.

I’m neither arguing for nor against the validity of a CS background in professional software development. I’m arguing against taking either side. You need to know some CS things to write software, you need to know some other things, multiple backgrounds are appropriate and welcome.

Dogmatic paradigmatism

First, you put all of your faith in structured programming, and you got burned. You found it hard to associate the operations in your software with the data upon which they act, and to make sure that the expectations made on the data in one place are satisfied when that data has been modified in that other place, or over there in yet another place. Clearly structured programming is broken.

Then, you put all of your faith in object-oriented programming, and you got burned. You found it hard to follow the flow of a program when it jumps in and out of different classes, and to see which parts were coupled to what. Clearly object-oriented programming is broken.

Then, you put all of your faith in functional programming, and you got burned. You found it hard to represent real business processes in terms of immutable data structures and pure functions, and to express changes to the operating environment without using side effects. Clearly functional programming is broken.

Or maybe it’s you. Maybe, rather than relying on faith to make these conceptual thought frameworks do what you need from them, you could have thought about the concepts.

In which I interview so you don’t have to

Describing job interviews for technical roles in the software industry to people who have left or have always been outside the software industry requires two things: patience on the part of the one doing the describing, and the ability for the listener to take a joke. Over the last twelve years I have taken countless job interviews so that you don’t have to. Here’s what I’ve found: presented as a guide to running the average software developer interview. As with all descriptions of mediocrity, you should treat this as best practice.

[Be clear on this: not all interviews are like this. But this is an expectable baseline, derived from experience.]

Person Specification

The ideal candidate will be rich. We’re going to put them through hours – maybe even days – of tests, interviews, meetings, and “informal chats” that they’d better be on best behaviour for anyway. They need to be able to afford taking that time away from work, friends, other opportunities, so they’d better be rich.

That multiple-hour interview process means that they’d better be desperate for a job too. As you’ll find out in the section on our process, we pride ourselves on not giving away too much. We’re not selling our company to you, because we know we’re offering the chance to do what you’ve always wanted: sit in our open plan office space next to our own particular loud crisp-eater muttering at Eclipse.

The ability to go without food is desirable too. Even if a stage of the interview is planned to take so long that it would go over lunch, and even though we might put a break for lunch in, we might also forget to do any catering. Computers don’t need food and programmers are sort of like computers, we heard. We actually occasionally do feed our staff, and advertise this as a perk.

Our Process

The first thing we want to check is whether you can solve logical problems. We don’t actually need you to solve logical problems, after all, that’s what the computers are for. But we’ll give you an aptitude/basic reasoning test anyway [yes, although it’s no longer the 1960s and we aren’t IBM, this is still common if not universal].

The reasoning test is there to weed out people who didn’t have the same education as us, or were raised speaking a different language, or in a different culture. Empathy is hard, and to avoid unduly stressing our staff we want to make sure that their colleagues are as similar to them as possible. Additionally the hour you’ll take going through this test is an hour we don’t have to make eye contact or conversation with you: empathy is hard.

To be honest we have no idea what this test means or how to interpret its results. Everybody before you went through this test, and they’d raise merry hell if we “lowered the bar” by removing it now. As a holacracy/meritocracy/hypocrisy/this week’s organisational behaviour buzzword, we empower our employees to not see any changes that might raise a small amount of discomfort.

So after that test, depending on the seniority of the position and the candidate’s experience, we’ll…no, not really. We did nearly keep a straight face through that sentence though. In fact we didn’t read your CV except to find out whether the keywords that describe the problems we have right now and the solutions we have chosen last week appear. We didn’t read your GitHub/Lanyrd/Bitbucket profiles either, except to check that you have them so we know how much free work to expect out of you in addition to the paid stuff. Our project management system works on the Pareto Principle: 80 hours a week on our stuff, 20 hours a week on open source stuff that we can co-opt.

The next stage in the process is actually the same for everybody: a basic programming test to find out whether you even know what a computer is. We don’t care that you’re [glances at CV] Grace Hopper, we still don’t believe that you can reverse a linked list. None of our employees has ever had to reverse a linked list on the job, and we’d fire them if they did reverse a linked list on the job because there are libraries for that.

Now we’ll come onto the technical interview: a cross-examination by a panel of between one and twelve [not joking] people who have, or have had, a word like “engineer” in their job description at some point. These people are tasked with finding out whether you’ve solved the same problems in your career as they have in theirs. If you haven’t, you might not be clever enough. If you have, then what new experiences are you bringing to the table?

By the way, our flexibility on your technical skills will go down as you become more experienced. We appreciate that new grads might not have used our tools/frameworks/technology and are willing to train them, but if you have more than six months’ experience with Java we’re going to call you a Java developer and only consider you for Java roles.

After all of that, it’s still possible that you might have somehow snuck through the system despite not going to the same university or belonging to the same society as the founder. We can’t really quantify the idea of “culture fit” but that’s what we’re examining in the next part of the process and we’ll know it when we don’t see it.

The Offer

You’ll get a phone call from us while you’re in the bath. We’ll outline the position, pay and (unless this is an American company and there isn’t any) holiday provision. You then have two seconds in which to reply, with either “Yes” or whatever the other one is. You may have other irons in the fire but of course you’ll want to drop all of those when we tell you about the parking space we’ve already allocated for you [This has happened. I don’t have a car.].

The Job

You will be working with a team of people who all went through that same interview and decided they wanted to work in our environment. We will leave it to you to decide what that means.

The Alternative

There are some less…scientific…approaches to hiring that involve using the candidate’s stated and visible experience to have a conversation about what they’ve done, how they do and don’t like to work, how they’ve responded to success and failure, and whether the challenges they would like to see in their career match up with the environment we’re able to provide. While that sounds like quite a pleasant experience for everybody involved we fail to see how it could possibly translate into discovering whether we want to work with you or vice versa.

Turn it off and back on again

I’m now six months into what I expected to be about a year out of working in technology, and I’m starting to think about what comes next and trying to make it happen. The difficulty I have is that it’s hard to explain what I’m looking for in a way that makes sense to those that are hiring, or that I can summarise in a search term for job sites. I’m considering running a company again to do all this myself, but that doesn’t obviate the problem, I still need to be able to describe this to potential clients and explain why they would want to buy one.

The difficulty comes from being a people person. I listen to people, I talk to people, I get people to talk to other people, I learn from people, I teach people, I perform for people, I watch people, I read what people have to say, and I write for people. And I happen to want to do that for money in the software industry, but if you tell that to a hiring manager on a software team you’ll get a blank stare followed by “um, but how much have you used MongoDB from Scala in your last job?”. You don’t need to try this yourself, I have done it. This is what happens.

I don’t mind much what technology I use, as long as we’re using it because it helps to address the problems we or our customers have rather than because a developer threw a strop if they weren’t going to be allowed to rewrite everything that already works well in NodeCaml. I care about understanding and solving the problems people have, and about understanding the people who have those problems. “I think we should use this” is not fine. “I think we should use this because” is perfectly fine. “We’re a $VENDOR shop” is probably not fine.

So the problem I have is that the job I know how to apply for and get is “programmer” (these days with some highfalutin prefix that really comes down to “better paid”), but that usually comes with some expectation to focus on the programming, and leave all the gloopy soft stuff like what programming should be done and whether it’s a good idea to do it now to other people. What I want to be doing is (being paid for) the gloopy soft stuff like making programmers into better programmers, working out what programming should be done (if any) and whether it’s a good idea to do it now, helping programmers to understand the people they’re helping, and helping the people being helped by programmers to understand the programmers, with the programming itself being a context not a focus. I have no idea how to explain that succinctly to people who might want to hire one of those, nor how to find people who might want to hire one of those.

Practically, based on what I’ve experienced about my own health and its relationship with my work, I also need to be realistic about where and when I work. That’s from or within cycling distance of home (around Leamington Spa, Warwick, Kenilworth and Coventry), from usual working hours my own timezone. If your company is in a different timezone and supports remote work, that’s great, but if you need me to work from your timezone then it’s not great. In fact, you don’t support remote work, you just support local employees who don’t always come into the office.

If you are someone who wants one of those, know someone who wants one of those, or know how to describe one of those succinctly, please do help me out. Based on the last time I tried this, here’s a couple of lists:

Things I’ve never done, but would

These aren’t necessarily things my next job must have, and aren’t all even work-related, but are things that I would take the opportunity to do.

  • Work in a field on a farm. Preferably in control of a tractor.
  • Have a job title that begins with the letters ‘C’ or ‘D’ (I managed ‘Q’ a while back).
  • Spend lots of time supporting the Free Software Definition
  • Include going to lunch with each other employee in the company in my responsibilities.
  • Visit Iceland.

Things I don’t like

These are the things I would try to avoid.

  • I still seriously hate raw celery.
  • Client work, in those cases where we don’t all really think that the client is doing something important.
  • “Rock star” programmers, and people who hire them.

Clown Trousers

An indirect side effect of stopping programming is that none of my trousers fit any more.

People who like to explain things before they have all the facts (or “programmers” as we sometimes call them) will justify this observation by pointing out that I have more time for exercise now. I do, but I don’t use it. While working at Facebook I walked six miles each day as part of my commute and worked at a variable-height desk; I spent a lot of time walking and standing.

When I began my gap year, I put some effort into running every day. That didn’t last long. I still stand a lot to play musical instruments, but am significantly less active now that I’m 8kg lighter than programmer Graham.

Looking at videos of programmer me, I just see an obese, tired guy surviving on caffeine, sugar snacks and three big meals a day as he lurched between commuting trips, flights abroad, conference talks and infrequent visits to bed. Peak Graham (weight for weight) came in June, as attested by the video of my AltConf talk, I have no idea what I’m doing.

It turns out I had no idea what I was doing to myself either. But now that I’m not doing it, the historical record that is my wardrobe tells me I’m healthier than I have been in over five years.

You may not need hipster silicon valley nutritional engineering sludge. You may not need an extra hour in the day to fit in a run and a shower. You may not need to drop a few hundred quid on a watch that also reminds you to stand up. You might just need to discover what you’re doing wrong, and not do it like that.

Week Seven

Having spent a few weeks trying all of the things and letting life happen, this week was about selection and focus. What should I actually concentrate on, and put energy into?

It’s time to add some structure to this situation. Dropping all of the things and taking life as it comes was relaxing, enjoyable, and necessary. Re-moulding things and building something new out of the parts will be necessary to provide a new sense of engagement and purpose.

Week five

“You look so much happier!”

I get the best compliments. Also, I feel so much happier. I have put people, friendships, connections, and experiences first, and am taking advantage of the rewards.

One such experience was a visit a couple of further education (16-18 years old) computing classes in my county. I was mostly there to talk about my background in the industry and help them to understand what jobs there are and what employers look for. I was blown away by the things these students could make in their robotics classes though, with Mindstorms kits talking to laptops and mobile apps.

That was a huge step up from the turtle robots we had shuffling around the floor. in my school. And it still feels like there’s even more opportunity there, like there’s a huge gap between what a student sees computers being capable of and what they can do with the tools we (the industry) give them.