Gently HURDing the side projects

I find it problematic that even at times when I’m avoiding computing outside of work, I still have ideas about things I would like to try out or improve in computing “if I had the time”. I tend to capture these somehow – usually written notes in paper or Evernote, and my personal technology radar.

Why might this be a problem? Isn’t having ideas good, and fun? Well it is, but with each comes guilt that I could be making progress on it but am not. Even when that’s my choice, when I deliberately put more effort into relationships with friends or musical projects or whatever, there’s still that nagging feeling that I’m leaving behind chances to make positive changes to computing.

My approach to addressing that started by building the radar. Now I don’t have a lot of different projects I could be working on but am not; I have a single related web of issues, and progress on any one thing counts as progress toward the whole.

The second change is to note that any progress is progress; sometimes I spend some time reading and make a sentence or two of notes on dealing with a problem. Sometimes I try a solution, find difficulties with it and write down that I discount that solution. If I’ve made some move forward from where I was before I’ve started, I can be satisfied and don’t need to burn the midnight oil to get a complete solution to a complex problem done before putting it down.

All of that goes toward describing the limited progress I’ve made on my current research topic, which is distributed message-passing. I like the idea from Erlang that objects run in separate contexts, completely decoupled except for passing messages between one another. This seems to be the best implementation of an object-oriented runtime environment, except that it is all done on the Erlang VM and in the couple of relatively esoteric languages that target it.

On the other hand, while Objective-C doesn’t make it easy to do that decoupling, it does have a very simple message-passing interface that can be implemented in any language with a C FFI. If you can wrap objc_msgSend or objc_msgLookup and expose it to your language runtime, you too can pass messages.

Why can’t we have both of these things? Why can’t we have the simple-to-integrate message interface that can work anywhere, along with the distributed and decoupled objects?

My theory is that Mach makes this possible so I’ve been investigating it using GNU Mach and the GNU HURD. Much of the documentation of Mach messaging uses name servers that register named ports for clients to find; this is how macOS, NeXTSTEP, OSF/1 and related systems work. HURD does not use a name server, it uses the filesystem: you attach a server to a file system node as a translator and clients find the ports by looking up their paths on the filesystem.

I found examples of filesystem translators in the HURD documentation, but they typically were examples that implemented the filesystem messages: seek, read, write, and so on. One could build message-sending on top of filesystem operations but it would not be pleasant:

  • marshall the message selector and arguments into some stream format
  • write() your message to the node
  • verify that you wrote as many bytes as you expect
  • read() the length of the reply
  • verify that you read as many bytes as you expect
  • read() the reply
  • verify that you read as many bytes as you expect
  • unmarshall the reply into an object of the correct type

Let’s be clear, all of this needs to be done, but it’s all already being done at the Mach message layer and hidden behind the MIG abstraction, so why should our clients and servers do it again in another abstraction built on top? I wanted to find a way to register a port that accepts non-filesystem messages using the HURD’s filesystem-as-name-server approach.

This morning I decided to look at how login was implemented on the HURD and discovered that the password server does exactly what I need. It is configured as a translator on the filesystem, and uses the trivfs library to check in with the bootstrap server and get its ports, but then it handles its own messages for checking passwords rather than the standard filesystem messages. Discovering that gives me enough new information to feel I’ve made progress, and a clear next step (pardon the pun).

All the things

It’s been a long time since I had a side project, or one that didn’t get abandoned very early on. I tend to get sidetracked by other thoughts about computing, or think “while I’m doing this, I’m leaving that unsolved” so nothing gets very far.

In an attempt to address that, to clear all of the different thoughts I have about the matter of computing out of my head, organise them, identify conflicts, and prioritise what I work on, I spent this evening jotting down the big points and a brief abstract about each one. I’m hoping this will cut the Gordian knot by letting me see it all in one place and start to make choices.

The format I chose to represent this braindump is this personal Technology Radar, based on the Thoughtworks build-your-own tool. It seemed like a good place to see everything at once, and look for clusters or trends.

You’ll notice that almost everything in this radar is fairly old tech! That’s mostly a matter of taste, as I enjoy learning about things that were tried, what succeeded or failed, and what can be learnt from that to put to use today. I’m not good at novelty for novelty’s sake.

I expect to get some mileage (for my own benefit, you might like it too) out of expanding on some of the entries in this radar over a few more posts, so I’ve created a techradar category in this blog that you can filter on/out.

The Atoms of Programming

In the world of physics, there are many different models that can be used, though typically each of them has different applicability to different contexts. At the small scale, quantum physics is a very useful model, Newtonian physics will yield evidently incorrect predictions so is less valuable. Where a Newtonian model gives sufficiently accurate results, it’s a lot easier to work with than quantum or relativistic mechanics.

All of these models are used to describe the same universe – the same underlying collection of observations that can systematically be categorised, modelled and predicted.

Physical science (or experimental philosophy) does not work in the same way as computational philosophy. There are physical realisations of computational systems, typically manifested as electronic systems or pencil-and-paper simulations. But the software, the abstract configurations of ideas that run on those systems, exist in entirely separate space and are merely (though the fact that this is possible is immensely powerful) translated into the electronic or paper medium.

Of course one model for the software system is to abstract the electronic: to consider the movement of electrons as the presence of voltages at terminals; to group terminals as registers or busses; to further abstract this range of voltages as 0 and that range as 1. And indeed that model frequently is useful.

Frequently, that model is not useful. And the great thing is that we get to select from a panoply of other models, at some small or large remove from the physical model. We can use these models separately, or simultaneously. You can think of a software system as a network of messages passed between independent objects, as a flow of data through transformers, as a sequence of state changes, as a graph of single-argument functions, as something else, or as a combination of these things. Each is useful, each is powerful, all are applicable.

Sometimes, I can use these models to make decisions about representing the logical structure of these systems, transforming a concept into a representation that’s valid in the model. If I have a statement in a mathematical formulation of my problem, “for any a drawn from the set of Articles there exists a p drawn from the set of People such that p is the principal author of a” then I can build a function, or a method, or a query, or a predicate, or a procedure, or a subroutine, or a spreadsheet cell, or a process, that given an article will yield exactly one person who is the principal author of that article.

Sometimes, I use the models to avoid the conceptual or logical layers at all and express my problem as if it is a software solution. Object-oriented analysis and design, data flow modelling, and other techniques can be used to represent a logical model, or they can be used to bash the problem straight into a physical model without having thought about the problem in the abstract. “Shut up and code” is an extreme example of this approach, in which the physical model is realised without any attempt to tie it to a logical or conceptual design. I’ll know correct when I see it.

I don’t see a lot of value in collecting programming languages. I can’t count the number of different programming languages I’ve used, and many of them are entirely similar. C and JavaScript both have sequences of expressions that are built into statements that are built into procedures. Both let me build aggregations of data and procedures that either let me organise sequential programs, represent objects, represent functions, or do something else.

But collecting the models, the different representations of systems conceptually that can be expressed as software, sometimes called paradigms: this is very interesting. This is what lets me think about representing problems in different ways, and come up with efficient (conceptually or physically) solutions.

More paradigms, please.

Recommend me some books or articles

I’ve been looking for something to read on these topics, can you help?

  • a history of the Unix wars (the ‘workstation’ period involving Sun, HP, Apollo, DEC, IBM, NeXT and SGI primarily, but really everything starting from AT&T up to Linux and OS X would be interesting)
  • a business case study on Apple’s turnaround 1997-2001. I’ve read plenty of 1990s case studies explaining why they’ll fail, and 2010s interpretations of why they’re dominant, and Gil Amelio’s “On the Firing Line” which explains his view of how he stemmed the bleeding, but would like to fill in the gaps: particularly the changes from Dec 1997 to the iPod.
  • a technical book on Mach (it doesn’t need to still be in print, I’ll try to track it down): I’ve read the source code for xnu, GNU Mach and mkLinux, Tevanien’s papers, and the Mac OS X Internals book, but could still do with more

On books

I’d say that if there’s one easy way to summarise how I work, it’s as an information focus. I’m not great at following a solution all the way to the bitter end so you should never let me be a programmer (ahem): when all that’s left is the second 90% of the effort in fixing the bugs, tidying up edge cases and iterating on the interaction, I’m already bored and looking for the next thing. Where I’m good is where there’s a big problem to solve, and I can draw analogies with things I’ve seen before and come up with the “maybe we should try this” suggestions.

Part of the input for that is the experience of working in lots of different contexts, and studying for a few different subjects. A lot of it comes from reading: my goodreads account lists 870 books and audiobooks that I’ve read and I know it to be an incomplete record. Here are a few that I think have been particularly helpful (professionally speaking, anyway).

  • Douglas Adams, The Hitch-Hikers’ Guide to the Galaxy. Adams is someone who reminds us not to take the trappings of society too seriously, and to question whether what we’re doing is really necessary. Are digital watches really a neat idea? Also an honourable mention to the Dirk Gently novels for introducing the fundamental interconnectedness of all things.
  • Steve Jackson and Ian Livingstone, The Warlock of Firetop Mountain. I can think of at least three software projects that I’ve been able to implement and describe as analogies to the choose your own adventure style of book.
  • David Allen, Getting Things Done, because quite often it feels like there’s too much to do.
  • Douglas Hofstadter, Godel, Escher, Bach: An Eternal Golden Braid is a book about looking for the patterns and connections in things.
  • Victor Papanek, Design for the Real World, for reminding us of the people who are going to have to put up with the consequences of the things we create.
  • Donald Broadbent, Perception and Communication, for being the first person to systematically explore that topic.
  • Steven Hawking, A Brief History of Time, showing us how to make complex topics accessible.
  • Roger Penrose, The Road to Reality, showing us how to make complex topics comprehensively presentable.
  • Douglas Coupland, Microserfs, for poking fun at things I took seriously.
  • Janet Abbate, Recoding Gender, because computering is more accessible to me than to others for no good reason.
  • Joshua Bloch, Effective Java, Second Edition, for showing that part of the inaccessibility is a house of cards of unsuitable models with complex workarounds, and that programmers are people who delight in knowing, not addressing, the workarounds.
  • Michael Feathers, Working Effectively with Legacy Code, the one book every programmer should read.
  • Steve Krug, Don’t make me think!, a book about the necessity of removing exploration and uncertainty from computer interaction.
  • Seymour Papert, Mindstorms, a book about the necessity of introducing exploration and uncertainty into computer interaction.
  • Richard Stallman, Free as in Freedom 2.0, for suggesting that we should let other people choose between ther previous two options.
  • Brad Cox, Object-Oriented Programming: An Evolutionary Approach, for succinctly and effortlessly explaining objects a whole decade before everybody else got confused by whether a dog is an animal or a square is a rectangle.
  • Gregor Kiczales, Jim des Rivieres, and Daniel G. Bobrow, The Art of the Metaobject Protocol showed me that OOP is just one way to do OOP, and that functional programming is the same thing.
  • Simson Garfinkel and Michael Mahoney, NEXTSTEP Programming: Step One was where I learnt to create software more worthwhile than a page of BASIC instructions.
  • Gil Amelio, On the Firing Line: My 500 Days at Apple shows that the successful business wouldn’t be here if someone hadn’t managed the unsuccessful business first.

There were probably others.

Working Effectively with Legacy Code

I gave a talk to my team at ARM today on Working Effectively with Legacy Code by Michael Feathers. Here are some notes I made in preparation, which are somewhat related to the talk I gave.

This may be the most important book a software developer can
read. Why? Because if you don’t, then you’re part of the problem.

It’s obviously a lot easier and a lot more enjoyable to work on
greenfield projects all the time. You get to choose this week’s
favourite technologies and tools, put things together in the ways that
suit you now, and make progress because, well anything is progress
when there’s nothing there already. But throwing away an existing
system and starting from scratch makes it easy to throw away the
lessons learned in developing that system. It may be ugly, and patched
up all over the place, but that’s because each of those patches was
needed. They each represent something we learned about the product
after we thought we were done.

The new system is much more likely to look good from the developer’s
perspective
, but what about the users’? Do they want to pay again
for development of a new system when they already have one that mostly
works? Do they want to learn again how to use the software? We have
this strange introspective notion that professionalism in software
development means things that make code look good to other coders:
Clean Code, “well-crafted” code. But we should also have some
responsibility to those people who depend on us and who pay our way,
and that might mean taking the decision to fix the mostly-working
thing.

A digression: Lehman’s Laws

Manny Lehman identified three different categories of software system:
those that are exactly specified, those that implement
well-understood procedures, and those that are influenced by the
environment in which they run. Most software (including ours) comes
into that last category, and as the environment changes so must the
software, even if there were no (known) problems with it at an earlier
point in its evolution.

He expressed
Laws governing the evolution of software systems,
which govern how the requirements for new development are in conflict
with the forces that slow down maintenance of existing systems. I’ll
not reproduce the full list here, but for example on the one hand the
functionality of the system must grow over time to provide user
satisfaction, while at the same time the complexity will increase and
perceived quality will decline unless it is actively maintained.

Legacy Code

Michael Feather’s definition of legacy code is code without tests. I’m
going to be a bit picky here: rather than saying that legacy code is
code with no tests, I’m going to say that it’s code with
insufficient tests
. If I make a change, can I be confident that I’ll
discover the ramifications of that change?

If not, then it’ll slow me down. I even sometimes discard changes
entirely, because I decide the cost of working out whether my change
has broken anything outweighs the interest I have in seeing the change
make it into the codebase.

Feathers refers to the tests as a “software vice”. They clamp the
software into place, so that you can have more control when you’re
working on it. Tests aren’t the only tools that do this: assertions
(and particularly Design by Contract) also help pin down the software.

How do I test untested code?

The apparent way forward then when dealing with legacy code is to
understand its behaviour and encapsulate that in a collection of unit
tests. Unfortunately, it’s likely to be difficult to write unit tests
for legacy code, because it’s all tightly coupled, has weird and
unexpected dependencies, and is hard to understand. So there’s a
catch-22: I need to make tests before I make changes, but I need to
make changes before I can make tests.

Seams

Almost the entire book is about resolving that dilemma, and contains a
collection of patterns and techniques to help you make low-risk
changes to make the code more testable, so you can introduce the tests
that will help you make the high-risk changes. His algorithm is:

  1. identify the “change points”, the things that need modifying to
    make the change you have to make.
  2. find the “test points”, the places around the change points where
    you need to add tests.
  3. break dependencies.
  4. write the tests.
  5. make the changes.

The overarching model for breaking dependencies is the “seam”. It’s a
place where you can change the behaviour of some code you want to
test, without having to change the code under test itself. Some examples:

  • you could introduce a constructor argument to inject an object
    rather than using a global variable
  • you could add a layer of indirection between a method and a
    framework class it uses, to replace that framework class with a
    test double
  • you could use the C preprocessor to redefine a function call to use
    a different function
  • you can break an uncohesive class into two classes that collaborate
    over an interface, to replace one of the classes in your tests

Understanding the code

The important point is that whatever you, or someone else, thinks
the behaviour of the code should be, actually your customers have paid
for the behaviour that’s actually there and so that (modulo bugs) is
the thing you should preserve.

The book contains techniques to help you understand the existing code
so that you can get those tests written in the first place, and even
find the change points. Scratch refactoring is one technique: look
at the code, change it, move bits out that you think represent
cohesive functions, delete code that’s probably unused, make notes in
comments…then just discard all of those changes. This is like Fred
Brooks’s recommendation to “plan to throw one away”, you can take what
you learned from those notes and refactorings and go in again with a
more structured approach.

Sketching is another technique recommended in the book. You can draw
diagrams of how different modules or objects collaborate, and
particularly draw networks of what parts of the system will be
affected by changes in the part you’re looking at.

On the extremes of computer science

I didn’t study computer science at school or university, and still manage to work as a programmer.

That is not to say that I don’t need to know some things that are taught on computer science courses. Just this week I’ve had to build a couple of different data structures and understand their running time: very CS.

I’ve also needed to know things that aren’t on a CS degree, too. The acceptance criteria for one of my projects are written in French, and none of the CS courses I’ve seen in UK universities include that in the syllabus.

I’m neither arguing for nor against the validity of a CS background in professional software development. I’m arguing against taking either side. You need to know some CS things to write software, you need to know some other things, multiple backgrounds are appropriate and welcome.

Staying power

You would imagine that by now I would have come to realise how long my attention span is and worked to find projects that fit within it, but no. This is one of the changes I need to make soon.

So often I start a project really excited by it, but am really excited by something else before the end. Book projects always work that way, and quite a few software projects. Sometimes even talks, given a long enough lead time between being asked for a topic and actually giving the talk.

The usual result is that I become distracted before the end of the project, which leads to procrastination. That then makes it take longer, which only increases the distraction and disengagement.

What I’m saying is that if I ever say that I’m thinking of starting a PhD, you have my permission to chastise me. Four years is not within my observed boredom limit. Six months is closer to the mark.

It depends? It depends.

Sometimes you ask a question which has a small collection of actionable answers: yes or no. You ask someone who should be able to give that yes or no answer, and they go for the third: it depends.

Maybe they can’t actually answer your question, but want to sound profound.

Maybe they don’t realise that you’re a beginner in this particular area and would benefit from knowing what they’d do in this situation.

Maybe they don’t realise that you’re an expert in this area and are looking for confirmation or refutation of your existing choice.

Maybe they don’t realise that you’re uninterested in this area, and just need a decision made so you can move on.

Maybe they don’t know what the answer is, but are hoping you’ll help them think through the alternatives.

Sometimes, the answer to a question is “it depends”. But is that the answer you should give? It depends.

That can’t possibly work.

A while back I was at a Facebook developer event, talking about techniques for analysing Objective-C. My summary of the problem was something like “it’s one of those things that works pretty well in the ivory towers of practice but completely falls apart when you try to use it in theory.”

That’s true of many things in building software, but as people who get paid for removing bugs from things we’re all too good at the situations where the theory doesn’t pan out. Problems arise when we report the overly general conclusion: “that doesn’t work” rather than “there exists a condition in which that doesn’t work”.

Often, the appropriate thing to do is to build the thing anyway and extract value from it in the cases where it does work. Most Ruby code is just Java with different punctuation: if you build a thing that can’t possibly work because of monkey patching then build it anyway and 100% (give or take) of Ruby you actually encounter will work with it.

If it were up to programmers we wouldn’t have paper money. You can’t promise to back 10× as much money as you’ve actually got gold to make good on, in case everyone asks for it back. Ah, but what if they don’t?

And even if you find that the edge situations arise too frequently to make the thing you built worthwhile, you’ll have learned something about the problem. You may find a different approach, or even that solving a slightly (or greatly) different problem is what you really want to do.

When someone tells you the thing you want to build can’t work, build it and work with it. If only for the look on their face.