Two ways of thinking

I’ve used this idea in conversations for years, and can’t find a post on it, which I find surprising but there you go. There are, broadly speaking, two different ways to look at programming languages. And I think that these mean two different ways to select programming languages, which are asymmetric. However, they can lead to the same choice, but viewed in different ways: if you use one of those languages then you need to understand that these different views exist to understand decisions made by programmers working in those languages.

On the one hand, the Language Lawyer seeks out languages with plenty of esoterica and specifics, in order to become proficient at recalling those esoterica and demonstrating their correct application. The Language Lawyer loves that reference and value types in swift are actually called class and struct, or that the difference between class and struct in C++ is the default member visibility, and that neither is actually related to the words class or struct. They are delighted both to know and to explain that in Perl 5, @var, \var and $var refer to the same variable but are accessed in different contexts and what they evaluate to in those contexts. They are excited to explain that technically the admonition that you must always return a value from a non-void function in C is not true, because since C99 if you have a function called main that returns control without returning a value, the compiler will implicitly return 0.

The Language Lawyer seeks out a language with such esoterica. But learning it is complex and time-consuming, so they probably don’t change particularly often. In fact, changing language may be deleterious, because then other people will know the esoterica that they get caught out on. Imagine knowing how the function overloading rules in C++ work, then trying something similar in Java and finding that you’re wrong. And that somebody else knows that. The shame! Once their language is chosen, the Language Lawyer will form heuristic rules of which language feature to use when solving which problem.

Standing in the other corner, the Lazy Evaluator wants to avoid those edge cases, because it makes them think about the programming language, not the software problem. They’d much prefer to have an environment in which there’s as much as one way to express their solution, then worry about how to express their solution using that one tool. The Lazy Evaluator loves Ruby because Everything Is An Object (and they love Io more because in Ruby there are also classes). The Lazy Evaluator is delighted to know that in Lisp, Everything Is A List. They are excited to be programming in Haskell, where Everything Is A Function.

Both of these people can be happy with the same programming language, though clearly this will be for different reasons, and their different thought processes will bring them into conflict. The Lazy Evaluator will be happy enough using the third-most common C-family language, C++– (full name: “C++98 but never use exceptions or templates, avoid multiple inheritance, and [never/always] mark methods as virtual”). The Language Lawyer will enjoy demonstrating const-correctness and the difference between l-, r-, pr-, gl- and x-values, but the two will come to blows over whether the Lazy Evaluator is denying the full elegant expressiveness of the language by proscribing exceptions.

Similarly, the Lazy Evaluator can look at JavaScript, see the simple classless object model with dynamic dispatch remembered from Self or Io, even notice that Functions Are Objects Too and work in that Everything Is An Object paradigm. The Language Lawyer can be confident that at some point, all of the weird coercion behaviour, the Which this Is That this Anyway question, and the You Need That Flag With This Polyfill To Get That Language Feature issues will come up. But the Lazy Evaluator will notice that That Flag also enables class syntax and arrow functions, we already have prototypes and function functions, and disagreement will ensue.

So in what way are those ways of thinking asymmetric? Invoking Hickey’s Corollary, you can never be recomplecting. It’s easier to compromise on C++ But Without Exceptions than it is to compromise on Lisp But We’ll Add Other Atom Types. You can choose JavaScript But Never Use this, you can’t choose Haskell But With Type-Escaping Side Effects.

This isn’t really about programming languages, it’s about any form of model or abstraction. Languages are a great example though. I think it’s a mindset thing. Your task is to solve this problem, using this tool. Are you focusing on the bit that’s about the problem, or the bit that’s about the tool? Both are needed.

Linus’s Bystanders

For some reason, when Eric S. Raymond wanted to make a point about the “bazaar” model of open source software development, he named it after someone else. Thus we have Linus’s Law:

Linus was directly aiming to maximize the number of person-hours thrown at debugging and development, even at the possible cost of instability in the code and user-base burnout if any serious bug proved intractable. Linus was behaving as though he believed something like this:

8. Given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone.

Or, less formally, “Given enough eyeballs, all bugs are shallow.” I dub this: “Linus’s Law”.

The “proof” of this law, such as it is, is a reductio ad absurdum painting the contradiction of the law as a universe in which the Linux kernel couldn’t be created.

If “Linus’s Law” is false, then any system as complex as the Linux kernel, being hacked over by as many hands as the that kernel was, should at some point have collapsed under the weight of unforseen bad interactions and undiscovered “deep” bugs. If it’s true, on the other hand, it is sufficient to explain Linux’s relative lack of bugginess and its continuous uptimes spanning months or even years.

Let us remember the time that Linux development did collapse under its own weight. Raymond’s original essay was written in 1997; by 2006 it seemed that “we’re adding bugs at a higher rate than we’re fixing them”. In between, more developers had (presumably) contributed, so what happened there? Why weren’t all of the bugs even shallower?

I’d like to examine a result from the social sciences here. From the Wikipedia (I am researcher, hear me roar) page on bystander effect:

The bystander effect, or bystander apathy, is a social psychological phenomenon that refers to cases in which individuals do not offer any means of help to a victim when other people are present. The probability of help is inversely related to the number of bystanders. In other words, the greater the number of bystanders, the less likely it is that any one of them will help.

And then, a few paragraphs later when talking about computer-mediated communication:

Research suggests that the bystander effect may be present in computer-mediated communication situations.

The experiments into bystander apathy are specifically about someone in distress, either asking for help or visibly in trouble. Maybe it applies here, too, when the only “distress” is that some software you didn’t pay for could be better than it is. Perhaps, even before we get into discussions of whether the projects have welcoming communities that accept contributors, our projects are putting people off just because they know that there are plenty of others out there who could fix these bugs before they do.

I’ve met software team leaders who, when asked how their techniques and processes could be made to scale to a team of 1,000, answer that they’d start by giving the project to a team of ten. Maybe they’re onto something.

Is Social Psychology Biased Against Republicans?

Pretty interesting, and an often unmentioned aspect of diversity (probably because political leaning is supposed to be a secret in democratic countries, if not because it’s usually acceptable to display ingroup/outgroup bias politically). But it’s very relevant in the social sciences, especially if it means that particular political views are more likely to be treated favourably or argued for by researchers.

The Software Leviathan

Thomas Hobbes viewed society as a meta-person, a gigantic creature whose parts were human and which was in the service of those humans. Left to their own devices, people would not work well together as their notion of individualism and search for personal gain leads directly to conflict: strong government is needed to instil a sense of cooperation and of social obligation. This idea of “government through social contract” is pervasive in Western political thought, being the basis as it is for the “government of the people, by the people, for the people” with which Abraham Lincoln hoped to lead post-civil war America.

Software systems themselves can also be thought of as Leviathans. From a purely technical sense, all of “professional” software construction is based on notions of composition, of software systems that are themselves made of software systems. So we have structured or procedural programming, with routines composed of subroutines. And functional programming, with functions composed of functions. And object-oriented programming, with objects composed of objects. So central are these ideas to expressions of thought in software that they are considered paradigmatic by many, representing fundamental world-views of the art/craft/science.

There’s a second formulation of software-as-Leviathan, which is closer to Hobbesian meaning. The technical aspect of our software systems is merely a substrate[*] through which a social system—that of the people interacting with the software, the people acting on the software, and the people interacting with the other people—is reified. So the descriptions Hobbes made of his Leviathan can be made of these socio-technical systems:

  • First the Matter thereof, and the Artificer; both which is Man[sic].
  • Secondly, How, and by what Covenants it is made; what are the Rights and just Power or Authority of a Soveraigne; and what it is that Preserveth and Dissolveth it.
  • Thirdly, what is a Christian Common-Wealth.
  • Lastly, what is the Kingdome of Darkness.

[*] I wonder what form of substance gives the best sense of the analogy. Scaffolding? Lubricant? Mortar? Framework?

OK, maybe not so much the third one, except that it is really an attempt to define the values and norms of a society, which in the context of Hobbes’s writing, meant a Christian society.

Of course, any attempt to describe such a system is going to be filtered by the preconceptions, ideas and values of the person creating the description. Which brings me onto today’s topic: the pun in the new domain of this blog. Evidently it’s a contraction of “Structure and Interpretation of Computer Programmers”, based on the Abelson and Sussman book title. That book is abbreviated to SICP, so it’s not too difficult to see how it might be adapted to SICPers.

We can also see it as being a Latin abbreviation: sic pers., meaning such a person. So there is both the Structure and Interpretation of Computer Programmers, and there is this person who is doing the interpreting, in the domain name.

Software, Science?

Is there any science in software making? Does it make sense to think of software making as scientific? Would it help if we could?

Hold on, just what is science anyway?

Good question. The medieval French philosopher-monk Buridan said that the source of all knowledge is experience, and Richard Feynman paraphrased this as “the test of all knowledge is experiment”.

If we accept that science involves some experimental test of knowledge, then some of our friends who think of themselves as scientists will find themselves excluded. Consider the astrophysicists and cosmologists. It’s all very well having a theory about how stars form, but if you’re not willing to swirl a big cloud of hydrogen, helium, lithium, carbon etc. about and watch what happens to it for a few billion years then you’re not doing the experiment. If you’re not doing the experiment then you’re not a scientist.

Our friends in what are called the biological and medical sciences also have some difficulty now. A lot of what they do is tested by experiment, but some of the experiments are not permitted on ethical grounds. If you’re not allowed to do the experiment, maybe you’re not a real scientist.

Another formulation (OK, I got this from the wikipedia entry on Science) sees science as a sort of systematic storytelling: the making of “testable explanations and predictions about the universe”.

Under this definition, there’s no problem with calling astronomy a science: you think this is how things work, then you sit, and watch, and see whether that happens.

Of course a lot of other work fits into the category now, too. There’s no problem with seeing the “social sciences” as branches of science: if you can explain how people work, and make predictions that can (in principle, even if not in practice) be tested, then you’re doing science. Psychology, sociology, economics: these are all sciences now.

Speaking of the social sciences, we must remember that science itself is a social activity, and that the way it’s performed is really defined as the explicit and implicit rules and boundaries created by all the people who are doing it. As an example, consider falsificationism. This is the idea that a good scientific hypothesis is one that can be rejected, rather than confirmed, by an appropriately-designed experiment.

Sounds pretty good, right? Here’s the interesting thing: it’s also pretty new. It was mostly popularised by Karl Popper in the 20th Century. So if falsificationism is the hallmark of good science, then Einstein didn’t do good science, nor did Marie Curie, nor Galileo, or a whole load of other people who didn’t share the philosophy. Just like Dante’s Virgil was not permitted into heaven because he’d been born before Christ and therefore could not be a Christian, so all of the good souls of classical science are not permitted to be scientists because they did not benefit from Popper’s good message.

So what is science today is not necessarily science tomorrow, and there’s a sort of self-perpetuation of a culture of science that defines what it is. And of course that culture can be critiqued. Why is peer review appropriate? Why do the benefits outweigh the politics, the gazumping, the gender bias? Why should it be that if falsification is important, negative results are less likely to be published?

Let’s talk about Physics

Around a decade ago I was studying particle physics pretty hard. Now there are plenty of interesting aspects to particle physics. Firstly that it’s a statistics-heavy discipline, and that results in statistics are defined by how happy you are with them, not by some binary right/wrong criterion.

It turns out that particle physicists are a pretty conservative bunch. They’ll only accept a particle as “discovered” if the signal indicating its existence is measured as a five-sigma confidence: i.e. if there’s under a one-on-a-million chance that the signal arose randomly in the absence of the particle’s existence. Why five sigma? Why not three (a 99.7% confidence) or six (to keep Motorola happy)? Why not repeat it three times and call it good, like we did in middle school science classes?

Also, it’s quite a specialised discipline, with a clear split between theory and practice and other divisions inside those fields. It’s been a long time since you could be a general particle physicist, and even longer since you could be simply a “physicist”. The split leads to some interesting questions about the definition of science again: if you make a prediction which you know can’t be verified during your lifetime due to the lag between theory and experimental capability, are you still doing science? Does it matter whether you are or not? Is the science in the theory (the verifiable, or falsifiable, prediction) or in the experiment? Or in both?

And how about Psychology, too

Physicists are among the most rational people I’ve worked with, but psychologists up the game by bringing their own feature to the mix: hypercriticality. And I mean that in the technical sense of criticism, not in the programmer “you’re grammar sucks” sense.

You see, psychology is hard, because people are messy. Physics is easy: the apple either fell to earth or it didn’t. Granted, quantum gets a bit weird, but it generally (probably) does its repeatable thing. We saw above that particle physics is based on statistics (as is semiconductor physics, as it happens); but you can still say that you expect some particular outcome or distribution of outcomes with some level of confidence. People aren’t so friendly. I mean, they’re friendly, but not in a scientific sense. You can do a nice repeatable psychology experiment in the lab, but only by removing so many confounding variables that it’s doubtful the results would carry over into the real world. And the experiment only really told you anything about local first year psychology undergraduates, because they’re the only people who:

  1. walked past the sign in the psychology department advertising the need for participants;
  2. need the ten dollars on offer for participation desperately enough to turn up.

In fact, you only really know about how local first year psychology undergraduates who know they’re participating in a psychology experiment behave. The ethics rules require informed consent which is a good thing because otherwise it’s hard to tell the difference between a psychology lab and a Channel 4 game show. But it means you have to say things like “hey this is totally an experiment and there’ll be counselling afterward if you’re disturbed by not really electrocuting the fake person behind the wall” which might affect how people react, except we don’t really know because we’re not allowed to do that experiment.

On the other hand, you can make observations about the real world, and draw conclusions from them, but it’s hard to know what caused what you saw because there are so many things going on. It’s pretty hard to re-run the entire of a society with just one thing changed, like “maybe if we just made Hitler an inch taller then Americans would like him, or perhaps try the exact same thing as prohibition again but in Indonesia” and other ideas that belong in Philip K. Dick novels.

So there’s this tension: repeatable results that might not apply to the real world (a lack of “ecological validity”), and real-world phenomena that might not be possible to explain (a lack of “internal validity”). And then there are all sorts of other problems too, so that psychologists know that for a study to hold water they need to surround what they say with caveats and limitations. Thus is born the “threats to validity” section on any paper, where the authors themselves describe the generality (or otherwise) of their results, knowing that such threats will be a hot topic of discussion.

But all of this—the physics, the psychology, and the other sciences—is basically a systematised story-telling exercise, in which the story is “this is why the universe is as it is” and the system is the collection of (time-and-space-dependent) rules that govern what stories may be told. It’s like religion, but with more maths (unless your religion is one of those ones that assigns numbers to each letter in a really long book then notices that your phone number appears about as many times as a Poisson distribution would suggest).

Wait, I think you were talking about software

Oh yeah, thanks. So, what science, if any, is there in making software? Does there exist a systematic approach to storytelling? First, let’s look at the kinds of stories we need to tell.

The first are the stories about the social system in which the software finds itself: the story of the users, their need (or otherwise) for a software system, their reactions (or otherwise) to the system introduced, how their interactions with each other change as a result of introducing the system, and so on. Call this requirements engineering, or human-computer interaction, or user experience; it’s one collection of stories.

You can see these kinds of stories emerging from the work of Manny Lehman. He identifies three types of software:

  • an S-system is exactly specified.
  • a P-system executes some known procedure.
  • an E-system must evolve to meet the needs of its environment.

It may seem that E-type software is the type in which our stories about society are relevant, but think again: why do we need software to match a specification, or to follow a procedure? Because automating that specification or procedure is of value to someone. Why, or to what end? Why that procedure? What is the impact of automating it? We’re back to telling stories about society. All of these software systems, according to Lehman, arise from discovery of a problem in the universe of discourse, and provide a solution that is of possible interest in the universe of discourse.

The second kind are the stories about how we worked together to build the software we thought was needed in the first stories. The practices we use to design, build and test our software are all tools for facilitating the interaction between the people who work together to make the things that come out. The things we learned about our own society, and that we hope we can repeat (or avoid) in the future, become our design, architecture, development, testing, deployment, maintenance and support practices. We even create our own tools—software for software’s sake—to automate, ease or disrupt our own interactions with each other.

You’ll mostly hear the second kind of story at most developer conferences. I believe that’s because the people who have most time and inclination to speak at most developer conferences are consultants, and they understand the second stories to a greater extent than the first because they don’t spend too long solving any particular problem. It’s also because most developer conferences are generally about making software, not about whatever problem it is that each of the attendees is trying to solve out in the world.

I’m going to borrow a convention that Rob Rix told me in an email, of labelling the first type of story as being about “external quality” and the second type about “internal quality”. I went through a few stages of acceptance of this taxonomy:

  1. Sounds like a great idea! There really are two different things we have to worry about.
  2. Hold on, this doesn’t sounds like such a good thing. Are we really dividing our work into things we do for “us” and things we do for “them”? Labelling the non-technical identity? That sounds like a recipe for outgroup homogeneity effect.
  3. No, wait, I’m thinking about it wrong. The people who make software are not the in-group. They are the mediators: it’s just the computers and the tools on one side of the boundary, and all of society on the other side. We have, in effect, the Janus Thinker: looking on the one hand toward the external stories, on the other toward the internal stories, and providing a portal allowing flow between the two.

JANUS (from Vatican collection) by Flickr user jinnrouge

So, um, science?

What we’re actually looking at is a potential social science: there are internal stories about our interactions with each other and external stories about our interactions with society and of society’s interactions with the things we create, and those stories could potentially be systematised and we’d have a social science of sorts.

Particularly, I want to make the point that we don’t have a clinical science, an analogy drawn by those who investigate evidence-based software engineering (which has included me, in my armchair way, in the past). You can usefully give half of your patients a medicine and half a placebo, then measure survival or recovery rates after that intervention. You cannot meaningfully treat a software practice, like TDD as an example, as a clinical intervention. How do you give half of your participants a placebo TDD? How much training will you give your ‘treatment’ group, and how will you organise placebo training for the ‘control’ group? [Actually I think I’ve been on some placebo training courses.]

In constructing our own scientific stories about the world of making software, we would run into the same problems that social scientists do in finding useful compromises between internal and ecological validity. For example, the oft-cited Exploratory experimental studies comparing online and offline programming performance (by Sackman et al., 1968) is frequently used to support the notion that there are “10x programmers”, that some people who write software just do it ten times faster than others.

However, this study does not have much ecological validity. It measures debugging performance, using either an offline process (submitting jobs to a batch system) or an online debugger called TSS, which probably isn’t a lot like the tools used in debugging today. The problems were well-specified, thus removing many of the real problems programmers face in designing software. Participants were expected to code a complete solution with no compiler errors, then debug it: not all programmers work like that. And where did they get their participants from? Did they have a diverse range of backgrounds, cultures, education, experience? It does not seem that any results from that study could necessarily apply to modern software development situated in a modern environment, nor could the claim of “10x programmers” necessarily generalise as we don’t know who is 10x better than whom, even at this one restricted task.

In fact, I’m also not convinced of its internal validity. There were four conditions (two programming problems and two debugging setups), each of which was assigned to six participants. Variance is so large that most of the variables are independent of each other (the independent variables are the programming problem and the debugging mode, and the dependent variables are the amount of person-time and CPU-time), unless the authors correlate them with “programming skill”. How is this skill defined? How is it measured? Why, when the individual scores are compared, is “programming skill” not again taken into consideration? What confounding variables might also affect the wide variation in scored reported? Is it possible that the fastest programmers had simply seen the problem and solved it before? We don’t know. What we do know is that the reported 28:1 ratio between best and worst performers is across both online and offline conditions (as pointed out in, e.g., The Leprechauns of Software Engineering, so that’s definitely a confounding factor. If we just looked at two programmers using the same environment, what difference would be found?

We had the problem that “programming skill” is not well-defined when examining the Sackman et al. study, and we’ll find that this problem is one we need to overcome more generally before we can make the “testable explanations and predictions” that we seek. Let’s revisit the TDD example from earlier: my hypothesis is that a team that adopts the test-driven development practice will be more productive some time later (we’ll defer a discussion of how long) than the null condition.

OK, so what do we mean by “productive”? Lines of code delivered? Probably not, their amount varies with representation. OK, number of machine instructions delivered? Not all of those would be considered useful. Amount of ‘customer value’? What does the customer consider valuable, and how do we ensure a fair measurement of that across the two conditions? Do bugs count as a reduction in value, or a need to do additional work? Or both? How long do we wait for a bug to not be found before we declare that it doesn’t exist? How is that discovery done? Does the expense related to finding bugs stay the same in both cases, or is that a confounding variable? Is the cost associated with finding bugs counted against the value delivered? And so on.

Software dogma

Because our stories are not currently very testable, many of them arise from a dogmatic belief that some tool, or process, or mode of thought, is superior to the alternatives, and that there can be no critical debate. For example, from the Clean Coder:

The bottom line is that TDD works, and everybody needs to get over it.

No room for alternatives or improvement, just get over it. If you’re having trouble defending it, apply a liberal sprinkle of argumentum ab auctoritate and tell everyone: Robert C. Martin says you should get over it!

You’ll also find any number of applications of the thought-terminating cliché, a rhetorical technique used to stop cognitive dissonance by allowing one side of the issue to go unchallenged. Some examples:

  • “I just use the right tool for the job”—OK, I’m done defending this tool in the face of evidence. It’s just clearly the correct one. You may go about your business. Move along.
  • “My approach is pragmatic”—It may look like I’m doing the opposite of what I told you earlier, but that’s because I always do the best thing to do, so I don’t need to explain the gap.
  • “I’m passionate about [X]”—yeah, I mean your argument might look valid, I suppose, if you’re the kind of person who doesn’t love this stuff as much I do. Real craftsmen would get what I’m saying.
  • and more.

The good news is that out of such religious foundations spring the shoots of scientific thought, as people seek to find a solid justification for their dogma. So just as physics has strong spiritual connections, with Steven Hawking concluding in A Brief History of Time:

However, if we discover a complete theory, it should in time be understandable by everyone, not just by a few scientists. Then we shall all, philosophers, scientists and just ordinary people, be able to take part in the discussion of the question of why it is that we and the universe exist. If we find the answer to that, it would be the ultimate triumph of human reason — for then we should know the mind of God.

and Einstein debating whether quantum physics represented a kind of deific Dungeons and Dragons:

[…] an inner voice tells me that it is not yet the real thing. The theory says a lot, but does not really bring us any closer to the secret of the “old one.” I, at any rate, am convinced that He does not throw dice.

so a (social) science of software could arise as an analogous form of experimental theology. I think the analogy could be drawn too far: the context is not similar enough to the ages of Islamic Science or of the Enlightenment to claim that similar shifts to those would occur. You already need a fairly stable base of rational science (and its application via engineering) to even have a computer at all upon which to run software, so there’s a larger base of scientific practice and philosophy to draw on.

It’s useful, though, when talking to a programmer who presents themselves as hyper-rational, to remember to dig in and to see just where the emotions, fallacious arguments and dogmatic reasoning are presenting themselves, and to wonder about what would have to change to turn any such discussion into a verifiable prediction. And, of course, to think about whether that would necessarily be a beneficial change. Just as we can critique scientific culture, so should we critique software culture.

The First Flaw

As she left her desk at the grandiosely-named United States Robotics, Susan reflected on her relationship with the engineering team she was about to meet. Many of its members were juvenile and frivolous in her opinion, and she refused to play along with any of their jokes.

Even the title they gave her was mocking. They called her “the robopsychologist,” a term with no real meaning as USR had yet to make a single product. They had not even sold any customers on the promise of a robot. All they had to their name was a rented office, some venture capital and their founder’s secret recipe that was supposed to produce an intelligent sponge from a mixture of platinum and iridium.

While Susan might not be the robopsychologist, she certainly was a psychologist, of sorts. She seemed to spend most of her time working out what was wrong with the people making the robots, and how to get them to quit goofing off and start making this company some much needed profit. Steeling herself for whatever chaotic episode this dysfunctional group was going through, she opened the door to the meeting room. She was waved to a seat by director of product development Roger Meadows.

“Thanks for coming, Doctor Ca-“

Susan cut him off. “You’ve called me Susan before, Roger, you can do it again now. What’s up?”

“It’s Pal. We’ve lost another programmer.”

Susan refused to call their unborn (and soon stillborn, if the engineers didn’t buck up soon) product by its nickname, short for Proprietary Artificial Lifeform. She had at least headed off a scheme to call it “Robot 2 Delegate 2”, which would have cost their entire budget before they even started.

“Well, look. I know it’s hard, but those kids work too much and burn themselves out. Of course they’re going to quit if-“

“No, I don’t mean that. Like I said, it’s Pal. He killed Tanya.”

“Killed?” Susan suddenly realised how pale Roger looked, and that she had probably just gone a similar hue. “But how, no, wait. You said another programmer?”

“Er, yes. I mean, first Pal got Steve, but we though, you know, that we could, uh, keep that quiet until the next funding round, so-“

The blood suddenly came back to Susan’s face. “Are you telling me,” she snapped, “that two people had to die before you thought to ask for any help? Did you come to me now because you’re concerned, or because you’ve run out of programmers?”

“Well, you know, I’d love to recruit more, but as they say, adding people to a late project…”

Yes, thought Susan, I do know what they say. You can boil the whole programming field down to damned aphorisms like that one. Probably they just give you a little phrasebook in CS101 and test you on it after three years, see if you have them all down pat.

“But what about the ethics code? Isn’t there some module in that positronic brain you’ve built to stop that sort of thing happening?”

“Of course, the One Law of Robotics. The robot may not harm a human being. That was the first story we built. We haven’t added the inaction thing the VCs wanted, but that can’t be it. That mess in the lab was hardly the result of inaction.”

“Right, the lab. I suppose I’d better go down and see for myself.”


She quite quickly wished she hadn’t. Despite having a strong constitution, Susan’s stomach turned at the sight of barely-recognisable pieces of former colleague. Or possibly colleagues, she wasn’t convinced Roger would have let a cleaner in between accidents.

The robot had evidently launched itself directly at and subsequently through Tanya, stopping only when the umbilical connecting it to the workstation had become disconnected, removing its power source. Outwardly and, Susan knew, internally, it lay dormant in its new macabre gloss coat.

“I take it you did think to do a failure analysis? Do you know what happened here?”

“If I knew that, Susan, I wouldn’t have gotten you involved.” She believed it, knowing her reputation at USR. “We’ve checked the failure tree and no component could cause the defect mode seen here.”

“Defect mode! Someone’s dead, Roger! Two people! People you’re responsible for! Look, it went wrong somehow, and you’re saying it can’t. Well it can. How did the software check out?”

“I don’t know, the software isn’t in scope for the safety analysis.”

Susan realised she was slowly counting to ten. “Well I’m making it in scope now. I took a couple CS classes at school, and I know they’re using the same language I learnt. I’ll probably not find anything, but I can at least take a look before we involve anyone else.”


Hours later, and Susan’s head hurt. She wasn’t sure whether it was the hack-and-hope code she was reading or the vat of coffee she had drunk while she was doing it, but it hurt. So far she had found that the robot’s one arm was apparently thought of as a specific type of limb, itself a particular appendage, in its turn a type of protuberance. She wasn’t sure what other protuberances the programmers had in mind, but she did know the arm software looked OK.

So did the movement software. It had clearly been built quickly, in a slapdash way, and she’d noted down all sorts of problems as she read. But nothing major. Nothing that said “kill the person in front of you,” rather than “switch on the wheel motors”.

She didn’t really expect to see that, anyway. The robot’s ethics module, the One Law that Roger had quoted at her, was meant to override all the robot’s other functions. Where was that code, anyway? She hadn’t seen it in her study, and now couldn’t find a file called ethics, laws or anything similar. Were the programmers over-abstracting again?, she thought. A law is a rule, no rules file.

Susan finally cursed the programmer mind as she found a source file called jude. Of course. But it definitely was what she was looking for: here was the moral code built into their first and, assuming USR wasn’t shut down, all subsequent robots. Opening it, she saw a comment on the first two lines.

// "The robot may not harm a human being."
// Of course, we know that words like MUST, SHOULD and MAY are to be interpreted in accordance with RFC2119 ;-)

The bloody idiots, she thought. Typical programmers, deliberately misinterpreting a clear statement because they think it’s funny. Poor Pal had not been taught good from bad. Susan realised that she had used his name for the first time. She was beginning to empathise more with the robot than she did with the people who built him. Without making any changes, she closed her editor and phoned Roger.

“Meadows? Oh, Susan, did you find out what’s up?”

“Yes, I looked into the software. You can send all the programmers you want in there with Pal now.”

The Ignoble Programmer

Two programmers are taking a break from their work, relaxing on a bench in the park across from their office. As they discuss their weekend plans, a group of people jog past, each carrying their laptop in a yoke around their neck and furiously typing as they go.

“Oh, there goes the Smalltalk team,” says the senior of the two programmers on the bench. “They have to do everything at run-time.”

I love jokes. And not just because they’re sometimes funny, though that helps: I certainly find I enjoy a conversation and can relax more when at least two of the people involved are having fun. When only one person is joking, it gets awkward (particularly if everyone else is from HR). But a little levity can go a long way toward disarming an unpleasant truth so that it can be discussed openly. Political leaders through the ages have taken advantage of this by appointing jesters and fools to keep them aware of intrigues in the courts: even the authors of the American bill of rights remembered the satirist before the shooter.

I also like jokes because of the thought that goes into constructing a good (or deliberately bad) one. There’s a certain kind of creativity that goes into identifying an apparently absurd connection, exactly because of the absurdity. Being able to construct a joke, and being practised at constructing jokes, means being able to see new contexts and applications for existing ideas. Welcome to the birthplace of reuse and exploring the bounds of a construct’s application: welcome to the real home of software architecture.

But there’s a problem, or at least an opportunity (or maybe just a few thousand consulting dollars to be made and a book to be written). That problem is this: everyone else puts way more effort into their jokes than programmers do. Take this one, from the scientists:

Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon

They didn’t just joke about doing a brain scan of a dead fish, they did a brain scan of a dead fish. And published the (serendipitous and unexpected) results. But they didn’t just angle for a laugh, they had a real point. The subtitle of their paper:

An Argument For Proper Multiple Comparisons Correction

And isn’t it fun that some microbiologists demonstrated that beards are significant vectors for microbial infections?

Both of these examples were lifted from the Annals of Improbable Research’s Ig Nobel Prizes, awarded for “achievements that first make people laugh, and then makes them think”. The Ig Nobels have been awarded every year since 1991, and in that time only one computer science award has been granted. That award was given to the developer behind PawSense, a utility that detects and blocks typing caused by your cat walking across your keyboard.

Jokes that first make you laugh, and then make you think, are absolutely the best jokes you can make about my work. If I conclude “you’re right, that is absurd, but what if…” then you’ve done it right. Jokes that are thought-terminating statements can make us laugh, and maybe make us feel good about what we’re doing, but cannot make us any better at it because they don’t give us the impetus to reflect on our craft. Rather, they make us feel smug about knowing better than the poor sap who’s the butt of the joke. They confirm that we’ve nothing to learn, which is never the correct outlook.

We need more Ig Nobel-quality achievements in computing. Disarming the absurd and the downright broken in programming and presenting them as jokes can first make us laugh, and then make us think.

N.B. My complete connection to the Annals of Improbable Research is that I helped out on the AV desk at a couple of their talks. At their talk in Oxford in 2006 I was inducted into the Luxuriant and Flowing Hair Club for Scientists.

How to answer questions the smart way

You may have read how to ask questions the smart way by Eric S. Raymond. You may have even quoted it when faced with a question you thought was badly-formed. I want you to take a look at a section near the end of the article.

How to answer questions in a helpful way is the part I’m talking about. It’s a useful section. It reminds us that questions are part of a dialogue, which is a two-way process. Sometimes questions seem bad, but then giving bad answers is certainly no way to make up for that. What else should we know about answering questions?

The person who asked the question has had different experiences than you. The fact that you do not understand why the question should be asked does not mean that the question should not be asked. “Why would you even want to do that?” is not an answer.

Answer at a level appropriate to the question. If the question shows a familiarity with the basics, there’s no need to mansplain trivial details in the answer. On the other hand, if the question shows little familiarity with the basics, then an answer that relies on advanced knowledge is just pointless willy-waving.

The shared values that pervade your culture are learned, not innate. Not everyone has learned them yet, and they are not necessarily even good, valuable or correct. This is a point that Raymond misses with quotes like this:

You shouldn’t be offended by this; by hacker standards, your respondent is showing you a rough kind of respect simply by not ignoring you. You should instead be thankful for this grandmotherly kindness.

What this says is: this is how we’ve always treated outsiders, so this is how you should expect to be treated. Fuck that. You’re better than that. Give a respectful, courteous answer, or don’t answer. It’s really that simple. We can make a culture of respect and courtesy normative, by being respectful and courteous. We can make a culture of inclusion by not being exclusive.

I’m not saying that I’m any form of authority on answering questions. I’m far from perfect, and by exploring the flaws I know I perceive in myself and making them explicit I make them conscious, with the aim of detecting and correcting them in the future.

Story points: because I don’t know what I’m doing

The scenario

[Int. developer’s office. Developer sits at a desk that faces the wall. Two of the monitors on Developer’s desk are on stands, if you look closely you see that the third is balanced on the box set of The Art of Computer Programming, which is still in its shrink-wrap. Developer notices you and identifies an opportunity to opine about why the world is wrong, as ever.]

Every so often, people who deal with the real world instead of the computer world ask us developers annoying questions about how our work interacts with so-called reality. You’re probably thinking the same thing I do: who cares, right? I’m right in the middle of a totally cool abstraction layer on top of the operating system’s abstraction layer that abstracts their abstraction so I can interface it to my abstraction and abstract all the abstracts, what’s that got to do with reality and customers and my employer and stuff?

Ugh, damn, turning up my headphones and staring pointedly at the screen hasn’t helped, they’re still asking this question. OK, what is it?

Apparently they want to know when some feature will be done. Look, I’m a programmer, I’m absolutely the worst person to ask about time. OK, I believe that you might want to know whether this development effort is going to deliver value to the customers any time soon, and whether we’re still going to be ahead financially when we’re done, or whether it’d be better to take on some other work. And really I’d love to answer this question, except for one thing:

I have absolutely no idea what I’m doing. Seriously, don’t you remember all the other times that I gave you estimates and they were way off? The problem isn’t some systematic error in the way I think about how long it’ll take me to do stuff, it’s that while I can build abstractions on top of other abstractions I’m not so great at going the other way. Give me a short description of a task, I’ll try and work out what’s involved but I’m likely to miss something that will become important when I go to do it. It’s these missed details that add time, and I don’t know how many of those there will be until I get started.

The proposed solution

[Developer appears to have a brainwave]

Wait, remember how my superpower is adding layers of abstraction? Well your problem of estimation looks quite a lot like a nail to me, so I’ll apply my hammer! Let’s add a layer of abstraction on top of time!

Now you wanted to know how long it’ll take to finish some feature. Well I’ll tell you, but I won’t tell you in units of hours or days, I’ll use BTUs (Bullshit Time Units) instead. So this thing I’m working on will be about five BTUs. What do you mean, that doesn’t tell you when I’ll be done? It’s simple, duh! Just wait a couple of months, and measure how many BTUs we actually managed to complete. Now you know how many BTUs per day we can do, and you know how long everything takes!

[Developer puts their headphones back in, and turns to face the monitor. The curtain closes on the scene, and the Humble(-ish) Narrator takes the stage.]

The observed problem

Did you notice that the BTU doesn’t actually solve the stated problem? If it’s possible to track BTU completion over time until we know how many BTUs get completed in an iteration, then we are making the assumption that there is a linear relationship between BTUs and units of time. Just as there are 40 (or 90, if you picked the wrong recruiter) hours to the work week, so there are N BTUs to the work week. A BTU is worth x hours, and we just need to measure for a bit until we find the value of x.

But Developer’s problem was not a failure to understand how many hours there are in an hour. Developer’s problem was a failure to know what work is outstanding. An inability to foresee what work needs to be done cannot be corrected by any change to the way in which work to be done is mapped onto time. It is, to wear out even further an already tired saw, an unknown unknown.

What to do about it

We’re kindof stuck, really. We can’t tell how long something will take until we do it, not because we’re bad at estimating how long it’ll take to do something but because we’re bad at knowing what it is we need to do.

The little bit there about “until we do it” is, I think, what we need to focus on. I can’t tell you how long something I haven’t done will take, but I can probably tell you what problems are outstanding on the thing I’m doing now. I can tell you whether it’s ready now, or whether I think it’ll be ready “soon” or “not soon”.

So here’s the opportunity: we’ll keep whatever we’ve already got ready for immediate release. We’ll share information about which of the acceptance tests are passing, and if we were to release right now you’d know what customers will get from that. Whatever the thing we’re working on now is, we’ll be in a position to decide whether to switch away if we can do some more valuable work instead.

The whole ‘rockstar developer’ thing is backwards

Another day, another clearout of junk from people who want ‘rockstar iPhone developers’ for their Shoreditch startups. I could just say “no”, or I could launch into a detailed discussion of the problems in this picture.

Rockstars are stagnant

No-one, and I mean no-one, wants to listen to your latest album. They want you to play Free Bird, or Jessica, or Smoke on the Water. OK, so they’ll pay more for their tickets than people listening to novel indie acts, you’ll make more money from them (after your promoter has taken their 30%). But you had better use exactly the right amount of sustain in that long note in Parisienne Walkways, just like you did back in ’79, or there’ll be trouble. Your audience doesn’t care whether you’ve incorporated new styles or interesting techniques from other players, or bought new equipment, you’re playing Apache on that pink Stratocaster the way you always have.

That’s exactly the opposite of a good model in software. Solving the same problem over and over, using the same tools and techniques, is ossification. It’s redundant. No-one needs it any more. Your audience are more like New York jazz fans than VH-1 viewers: they want tradition with a twist. Yes, it needs to be recognisable that you’re solving a problem they have – that you’re riffing on a standard. But if you’re not solving new problems, you’re no longer down with the cool cats. As the rock stars might say: who wants yesterday’s papers?

Home taping is killing music

That riff you like to throw out every night, that same problem that needs solving over and over again? Some student just solved the same thing, and they put it on github. The change in code-sharing discourse of the late 1990s – from “Free Software” to “Open Source” – brought with it the ability for other people to take that solution and incorporate it into their own work with few obligations. So now everyone has a solution to that problem, and is allowed to sell it to everyone who has the problem. Tomorrow night, your stadium’s going to have plenty of empty seats.

Programming groupie culture

Programming has a very small number of big names: not many people would be as well-known in the industry as, say, Linus Torvalds, Richard Stallman, DHH. Some people might choose to call these people “polarising”. Others might choose “rude and arrogant”. Either way, they seem to bring their harems of groupies to the internet: cadres of similarly-“polarising” males who want to be seen to act in the same way as their heroes.

A primatologist might make the case that they are imitating the alpha male baboon in order to gain recognition as the highest-status beta.

Now the groupies have moved the goalposts for success from solving new problems to being rude about solutions that weren’t solved by the “in” group. What, you want to patch our software to fix a bug? You’re not from round these parts, are you?

Embrace the boffin

Somehow for the last few years I managed to hang on to the job title “Security Boffin”. Many people ask what a boffin is: the word was World War 2 slang among the British armed forces referring to the scientists working on the war effort. Like “nerd” or “geek”, it meant someone who was clever but perhaps a bit, well, different.

Boffins were also known at the time as “the back room boys”[*] for their tendency to stay out of the way and solve important – and expedient – technical problems. We need these messages decrypting, the boffins in the back room have done it but they keep talking about this “computer” thing they built. Those boffins have come up with a way to spot planes before we can even see them.

The rockstar revels in former glories while their fans insist that nothing made later even comes close to the classics. If you need a problem solving, look for boffins, not Bonos.

[*] Unfortunately in the military establishment of the 1940s it was assumed that the clever problem solvers were all boys. In fact histories of early computing in the States show that the majority-female teams who actually programmed and operated the wartime computers often knew more about the machines’ behaviours than did the back room boys, diagnosing and fixing problems without reporting them. A certain Grace Hopper, PhD, invented the compiler while the back room boys were sure computers couldn’t be used for that.