Fuck. This. Shit.

Enough with the subtle allusions of the previous posts. What’s going on here is not right.

It’s not right that I get to pass as a member of the group of people who can work in technology, while others have to justify their very presence in the field.

It’s not right that “looking like me” is a pass to being considered for the best-paid jobs, while “not looking like me” is not.

[that last one took me a long time to understand. To me, it seems like I worked hard to get where I am, but I needed to understand that I was given the opportunity to work at the things I worked at. That all I needed to do was to work at it, not to work at it and convince everyone else that I was eligible to work at it.]

It’s not right that while I get a free pass into various fields of endeavour, others are told that they either slept their way into the field or are groupies or are unfuckable.

Previously, I avoided writing about this stuff because I thought I might get writing about this stuff wrong. Fuck. This. Shit. I’ve got social capital to burn; it’s much easier for me to get another job around this sector than plenty of people who are as good or better than me at doing the work. I might be worried about treading the line between being supportive and getting into trouble, but that’s not as bad as the line women, trans people, non-white people, non-straight people, disabled people have to tread between asking to be considered equally and keeping their jobs have to tread. I have one job: doing my job. I do not have two jobs: doing my job, and convincing people that someone like me should be allowed to do my job. If the cost of equality is giving up my free ride, then I give up my free ride.

The pipeline is not the problem, it leads to a vat of acid. No-one wants to lean in to a vat of acid. (Thanks to Cate Huston for that metaphor, it made a lot of sense to me.)

Our industry is exclusive, and needs to be inclusive. What should you do about this? I don’t know, I’m far from knowledgable. If your position is “I agree with the straight white guy that the world is broken, I should ask the straight white guy how to fix it” then perhaps you are the problem, just as I have been and am the problem.

What should I do about this? First step for me is to listen. To not tell people who are describing their experiences what my experiences are. To avoid thinking about my reply to people, and to think about what they’ve said. To stop looking for holes in arguments and to listen for opportunities to grow. Not just to grow me, but to grow others.

What it takes to “win” a discussion

You may have been to some kind of debate club at school, or at least had a debate in a class. If so, the debate you had was probably a competitive debate, and went something along these lines (causality is not presented as its usual wibbly-wobbly self to keep the sentences short):

  • A motion is proposed.
  • Someone presents a statement in favour of the motion.
  • Someone else presents a statement against the motion.
  • A second statement in favour is made.
  • A second opposing statement is made.
  • Questions are asked and answered.
  • A favouring summary is made.
  • An opposing summary is made.
  • Somehow one “side” or the other “wins”, perhaps by a vote.

Or you may have been to court. If so, you probably saw something that went along these lines:

  • A charge is proposed.
  • Someone presents a case, including evidence, supporting the charge.
  • Someone else presents a case, including evidence, refuting the charge.
  • Questions are asked and answered.
  • A supporting summary is made.
  • A refuting summary is made.
  • Somehow one “side” or the other “wins”, perhaps by the agreement of a group of people.

Both forms of conversation are very formal and confrontational. They’re also pretty hard to get right without a lot of practice.

And here’s a secret that the Internet Illuminati apparently tries to keep shielded from many people: not every conversation needs to work like that.

Back in the time of the war of the three kingdoms, the modern party system of British politics didn’t exist in the same way it does now. Members of parliament would form associations based on agreements of the matter under discussion, but the goal of parliament was to reach consensus on that matter. “Winning” was achieved by coming to the best conclusion available.

And, it turns out, that’s a possible outcome for today’s discussions too. Let’s investigate what that might mean.

  • Someone wants to know what tool to use to achieve some goal. This conversation is “won” by exploring the possibilities and their pros and cons. Shouting across other people’s views until they give up doesn’t count as winning, because nothing is learned. That’s losing.
  • Two people have different experiences. Attempting to use clever rhetorical tricks to demonstrate that the other person’s views are invalid doesn’t count as winning, because nothing is learned. That’s losing.

Learning things, by the way, is pretty cool.

Intra-curricular activities

I’m apparently fascinated by the idea of defining curricula for learning programming. I’ve written about how we need to be careful what we try to pay forward from the way we learned in the past, and I’ve talked about how we do need to pay it forward so that the second hundred years see faster progress than the first hundred years.

I’m a fan (with reservations, as seen below) of the book series as a form of curriculum. Take something like Kent Beck’s signature series, which covers a decent subset of both technical and social approaches in software development in breadth and in depth. You could probably imagine developers who would benefit from reading some or all of the books in the series. In fact, you may be one.

Coping with people approaching the curriculum from different skill levels and areas of experience is hard. Not just for the book series, it’s hard in general. Universities take the simplifying approach of assuming that everybody wants to learn the same stuff, and teaching that stuff. And to some extent that’s easy for them, because the backgrounds of prospective students is relatively uniform. Even so, my University course organised incoming students into two groups; those who had studied complex numbers at A-level and those who had not. The difference was simply that the group who had not were given a couple of lectures on complex numbers, then it was assumed that they also knew the topic from the fourth week.

Now consider selling a programming book to the public. Part of the proposal process with all of the publishers I’ve worked with has been describing the target audience. Is this a book for people who have never programmed before? For people who have programmed a little, but never used this particular tool or technique? People who have programmed a lot but never used this tool? Is this thing similar to what they have used before, or very different? For people who are somewhat familiar with the tool? For experts (and how is that defined)? Is it for readers comfortable with maths? For readers with no maths background?

Every “no” in answer to one of those questions is an opportunity to improve the experience for a subset of the potential audience by tailoring it to that subset. It’s also an opportunity to exclude a subset of the audience by making the content less relevant to them.

[I’ll digress here to explain how I worked that out for my books: whether it’s selfishness or a failure of empathy, I wrote books that I wanted to read but that didn’t exist. Therefore the expected experience is something similar to mine, back when I filled in the proposal form.]

Clearly no single publication will cover the whole phase space of potential readers and be any good. The interesting question is how much it’s worth covering with multiple publications; whether the idea of series-as-curriculum pulls in the general direction as much as scope-limiting each book pulls in the specific. Should the curriculum take readers on a straight line from novice to master? Should it “fan in” from multiple introductions? Should it “fan out” in multiple directions of interest and enquiry? Would a non-linear curriculum be inclusive or offputtingly confusing? Should the questions really be answered by substituting the different question “how many people would buy that”?

Preparing for Computing’s Big One-Oh-Oh

However you slice the pie, we’re between two and three decades away from the centenary celebration for applied computing (which is of course significantly after theoretical or hypothetical advances made by the likes of Lovelace, Turing and others). You might count the anniversary of Colossus in 2043, the ENIAC in 2046, or maybe something earlier (and arguably not actually applied) like the Z3 or ABC (both 2041). Whichever one you pick, it’s not far off.

That means that the time to start organising the handover from the first century’s programmers to the second is now, or perhaps a little earlier. You can see the period from the 1940s to around 1980 as a time of discovery, when people invented new ways of building and applying computers because they could, and because there were no old ways yet. The next three and a half decades—a period longer than my life—has been a period of rediscovery, in which a small number of practices have become entrenched and people occasionally find existing, but forgotten, tools and techniques to add to their arsenal, and incrementally advance the entrenched ones.

My suggestion is that the next few decades be a period of uncovery, in which we purposefully seek out those things that have been tried, and tell the stories of how they are:

  • successful because they work;
  • successful because they are well-marketed;
  • successful because they were already deployed before the problems were understood;
  • abandoned because they don’t work;
  • abandoned because they are hard;
  • abandoned because they are misunderstood;
  • abandoned because something else failed while we were trying them.

I imagine a multi-volume book✽, one that is to the art of computer programming as The Art Of Computer Programming is to the mechanics of executing algorithms on a machine. Such a book✽ would be mostly a guide, partly a history, with some, all or more of the following properties:

  • not tied to any platform, technology or other fleeting artefact, though with examples where appropriate (perhaps in a platform invented for the purpose, as MIX, Smalltalk, BBC BASIC and Oberon all were)
  • informed both by academic inquiry and practical experience
  • more accessible than the Software Engineering Body of Knowledge
  • as accepting of multiple dissenting views as Ward’s Wiki
  • at least as honest about our failures as The Mythical Man-Month
  • at least as proud of our successes as The Clean Coder
  • more popular than The Celestial Homecare Omnibus

As TAOCP is a survey of algorithms, so this book✽ would be a survey of techniques, practices and modes of thought. As this century’s programmer can go to TAOCP to compare algorithms and data structures for solving small-scale problems then use selected algorithms and data structures in their own work, so next century’s applier of computing could go to this book✽ to compare techniques and ways of reasoning about problems in computing then use selected techniques and reasons in their own work. Few people would read such a thing from cover to cover. But many would have it to hand, and would be able to get on with the work of invention without having to rewrite all of Doug Engelbart’s work before they could get to the new stuff.

It's dangerous to go alone! Take this.

✽: don’t get hung up on the idea that a book is a collection of quires of some pigmented flat organic matter bound into a codex, though.

Intuitive is the Enemy of Good

In the previous instalment, I discussed an interview in which Alan Kay maligned growth-restricted user interfaces. Here’s the quote again:

There is the desire of a consumer society to have no learning curves. This tends to result in very dumbed-down products that are easy to get started on, but are generally worthless and/or debilitating. We can contrast this with technologies that do have learning curves, but pay off well and allow users to become experts (for example, musical instruments, writing, bicycles, etc. and to a lesser extent automobiles).

This is nowhere more evident than in the world of the mobile app. Any one app comprises a very small number of very focussed, very easy to use features. This has a couple of different effects. One is that my phone as a whole is an incredibly broad, incredibly shallow experience. For example, one goal I want help with from technology is:

As an obese programmer, I want to understand how I can improve my lifestyle in order to live longer and be healthier.

Is there an app for that? No; I have six apps that kindof together provide an OK, but pretty disjointed experience that gets me some dissatisfying way toward my goal. I can tell three of these apps how much I run, but I have to remember that some subset can feed information to the others but the remainder cannot. I can tell a couple of them how much I ate, but if I do it in one of them then another won’t count it correctly. Putting enough software to fulfil my goal into one app presumably breaks the cardinal rule of making every feature available within two gestures of the app’s launch screen. Therefore every feature is instead hidden behind the externalised myriad gestures required to navigate my home screens and their folders to get to the disparate subsets of utility.

The second observable effect is that there is a lot of wasted potential in both the device, and the person operating that device. You have never met an expert iPhone user, for the simple reason that someone who’s been using an iPhone for six years is no more capable than someone who has spent a week with their new device diligently investigating. There is no continued novelty, there are no undiscovered experiences. There is no expertise. Welcome to the land of the perpetual beginner.

Thankfully, marketing provided us with a thought-terminating cliché, to help us in our discomfort with this situation. They gave us the cars and trucks analogy. Don’t worry that you can’t do everything you’d expect with this device. You shouldn’t expect to do absolutely everything with this device. Notice the sleight of brain?

Let us pause for a paragraph to notice that even if making the most simple, dumbed-down (wait, sorry, intuitive) experience were our goal, we use techniques that keep that from within our grasp. An A/B test will tell you whether this version is incrementally “better” than that version, but will not tell you whether the peak you are approaching is the tallest mountain in the range. Just as with evolution, valley crossing is hard without a monumental shake-up or an interminable period of neutral drift.

Desktop environments didn’t usually get this any better. The learning path for most WIMP interfaces can be listed thus:

  1. cannot use mouse.
  2. can use mouse, cannot remember command locations.
  3. can remember command locations.
  4. can remember keyboard shortcuts.
  5. ???
  6. programming.

A near-perfect example of this would be emacs. You start off with a straightforward modeless editor window, but you don’t know how to save, quit, load a file, or anything. So you find yourself some cheat-sheet, and pretty soon you know those things, and start to find other things like swapping buffers, opening multiple windows, and navigating around a buffer. Then you want to compose a couple of commands, and suddenly you need to learn LISP. Many people will cap out at level 4, or even somewhere between 3 and 4 (which is where I am with most IDEs unless I use them day-in, day-out for months).

The lost magic is in level 5. Tools that do a good job of enabling improvement without requiring that you adopt a skill you don’t identify with (i.e. programming, learning the innards of a computer) invite greater investment over time, rewarding you with greater results. Photoshop gets this right. Automator gets it right. AppleScript gets it wrong; that’s just programming (in fact it’s all the hard bits from Smalltalk with none of the easy or welcoming bits). Yahoo! Pipes gets it right but markets it wrong. Quartz Composer nearly gets it right. Excel is, well, a bit of a boundary case.

The really sneaky bit is that level 5 is programming, just with none of the trappings associated with the legacy way of programming that professionals do it. No code (usually), no expressing your complex graphical problem as text, no expectation that you understand git, no philosophical wrangling over whether squares are rectangles or not. It’s programming, but with a closer affinity with the problem domain than bashing out semicolons and braces. Level 5 is where we can enable people to get the most out of their computers, without making them think that they’re computering.

Software, Science?

Is there any science in software making? Does it make sense to think of software making as scientific? Would it help if we could?

Hold on, just what is science anyway?

Good question. The medieval French philosopher-monk Buridan said that the source of all knowledge is experience, and Richard Feynman paraphrased this as “the test of all knowledge is experiment”.

If we accept that science involves some experimental test of knowledge, then some of our friends who think of themselves as scientists will find themselves excluded. Consider the astrophysicists and cosmologists. It’s all very well having a theory about how stars form, but if you’re not willing to swirl a big cloud of hydrogen, helium, lithium, carbon etc. about and watch what happens to it for a few billion years then you’re not doing the experiment. If you’re not doing the experiment then you’re not a scientist.

Our friends in what are called the biological and medical sciences also have some difficulty now. A lot of what they do is tested by experiment, but some of the experiments are not permitted on ethical grounds. If you’re not allowed to do the experiment, maybe you’re not a real scientist.

Another formulation (OK, I got this from the wikipedia entry on Science) sees science as a sort of systematic storytelling: the making of “testable explanations and predictions about the universe”.

Under this definition, there’s no problem with calling astronomy a science: you think this is how things work, then you sit, and watch, and see whether that happens.

Of course a lot of other work fits into the category now, too. There’s no problem with seeing the “social sciences” as branches of science: if you can explain how people work, and make predictions that can (in principle, even if not in practice) be tested, then you’re doing science. Psychology, sociology, economics: these are all sciences now.

Speaking of the social sciences, we must remember that science itself is a social activity, and that the way it’s performed is really defined as the explicit and implicit rules and boundaries created by all the people who are doing it. As an example, consider falsificationism. This is the idea that a good scientific hypothesis is one that can be rejected, rather than confirmed, by an appropriately-designed experiment.

Sounds pretty good, right? Here’s the interesting thing: it’s also pretty new. It was mostly popularised by Karl Popper in the 20th Century. So if falsificationism is the hallmark of good science, then Einstein didn’t do good science, nor did Marie Curie, nor Galileo, or a whole load of other people who didn’t share the philosophy. Just like Dante’s Virgil was not permitted into heaven because he’d been born before Christ and therefore could not be a Christian, so all of the good souls of classical science are not permitted to be scientists because they did not benefit from Popper’s good message.

So what is science today is not necessarily science tomorrow, and there’s a sort of self-perpetuation of a culture of science that defines what it is. And of course that culture can be critiqued. Why is peer review appropriate? Why do the benefits outweigh the politics, the gazumping, the gender bias? Why should it be that if falsification is important, negative results are less likely to be published?

Let’s talk about Physics

Around a decade ago I was studying particle physics pretty hard. Now there are plenty of interesting aspects to particle physics. Firstly that it’s a statistics-heavy discipline, and that results in statistics are defined by how happy you are with them, not by some binary right/wrong criterion.

It turns out that particle physicists are a pretty conservative bunch. They’ll only accept a particle as “discovered” if the signal indicating its existence is measured as a five-sigma confidence: i.e. if there’s under a one-on-a-million chance that the signal arose randomly in the absence of the particle’s existence. Why five sigma? Why not three (a 99.7% confidence) or six (to keep Motorola happy)? Why not repeat it three times and call it good, like we did in middle school science classes?

Also, it’s quite a specialised discipline, with a clear split between theory and practice and other divisions inside those fields. It’s been a long time since you could be a general particle physicist, and even longer since you could be simply a “physicist”. The split leads to some interesting questions about the definition of science again: if you make a prediction which you know can’t be verified during your lifetime due to the lag between theory and experimental capability, are you still doing science? Does it matter whether you are or not? Is the science in the theory (the verifiable, or falsifiable, prediction) or in the experiment? Or in both?

And how about Psychology, too

Physicists are among the most rational people I’ve worked with, but psychologists up the game by bringing their own feature to the mix: hypercriticality. And I mean that in the technical sense of criticism, not in the programmer “you’re grammar sucks” sense.

You see, psychology is hard, because people are messy. Physics is easy: the apple either fell to earth or it didn’t. Granted, quantum gets a bit weird, but it generally (probably) does its repeatable thing. We saw above that particle physics is based on statistics (as is semiconductor physics, as it happens); but you can still say that you expect some particular outcome or distribution of outcomes with some level of confidence. People aren’t so friendly. I mean, they’re friendly, but not in a scientific sense. You can do a nice repeatable psychology experiment in the lab, but only by removing so many confounding variables that it’s doubtful the results would carry over into the real world. And the experiment only really told you anything about local first year psychology undergraduates, because they’re the only people who:

  1. walked past the sign in the psychology department advertising the need for participants;
  2. need the ten dollars on offer for participation desperately enough to turn up.

In fact, you only really know about how local first year psychology undergraduates who know they’re participating in a psychology experiment behave. The ethics rules require informed consent which is a good thing because otherwise it’s hard to tell the difference between a psychology lab and a Channel 4 game show. But it means you have to say things like “hey this is totally an experiment and there’ll be counselling afterward if you’re disturbed by not really electrocuting the fake person behind the wall” which might affect how people react, except we don’t really know because we’re not allowed to do that experiment.

On the other hand, you can make observations about the real world, and draw conclusions from them, but it’s hard to know what caused what you saw because there are so many things going on. It’s pretty hard to re-run the entire of a society with just one thing changed, like “maybe if we just made Hitler an inch taller then Americans would like him, or perhaps try the exact same thing as prohibition again but in Indonesia” and other ideas that belong in Philip K. Dick novels.

So there’s this tension: repeatable results that might not apply to the real world (a lack of “ecological validity”), and real-world phenomena that might not be possible to explain (a lack of “internal validity”). And then there are all sorts of other problems too, so that psychologists know that for a study to hold water they need to surround what they say with caveats and limitations. Thus is born the “threats to validity” section on any paper, where the authors themselves describe the generality (or otherwise) of their results, knowing that such threats will be a hot topic of discussion.

But all of this—the physics, the psychology, and the other sciences—is basically a systematised story-telling exercise, in which the story is “this is why the universe is as it is” and the system is the collection of (time-and-space-dependent) rules that govern what stories may be told. It’s like religion, but with more maths (unless your religion is one of those ones that assigns numbers to each letter in a really long book then notices that your phone number appears about as many times as a Poisson distribution would suggest).

Wait, I think you were talking about software

Oh yeah, thanks. So, what science, if any, is there in making software? Does there exist a systematic approach to storytelling? First, let’s look at the kinds of stories we need to tell.

The first are the stories about the social system in which the software finds itself: the story of the users, their need (or otherwise) for a software system, their reactions (or otherwise) to the system introduced, how their interactions with each other change as a result of introducing the system, and so on. Call this requirements engineering, or human-computer interaction, or user experience; it’s one collection of stories.

You can see these kinds of stories emerging from the work of Manny Lehman. He identifies three types of software:

  • an S-system is exactly specified.
  • a P-system executes some known procedure.
  • an E-system must evolve to meet the needs of its environment.

It may seem that E-type software is the type in which our stories about society are relevant, but think again: why do we need software to match a specification, or to follow a procedure? Because automating that specification or procedure is of value to someone. Why, or to what end? Why that procedure? What is the impact of automating it? We’re back to telling stories about society. All of these software systems, according to Lehman, arise from discovery of a problem in the universe of discourse, and provide a solution that is of possible interest in the universe of discourse.

The second kind are the stories about how we worked together to build the software we thought was needed in the first stories. The practices we use to design, build and test our software are all tools for facilitating the interaction between the people who work together to make the things that come out. The things we learned about our own society, and that we hope we can repeat (or avoid) in the future, become our design, architecture, development, testing, deployment, maintenance and support practices. We even create our own tools—software for software’s sake—to automate, ease or disrupt our own interactions with each other.

You’ll mostly hear the second kind of story at most developer conferences. I believe that’s because the people who have most time and inclination to speak at most developer conferences are consultants, and they understand the second stories to a greater extent than the first because they don’t spend too long solving any particular problem. It’s also because most developer conferences are generally about making software, not about whatever problem it is that each of the attendees is trying to solve out in the world.

I’m going to borrow a convention that Rob Rix told me in an email, of labelling the first type of story as being about “external quality” and the second type about “internal quality”. I went through a few stages of acceptance of this taxonomy:

  1. Sounds like a great idea! There really are two different things we have to worry about.
  2. Hold on, this doesn’t sounds like such a good thing. Are we really dividing our work into things we do for “us” and things we do for “them”? Labelling the non-technical identity? That sounds like a recipe for outgroup homogeneity effect.
  3. No, wait, I’m thinking about it wrong. The people who make software are not the in-group. They are the mediators: it’s just the computers and the tools on one side of the boundary, and all of society on the other side. We have, in effect, the Janus Thinker: looking on the one hand toward the external stories, on the other toward the internal stories, and providing a portal allowing flow between the two.

JANUS (from Vatican collection) by Flickr user jinnrouge

So, um, science?

What we’re actually looking at is a potential social science: there are internal stories about our interactions with each other and external stories about our interactions with society and of society’s interactions with the things we create, and those stories could potentially be systematised and we’d have a social science of sorts.

Particularly, I want to make the point that we don’t have a clinical science, an analogy drawn by those who investigate evidence-based software engineering (which has included me, in my armchair way, in the past). You can usefully give half of your patients a medicine and half a placebo, then measure survival or recovery rates after that intervention. You cannot meaningfully treat a software practice, like TDD as an example, as a clinical intervention. How do you give half of your participants a placebo TDD? How much training will you give your ‘treatment’ group, and how will you organise placebo training for the ‘control’ group? [Actually I think I’ve been on some placebo training courses.]

In constructing our own scientific stories about the world of making software, we would run into the same problems that social scientists do in finding useful compromises between internal and ecological validity. For example, the oft-cited Exploratory experimental studies comparing online and offline programming performance (by Sackman et al., 1968) is frequently used to support the notion that there are “10x programmers”, that some people who write software just do it ten times faster than others.

However, this study does not have much ecological validity. It measures debugging performance, using either an offline process (submitting jobs to a batch system) or an online debugger called TSS, which probably isn’t a lot like the tools used in debugging today. The problems were well-specified, thus removing many of the real problems programmers face in designing software. Participants were expected to code a complete solution with no compiler errors, then debug it: not all programmers work like that. And where did they get their participants from? Did they have a diverse range of backgrounds, cultures, education, experience? It does not seem that any results from that study could necessarily apply to modern software development situated in a modern environment, nor could the claim of “10x programmers” necessarily generalise as we don’t know who is 10x better than whom, even at this one restricted task.

In fact, I’m also not convinced of its internal validity. There were four conditions (two programming problems and two debugging setups), each of which was assigned to six participants. Variance is so large that most of the variables are independent of each other (the independent variables are the programming problem and the debugging mode, and the dependent variables are the amount of person-time and CPU-time), unless the authors correlate them with “programming skill”. How is this skill defined? How is it measured? Why, when the individual scores are compared, is “programming skill” not again taken into consideration? What confounding variables might also affect the wide variation in scored reported? Is it possible that the fastest programmers had simply seen the problem and solved it before? We don’t know. What we do know is that the reported 28:1 ratio between best and worst performers is across both online and offline conditions (as pointed out in, e.g., The Leprechauns of Software Engineering, so that’s definitely a confounding factor. If we just looked at two programmers using the same environment, what difference would be found?

We had the problem that “programming skill” is not well-defined when examining the Sackman et al. study, and we’ll find that this problem is one we need to overcome more generally before we can make the “testable explanations and predictions” that we seek. Let’s revisit the TDD example from earlier: my hypothesis is that a team that adopts the test-driven development practice will be more productive some time later (we’ll defer a discussion of how long) than the null condition.

OK, so what do we mean by “productive”? Lines of code delivered? Probably not, their amount varies with representation. OK, number of machine instructions delivered? Not all of those would be considered useful. Amount of ‘customer value’? What does the customer consider valuable, and how do we ensure a fair measurement of that across the two conditions? Do bugs count as a reduction in value, or a need to do additional work? Or both? How long do we wait for a bug to not be found before we declare that it doesn’t exist? How is that discovery done? Does the expense related to finding bugs stay the same in both cases, or is that a confounding variable? Is the cost associated with finding bugs counted against the value delivered? And so on.

Software dogma

Because our stories are not currently very testable, many of them arise from a dogmatic belief that some tool, or process, or mode of thought, is superior to the alternatives, and that there can be no critical debate. For example, from the Clean Coder:

The bottom line is that TDD works, and everybody needs to get over it.

No room for alternatives or improvement, just get over it. If you’re having trouble defending it, apply a liberal sprinkle of argumentum ab auctoritate and tell everyone: Robert C. Martin says you should get over it!

You’ll also find any number of applications of the thought-terminating cliché, a rhetorical technique used to stop cognitive dissonance by allowing one side of the issue to go unchallenged. Some examples:

  • “I just use the right tool for the job”—OK, I’m done defending this tool in the face of evidence. It’s just clearly the correct one. You may go about your business. Move along.
  • “My approach is pragmatic”—It may look like I’m doing the opposite of what I told you earlier, but that’s because I always do the best thing to do, so I don’t need to explain the gap.
  • “I’m passionate about [X]”—yeah, I mean your argument might look valid, I suppose, if you’re the kind of person who doesn’t love this stuff as much I do. Real craftsmen would get what I’m saying.
  • and more.

The good news is that out of such religious foundations spring the shoots of scientific thought, as people seek to find a solid justification for their dogma. So just as physics has strong spiritual connections, with Steven Hawking concluding in A Brief History of Time:

However, if we discover a complete theory, it should in time be understandable by everyone, not just by a few scientists. Then we shall all, philosophers, scientists and just ordinary people, be able to take part in the discussion of the question of why it is that we and the universe exist. If we find the answer to that, it would be the ultimate triumph of human reason — for then we should know the mind of God.

and Einstein debating whether quantum physics represented a kind of deific Dungeons and Dragons:

[…] an inner voice tells me that it is not yet the real thing. The theory says a lot, but does not really bring us any closer to the secret of the “old one.” I, at any rate, am convinced that He does not throw dice.

so a (social) science of software could arise as an analogous form of experimental theology. I think the analogy could be drawn too far: the context is not similar enough to the ages of Islamic Science or of the Enlightenment to claim that similar shifts to those would occur. You already need a fairly stable base of rational science (and its application via engineering) to even have a computer at all upon which to run software, so there’s a larger base of scientific practice and philosophy to draw on.

It’s useful, though, when talking to a programmer who presents themselves as hyper-rational, to remember to dig in and to see just where the emotions, fallacious arguments and dogmatic reasoning are presenting themselves, and to wonder about what would have to change to turn any such discussion into a verifiable prediction. And, of course, to think about whether that would necessarily be a beneficial change. Just as we can critique scientific culture, so should we critique software culture.

It’s about solving problems

As ever, there’s a touchstone issue on the programmers’ corner of the intarwebs (the programmers’ corner is actually the same intarwebs everyone else is using, just we model it with geometry so it can have a corner). Here it is:

Alan Kelly predicted that by 2022, TDD will become a prerequisite for employment as a programmer. Let’s leave aside for a moment the issue discussed here and in APPropriate Behaviour, that there is no arbiter of programmer employability.

Responses did not take long in coming. Some TDDers cannot believe that TDD is not already required. Some non-TDDers are incensed that non-TDD is not considered valid. I particularly like this tweet about TDD’s place:

I like to explain that a lot of [TDD is] just a substitute for a decent type system ;).

You know what? That might be true, I’m not sure. Functional programming education has, so far, been to me like a pure map of me onto a slightly older version of me without the side effect of imparting knowledge of functional programming. Bear in mind that I’ve been paid to write LISP in the past, and I struggle with FP. I just can’t get from knowing what functions are to big-picture views of functional programming.

I wouldn’t know the mathematics of a type system if it came up and quacked at me. As far as I know, Hindley is the name of a serial killer and now I’m looking warily at Milner. But I’m fine with that, and I’m fine with you not being fine with that. It’s just that we think about things in different ways.

Other people think about things in other different ways too. And I’m fine with that, too. Let’s look some more at tests.

Many forms of automated test follow a common pattern: Assemble, Act, Assert or Given, When, Then. Let’s talk about Eiffel, a language invented by the only person I know whose opinions have a stronger identity than the person who holds them.

An Eiffel programmer would look at Given and see that this is describing a particular state of the preconditions of the system under test. Or, using theories, a range of preconditions. And they would look at Then, and observe that this is making statements about the postconditions of the system under test.

This Eiffel programmer would probably then ask why you expect them to do all of this, when they already generalised this out in designing the system’s contract. Why write all these tests when the program itself can evaluate its own preconditions and postconditions as it’s chugging along, and you can design to those?

Just as this starts to sound reasonable, a Prolog programmer asks why you’d ever write the When bit at all. Surely you just tell the computer what you’ve got and what you want, and it works out how to get there? (I tried, and it said “NO.” Oh well.)

There are loads of different ways to write software. Many of these are not better nor worse than others. In fact, to even have that discussion you’d need to be able to express and agree upon what “better” means, and I’m not sure we’re there. Some are prevalent because they let people reason about their problems in a way that feels comfortable, or efficient. Others are prevalent because a vendor threw lots of money into marketing. Some of them feel comfortable or efficient because we’ve become used to thinking in those ways, rather than because they were a natural fit onto our ab initio thoughts.

But remember, your goal is not to write software. Your goal is to solve problems, and often software is part of the solution. The tools, practices and principles we have now may be adequate, but they may not be best. They might suit some of us, and not others.

That’s fine. I know how I like to work, and I’m happy to help other people find out about it and try it out. But I’m also happy to find out about and try out other things. I’ll champion what I do, while learning about what others do. And all along I’ll champion the higher-level goal of solving problems, and the professional standard of being able to demonstrate confidence in our solutions.

Maybe I should go and read about type systems now.

A sneaky preview of ClassBrowser

Let me start with a few admissions. Firstly, I have been computering for a good long time now, and I still don’t really understand compilers. Secondly, work on my GNUstep Web side-project has tailed off for a while, because I decided I wanted to try something out to learn about the compiler before carrying on with that work. This post will mostly be about that something.

My final admission: if you saw my presentation on the ObjC runtime you will have seen an app called “ClassBrowser” where I showed that some of the Foundation classes are really in the CoreFoundation library. Well, there were two halves to the ClassBrowser window, and I only showed you the top half that looked like the Smalltalk class browser. I’m sorry.

So what’s the bottom half?

This is what the bottom half gives me. It lets me go from this:

ClassBrowser before

via this:

Who are you calling a doIt?

to this:

ClassBrowser after

What just happened?

You just saw some C source being compiled to LLVM bit code, which is compiled just-in-time to native code and executed, all inside that browser app.

Why?

Well why not? Less facetiously:

  • I’m a fan of Smalltalk. I want to build a thing that’s sort of a Smalltalk, except that rather than being the Smalltalk language on the Objective-C runtime (like F-Script or objective-smalltalk), it’ll be the Objective-C(++) language on the Objective-C runtime. So really, a different way of writing Objective-C.
  • I want to know more about how clang and LLVM work, and this is as good a way as any.
  • I think, when it actually gets off the ground, this will be a faster way of writing test-first Objective-C than anything Xcode can do. I like Xcode 5, I think it’s the better Xcode 4 that I always wanted, but there are gains to be had by going in a completely different direction. I just think that whoever strikes out in such a direction should not release their research project as a new version of Xcode :-).

Where can I get me a ClassBrowser?

You can’t, yet. There’s some necessary housekeeping that needs to be done before a first release, replacing some “research” hacks with good old-fashioned tested code and ensuring GNUstep-GUI compatibility. Once that’s done, a rough-and-nearly-ready first open source release will ensue.

Then there’s more work to be done before it’s anything like useful. Particularly while it’s possible to use it to run Objective-C, it’s far from pleasant. I’ve had some great advice from LLVM IRC on how to address that and will try to get it to happen soon.

The Ignoble Programmer

Two programmers are taking a break from their work, relaxing on a bench in the park across from their office. As they discuss their weekend plans, a group of people jog past, each carrying their laptop in a yoke around their neck and furiously typing as they go.

“Oh, there goes the Smalltalk team,” says the senior of the two programmers on the bench. “They have to do everything at run-time.”

I love jokes. And not just because they’re sometimes funny, though that helps: I certainly find I enjoy a conversation and can relax more when at least two of the people involved are having fun. When only one person is joking, it gets awkward (particularly if everyone else is from HR). But a little levity can go a long way toward disarming an unpleasant truth so that it can be discussed openly. Political leaders through the ages have taken advantage of this by appointing jesters and fools to keep them aware of intrigues in the courts: even the authors of the American bill of rights remembered the satirist before the shooter.

I also like jokes because of the thought that goes into constructing a good (or deliberately bad) one. There’s a certain kind of creativity that goes into identifying an apparently absurd connection, exactly because of the absurdity. Being able to construct a joke, and being practised at constructing jokes, means being able to see new contexts and applications for existing ideas. Welcome to the birthplace of reuse and exploring the bounds of a construct’s application: welcome to the real home of software architecture.

But there’s a problem, or at least an opportunity (or maybe just a few thousand consulting dollars to be made and a book to be written). That problem is this: everyone else puts way more effort into their jokes than programmers do. Take this one, from the scientists:

Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon

They didn’t just joke about doing a brain scan of a dead fish, they did a brain scan of a dead fish. And published the (serendipitous and unexpected) results. But they didn’t just angle for a laugh, they had a real point. The subtitle of their paper:

An Argument For Proper Multiple Comparisons Correction

And isn’t it fun that some microbiologists demonstrated that beards are significant vectors for microbial infections?

Both of these examples were lifted from the Annals of Improbable Research’s Ig Nobel Prizes, awarded for “achievements that first make people laugh, and then makes them think”. The Ig Nobels have been awarded every year since 1991, and in that time only one computer science award has been granted. That award was given to the developer behind PawSense, a utility that detects and blocks typing caused by your cat walking across your keyboard.

Jokes that first make you laugh, and then make you think, are absolutely the best jokes you can make about my work. If I conclude “you’re right, that is absurd, but what if…” then you’ve done it right. Jokes that are thought-terminating statements can make us laugh, and maybe make us feel good about what we’re doing, but cannot make us any better at it because they don’t give us the impetus to reflect on our craft. Rather, they make us feel smug about knowing better than the poor sap who’s the butt of the joke. They confirm that we’ve nothing to learn, which is never the correct outlook.

We need more Ig Nobel-quality achievements in computing. Disarming the absurd and the downright broken in programming and presenting them as jokes can first make us laugh, and then make us think.

N.B. My complete connection to the Annals of Improbable Research is that I helped out on the AV desk at a couple of their talks. At their talk in Oxford in 2006 I was inducted into the Luxuriant and Flowing Hair Club for Scientists.

How to answer questions the smart way

You may have read how to ask questions the smart way by Eric S. Raymond. You may have even quoted it when faced with a question you thought was badly-formed. I want you to take a look at a section near the end of the article.

How to answer questions in a helpful way is the part I’m talking about. It’s a useful section. It reminds us that questions are part of a dialogue, which is a two-way process. Sometimes questions seem bad, but then giving bad answers is certainly no way to make up for that. What else should we know about answering questions?

The person who asked the question has had different experiences than you. The fact that you do not understand why the question should be asked does not mean that the question should not be asked. “Why would you even want to do that?” is not an answer.

Answer at a level appropriate to the question. If the question shows a familiarity with the basics, there’s no need to mansplain trivial details in the answer. On the other hand, if the question shows little familiarity with the basics, then an answer that relies on advanced knowledge is just pointless willy-waving.

The shared values that pervade your culture are learned, not innate. Not everyone has learned them yet, and they are not necessarily even good, valuable or correct. This is a point that Raymond misses with quotes like this:

You shouldn’t be offended by this; by hacker standards, your respondent is showing you a rough kind of respect simply by not ignoring you. You should instead be thankful for this grandmotherly kindness.

What this says is: this is how we’ve always treated outsiders, so this is how you should expect to be treated. Fuck that. You’re better than that. Give a respectful, courteous answer, or don’t answer. It’s really that simple. We can make a culture of respect and courtesy normative, by being respectful and courteous. We can make a culture of inclusion by not being exclusive.

I’m not saying that I’m any form of authority on answering questions. I’m far from perfect, and by exploring the flaws I know I perceive in myself and making them explicit I make them conscious, with the aim of detecting and correcting them in the future.