Why is programming so hard?

I have been reflecting recently on what it was like to learn to program. The problem is, I don’t clearly remember: I do remember that there was a time when I was no good at it. When I could type a program in from INPUT or wherever, and if it ran correctly I was golden. If not, I was out of luck: I could proof-read the listing to make sure I had introduced no mistakes in creating my copy from the magazine, but if it was the source listing itself that contained the error, I wasn’t about to understand how to fix it.

The programs I could create at this stage were incredibly trivial, of the INPUT "WHAT IS YOUR NAME"; N$: IF N$="GRAHAM" THEN PRINT "HELLO, MY LORD" ELSE PRINT "GO AWAY ";N$ order of complexity. But that program contains pretty much most of what there is to computing: input, output, memory storage and branches. What made it hard? I’ll investigate whether it was BASIC itself that made things difficult later. Evidently I didn’t have a good grasp of what the computer was doing, anyway.

I then remember a time, a lot later, when I could build programs of reasonable complexity that used standard library features, in languages like C and Pascal. That means I could use arrays and record types, procedures, and library functions to read and write files. But how did I get there? How did I get from 10 PRINT "DIXONS IS CRAP" 20 GOTO 10 to building histograms of numeric data? That’s the bit I don’t remember: not that things were hard, but the specific steps or insights required to go from not being able to do a thing to finding it a natural part of the way I work.

I could repeat that story over and over, for different aspects of programming. My first GUI app, written in Delphi, was not much more than a “fill in the holes” exercise using its interface builder and code generator. I have an idea that my understanding of what I thought objects and classes were supposed to do was sparked by a particular training course I took in around 2008, but I still couldn’t point to what that course told me or what gaps in my knowledge it filled. Did it let me see the bigger picture around facts I already knew, did it correct a fallacious mental model, or did it give me new facts? How did it help? Indeed, is my memory even correct in pinpointing this course as the turning point? (The course, by the way, was Object-Oriented Analysis and Design Using UML.)

Maybe I should be writing down instances when I go from not understanding something to understanding it. That would work if such events can be identified: maybe I spend some time convincing myself that I do understand these things while I still don’t, or tell myself I don’t understand these things long after I do.

One place I can look for analogies to my learning experience is at teaching experience. A full litany of the problems I’ve seen in teaching programming to neophytes (as opposed to professional training, like teaching Objective-C programming to Rubyists, which is a very different thing) would be long and hard to recall. Tim Love, has seen and recorded similar problems to me (as have colleagues I’ve talked to about teaching programming).

A particular issue from that list that I’ll dig into here is the conflation of assignment and equality. The equals sign (=) was created in the form of two parallel lines of identical length, as no two things could be more equal. But it turns out then when used in many programming languages, in fact two things related by = could be a lot more equal. Here’s a (fabricated, but plausible) student attempt to print a sine table in C (preprocessor nonsense elided).

int main() {
  double x,y;
  y = sin(x);
  for (x = 0; x <= 6.42; x = x + 0.1)
    printf("%lf %lf\n", x, y);


Looks legit, especially if you’ve done any maths (even to secondary school level). In algebra, it’s perfectly fine for y to be a dependent variable related to x via the equality expressed in that program, effectively introducing a function y(x) = sin(x). In fact that means the program above doesn’t look legit, as there are not many useful solutions to the simultaneous equations x = 0 and x = x + 0.1. Unfortunately programming languages take a Humpty-Dumpty approach and define common signs like = to mean what they take them to mean, not what everybody else is conventionally accepting them to mean.

Maybe the languages themselves make learning this stuff harder, with their idiosyncrasies like redefining equality. This is where my musing on BASIC enters back into the picture: did I find programming hard because BASIC makes programming hard? It’s certainly easy to cast, pun intended, programming expertise as accepting the necessity to work around more roadblocks imposed by programming tools than inexpert programmers are capable of accepting. Anyone who has managed to retcon public static void main(String[] args) into a consistent vision of something that it’s reasonable to write every time you write a program (and to read every time to inspect a program, too) seems more likely to be subject to Stockholm Syndrome than to have a deep insight into how to start a computer program going.

We could imagine it being sensible to introduce neophytes to a programming environment that exposes the elements of programming with no extraneous trappings or aggressions. You might consider something like Self, which has the slot and message-sending syntax as its two features. Or LISP, which just has the list and list syntax. Or Scratch, which doesn’t even bother with having syntax. Among these friends, BASIC doesn’t look so bad: it gives you tools to access its model of computation (which is not so different from what the CPU is trying to do) and not much more, although after all this time I’m still not entirely convinced I understand how READ and DATA interact.

Now we hit a difficult question: if those environments would be best for beginners, why wouldn’t they be best for anyone else? If Scratch lets you computer without making mistakes associated with all the public static void nonsense, why not just carry on using Scratch? Are we mistaking expertise at the tools with expertise at the concepts, or what we currently do with what we should do, or complexity with sophistication? Or is there a fundamental reason why something like C++, though harder and more complex, is better for programmers with some experience than the environments in which they gained that experience?

If we’re using the wrong tools to introduce programming, then we’re unnecessarily making it hard for people to take their first step across the threshold, and should not be surprised when some of them turn away in disgust. If we’re using the wrong tools to continue programming, then we’re adding cognitive effort unnecessarily to a task which is supposed to be about automating thought. Making people think about not making people think. Masochistically imposing rules for ourselves to remember and follow, when we’re using a tool specifically designed for remembering and following rules.

Intra-curricular activities

I’m apparently fascinated by the idea of defining curricula for learning programming. I’ve written about how we need to be careful what we try to pay forward from the way we learned in the past, and I’ve talked about how we do need to pay it forward so that the second hundred years see faster progress than the first hundred years.

I’m a fan (with reservations, as seen below) of the book series as a form of curriculum. Take something like Kent Beck’s signature series, which covers a decent subset of both technical and social approaches in software development in breadth and in depth. You could probably imagine developers who would benefit from reading some or all of the books in the series. In fact, you may be one.

Coping with people approaching the curriculum from different skill levels and areas of experience is hard. Not just for the book series, it’s hard in general. Universities take the simplifying approach of assuming that everybody wants to learn the same stuff, and teaching that stuff. And to some extent that’s easy for them, because the backgrounds of prospective students is relatively uniform. Even so, my University course organised incoming students into two groups; those who had studied complex numbers at A-level and those who had not. The difference was simply that the group who had not were given a couple of lectures on complex numbers, then it was assumed that they also knew the topic from the fourth week.

Now consider selling a programming book to the public. Part of the proposal process with all of the publishers I’ve worked with has been describing the target audience. Is this a book for people who have never programmed before? For people who have programmed a little, but never used this particular tool or technique? People who have programmed a lot but never used this tool? Is this thing similar to what they have used before, or very different? For people who are somewhat familiar with the tool? For experts (and how is that defined)? Is it for readers comfortable with maths? For readers with no maths background?

Every “no” in answer to one of those questions is an opportunity to improve the experience for a subset of the potential audience by tailoring it to that subset. It’s also an opportunity to exclude a subset of the audience by making the content less relevant to them.

[I’ll digress here to explain how I worked that out for my books: whether it’s selfishness or a failure of empathy, I wrote books that I wanted to read but that didn’t exist. Therefore the expected experience is something similar to mine, back when I filled in the proposal form.]

Clearly no single publication will cover the whole phase space of potential readers and be any good. The interesting question is how much it’s worth covering with multiple publications; whether the idea of series-as-curriculum pulls in the general direction as much as scope-limiting each book pulls in the specific. Should the curriculum take readers on a straight line from novice to master? Should it “fan in” from multiple introductions? Should it “fan out” in multiple directions of interest and enquiry? Would a non-linear curriculum be inclusive or offputtingly confusing? Should the questions really be answered by substituting the different question “how many people would buy that”?

One meeellleeon

A teacher recently asked her computing class if there was any question they would like to ask me. One of the students came up with a question: how could they make a million pounds?

I think my answer would be one of these:

  1. Facebook has order of a billion users and is worth order of 100 billion pounds. Network value scales as the square of the number of users, so to merely make a million pounds you could build a network with just three and a half million users.

  2. A lowest-tier iOS app nets its developer roughly 40p per sale. To make a million pounds you need simply build a cheap app, then attract two and a half million sales.

If the app were a value-add for the network, you could easily make more than two million pounds.

[In fact I’m pretty sure I’ve already make a million pounds, it’s just that the costs worked out to about a million pounds.]