I was doing a literature search for a different subject (which will appear soon), and found a couple of articles related to teaching programming. I don’t know if you remember when you learnt programming, but you probably found it hard. I’ve had some experience of teaching programming: specifically, teaching C to undergraduates. Said undergraduates, as it happens, weren’t on a computing course (they studied Physics), and only turned up to the few classes they had a year because attendance was mandatory. The lectures, which weren’t compulsory, had fewer students showing up.
Teaching Python to Undergraduates
When I took the course, we were taught a Pascal variant on NeXTSTEP. I have some evidence that Algol had been the first programming language taught on the course; probably on an HLH Orion minicomputer. While Pascal was developed in part as a vehicle to teach structured programming concepts, the academics in the computing course at my department were already starting to see it as a toy language with no practical utility. Such justification was used to look for a different language to teach. As you can infer from the previous paragraph, we settled on C: but not before a test where interested students (myself included) who had, for the most part, already taken the Pascal course. The experiences with Python were written up in a Masters’ thesis by Michael Williams, the student who had converted the (Pascal-based, of course) teaching materials to Python.
Like Wirth, when Guido van Rossum designed Python he had teaching in mind; though knowing the criticisms of Pascal he also made it extensible so that it could be used as a “real” language. This extensibility was put to use in the Python experiment at Oxford, giving students the numpy module which they used mainly for its matrix datatype (an important facility in Physics).
What the report shows is that it’s possible to teach someone enough Python to get onto problems with numerical computation in a day; although clearly this is also true of Pascal and C. One interesting observation is the benefit of enforced layout (Python’s meaningful indentation) to both the students, who reported that they did not find it difficult to indent a program correctly; and to the teachers, who found that because students were coerced into laying out their code consistently, it was easier to read and understand the intention of code.
An interesting open question is whether that means enforced indentation leads to more efficient code reviews in general, not just in an expert/neophyte relationship. Many developers using languages that don’t enforce layout choose to add the enforcement themselves. Whether this is an issue at all when modern IDEs can lay out code automatically (assuming developers with enough experience of the IDE to use that feature) also needs answering.
The conclusion of this study was that Python is appropriate as a teaching language for Oxford’s Physics course, though clearly it was not adopted and C was favoured. Why was this? As this decision was made after the report was produced, it doesn’t say, and my own recollection is hazy. I recall the “not for real world use” lobby was involved, that it was also possible to teach C in the time involved, and that while many people wanted to teach Java this was overruled due to a desire to avoid OO. The spurned Java crowd preferred C for its Java-like syntax.
The next part of this story wasn’t published, but I’ll cover it anyway just for completeness. The year after this Python study, the teaching course did an A/B test where half of the first year course was taught Python, and half C. Whatever conclusions were drawn from this test, C won out so either there was no significant difference in that type of course or the “real worldness” of C was thought to be greater than that of Python. I remember both being given as justifications, but don’t know whether either or both were retrofitted.
Whatever the cause, teaching C was sufficiently not bad that the course is still based on the language.
Going back to that Java decision. How good is Java as a teaching language?
Analyses of Student Programming Errors In Java Programming Courses
I’m going to use the results of this paper to argue that Java is not good as a teaching language.
Programming errors can be categorized as syntax, semantic and logic. A syntax error is an error due to incorrect grammar. Syntax errors are often detected by a program called a compiler, if the language is a compiled language such as Java. A semantic error is an error due to misuse of a programming concept, despite correct syntactic structure. Semantic errors are caught when the program code is compiled. A logic error occurs when the program does not solve the problem that the programmer meant for it to solve.
[Notice that the study is only investigating errors: it’s not completely obvious but “bugs” aren’t included. The author’s only reporting on things that are either compiler or runtime errors in Java-land, like typos and indices out of bounds.]
Categorization of errors of the present study into syntax, semantic, runtime and logic revealed that syntax errors made up 94.1%, semantic errors 4.7% and logic
In the ideal world, a programming course teaches students the principles of programming and how to combine these to solve some computational problem. In learning these things, you expect people to make semantic and logic errors: they don’t yet know how these things work. Syntax errors, on the other hand, are the compiler’s way of saying “meh, you know what you meant but I couldn’t be bothered to work it out”, or “I require you to jump through some hoops and you didn’t”.
You don’t want syntax errors when you’re teaching programming. You want people to struggle with the problems, not the environment in which those problems are presented. Imagine failing a student because they pushed the door to the exam room when it was supposed to be pulled: that’s a syntax error. One of the roles of a demonstrator in a computing course is to be the magic compiler pixie, fixing syntax errors so the students can get back on track.
OK, so not Java. What else is out there?
The Oxford Physics investigation didn’t publish any results on the difficulty of teaching C. Thankfully, the Other Place is more forthcoming. Tim Love, author of Tim Love’s Cricket for the Dragon 32 and teacher of C++ to Engineering undergraduates blogged about difficulties encountered defining functions in C++. There’s no frequency information here, just representative problems. While most of the problems are semantic rather than syntactic, with some logic problems too, we can’t really compare these with the Java analysis above anyway.
The sorts of problems described in this blog post are largely the kind of problems you want, or at least don’t mind, people experiencing while you’re teaching them programming: as long as they get past them, and understand the difference between what they were trying and what eventually worked.
In that context, comments like this are worrying:
I left one such student to read the documentation for a few minutes, but when I returned to him he was none the wiser. The Arrays section of the doc might be sub-optimal, but it can’t be that bad – it’s much the same as last year’s.
So the teacher knows that the course might have problems, but not what they are or how to correct them despite seeing the failure modes in first person. This is not isolated: the Oxford course linked above has not changed substantially since the version I wrote (which added all the Finder & Xcode stuff).
So, we know something about the problems encountered by students of programming. Do we know anything about how they try to solve those problems?
An analysis of patterns of debugging among novice computer science students
We’ve already seen that students didn’t think to use the interactive interpreter feature of Python when the course handbook stopped telling them about it. In this paper, Ahmadzadeh et al modified the Java compiler to collect analytics about errors encountered by students on their course (the methodology looks very similar to the other Java paper, above). An interesting statistic noticed in §3.1, a statistical analysis of compiler errors:
It can be seen from this table that the error that is most common amongst all the subjects is failing to define a variable before it is used. This was almost always the highest frequency error when teaching a range of different concepts.
It’s possible that you could avoid 30-50% of the problems discovered in this study by using a language that doesn’t require explicit declaration of variables. Would the errors then be replaced by something else? Maybe.
In section 4, the authors note that there’s a distinction between being able to debug effectively and being able to program well: most people who are good at debugging are also good programmers, but a minority of good programmers are good at debugging. Of course this is measuring a class of neophytes so it’s possible that this gap eventually closes, but more work would need to be done to demonstrate or disprove that.
I notice that the students in this test are (at least, initially) fixing problems introduced into a program by someone else. Is that skill related to fixing problems in your own code? Might you be more frustrated if you think you’ve finished an assignment only to find there’s a problem you don’t understand in it? Does debugging someone else’s program support the educational goal? This paper suggests that the skills are in fact different, which is why “good” programmers can be “bad” debuggers: they understand programming, but not the problem solved by someone else’s code. They also suggest that “bad” programmers who do well at fixing bugs do it because they understand the aim of the program, and can reason about what the software should be doing. Perhaps being good at fixing bugs means more understanding of specifications than of code—traditionally the outlook of the tester (who is called on to find bugs but rarely to fix them).
Conclusions and Questions
There’s a surprising amount of data out there on the problems faced by students being taught programming—some of it leads directly to actionable conclusions, or at least testable hypotheses. Despite that, some courses look no different from courses I taught in 2004-2006 nor indeed any different from a course I took in 2000-2001.
Judicious selection of language could help students avoid some of the “syntactic” problems in programming, by choosing languages with less ceremony. Whether such a change would lead to students learning faster, enjoying the topic more, or just bumping up against a different set of syntax errors needs to be tested. But can we extrapolate from this? Are environments that are good for student programmers good for novices in general, including inexperienced professionals? Can this be taken further? Could we conclude that some languages waste time for all programmers, or that becoming expert in programming just means learning to cope with your environment’s idiosyncrasies?
And what should we make of this result that being good at programming and debugging do not go together? Should a programming course aim to develop both skills, or should specialisation be noticed and encouraged early? [Is there indeed a degree in software testing offered at any university?]
But, perhaps most urgently, why are so many different groups approaching this problem independently? Physics and Engineering academics are not experts at teaching computing, and as we’ve seen science code is not necessarily the best code. Could someone aggregate these results from various courses and produce the killer course in undergraduate computing?