Choosing the correct openings and closures

Plenty of programmers will have heard of the Open-Closed Principle of object-oriented design. It is, after all, one of the five SOLID principles. You may not, however, have seen the principle as originally stated. You’ve probably heard this formulation by Robert C. Martin, or a variation on the theme:

Modules that conform to the open-closed principle have two primary attributes.

  1. They are “Open For Extension”.
    This means that the behavior of the module can be extended. That we can make the module behave in new and different ways as the requirements of the application change, or to meet the needs of new applications.

  2. They are “Closed for Modification”.
    The source code of such a module is inviolate. No one is allowed to make source code changes to it.

Source: “The Open-Closed Principle”, the Engineering Notebook, Robert C. Martin, 1996

OK, so how can we add stuff to a module or make it behave “in different ways” if we’re not allowed to make source code changes? Martin’s solution involves abstract classes (because he’s writing for a C++ journal, read “interfaces” or “protocols” as appropriate to your circumstances). Your code makes use of the abstract idea of, say, a view. If views need to work in a different way, for some reason (say someone needs to throw away some third-party licensed display server and replace it with something written in-house) then you don’t edit the view you’ve already provided, you replace that class with your new one.

The Open-Closed Principle was originally written by Bertrand Meyer in the first edition of his book, Object-Oriented Software Construction. Here’s what he had to say:

A satisfactory modular decomposition technique must satisfy one more requirement: it should yield modules that are both open and closed.

  • A module will be said to be open if it is still available for extension. For example, it should be possible to add fields to the data structures it contains, or new elements to the set of functions it performs.
  • A module will be said to be closed if [it] is available for use by other modules. This assumes that the module has been given a well-defined, stable description (the interface in the sense of information hiding). In the case of a programming language module, a closed module is one that may be compiled and stored in a library, for others to use. In the case of a design or specification module, closing a module simply means having it approved by management, adding it to the project’s official repository of accepted software items (often called the project baseline), and publishing its interface for the benefit of other module designers.

Source: Object-Oriented Software Construction, Bertrand Meyer, 1988 (p.23)

Essentially the idea behind a “closed” module for Meyer is one that’s baked; it has been released, people are using it, no more changes. He doesn’t go as far as Martin later did; there are no changes to its data structure or functionality. But if a module has been closed, how can it still be open? “Aha,” we hear Meyer say, “that’s the beauty of inheritance. Inheritance lets you borrow the implementation of a parent type, so you can open a new module that has all the behaviour of the old.” There’s no abstract supertype involved, everything’s concrete, but we still get this idea of letting old clients carry on as they were while new programmers get to use the new shiny.

Both of these programmers were suggesting the “closedness” of a module as a workaround to limitations in their compilers: if you add fields to a module’s data structure, you previously needed to recompile clients of that module. Compilers no longer have that restriction: in [and I can’t believe I’m about to say this in 2013] modern languages like Objective-C and Java you can add fields with aplomb and old clients will carry on working. Similarly with methods: while there are limitations in C++ on how you can add member functions to classes without needing a recompile, other languages let you add methods without breaking old clients. Indeed in Java you can add new methods and even replace existing ones on the fly, and in Smalltalk-derived languages you can do it via the runtime library.

But without the closed part of the open-closed principle, there’s not much point to the open part. It’s no good saying “you should be able to add stuff”, of course you can. That’s what the 103 keys on your keyboard that aren’t backspace or delete are for. This is where we have to remember that the compiler isn’t the only reader of the code: you and other people are.

In this age where we don’t have to close modules to avoid recompiles, we should say that modules should be closed to cognitive overload. Don’t make behavioural changes that break a programmer’s mental model of what your module does. And certainly don’t make people try to keep two or more mental models of what the same class does (yes, NSTableView cell-based and view-based modes, I am looking at you).

There’s already a design principle supposed to address this. The Single Responsibility Principle says not to have a module doing more than one thing. Our new version of the Open-Closed Principle needs to say that a module is open to providing new capabilities for the one thing it does do, but closed to making programmers think about differences in the way that thing is done. If you need to change the implementation in such a way that clients of the module need to care about, stop pretending it’s the same module.

Posted in OOP | Leave a comment

NIMBY Objects

Members of comfortable societies such as English towns have expectations of the services they will receive. They want their rubbish disposed of before it builds up too much, for example. They don’t so much care how it’s dealt with, they just want to put the rubbish out there and have it taken away. They want electricity supplied to their houses, it doesn’t so much matter how as long as the electrons flow out of the sockets and into their devices.

Some people do care about the implementation, in that they want it to be far enough away from it not to have to pay it any mind. These people are known as NIMBYs, after the phrase Not In My Back Yard. Think what it will do to traffic/children walking to school/the skyline/property prices etc. to have this thing I intend to use near my house!

A NIMBY wants to have their rubbish taken away, but does not want to be able to see the landfill or recycling centre during their daily business. A NIMBY wants to use electricity, but does not want to see a power station or wind turbine on their landscape.

What does this have to do with software? Modules in applications (which could be—and often are—objects) should be NIMBYs. They should want to make use of other services, but not care where the work is done except that it’s nowhere near them. The specific where I’m talking about is the execution context. The user interface needs information from the data model but doesn’t want the information to be retrieved in its context, by which I mean the UI thread. The UI doesn’t want to wait while the information is fetched from the model: that’s the equivalent of residential traffic being slowed down by the passage of the rubbish truck. Drive the trucks somewhere else, but Not In My Back Yard.

There are two ramifications to this principle of software NIMBYism. Firstly, different work should be done in different places. It doesn’t matter whether that’s on other threads in the same process, scheduled on work queues, done in different processes or even on different machines, just don’t do it anywhere near me. This is for all the usual good reasons we’ve been breaking work into multiple processes for forty years, but a particularly relevant one right now is that it’s easier to make fast-ish processors more numerous than it is to make one processor faster. If you have two unrelated pieces of work to do, you can put them on different cores. Or on different computers on the same network. Or on different computers on different networks. Or maybe on the same core.

The second is that this execution context should never appear in API. Module one doesn’t care where module two’s code is executed, and vice versa. That means you should never have to pass a thread, an operation queue, process ID or any other identifier of a work context between modules. If an object needs its code to run in a particular context, that object should arrange it.

Why do this? Objects are supposed to be a technique for encapsulation, and we can use that technique to encapsulate execution context in addition to code and data. This has benefits because Threading Is Hard. If a particular task in an application is buggy, and that task is the sole responsibility of a single class, then we know where to look to understand the buggy behaviour. On the other hand, if the task is spread across multiple classes, discovering the problem becomes much more difficult.

NIMBY Objects apply the Single Responsibility Principle to concurrent programming. If you want to understand surprising behaviour in some work, you don’t have to ask “where are all the places that schedule work in this context?”, or “what other places in this code have been given a reference to this context?” You look at the one class that puts work on that context.

The encapsulation offered by OOP also makes for simple substitution of a class’s innards, if nothing outside the class cares about how it works. This has benefits because Threading Is Hard. There have been numerous different approaches to multiprocessing over the years, and different libraries to support the existing ones: whatever you’re doing now will be replaced by something else soon.

NIMBY Objects apply the Open-Closed Principle to concurrent programming. You can easily replace your thread with a queue, your IPC with RPC, or your queue with a Disruptor if only one thing is scheduling the work. Replace that one thing. If you pass your multiprocessing innards around your application, then you have multiple things to fix or replace.

There are existing examples of patterns that fit the NIMBY Object description. The Actor model as implemented in Erlang’s processes and many other libraries (and for which a crude approximation was described in this very blog) is perhaps the canonical example.

Android’s AsyncTask lets you describe the work that needs doing while it worries about where it needs to be done. So does IKBCommandBus, which has been described in this very blog. Android also supports a kind of “get off my lawn” cry to enforce NIMBYism: exceptions are raised for doing (slow) network operations in the UI context.

There are plenty of non-NIMBY APIs out there too, which paint you into particular concurrency corners. Consider -[NSNotificationCenter addObserverForName:object:queue:usingBlock:] and ignore any “write ALL THE BLOCKS” euphoria going through your mind (though this is far from the worst offence in block-based API design). Notification Centers are for decoupling the source and sink of work, so you don’t readily know where the notification is coming from. So there’s now some unknown number of external actors defecating all over your back yard by adding operations to your queue. Good news: they’re all being funnelled through one object. Bad news: it’s a global singleton. And you can’t reorganise the lawn because the kids are on it: any attempt to use a different parallelism model is going to have to be adapted to accept work from the operation queue.

By combining a couple of time-honoured principles from OOP and applying them to execution contexts we come up with NIMBY Objects, objects that don’t care where you do your work as long as you don’t bother them with it. In return, they won’t bother you with details of where they do their work.

Posted in Uncategorized | Leave a comment

Dogma-driven development

You can find plenty of dogmatic positions in software development, in blogs, in podcasts, in books, and even in academic articles. “You should (always/never) write tests before writing code.” “Pair programming is a (good/bad) use of time.” “(X/not X) considered harmful.” “The opening brace should go on the (same/next) line.”

Let us ignore, for the moment, that only a maximum of 50% of these commandments can actually be beneficial. Let us skip past the fact that demonstrating which is the correct position to take is fraught with problems. Instead we shall consider this question: dogmatic rules in software engineering are useful to whom?

The Dreyfus model of skill acquisition tells us that novices at any skill, not just programming, understand the skill in only a superficial way. Their recollection of rules is non-situational; in other words they will try to apply any rule they know at any time. Their ability to recognise previously penchant free scenarios is small-scale, not holistic. They make decisions by analysis.

The Dreyfus brothers proposed that the first level up from novice was competence. Competents have moved to a situational recollection of the rules. They know which do or do not apply in their circumstances. Those who are competent can become proficient, when their recognition becomes holistic. In other words, the proficient see the whole problem, rather than a few small disjointed parts of it.

After proficiency comes expertise. The expert is no longer analytical but intuitive, they have internalised their craft and can feel the correct way to approach a problem.

“Rules” of software development mean nothing to the experts or the proficient, who are able to see their problem for what it is and come up with a context-appropriate solution. They can be confusing to novices, who may be willing to accept the rule as a truth of our work but unable to see in which contexts it applies. Only the competent programmers have a need to work according to rules, and the situational awareness to understand when the rules apply.

But competent programmers are also proficient programmers waiting to happen. Rather than being given more arbitrary rules to follow, they can benefit from being shown the big picture, from being led to understand their work more holistically than as a set of distinct parts to which rulaes can be mechanistically – or dogmatically – applied.

Pronouncements like coding standards and methodological commandments can be useful, but not on their own. By themselves they help competent programmers to remain competent. They can be situated and evaluated, to help novices progress to competence. They can be presented as a small part of a bigger picture, to help competents progress to proficiency. As isolated documents they are significantly less valuable.

Dogmatism considered harmful.

Posted in Uncategorized | 1 Comment

The whole ‘rockstar developer’ thing is backwards

Another day, another clearout of junk from people who want ‘rockstar iPhone developers’ for their Shoreditch startups. I could just say “no”, or I could launch into a detailed discussion of the problems in this picture.

Rockstars are stagnant

No-one, and I mean no-one, wants to listen to your latest album. They want you to play Free Bird, or Jessica, or Smoke on the Water. OK, so they’ll pay more for their tickets than people listening to novel indie acts, you’ll make more money from them (after your promoter has taken their 30%). But you had better use exactly the right amount of sustain in that long note in Parisienne Walkways, just like you did back in ’79, or there’ll be trouble. Your audience doesn’t care whether you’ve incorporated new styles or interesting techniques from other players, or bought new equipment, you’re playing Apache on that pink Stratocaster the way you always have.

That’s exactly the opposite of a good model in software. Solving the same problem over and over, using the same tools and techniques, is ossification. It’s redundant. No-one needs it any more. Your audience are more like New York jazz fans than VH-1 viewers: they want tradition with a twist. Yes, it needs to be recognisable that you’re solving a problem they have – that you’re riffing on a standard. But if you’re not solving new problems, you’re no longer down with the cool cats. As the rock stars might say: who wants yesterday’s papers?

Home taping is killing music

That riff you like to throw out every night, that same problem that needs solving over and over again? Some student just solved the same thing, and they put it on github. The change in code-sharing discourse of the late 1990s – from “Free Software” to “Open Source” – brought with it the ability for other people to take that solution and incorporate it into their own work with few obligations. So now everyone has a solution to that problem, and is allowed to sell it to everyone who has the problem. Tomorrow night, your stadium’s going to have plenty of empty seats.

Programming groupie culture

Programming has a very small number of big names: not many people would be as well-known in the industry as, say, Linus Torvalds, Richard Stallman, DHH. Some people might choose to call these people “polarising”. Others might choose “rude and arrogant”. Either way, they seem to bring their harems of groupies to the internet: cadres of similarly-“polarising” males who want to be seen to act in the same way as their heroes.

A primatologist might make the case that they are imitating the alpha male baboon in order to gain recognition as the highest-status beta.

Now the groupies have moved the goalposts for success from solving new problems to being rude about solutions that weren’t solved by the “in” group. What, you want to patch our software to fix a bug? You’re not from round these parts, are you?

Embrace the boffin

Somehow for the last few years I managed to hang on to the job title “Security Boffin”. Many people ask what a boffin is: the word was World War 2 slang among the British armed forces referring to the scientists working on the war effort. Like “nerd” or “geek”, it meant someone who was clever but perhaps a bit, well, different.

Boffins were also known at the time as “the back room boys”[*] for their tendency to stay out of the way and solve important – and expedient – technical problems. We need these messages decrypting, the boffins in the back room have done it but they keep talking about this “computer” thing they built. Those boffins have come up with a way to spot planes before we can even see them.

The rockstar revels in former glories while their fans insist that nothing made later even comes close to the classics. If you need a problem solving, look for boffins, not Bonos.

[*] Unfortunately in the military establishment of the 1940s it was assumed that the clever problem solvers were all boys. In fact histories of early computing in the States show that the majority-female teams who actually programmed and operated the wartime computers often knew more about the machines’ behaviours than did the back room boys, diagnosing and fixing problems without reporting them. A certain Grace Hopper, PhD, invented the compiler while the back room boys were sure computers couldn’t be used for that.

Posted in advancement of the self, social-science | Leave a comment

Programmer Values

A question and answer exchange over at programmers.stackexchange.com reveals something interesting about how software is valued. The question asked whether there is any real-world data regarding costs and benefits of test-driven development.[*] One of the answers contained, at time of writing, the anthropologist’s money shot:

The first thing that needs to be stated is that TDD does not necessarily increase the quality of the software (from the user’s point of view). […] TDD is done primarily because it results in better code. More specifically, TDD results in code that is easier to change. [Emphasis original]

Programmers are contracted, via whatever means, by people who see quality in one way: presumably that quality is embodied in software that they can use to do some thing that they wanted to do. Maybe it’s safer to say that the person who provided this answer believes that their customers value quality in software in that way, rather than make an assumption on everybody’s behalf.

This answer demonstrates that (the author believed, and thought it uncontentious enough to post without defence in a popularity-driven forum) programmers value attributes of the code that are orthogonal to the values of the people who pay them. One could imagine programmers making changes to some software that either have no effect or even a negative effect as far as their customers are concerned, because the changes have a positive effect in the minds of the programmers. This issue is also mentioned in one of the other answers to the question:

The problem with developers is they tend to implement even things that are not required to make the software as generic as possible.

The obvious conclusion is that the quality of software is normative. There is no objectively good or bad software, and you cannot discuss quality independent of the value system that you bring to the evaluation.

The less-obvious conclusion is that some form of reconciliation is still necessary: that management has not become redundant despite the discussions of self-organised teams in the Agile development community. Someone needs to mediate between the desire of the people who need the software to get something that satisfies their norms regarding quality software, and the desire of the people who make the software to produce something that satisfies their norms instead. Whether this is by aligning the two value systems, by ignoring one of them or by ensuring that the project enables attributes from both value systems to be satisfied is left as an exercise for the reader.

[*] There is at least one relevant study. No, you might not think it relevant to your work: that’s fine.

Posted in social-science, TDD | Leave a comment

Know what counts

In Make it Count, Harry Roberts describes blacking out on stage at the end of a busy and sleepless week. Ironically, he was at the start of a talk in which he was to discuss being selective over side projects, choosing only those that you can actually “cash in” on and use to advance your career.

If you’re going to take on side projects and speaking and writing and open source and suchlike then please, make them fucking count. Do not run yourself into the ground working on ‘career moves’ if you’re not going to cash in on them. [emphasis original]

Obviously working until you collapse is not healthy. At that point, choosing which projects to accept is less important than just getting some damned sleep and putting your health back in order. In the 1950s, psychologist Abraham Maslow identified a “hierarchy of needs”, and sleep is at the base of the hierarchy meaning that, along with eating and drinking, you should take care of that before worrying about self-actualisation or esteem in the eyes of your peers.

Maslow's heirarchy, Wikipedia image

Here’s the little secret they don’t tell you in the hiring interview at Silicon Valley start-ups: you’re allowed to do things that aren’t career-centric. This includes, but is not limited to, sleeping, drinking enough water, eating non-pizza foodstuffs, having fun, seeing friends, taking breaks, and indulging in hobbies. It sometimes seems that programmers are locked in an arms race to see who can burn out first^W^W^Wdo more work than the others. That’s a short-term, economist-style view of work. I explained in APPropriate Behaviour that economists take things they don’t want to consider, or can’t work out a dollar value for, and call them “externalities” that lie outside the system.

Your health should not be an externality. Roberts attempted to internalise the “accounting” for all of his side projects by relating them in value to his career position. If you’re unhealthy, your career will suffer. So will the rest of your life. Don’t externalise your health. Worry not whether what you’re doing is good for your position in the developer community, but whether it’s good for you as a healthy individual. If you’ve got the basic things like food, shelter, sleep and safety, then validation in the eyes of yourself and your peers can follow.

Posted in advancement of the self, learning, psychology, Responsibility, social-science | Leave a comment

The future will be just like the past, right?

I’ve been having a bit of a retro programming session:

Z88 adventure game

The computer in the photo is a Cambridge Z88, and it won’t surprise you to know that I’ve owned it for years. However, it’s far from my first computer.

I was born less than a month before the broadcast of The Computer Programme, the television show that brought computers into many people’s living rooms for the first time. This was the dawn of the personal computer era. The Computer Programme was shown at the beginning of 1982: by the end of that year the Commodore VIC-20 had become the first computer platform ever to sell more than one million units.

My father being an early adopter (he’d already used a Commodore PET at work), we had a brand new Dragon 32 computer before I was a year old. There’s not much point doing the “hilarious” comparisons of its memory capacity and processor speed with today’s computers: the social systems into which micros were inserted and the applications to which they were put render most such comparisons meaningless.

In 1982, computers were seen by many people as the large cupboards in the back of “James Bond film” sets. They just didn’t exist for a majority of people in the UK, the US or anywhere else. The micros that supposedly revolutionised home life were, for the most part, mainly useful for hobbyists to find out how computers worked. Spreadsheets like VisiCalc might already have been somewhat popular in the business world, but anyone willing to spend $2000 on an Apple ][ and VisiCalc probably wasn’t the sort of person about to diligently organise their home finances.

Without being able to sell their computers on the world-changing applications, many manufacturers were concerned about price and designed their computers down to a level. The Register’s vintage hardware section has retrospectives on many of the microcomputer platforms from the early 1980s, many of which tell this tale. (Those that don’t tell the tale of focusing on time to market, and running out of money.) The microprocessors were all originally controllers for disk drives and other peripherals in “real” computers, repurposed as the CPUs of the micro platforms. Sinclair famously used faulty 64kB RAM chips to supply the 48kB RAM in the ZX Spectrum, to get a good price from the supplier.

So the manufacturers were able to make the hardware cheap enough that people would buy computers out of interest, but what would they then make of them? We can probably tell quite a lot by examining the media directed at home computer users. Start with The Computer Programme, as we’ve already seen that back at the beginning of the post. What you have is Ian “Mac” McNaught-Davies, positioned at the beginning of episode 1 as a “high priest” of the mainframe computer, acting as the Doctor to Chris Serle’s bemused and slightly apprehensive assistant. Serle is the perfectly ordinary man on the perfectly ordinary street, expressing (on behalf of us, the perfectly ordinary public) amazement at how a computer can turn a perfectly ordinary television set and a perfectly ordinary domestic cassette recorder into something that’s able to print poorly-defined characters onto perfectly ordinary paper.

During his perfectly ordinary tenure of ten episodes, Serle is taught to program in BBC BASIC by McNaught-Davis. In the first episode he demonstrates a fear of touching anything, confirming the spelling of every word (“list? L-I-S-T?”) he’s asked to type. If the computer requires him to press Return, he won’t do it until instructed by McNaught-Davis (thus making January 11, 1982 the first ever outing of The Return of the Mac). By the end of the series, Serle is able to get on a bit more autonomously, suggesting to Mac what the programs mean (“If temperature is more than 25, degrees I would assume…”).

Chris Serle suffered his way through nine weeks of BASIC tuition because there was no other choice for a freelance journalist to get any use out of a personal computer. Maybe as many as 8,000 hipster programmers would opt for a Jupiter Ace and the FORTH language, but for normal people it was BASIC or nothing. Even loading a game required typing the correct incantation into the BASIC prompt. Feedback was minimal because there wasn’t a lot of ROM in which to store the error messages: “Subscript at line 100” or even the Dragon’s “?BS ERROR” might be all you’re told about an error. If you didn’t have a handy McNaught-Davis around (perhaps the first user-friendly Mac in the computer field) you could easily lose ages working out what the computer thought was BS about your code.

Typing errors became manifold when using the common application distribution platform: the printed magazine. Much software was distributed as “type-ins”, often split over two (monthly) issues of a magazine: the program being presented in buggy form in one edition and an errata being supplied in the next. When you typed not one LOAD command, but a few hundred lines of BASIC in, only to find that your database program didn’t work as expected, you first had a tedious proof-reading task ahead to check that you’d typed it without error. If you had, and it still didn’t work, then out came the pencil and paper as you tried to work out what mistakes were in the listing.

Microcomputers represented seriously constrained hardware with limited application. The ability to get anything done was hampered by the primary interface being an error-prone, cryptic programming language. While the syntax of this language was hailed as simpler than many alternatives, it did nothing to smooth over or provide a soft landing for complex underlying concepts.

I’m willing to subject myself to those trials and terrors for the purpose of nostalgia. There are other people, though, who want to revert to this impression of computers as a way to get young people interested in programming. The TinyBASIC for Raspberry Pi announcement hails:

we’ve also had a really surprising number of emails from parents who haven’t done any programming since school, but who still have books on BASIC from when they were kids, remember enjoying computing lessons, and want to share some of what they used to do with their kids. It’s actually a great way to get kids started, especially if you have some enthusiasm of your own to share: enthusiasm’s contagious.

Undoubtedly there are some genuine, remembered benefits to programming on these platforms, which modern computer tuition could learn from. There was, as discussed above, no hurdle to jump to get into the programming environment. Try teaching any programming language on widely-available computing platforms today, and you’ve got to spend a while discussing what versions of what software are needed, differences between platforms, installation and so on. Almost anyone on a microcomputer could turn on, and start typing in BASIC code that would, if restricted to a limited subset of commands, work whatever they’d bought.

The cost of a “tyre-kicking” setup was modest, particularly as you could use your own TV and cassette deck (assuming you had them). Unlike many modern platforms, there was no need to have two computers tethered to program on one and run on the other, and no developer tithe to pay to the platform vendors. Where they were error-free and well documented, the type-ins gave you actually working applications that you could tweak and investigate. Such starting points are better for some learners than a blank screen and a blinking prompt.

Complete applications though these type-ins may have been, they would not satisfy the expectations of modern computer-using learners. There’s an important difference: people today have already used computers. They’re no longer magical wonder-boxes that can make a TV screen flash blue and yellow if you get the numbers correct in a PAPER command. People know what to expect from a laptop, tablet or smartphone: being able to print an endless march of RUMBELOWS IS SHIT to the screen is no longer sufficient to retain interest.

It’s not just the users of computers, nor the uses of computers, that have moved on in the last three decades. Teaching has evolved, too. There should probably be a name for the fallacy that assumes that however I was taught things is however everybody else should be taught them. A modern curriculum for novice programmers should reflect not only the technological and social changes in computing in the last thirty years, but also the educational changes. It should borrow from the positives of microcomputer programming courses, but not at the expense of throwing out a generation of evolution.

There are certainly things we can learn from the way microcomputers inspired a generation of programmers. There’s a place for ultra-cheap computers like the Raspberry Pi in modern computing pedagogy. But it would be a mistake to assume that if I gave a child my copy of “Super-Charge Your Spectrum”, that child would learn as much and be as enthused about programming as my rose-tinted model of my younger self apparently was.

Posted in social-science | Leave a comment

On what makes a “good” comment

I have previously discussed the readability of code:

The author must decide who will read the code, and how to convey the important information to those readers. The reader must analyse the code in terms of how it satisfies this goal of conveyance, not whether they enjoyed the indentation strategy or dislike dots on principle.

Source code is not software written in a human-readable notation. It’s an essay, written in executable notation.

Now how does that relate to comments? Comments are a feature of programming languages that allow all other text-based languages—executable or otherwise—to be injected into the program. The comment feature has no effect on the computer’s interpretation of the software, but wildly varying effects on the reader’s interpretation. From APPropriate Behaviour:

[There are] problems with using source code as your only source of information about the software. It does indeed tell you exactly what the product does. Given a bit of time studying, you can discover how it does it, too. But will the programming language instructions tell you why the software does what it does? Is that weird if statement there to fix a bug reported by a customer? Maybe it’s there to workaround a problem in the APIs? Maybe the original developer just couldn’t work out a different way to solve the problem.

So good documentation should tell you why the code does what it does, and also let you quickly discover how.

We need to combine these two quotes. Yes, the documentation—comments included—needs to express the why and the how, but different readers will have different needs and will not necessarily want these questions answered at the same level.

Take the usual canonical example of a bad comment, also given in APPropriate Behaviour and used for a very similar discussion:

//add one to i
`i++;`

To practiced developers, this comment is just noise. It says the same thing as the line below it.

The fact is that to novice developers too it says the same thing as the line below it, but they have not yet learned to read the notation fluently. This means that they cannot necessarily readily tell that they say the same thing: therefore the comment adds value.

Where someone familiar with the (programming) language might say that the comment only reiterates what the software does, and therefore adds no value, a neophyte might look at the function name to decide what it does and look to comments like this to help them comprehend how it does it.

Outside of very limited contexts, I would avoid comments like that though. I usually assume that a reader will be about as comfortable with the (computer) language used as I am, and either knows the API functions or (like me) knows where to find documentation on them. I use comments sparingly, to discuss trade-offs being made, information relied on that isn’t evident in the code itself or discussions of why what’s being done is there, if it might seem odd without explanation.

Have I ever written a good comment?

As examples, here are some real comments I’ve written on real code, with all the context removed and with reviews added. Of course, as with the rest of the universe “good” and “bad” are subjective, and really represent conformance with the ideas of comment quality described above and in linked articles.

 /*note - answer1.score < answer2.score, but answer1 is accepted so should
 *still be first in the list of answers.
 */

This is bad. You could work this one out with a limited knowledge of the domain, or from the unit tests. This comment adds nothing.

/* NASTY HACK ALERT
 * The UIWebView loads its contents asynchronously. If it's still doing
 * that when the test comes to evaluate its content, the content will seem
 * empty and the test will fail. Any solution to this comes down to "hold
 * the test back for a bit", which I've done explicitly here.
 * http://stackoverflow.com/questions/7255515/why-is-my-uiwebview-empty-in-my-unit-test
 */

This is good. I’ve explained that the code has a surprising shape, but for a reason I understand, and I’ve provided a reference that goes into more detail.

    //Knuth Section 6.2.2 algorithm D.

This is good, if a bit too brief. I’ve cited the reference description (to me, anyway: obviously Knuth got it from somewhere else) of the algorithm. If you want to know why it does what it does, you can go and read the discussion there. If there’s a bug you can compare my implementation with Knuth’s. Of course Knuth wrote more than one book, so I probably should have specified “The Art of Computer Programming” in this comment.

/**
 * The command bus accepts commands from the application and schedules work
 * to fulfil those commands.
 */

This is not what I mean by a comment. It’s API documentation, it happens to be implemented as a comment, but it fills a very particular and better-understood role.

What do other people’s comments look like?

Here are some similarly-annotated comments, from a project I happen to have open (GNUstep-base).

/*
 * If we need space allocated to store a return value,
 * make room for it at the end of the callframe so we
 * only need to do a single malloc.
 */

Explains why the programmer wrote it this way, which is a good thing.

  /* The addition of a constant '8' is a fudge applied simply because
   * some return values write beynd the end of the memory if the buffer
   * is sized exactly ... don't know why.
   */

This comment is good in that explains what is otherwise a very weird-looking bit of code. It would be better if the author had found the ultimate cause and documented that, though.

/* This class stores objects inline in data beyond the end of the instance.
 * However, when GC is enabled the object data is typed, and all data after
 * the end of the class is ignored by the garbage collector (which would
 * mean that objects in the array could be collected).
 * We therefore do not provide the class when GC is being used.
 */

This is a good comment, too. There’s a reason the implementation can’t be used in particular scenarios, here’s why a different one is selected instead.

/*
 *  Make sure the array is 'sane' so that it can be deallocated
 *  safely by an autorelease pool if the '[anObject retain]' causes
 *  an exception.
 */

This is a bad comment, in my opinion. Let’s leave aside for the moment the important issue of our industry’s relationship with mental illness. What exactly does it mean for an array to be ‘sane’? I can’t tell from this comment. I could look at the code, and find out what is done near this comment. However, I could not decide what there contributes to this particular version of ‘sanity’: particularly, what if anything could I remove before it was no longer ‘sane’? Why is it that this particular version of ‘sanity’ is required?

What do other people say about comments?

For many people, the go-to (pun intended) guide on coding practice is, or was, Code Complete, 2nd Edition. As with this blog and APPropriate Behaviour, McConnell promotes the view that comments are part of documentation and that documentation is part of programming as a social activity. From the introduction to Chapter 32, Self-Documenting Code:

Like layout, good documentation is a sign of the professional pride a programmer puts into a program.

He talks, as do some of the authors in 97 Things Every Programmer Should Know, about documenting the design decisions, both at overview and detailed level. That is a specific way to address the “why” question, because while the code shows you what it does it doesn’t express the infinitude of things that it does not do. Why does it not do any of them? Good question, someone should answer it.

Section 32.3 is, in a loose way, a Socratic debate on the value of comments. In a sidebar to this is a quote attributed to “B. A. Sheil”, from an entry in the bibliography, The Psychological Study of Programming. This is the source that most directly connects the view on comments I’ve been expressing above and in earlier articles to the wider discourse. The abstract demonstrates that we’re in for an interesting read:

Most innovations in programming languages and methodology are motivated by a belief that they will improve the performance of the programmers who use them. Although such claims are usually advanced informally, there is a growing body of research which attempts to verify them by controlled observation of programmers’ behavior. Surprisingly, these studies have found few clear effects of changes in either programming notation or practice. Less surprisingly, the computing community has paid relatively little attention to these results. This paper reviews the psychological research on programming and argues that its ineffectiveness is the result of both unsophisticated experimental technique and a shallow view of the nature of programming skill.

Here is not only the quote selected by McConnell but the rest of its paragraph, which supplies some necessary context. The emphasis is Sheil’s.

Although the evidence for the utility of comments is equivocal, it is unclear what other pattern of results could have been expected. Clearly, at some level comments have to be useful. To believe otherwise would be to believe that the comprehensibility of a program is independent of how much information the reader might already have about it. However, it is equally clear that a comment is only useful if it tells the reader something she either does not already know or cannot infer immediately from the code. Exactly which propositions about a program should be included in the commentary is therefore a matter of matching the comments to the needs of the expected readers. This makes widely applicable results as to the desirable amount and type of commenting so highly unlikely that behavioral experimentation is of questionable value.

So it turns out that at about the time I was being conceived, so was the opinion on comments (and documentation and code readability in general) to which I ascribe: that you should write for your audience, and your audience probably needs to know more than just what the software is up to.

That Sheil reference also contains a cautionary tale about the “value” of comments:

Weissman found that appropriate comments caused hand simulation to proceed significantly faster, but with significantly more errors.

That’s a reference to Laurence Weissman’s 1974 PhD Thesis.

Posted in documentation | 1 Comment

Did that restructuring work actually help?

Before getting into the meat of this post, I’d like to get into the meta of this post. This essay, and I imagine many in this blog [Ed: by which I meant the blog this has been imported from], will be treading a fine line. The intended aim is to question accepted industry practice, and find results consistent or inconsistent with the practice as a beneficial task to perform. I’m more likely to select papers that appear to refute the practice, as that’s more interesting and makes us introspect the way we work more than does affirmation. The danger is that this skates too close to iconoclasm, as expressed in the Goto Copenhagen talk title Is it just me or is everything shit?. My intention isn’t to say that whatever we’re doing is wrong, just to provide some healthy inspection and analysis of our industry.

Legacy Software Restructuring: Analyzing a Concrete Case

The thread in this paper is that metrics that have long been used to measure the quality of source code—metrics related to coupling and cohesion—may not actually be relevant to the problems developers have to solve. Firstly, the jargon:

  • coupling refers to the connections between the part of the software (module, class, function, whatever) under consideration and the rest of the software system. Received wisdom is that lower coupling (i.e. fewer connections that are less-tightly intertwined) is better.
  • cohesion refers to the relatedness of the tasks performed by the (module, class, function, whatever) under consideration. The more different responsibilities a component provides or uses, the lower its cohesion. Received wisdom is that higher cohesion (i.e. fewer responsibilities per module) is better.

We’re told that striving for low coupling and high cohesion will make the parts of our software reusable and replaceable, and will reduce the number of code sites we need to change when we want to fix bugs in the future. The focus of this paper is on whether the metrics we use as proxies for these properties actually represent enhancements to the code; in other words, whether we have a systematic way to decide whether a change is an improvement or not.

Approach

The way in which the authors test their metrics is necessarily problematic. There is no objective standard against which they can be prepared—if there were, we’d have an objective standard and we could all go home. They hypothesise that any restructuring effort by a development team must represent an improvement in the codebase: if you didn’t think a change was better, why would you make that change?

Necessity is one such reason. Consider the following thought process:

I need to add this feature to my product, this change was unforeseen at design time, so the architecture doesn’t really support it. I’m not very happy about the this, but shoehorning it in here is the simplest way to support what I need.

To understand other problems with this methodology, a one-paragraph introduction to the postmodern philosophy of software engineering is required. Software, it says, supports not some absolute set of requirements that were derived from studying the universe, but the ad-hoc set of interactions between the various people who engage with the software system. Indeed, the software system itself modifies these interactions, creating a feedback loop that in fact modifies the requirements of the software that was created. Some of the results of this philosophy[*] are expressed in “Manny” Lehman’s Laws of Software Engineering, which are also cited by the paper I’m talking about here. The authors offer one of Lehman’s Laws as:

[*] I don’t consider Lehman’s laws to be objectively true of software artefacts, but to be hypotheses that arise from a particular philosophy of software. I also think that philosophy has value.

Considering Lehman’s law of software evolution, such systems would already have suffered a decrease in their quality due to the maintenance. This would increase the probability that the restructuring has a better modular quality.

This statement is inconsistent. On the one hand, this change improves the quality of some software. On the other hand, the result of a collection of such changes is to decrease the quality. Now there’s nothing to say that a particular change won’t be an improvement; but there’s also nothing to say that the observed change has this property.[*] The postmodern philosophy adds an additional wrinkle: even if this change is better, it’s only better _as perceived by the people currently working with the system_. Others may have different ideas. We saw, in discussing the teaching of programming, that even experienced programmers can have difficulty reading somebody else’s code. I wouldn’t find it a big stretch to posit that different people have different ideas of what constitutes “good” modular decomposition, and that therefore a different set of programmers would think this change to be worse.

[*]Actually I think the sentence in the paper might just be broken; remember that I found this on the preprint server so it might not have been reviewed yet. One of Lehman’s laws says that, for “E-type” software (by which he means systems that evolve with their environment—in other words, systems where a postmodern appraisal is applicable), the software will gradually be perceived as _reducing_ in quality if no maintenance work goes into it. That’s because the system is evolving while the software isn’t; the requirements change without the software catching up.

Results and Discussion

The authors found that, for three particular revisions of Eclipse, the common metrics for coupling and cohesion did not monotonically “improve” with successive restructuring efforts. In some cases, both coupling and cohesion decreased in the same effort. In addition, they found that the number and extent of cyclic dependencies between Java packages increased with every successive version of the platform.

It’s not really possible to choose a conclusion to draw from these results:

  • maybe the dogma of increasing cohesion and decreasing coupling is misleading.
  • maybe the metrics used to measure those properties were poorly chosen (though they are commonly-chosen).
  • maybe the Eclipse developers use some other measurement of quality that the authors didn’t ask about.
  • maybe some of the Eclipse engineers do take these properties into account, and some others don’t, and we [can’t – added on import] even draw general conclusions about Eclipse.

So this paper doesn’t demonstrate that cohesion and coupling metrics are wrong. But it does raise the important question: might they be not right? If you’re relying on some code metrics derived from received wisdom or dogma, it’s time to question whether they really apply to what you do.

Posted in academia | Leave a comment

Teaching Programming to People. It’s easy, right?

I was doing a literature search for a different subject (which will appear soon), and found a couple of articles related to teaching programming. I don’t know if you remember when you learnt programming, but you probably found it hard. I’ve had some experience of teaching programming: specifically, teaching C to undergraduates. Said undergraduates, as it happens, weren’t on a computing course (they studied Physics), and only turned up to the few classes they had a year because attendance was mandatory. The lectures, which weren’t compulsory, had fewer students showing up.

Teaching Python to Undergraduates

When I took the course, we were taught a Pascal variant on NeXTSTEP. I have some evidence that Algol had been the first programming language taught on the course; probably on an HLH Orion minicomputer. While Pascal was developed in part as a vehicle to teach structured programming concepts, the academics in the computing course at my department were already starting to see it as a toy language with no practical utility. Such justification was used to look for a different language to teach. As you can infer from the previous paragraph, we settled on C: but not before a test where interested students (myself included) who had, for the most part, already taken the Pascal course. The experiences with Python were written up in a Masters’ thesis by Michael Williams, the student who had converted the (Pascal-based, of course) teaching materials to Python.

Like Wirth, when Guido van Rossum designed Python he had teaching in mind; though knowing the criticisms of Pascal he also made it extensible so that it could be used as a “real” language. This extensibility was put to use in the Python experiment at Oxford, giving students the numpy module which they used mainly for its matrix datatype (an important facility in Physics).

What the report shows is that it’s possible to teach someone enough Python to get onto problems with numerical computation in a day; although clearly this is also true of Pascal and C. One interesting observation is the benefit of enforced layout (Python’s meaningful indentation) to both the students, who reported that they did not find it difficult to indent a program correctly; and to the teachers, who found that because students were coerced into laying out their code consistently, it was easier to read and understand the intention of code.

An interesting open question is whether that means enforced indentation leads to more efficient code reviews in general, not just in an expert/neophyte relationship. Many developers using languages that don’t enforce layout choose to add the enforcement themselves. Whether this is an issue at all when modern IDEs can lay out code automatically (assuming developers with enough experience of the IDE to use that feature) also needs answering.

The conclusion of this study was that Python is appropriate as a teaching language for Oxford’s Physics course, though clearly it was not adopted and C was favoured. Why was this? As this decision was made after the report was produced, it doesn’t say, and my own recollection is hazy. I recall the “not for real world use” lobby was involved, that it was also possible to teach C in the time involved, and that while many people wanted to teach Java this was overruled due to a desire to avoid OO. The spurned Java crowd preferred C for its Java-like syntax.

Wait, C?

The next part of this story wasn’t published, but I’ll cover it anyway just for completeness. The year after this Python study, the teaching course did an A/B test where half of the first year course was taught Python, and half C. Whatever conclusions were drawn from this test, C won out so either there was no significant difference in that type of course or the “real worldness” of C was thought to be greater than that of Python. I remember both being given as justifications, but don’t know whether either or both were retrofitted.

Whatever the cause, teaching C was sufficiently not bad that the course is still based on the language.

Going back to that Java decision. How good is Java as a teaching language?

Analyses of Student Programming Errors In Java Programming Courses

I’m going to use the results of this paper to argue that Java is not good as a teaching language.

Programming errors can be categorized as syntax, semantic and logic. A syntax error is an error due to incorrect grammar. Syntax errors are often detected by a program called a compiler, if the language is a compiled language such as Java. A semantic error is an error due to misuse of a programming concept, despite correct syntactic structure. Semantic errors are caught when the program code is compiled. A logic error occurs when the program does not solve the problem that the programmer meant for it to solve.

[Notice that the study is only investigating errors: it’s not completely obvious but “bugs” aren’t included. The author’s only reporting on things that are either compiler or runtime errors in Java-land, like typos and indices out of bounds.]

Categorization of errors of the present study into syntax, semantic, runtime and logic revealed that syntax errors made up 94.1%, semantic errors 4.7% and logic
errors 1.2%.

In the ideal world, a programming course teaches students the principles of programming and how to combine these to solve some computational problem. In learning these things, you expect people to make semantic and logic errors: they don’t yet know how these things work. Syntax errors, on the other hand, are the compiler’s way of saying “meh, you know what you meant but I couldn’t be bothered to work it out”, or “I require you to jump through some hoops and you didn’t”.

You don’t want syntax errors when you’re teaching programming. You want people to struggle with the problems, not the environment in which those problems are presented. Imagine failing a student because they pushed the door to the exam room when it was supposed to be pulled: that’s a syntax error. One of the roles of a demonstrator in a computing course is to be the magic compiler pixie, fixing syntax errors so the students can get back on track.

OK, so not Java. What else is out there?

C++

The Oxford Physics investigation didn’t publish any results on the difficulty of teaching C. Thankfully, the Other Place is more forthcoming. Tim Love, author of Tim Love’s Cricket for the Dragon 32 and teacher of C++ to Engineering undergraduates blogged about difficulties encountered defining functions in C++. There’s no frequency information here, just representative problems. While most of the problems are semantic rather than syntactic, with some logic problems too, we can’t really compare these with the Java analysis above anyway.

The sorts of problems described in this blog post are largely the kind of problems you want, or at least don’t mind, people experiencing while you’re teaching them programming: as long as they get past them, and understand the difference between what they were trying and what eventually worked.

In that context, comments like this are worrying:

I left one such student to read the documentation for a few minutes, but when I returned to him he was none the wiser. The Arrays section of the doc might be sub-optimal, but it can’t be that bad – it’s much the same as last year’s.

So the teacher knows that the course might have problems, but not what they are or how to correct them despite seeing the failure modes in first person. This is not isolated: the Oxford course linked above has not changed substantially since the version I wrote (which added all the Finder & Xcode stuff).

So, we know something about the problems encountered by students of programming. Do we know anything about how they try to solve those problems?

An analysis of patterns of debugging among novice computer science students

We’ve already seen that students didn’t think to use the interactive interpreter feature of Python when the course handbook stopped telling them about it. In this paper, Ahmadzadeh et al modified the Java compiler to collect analytics about errors encountered by students on their course (the methodology looks very similar to the other Java paper, above). An interesting statistic noticed in §3.1, a statistical analysis of compiler errors:

It can be seen from this table that the error that is most common amongst all the subjects is failing to define a variable before it is used. This was almost always the highest frequency error when teaching a range of different concepts.

It’s possible that you could avoid 30-50% of the problems discovered in this study by using a language that doesn’t require explicit declaration of variables. Would the errors then be replaced by something else? Maybe.

In section 4, the authors note that there’s a distinction between being able to debug effectively and being able to program well: most people who are good at debugging are also good programmers, but a minority of good programmers are good at debugging. Of course this is measuring a class of neophytes so it’s possible that this gap eventually closes, but more work would need to be done to demonstrate or disprove that.

I notice that the students in this test are (at least, initially) fixing problems introduced into a program by someone else. Is that skill related to fixing problems in your own code? Might you be more frustrated if you think you’ve finished an assignment only to find there’s a problem you don’t understand in it? Does debugging someone else’s program support the educational goal? This paper suggests that the skills are in fact different, which is why “good” programmers can be “bad” debuggers: they understand programming, but not the problem solved by someone else’s code. They also suggest that “bad” programmers who do well at fixing bugs do it because they understand the aim of the program, and can reason about what the software should be doing. Perhaps being good at fixing bugs means more understanding of specifications than of code—traditionally the outlook of the tester (who is called on to find bugs but rarely to fix them).

Conclusions and Questions

There’s a surprising amount of data out there on the problems faced by students being taught programming—some of it leads directly to actionable conclusions, or at least testable hypotheses. Despite that, some courses look no different from courses I taught in 2004-2006 nor indeed any different from a course I took in 2000-2001.

Judicious selection of language could help students avoid some of the “syntactic” problems in programming, by choosing languages with less ceremony. Whether such a change would lead to students learning faster, enjoying the topic more, or just bumping up against a different set of syntax errors needs to be tested. But can we extrapolate from this? Are environments that are good for student programmers good for novices in general, including inexperienced professionals? Can this be taken further? Could we conclude that some languages waste time for all programmers, or that becoming expert in programming just means learning to cope with your environment’s idiosyncrasies?

And what should we make of this result that being good at programming and debugging do not go together? Should a programming course aim to develop both skills, or should specialisation be noticed and encouraged early? [Is there indeed a degree in software testing offered at any university?]

But, perhaps most urgently, why are so many different groups approaching this problem independently? Physics and Engineering academics are not experts at teaching computing, and as we’ve seen science code is not necessarily the best code. Could someone aggregate these results from various courses and produce the killer course in undergraduate computing?

Posted in academia | Leave a comment