Meta-writing

Barely 4,000 years ago, documents were written on heavy, clay tablets. The Epic of Gilgamesh, one of the earliest known works of fiction, was written on 11 such tablets with a 12th added later. There was only one thing you could do with these tablets: read. Fast forward to the 21-st century and things are very different. The word “tablet” has taken on a new meaning, and documents can be delivered wirelessly, updated as new versions are written. They can also contain rich media and hyperlinked references to other content. And with these new capabilities come new considerations when preparing your documents—or “docs”—for your readers.

The above story seems rambling and pointless, doesn’t it? But change the timescale and the technology, and every single bloody report on mobile technology starts in exactly the same way.

APPropriate Behaviour is complete!

APPropriate Behaviour, the book on things programmers do that aren’t programming, is now complete! The final chapter – a philosophy of software making – has been added, concluding the book.

Just because it’s complete, doesn’t mean it’s finished: as my understanding of what we do develops I’ll probably want to correct things, or add new anecdotes or ideas. Readers of the book automatically get free updates whenever I create them in the future, so I hope that this is a book that grows with us.

As ever, the introduction to the book has instructions on joining the book’s Glassboard to discuss the content or omissions from the content. I look forward to reading what you have to say about the book in the Glassboard.

While the recommended purchase price of APPropriate Behaviour is $20, the minimum price now that it’s complete is just $10. Looking at the prices paid by the 107 readers who bought it while it was still being written, $10 is below the median price (so most people chose to pay more than $10) and the modal price (so the most common price chosen by readers was higher than $10).

A little about writing the book: I had created the outline of the book last Summer, while thinking about the things I believed should’ve been mentioned in Code Complete but were missing. I finally decided that it actually deserved to be written toward the end of the year, and used National Novel Writing Month as an excuse to start on the draft. A sizeable portion of the draft typescript was created in that month; enough to upload to LeanPub and start getting feedback on from early readers. I really appreciate the help and input those early readers, along with other people I’ve talked to the material about, have given both in preparing APPropriate Behaviour and in understanding my career and our industry.

Over the next few months, I tidied up that first draft, added new chapters, and extended the existing material. The end result – the 11th release including that first draft – is 141 pages of reflection over the decade in which I’ve been paid to make software: not a long time, but still nearly 15% of the sector’s total lifespan. I invite you to grab a copy from LeanPub and share in my reflections on that decade, and consider what should happen in the next.

APPropriate Behaviour is almost done

I just pushed another update to APPropriate Behaviour, my work on the things programmers do that aren’t programming. There’s some refinement to the existing material to be done, and a couple of short extra chapters to finish and add. But then it will be complete!

The recommended price of APPropriate Behaviour is $20. While it’s been under development, I’ve allowed readers interested in a sneak peak to buy APPropriate Behaviour at any price above $5. Once the final chapters are in place, the recommended price will remain $20 but the minimum price will be increasing. If you’ve been pondering buying it but haven’t yet, I recommend you do so now to get a bargain. Even if you buy it while I’m still working on it, you’ll get free updates for life as I add new material and make corrections.

As a little taster of things to come, the two remaining chapters are:

  • The ethics of making software
  • The philosophy of making software

Can’t wait to see what that means? Neither can I!

I just updated Appropriate Behaviour

The new release of Appropriate Behaviour—the book about things programmers should do that aren’t programming—is now up. The most obvious, and most awesome, change in this update is a fabulous new cover, designed by Sebastian Hermida of leanpubcovers.com. Should you be in the market for a cover page, I’d strongly recommend him.

Other changes in this release include additions to the (ends of the) chapters on coding practices and learning, and I’ve added part of a new chapter on requirements engineering. As ever, discussion of the book is welcome in its glassboard, details of which are in the introduction.

I’ve found it really interesting in researching this book that I can go back decades and find information that has either been forgotten, or was seemingly ignored even at the time of publication. I think it’s quite clear that there’s a gulf between software as practiced by people who make software, and software as researched by academics; it’s therefore not surprising to see journal articles that apparently never got read by commercial sector developers.

What is more interesting is the extent to which “mainstream” programming books, including ones that apparently made a big splash at the time of their publication, no longer seem relevant. They’ve either been completely dropped from our consciousness (hands up everyone who’s read Peopleware in the last five years), or have been adapted into a one-sentence précis that’s become part of the mythology of programming. A thought experiment by way of an example of this mythologising: quote any sentence from The Mythical Man Month except the one about adding people to a late project. What was the rest of the book about? Is anything else in it relevant to what we do today? Do we know that even that sentence is relevant, or does it just sound plausible?

I’ve been having lots of fun discovering these forgotten entries in our history and bringing some of them into a modern story about programming. But Appropriate Behaviour is not a history book; if anything, it’s a book on social anthropology. The lesson to learn from this post is that it’s not the first anthropological study of programmers; I’d argue that 1971’s The Psychology of Computer Programming is more anthropology than it is psychology. It’s very different from Appropriate Behaviour but they both tread the same ground, analysing the problems faced by a programmer that aren’t directly related to telling a computer what to do.

I imagine the history book would be fun to write, though for the moment I present this, which I hope is also fun to read.

Does the history of making software exist?

A bit of a repeated theme in the construction of APPropriate Behaviour has been that I’ve tried to position certain terms or concepts in their historical context, and found it difficult, or impossible to do so with sufficient rigour. There’s an extent to which I don’t want the book to become historiographical so have avoided going too deep into that angle, but have discovered that either no-one else has or that if they have, I can’t find their work.

What often happens is that I can find a history or even many histories, but that these aren’t reliable. I already wrote in the last post on this blog about the difficulties in interpreting references to the 1968 NATO conference; well today I read another two sources that have another two descriptions of the conference and how it kicked off the software crisis. Articles like that linked in the above post help to defuse some of the myths and partisan histories, but only in very specific domains such as the term “software crisis”.

Occasionally I discover a history that has been completely falsified, such as the great sequence of research papers that “prove” how some programmers are ten (or 25, or 1000) times more productive than others or those that “prove” bugs cost 100x more to fix in maintenance. Again, it’s possible to find specific deconstructions of these memes (mainly by reading Laurent Bossavit), but having discovered the emperor is naked, we have no replacement garments with which to clothe him.

There are a very few subjects where I think the primary and secondary literature necessary to construct a history exist, but that I lack the expertise or, frankly, the patience to pursue it. For example you could write a history of the phrase “software engineering”, and how it was introduced to suggest a professionalism beyond the craft discipline that went before it, only to become a symbol of lumbering lethargy among adherents of the craft discipline that came after it. Such a thing might take a trained historian armed with a good set of library cards a few years to complete (the book The Computer Boys Take Over covers part of this story, though it is written for the lay reader and not the software builder). But what of more technical ideas? Where is the history of “Object-Oriented”? Does that phrase mean the same thing in 2013 as in 1983? Does it even mean the same thing to different people in 2013?

Of course there is no such thing as an objective history. A history is an interpretation of a collection of sources, which are themselves interpretations drawn from biased or otherwise limited fonts of knowledge. The thing about a good history is that it lets you see what’s behind the curtain. The sources used will all be listed, so you can decide whether they lead you to the same conclusions as the author. It concerns me that we either don’t have, or I don’t have access to, resources allowing us to situate what we’re trying to do today in the narrative of everything that has gone before and will go hence. That we operate in a field full of hype and innuendo, and lack the tools to detect Humpty Dumptyism and other revisionist rhetoric.

With all that said, are the histories of the software industry out there? I don’t mean the collectors like the museums, who do an important job but not the one I’m interested in here. I mean the histories that help us understand our own work. Do degrees in computer science, to the extent they consider “real world” software making at all, teach the history of the discipline? Not the “assemblers were invented in 1949 and the first binary tree was coded in 19xy” history, but the rise and fall of various techniques, fads, disciplines, and so on? Or have I just volunteered for another crazy project?

I hope not, I haven’t got a good track record at remembering my library cards. Answers on a tweet, please.

An observation designed to aid the reading of books on software

Wherever a book on writing software describes the 1968 NATO conference in Garmisch on Software Engineering, consider whether the clarity of the argument can be improved by adding the following parenthetical clause:

[…], a straw man version of an otherwise real conference that took place in 1968, […]

Usually it can. The proceedings of the conference, which were written post facto by the editors and typists locking themselves in a hotel room with tapes of the sessions and typewriters in various states of repair, are available at Brian Randell’s website along with reflections on their creation. Does the report actually contain the fact presented in whichever book you’re reading now?

Probably not. The article “Crisis, What Crisis?” Reconsidering the Software Crisis of the 1960s and the Origins of Software Engineering investigates the position of the 1968 report in the rhetoric of the software industry and reliance by secondary authors on its content. The conclusion is that the report was largely ignored for about a decade, when it suddenly became the thing that kickstarted the software crisis and software engineering.

It would only be a little satirical to say “the software crisis was invented circa 1980 by Edsger Dijkstra, who postulated its origins in the NATO conference of 1968, a straw man conference” etc.

I published a new book!

Executive summary: it’s called APPropriate Behaviour, head over to the LeanPub site to check it out.

For quite a while, I’ve noticed that posts here are moving away from nuts and bolts code towards questions about evaluating my own performance, working with other developers and the industry in general.

I decided to spend some time working on these and related thoughts, trying to derive some consistent narrative as well as satisfying myself that these ideas were indeed going somewhere. I quickly ended up with about half of a novel-length book.

The other half is coming soon, but in the meantime the book is already published in preview state. To quote from the introduction:

this book is about the things that go into being a programmer that aren’t specifically the programming. It starts fairly close to home, with chapters on development tools, on supporting your own programming needs, and on other “software engineering” practices that programmers should understand and make use of. But by the end of the book we’ll be talking about psychology and metacognition — about understanding how you the programmer function and how to improve that functioning.

As I said, this is currently in very much a preview state—only about half of the content is there, it hasn’t been reviewed, and the thread that runs through it has dropped a few stitches. However, even if you buy the book now you’ll get free updates forever so you’ll get to find out as chapters are added and as changes are made.

At this early stage I’m particularly interested in any feedback readers have. I’ve set up a Glassboard for the book—in the Glassboard app, use invite code XVSSV to join the discussion.

I hope you enjoy APPropriate behaviour!

An apology to readers of Test-Driven iOS Development

I made a mistake. Not a typo or a bug in some pasted code (actually I’ve made some of those, too). I perpetuated what seems (now, since I analyse it) to be a big myth in software engineering. I uncritically quoted some work without checking its authority, and now find it lacking. As an author, it’s my responsibility to ensure that what I write represents the best of my knowledge and ability so that you can take what I write and use it to be better at this than I am.

In propagating incorrect information, I’ve let you down. I apologise unreservedly. It’s my fervent hope that this post explains what I got wrong, why it’s wrong, and what we can all learn from this.

The mistake

There are two levels to this. The problem lies in Table 1.1 at the top of page 4 (in the print edition) of Test-Driven iOS Development. Here it is:

Table 1-1 from TDiOSD

As explained in the book, this table is reproduced from an earlier publication:

Table 1.1, reproduced from Code Complete, 2nd Edition, by Steve McConnell (Microsoft Press, 2004), shows the results of a survey that evaluated the cost of fixing a bug as a function of the time it lay “dormant” in the product. The table shows that fixing bugs at the end of a project is the most expensive way to work, which makes sense…

The first mistake I made was simply that I seem to have made up the bit about it being the result of a survey. I don’t know where I got that from. In McConnell, it’s titled “Average Cost of Fixing Defects Based on When They’re Introduced and Detected” (Table 3-1, at the top of page 30). It’s introduced in a paragraph marked “HARD DATA”, and is accompanied by an impressive list of references in the table footer. McConnell:

The data in Table 3-1 shows that, for example, an architecture defect that costs $1000 to fix when the architecture is being created can cost $15,000 to fix during system test.

As already covered, the first problem is that I misattributed the data in the table. The second problem, and the one that in my opinion I’ve let down my readers the most by not catching, is that the table is completely false.

Examining the data

I was first alerted to the possibility that something was fishy with these data by the book the Leprechauns of Software Engineering by Laurent Bossavit. His Appendix B has a broad coverage of the literature that claims to report the exponential increase in cost of bug fixing.

It was this that got me worried, but I thought deciding that the table was broken on the basis of another book would be just as bad as relying on it from CC2E was in the first place. I therefore set myself the following question:

Is it possible to go from McConnell’s Table 3-1 to data that can be used to reconstruct the table?

My results

The first reference is Design and Code Inspections to Reduce Errors in Program Development. In a perennial problem for practicing software engineers, I can’t read this paper: I subscribe to the IEEE Xplore library, but they still want more cash to read this title. Laurent Bossavit, author of the aforementioned Leprechauns book, pointed out that the IEEE often charge for papers that are available for free elsewhere, and that this is the case with this paper (download link).

The paper anecdotally mentions a 10-100x factor as a result of “the old way” of working (i.e. without code inspections). The study itself looks at the amount of time saved by adding code reviews to projects that hitherto didn’t do code reviews; even if it did have numbers that correspond to this table I’d find it hard to say that the study (based on a process where the code was designed such that each “statement” in the design document corresponded to 3-10 estimated code statements, and all of the code was written in the PL/S language before a compiler pass was attempted) has relevance to modern software practice. In such a scheme, even a missing punctuation symbol is a defect that would need to be detected and reworked (not picked up by an IDE while you’re typing).

The next I discovered was Boehm and Turner’s “Balancing Agility and Discipline”. McConnell doesn’t tell us where in this book he was looking, and it’s a big book. Appendix E has a lot of secondary citations supporting “The steep version of the cost-of-change curve”, but quotes figures from “100:1” to “5:1” comparing “requirements phase” changes to “post-delivery” changes. All defect fixes are changes but not all changes are defect fixes, so these numbers can’t be used to build Table 3-1.

The graphs shown from studies for agile Java projects have “stories” on the x axis and “effort to change” in person/hour on the y-axis; again not about defects. These numbers are inconsistent with the table in McConnell anyway. As we shall see later, Boehm has trouble getting his data to fit agile projects.

“Software Defect Removal” by Dunn is another book, which I couldn’t find.

“Software Process Improvement at Hughes Aircraft” (Humphrey, Snyder, and Willis 1991) The only reference to cost here is a paragraph on “cost/performance index” on page 22. The authors say (with no supporting data; the report is based on a confidential study) that costs for software projects at Hughes were 6% over budget in 1987, and 3% over budget in 1990. There’s no indication of whether this was related to the costs of fixing defects, or the “spread” of defect discovery/fix phases. In other words this reference is irrelevant to constructing Table 3-1.

The other report from Hughes Aircraft is “Hughes Aircraft’s Widespread Deployment of a Continuously Improving So” by Ron R. Willis, Robert M. Rova et al.. This is the first reference I found to contain useful data! The report is looking at 38,000 bugs: the work of nearly 5,000 developers over dozens of projects, so this could even be significant data.

It’s a big report, but Figure 25 is the part we need. It’s a set of tables that relate the fix time (in person-days) of defects when comparing the phase that they’re fixed with the phase in which they’re detected.

Unfortunately, this isn’t the same as comparing the phase they’re discovered with the phase they’re introduced. One of the three tables (the front one, which obscures parts of the other two) looks at “in-phase” bugs: ones that were addressed with no latency. Wide differences in the numbers in this table (0.36 days to fix a defect in requirements analysis vs 2.00 days to fix a defect in functional test) make me question the presentation of Table 3-1: why put “1” in all of the “in-phase” entries in that table?

Using these numbers, and a little bit of guesswork about how to map the headings in this figure to Table 3-1, I was able to use this reference to construct a table like Table 3-1. Unfortunately, my “table like Table 3-1” was nothing like Table 3-1. Far from showing an incremental increase in bug cost with latency, the data look like a mountain range. In almost all rows the relative cost of a fix in System Test is greater than in maintenance.

I then looked at “Calculating the Return on Investment from More Effective Requirements Management” by Leffingwell. I have to hope that this 2003 webpage is a reproduction of the article cited by McConnell as a 1997 publication, as I couldn’t find a paper of that date.

This reference contains no primary data, but refers to “classic” studies in the field:

Studies performed at GTE, TRW, and IBM measured and assigned costs to errors occurring at various phases of the project life-cycle. These statistics were confirmed in later studies. Although these studies were run independently, they all reached roughly the same conclusion: If a unit cost of one is assigned to the effort required to detect and repair an error during the coding stage, then the cost to detect and repair an error during the requirements stage is between five to ten times less. Furthermore, the cost to detect and repair an error during the maintenance stage is twenty times more.

These numbers are promisingly similar to McConnell’s: although he talks about the cost to “fix” defects while this talks about “detecting and repairing” errors. Are these the same things? Was the testing cost included in McConnell’s table or not? How was it treated? Is the cost of assigning a tester to a project amortised over all bugs, or did they fill in time sheets explaining how long they spent discovering each issue?

Unfortunately Leffingwell himself is already relying on secondary citation: the reference for “Studies performed at GTE…” is a 520-page book, long out of print, called “Software Requirements—Objects Functions and States”. We’re still some degrees away from actual numbers. Worse, the citation at “confirmed in later studies” is to Understanding and controlling software costs by Boehm and Papaccio, which gets its numbers from the same studies at GTE, TRW and IBM! Leffingwell is bolstering the veracity of one set of numbers by using the same set of numbers.

A further reference in McConnell, “An Economic Release Decision Model” is to the proceedings on a 1999 conference on Applications of Software Measurement. If these proceedings have actually been published anywhere, I can’t find them: the one URL I discovered was to a “cybersquatter” domain. I was privately sent the powerpoint slides that comprise this citation. It’s a summary of then-retired Grady’s experiences with software testing, and again contains no primary data or even verifiable secondary citations. Bossavit describes a separate problem where one of the graphs in this presentation is consistently misattributed and misapplied.

The final reference provided by McConnell is What We Have Learned About Fighting Defects, a non-systematic literature review carried out in an online “e-Workshop” in 2002.

Section 3.1 of the report is “the effort to find and fix”. The 100:1 figure is supported by “general data” which are not presented and not cited. Actual cited figures are 117:1 and 135:1 for “severe” defects from two individual studies, and 2:1 for “non-severe” defects (a small collection of results).

The report concludes:

“A 100:1 increase in effort from early phases to post-delivery was a usable heuristic for severe defects, but for non-severe defects the effort increase was not nearly as large. However, this heuristic is appropriate only for certain development models with a clearly defined release point; research has not yet targeted new paradigms such as extreme programming (XP), which has no meaningful distinction between “early” and “late” development phases.”

A “usable heuristic” is not the verification I was looking for – especially one that’s only useful when practising software development in a way that most of my readers wouldn’t recognise.

Conclusion

If there is real data behind Table 3-1, I couldn’t find it. It was unprofessional of me to incorporate the table into my own book—thereby claiming it to be authoritative—without validating it, and my attempts to do so have come up wanting. I therefore no longer consider Table 1-1 in Test-Driven iOS Development to be representative of defect fixing in software engineering, I apologise for including it, and I resolve to critically appraise material I use in my work in the future.

It’s not @jnozzi’s fault!

My last post was about how we don’t use evidence-based techniques in software engineering. If we don’t rely on previous results to guide us, what do we use?

The answer is that the industry is guided by anecdote. Plenty of people give their opinions on whether one thing is better than another and why, and we read those opinions, combining them with out experiences into a world-view of our own.

Not every opinion has equal weight. Often we’ll identify some people as experts (or “rock star thought leaders”, if they’re Marcus) and consider their opinions as more valuable than the average.

So how do these people come to be experts? Usually it’s through the tautological sense: we’ve come to value what they say because they’ve repeatedly said things that we value. Whatever opinion you hold of the publishing industry, writing a book is a great way to get your thoughts out to a wide subset of the community, and to become recognised as an expert on the book’s content (there’s a very good reason why we spell the word AUTHORity).

I noticed that in my own career. I was “the Mac security guy” for a long time, a reputation gained through the not-very-simple act of writing one book published in 2010.

In just the same way, Joshua is the Xcode guy. His book, Mastering Xcode 4, is a comprehensive guide to using Xcode, based on Joshua’s experiences and opinions. As an author on Xcode, he becomes known as the Xcode guy.

And here’s where things get confusing. See, being an authority and having authority are not the same thing. Someone who told you how a thing works is not necessarily best placed to change how that thing works, or even justify why the thing works that way. And yet some people do not make this distinction. Hence Joshua being on the receiving end of complaints about Xcode not working the way some people would like it to work.

These people are frustrated because “the expert” is saying “I’m not going to tell you how to do that”. And while that’s true, we see that truth is a nuanced thing with many subtleties. In this case, you’re not being blown off, it can’t be done.

So yeah, the difference between being an authority and having authority. If you want to tell someone your opinion about Xcode not working, talk to someone with the authority to change how Xcode works. Someone like the product manager for Xcode. If you want to find out how Xcode does work, talk to someone who is an authority on how Xcode works. Someone like Joshua.

Illuminative-C

In addition to being a mildly accomplished software engineer, I’ve done some studying and armchair research in the field of ancient languages and palaeography. What happens if we smoosh those fields together?

In a very slight way, art historian and fellow Oxenafordisc Dr. Janina Ramirez did that in her series on Illuminations: the Private Lives of Medieval Kings (erm, Kings and Ælfgifu). In the series she showed off many manuscripts in the British Library collection, but when she went out in the field she took an iPad. It turns out that the BL isn’t too hot on letting you run around with their thousand-year-old kidskin.

You already know my opinion on our digital heritage. This puts it into stark relief: in one hundred years’ time, barring some epic fire in London (those never happen), the BL and its collection will still be there. Will it still be possible to even launch the iPad app she was using? I very much doubt it.

How about if we put the same effort into storing our source code as the scriptoria did into storing their indentures and gospels? Well, I sharpened a goose feather and had a go at just that (warning: very much draft document impending).

Example 2-1 from PCAS

What you see up there is the first sample code in Professional Cocoa Application Security – Listing 2-1. Ignore the fact that you don’t recognise all the letter shapes: things have changed over the centuries. There were a few contortions required to get the source code to work in manuscript form: let me show you them.

First is that in the hand in which I wrote the source, some of the characters needed for Objective-C source code don’t exist. Like ‘v’. I used the fact that u and v are actually the same letter to get around that. Punctuation was harder: I went for roughly accurate rendering, with a single misplaced comma to suggest that the scribe didn’t really understand punctuation.

When it came to comments, I decided they have the same meaning as the gloss in the Lindisfarne Gospels – rendering the difficult language required by the church^Wcompiler into plain English. I therefore roughly scratched them in smaller text with different ink, letting them flow around the code as if they’d been written later. I also put in a few Old English spellings – though again not consistently[*].

The return value posed some difficulty, because we didn’t borrow 0 from the Middle East until a few centuries after the time this script is mimicking. I realised that if a scribe were to illuminate any part of a C function, it’d probably be the return value because that’s the consistent and – from the perspective of the rest of the code – important part. Thus the 0 is highly decorated, with six legs in the fashion of a bug :-).

Bugs hark back to the days of illuminated manuscripts anyway. Any good scribe would know that a mistake in the text was the fault of Titivillus, not of the scribe. Just as those bugs aren’t my fault. Honest.

[*] Next time you want to get angry at a teenager, remember that the work “ask” was once “acsian” with the s on the end, and think about which one of you is bastardising our language.