It’s not @jnozzi’s fault!

My last post was about how we don’t use evidence-based techniques in software engineering. If we don’t rely on previous results to guide us, what do we use?

The answer is that the industry is guided by anecdote. Plenty of people give their opinions on whether one thing is better than another and why, and we read those opinions, combining them with out experiences into a world-view of our own.

Not every opinion has equal weight. Often we’ll identify some people as experts (or “rock star thought leaders”, if they’re Marcus) and consider their opinions as more valuable than the average.

So how do these people come to be experts? Usually it’s through the tautological sense: we’ve come to value what they say because they’ve repeatedly said things that we value. Whatever opinion you hold of the publishing industry, writing a book is a great way to get your thoughts out to a wide subset of the community, and to become recognised as an expert on the book’s content (there’s a very good reason why we spell the word AUTHORity).

I noticed that in my own career. I was “the Mac security guy” for a long time, a reputation gained through the not-very-simple act of writing one book published in 2010.

In just the same way, Joshua is the Xcode guy. His book, Mastering Xcode 4, is a comprehensive guide to using Xcode, based on Joshua’s experiences and opinions. As an author on Xcode, he becomes known as the Xcode guy.

And here’s where things get confusing. See, being an authority and having authority are not the same thing. Someone who told you how a thing works is not necessarily best placed to change how that thing works, or even justify why the thing works that way. And yet some people do not make this distinction. Hence Joshua being on the receiving end of complaints about Xcode not working the way some people would like it to work.

These people are frustrated because “the expert” is saying “I’m not going to tell you how to do that”. And while that’s true, we see that truth is a nuanced thing with many subtleties. In this case, you’re not being blown off, it can’t be done.

So yeah, the difference between being an authority and having authority. If you want to tell someone your opinion about Xcode not working, talk to someone with the authority to change how Xcode works. Someone like the product manager for Xcode. If you want to find out how Xcode does work, talk to someone who is an authority on how Xcode works. Someone like Joshua.

Does that thing you like doing actually work?

Genuine question. I’ve written before about Test-Driven Development, and I’m sure some of you practice it: can you show evidence that it’s better than (or, for that matter, evidence that it’s worse than) some other practice? Statistically significant evidence?

How about security? Can you be confident that there’s a benefit to spending any money or time on information security countermeasures? On what should it be spent? Which interventions are most successful? Can you prove that?

I am, of course, asking whether there’s any evidence in software engineering. I ask rhetorically, because I believe that there isn’t—or there isn’t a lot that’s in a form useful to practitioners. A succinct summary of this position comes courtesy of Anthony Finkelstein:

For the most part our existing state-of-practice is based on anecdote. It is, at its very best quasi-evidence-based. Few key decisions from the choice of an architecture to the configuration of tools and processes are based on a solid evidential foundation. To be truthful, software engineering is not taught by reference to evidence either. This is unacceptable in a discipline that aspires to engineering science. We must reconstruct software engineering around an evidence-based practice.

Now there is a discipline of Evidence-Based Software Engineering, but herein lies a bootstrapping problem that deserves examination. Evidence-Based [ignore the obvious jokes, it’s a piece of specific jargon that I’m about to explain] practice means summarising the significant results in scientific literature and making them available to practitioners, policymakers and other “users”. The primary tools are the systematic literature review and its statistics-heavy cousin, the meta-analysis.

Wait, systematic literature review? What literature? Here’s the problem with trying to do EBSE in 2012. Much software engineering goes on behind closed doors in what employers treat as proprietary or trade-secret processes. Imagine that a particular project is delayed: most companies won’t publish that result because they don’t want competitors to know that their projects are delayed.

Even for studies, reports and papers that do exist, they’re not necessarily accessible to the likes of us common programmers. Let’s imagine that I got bored and decided to do a systematic literature survey of whether functional programming truly does lead to fewer concurrency issues than object-oriented programming.[*] I’d be able to look at articles in the ACM Digital Library, on the ArXiv pre-print server, and anything that’s in Leamington Spa library (believe me, it isn’t much). I can’t read IEEE publications, the BCS Computer Journal, or many others because I can’t afford to subscribe to them all. And there are probably tons of journals I don’t even know about.

[*]Results of asking about this evidence-based approach to paradigm selection revealed that either I didn’t explain myself very well or people don’t like the idea of evidence mucking up their current anecdotal world views.

So what do we do about this state of affairs? Actually, to be more specific: if our goal is to provide developers with better access to evidence from our field, what do we do?

I don’t think traditional journals can be the answer. If they’re pay-to-read, developers will never see them. If they’re pay-to-write, the people who currently aren’t supplying any useful evidence still won’t.

So we need something lighter weight, free to contribute to and free to consume; and we probably need to accept that it then won’t be subject to formal peer review (in exactly the same way that Wikipedia isn’t).

I’ve argued before that a great place for this work to be done is the Free Software Foundation. They’ve got the components in place: a desire to prove that their software is preferable to commercial alternatives; public development projects with some amount of central governance; volunteer coders willing to gain karma by trying out new things. They (or if not them, Canonical or someone else) could easily become the home of demonstrable quality in software production.

Could the proprietary software developers be convinced to open up on information about what practices do or don’t work for them? I believe so, but it wouldn’t be easy. Iteratively improving practices is a goal for both small companies following Lean Startup and similar techniques, and large enterprises interested in process maturity models like CMMI. Both of these require you to know what metrics are important; to measure, test, improve and iterate on those metrics. This can be done much more quickly if you can combine your results from those of other teams—see what already has or hasn’t worked elsewhere and learn from that.

So that means that everyone will benefit if everyone else is publishing their evidence. But how do you bootstrap that? Who will be first to jump from a culture of silence to a culture of sharing, the people who give others the benefit of their experience before they get anything in return?

I believe that this is the role of the platform companies. These are the companies whose value lies not only in their own software, but in the software created by ISVs on their platforms. If they can help their ISVs to make better software more efficiently, they improve their own position in the market.

I made a web!

That is, I made a C program using the literate programming tool, CWEB. The product it outputs is, almost by definition, self-documenting, so find out about the algorithm and how I built it by reading the PDF. This post is about the process.

Unsurprisingly I found it much more mentally taxing to understand a prose description of a complex algorithm and how I might convert that into C than writing the C itself. In that, and acknowledging that this little project was a very artificial example, it was very helpful to be able to write long-form comments alongside the code.

That’s not to say that I don’t normally comment my code; I often do when I’m trying something I don’t think I understand. But often I’ll write out a prose description of what I’m trying to do in a notebook, or produce incredibly terse C comments. The literate programming environment encouraged me to marry these two ideas and create long prose that’s worth reading, but attach it to the code I’m writing.

I additionally found it useful to be able to break up code into segments by idea rather than by function/class/method. If I think “oh, I’ll need one of these” I can just start a new section, and then reference it in the place it’ll get used. It inverts my usual process, which is to write out the code I think I’ll need to do a task and then go back and pick out isolated sections with refactoring tools.

As a developer’s tool, it’s pretty neat too, though not perfect. The ctangle tool that generates the C source code inserts both comments referring to the section of the web that the code comes from, and (more usefully) preprocessor #line directives. If you debug the executable (which I needed to…) you’ll get told where in the human-readable source the PC is sitting, not where in the generated C.

The source file, a “web” that contains alternating TeX and C code, is eminently readable (if you know both TeX and C, obviously) and plays well with version control. Because this example was a simple project, I defined everything in one file but CWEB can handle multiple-file projects.

The main issue is that it’d be much better to have an IDE that’s capable of working with web files directly. A split-pane preview of the formatted documentation would be nice, and there are some good TeX word processors out there that would be a good starting point. Code completion, error detection and syntax highlighting in both the C and TeX parts would be really useful. Refactoring support would be…a challenge, but probably doable.

So my efforts with CWEB haven’t exactly put me off, but do make me think that even three decades after being created it’s not in a state to be a day-to-day developer environment. Now if only I knew someone with enough knowledge of the Clang API to think about making a C or ObjC IDE…

A brief history of talking on the interwebs (or: why I’m not on app.net)

When I first went to university, I was part of an Actual September, though it took place in October. Going from a dial-up internet service shared with the telephone line to the latest iteration of SuperJANET with its multi-megabit connection to my computer opened many new possibilities for me and my peers.

One of these possibilities was Usenet, which we accessed via news.ox.ac.uk. Being new to this online society, my fellow neophytes and I made all of the social faux pas that our forbears had made this time last year, and indeed in prior years. We top-posted. We cross-posted. We fed the trolls. Some of us even used Outlook Express. Over time, those of us who were willing to make concessions to the rules became the denizens, and it was our job the next September to flame the latest crop of newbies.

The above description is vastly oversimplified, of course. By the time of my Actual September, Usenet was feeling the effects of the Neverending September. Various commercial ISPs – most notoriously America Online – had started carrying Usenet and their customers were posting. Now there was, all year round, an influx of people who didn’t know about the existing society and rules but were, nonetheless, posting to Usenet.

Between AOL and – much later – Google Groups incorporating Usenet into its content, the people who felt themselves the guardians and definition of all that Usenet stood for found that they were the minority of users. Three main ways of dealing with this arose. Some people just gave up and left for other services. Others joined in with the new way of using Usenet. Still others worked in the old way despite the rise of the new way, wielding their ability to plonk newbies into their kill file as a badge of honour.

By now I probably don’t need to ask the rhetorical question: what has all of this to do with twitter? Clearly it has everything to do with twitter. The details differ but the analogy is near watertight. In each instance, we find a community of early adopters for a service that finds a comfortable way to use that service. In each we find that as the community grows, latecomers use the service in different ways, unanticipated or frowned upon by the early adopters. In each case the newcomers outnumber the early adopters by orders of magnitude and successfully, whether by sheer scale or through the will of the owners of the service, redefine the culture of the service. Early adopters complain that the new majority don’t “get” the culture.

Moving to app.net does nothing except reset that early-adopter clock. Any postmodernist philosopher will tell you that: probably while painting your living room lilac and dragging a goldfish bowl on a leash. If app.net takes off then the population of users will be orders of magnitude greater than the number of “backers”. The people who arrive later will have their own ideas of how to use the service; and together will have contributed orders more cash to the founders than the initial tranche of “backers”. I wonder who the management will listen to.

Any publicly-accessible communication platform will go through this growth and change. When I joined Facebook it was only open to university members and was a very different beast than modern Facebook. I would not be surprised to read similar complaints made about citizens’ band radio or Morse telegraphy.

The people who move on don’t necessarily want a changed experience. It seems to me they want a selective experience, and moving into the wilderness allows them an approximation of that. In the short term, anyway. Soon the undesirables will move in next door and they’ll choose to move on again.

I suggest that what’s required is actually something more like Usenet. I run my own status.net server, initially to archive my tweet stream but it turns out I’m not using it for that. If I chose I could open that server up to selected people, just as news.ox.ac.uk was only open to members of one university. I could curate a list of servers that mine peers with. If there are some interesting people at status.cocoadev.com, I could peer with that server. If status.beliebers.net isn’t to my taste, I don’t peer with it. But that’s fine, their users don’t see what I write in return for me not seeing what they write. In fact Usenet could’ve benefitted from more selective peering, and a lot of the paid-for access now has, easily-detectable spam aside, a higher signal to noise ratio than the service had a decade ago.

Another service that has some of the aspects of the curated experience is Glassboard. Theirs is entirely private, losing some of the discoverability of a public tweet stream. In return all conversations are (to some extent) invitation only and moderated. If you don’t like someone’s contributions, the board owner can kick ban them.

So the problem long-term tweeters have with twitter is not a new problem. Moving wholesale to something that does the same thing means deferring, not solving, the problem.


I thought I’d update this post (nearly six months later) on the day that I joined app.net. It’s changed quite a lot—both by adding a cloud storage API and by going freemium—in the intervening time. I remain skeptical that the problem with a social network is the tool, and I also wonder how the people who joined to get away from people using Twitter really badly will react to the free tier allowing the unwashed masses like me to come and use app.net really badly. Still, there’s a difference between skeptical and closed-minded, so here I am.