Structure and Interpretation of Computer Programmers

Object-Oriented callback design

Posted on 2012-11-17 by Graham

One of the early promises of object-oriented programming, encapsulated in the design of the Smalltalk APIs, was a reduction – or really an encapsulation – of the complexity of code. Many programmers believe that the more complex a method or function is, the harder it is to understand and to maintain. Some developers even use tools to measure the complexity quantitatively, in terms of the number of loops or conditions present in the function’s logic. Get this “cyclomatic complexity” figure too high, and your build fails. Unfortunately many class APIs have been designed that don’t take the complexity of client code into account. Here’s an extreme example: the NSStreamDelegate protocol from Foundation.

-(void)stream:(NSStream *)stream handleEvent:(NSStreamEvent)streamEvent;

This is not so much an abstraction of the underlying C functions as a least-effort adaptation into Objective-C. Every return code the lower-level functionality exposes is mapped onto a code that’s funnelled into one place; this delegate callback. Were you trying to read or write the stream? Did it succeed or fail? Doesn’t matter; you’ll get this one callback. Any implementation of this protocol looks like a big bundle of if statements (or, more tersely, a big bundle of cases in a switch) to handle each of the possible codes. The default case has to handle the possibility that future version of the API adds a new event to the list. Whenever I use this API, I drop in the following implementation that “fans out” the different events to different handler methods.

-(void)stream:(NSStream *)stream handleEvent:(NSStreamEvent)streamEvent
{
  switch(streamEvent)
  {
  case NSStreamEventOpenCompleted:
    [self streamDidOpen: stream];
    break;
  //...
  default:
    NSAssert(NO, @"Apple changed the NSStream API");
    [self streamDidSomethingUnexpected: stream];
    break;
  }
}

Of course,

NSStream is an incredibly old class. We’ve learned a lot since then, so modern callback techniques are much better, aren’t they? In my opinion, they did indeed get better for a bit. But then something happened that led to a reduction in the quality of these designs. That thing was the overuse of blocks as callbacks. Here’s an example of what I mean, taken from Game Center’s authentication workflow.

@property(nonatomic, copy) void(^authenticateHandler)(UIViewController *viewController, NSError *error)

Let’s gloss over, for a moment, the fact that simply setting this property triggers authentication. There are three things that could happen as a result of calling this method: two are the related ideas that authentication could succeed or fail (related, but diametrically opposed). The third is that the API needs some user input, so wants the app to present a view controller for data entry. Three things, one entry point. Which do we need to handle on this run? We’re not told; we have to ask. This is the antithesis of accepted object-oriented practice. In this particular case, the behaviour required on event-handling is rich enough that a delegate protocol defining multiple methods would be a good way to handle the interaction:

@protocol GKLocalPlayerAuthenticationDelegate

@required
-(void)localPlayer: (GKLocalPlayer *)localPlayer needsToPresentAuthenticationInterface: (UIViewController *)viewController;
-(void)localPlayerAuthenticated: (GKLocalPlayer *)localPlayer;
-(void)localPlayer: (GKLocalPlayer *)localPlayer failedToAuthenticateWithError: (NSError *)error;

@end

The simpler case is where some API task either succeeds or fails. Smalltalk had a pattern for dealing with this which could be both supported in Objective-C, and extended to cover asynchronous design. Here’s how you might encapsulate a boolean success state with error handling in an object-oriented fashion.

typedef id(^conditionBlock)(NSError **error);
typedef void(^successHandler)(id result);
typedef void(^failureHandler)(NSError *error);
- (void)ifThis:(conditionBlock)condition then:(successHandler)success otherwise:(failureHandler)failure
{
  __block NSError *error;
  __block id result;
  if ((result = condition(&error)))
    success(result);
  else
    failure(error);
}

Now you’re telling client code whether your operation worked, not requiring that it ask. Each of the conditions is explicitly and separately handled. This is a bit different from Smalltalk’s condition handling, which works by sending the ifTrue:ifFalse: message to an object that knows which Boolean state it represents. The ifThis:then:otherwise: message needs to deal with the common Cocoa idiom of describing failure via an error object – something a Boolean wouldn’t know about.[] However, the Smalltalk pattern *is possible while still supporting the above requirements: see the coda to this post. This method could be exposed directly as API, or it can be used to service conditions inside other methods:

@implementation NSFileManager (BlockDelete)

- (void)deleteFileAtPath:(NSString *)path success:(successHandler)success failure:(failureHandler)failure
{
    [self ifThis: ^(NSError **error){
        return [self removeItemAtPath: path error: error]?@(1):nil;
    } then: success otherwise: failure];
}

@end

int main(int argc, const char * argv[])
{
    @autoreleasepool {
        [[NSFileManager defaultManager] deleteFileAtPath: @"/private/tmp" success: ^(id unused){
            NSLog(@"Holy crap, you deleted the temporary folder!");
        } failure: ^(NSError *error){
            NSLog(@"Meh, that didn't work. Here's why: %@", error);
        }];
    }
    return 0;
}

[*]As an aside, there’s no real reason that Cocoa needs to use indirect error pointers. Consider the following API:

-(id)executeFetchRequest:(NSFetchRequest *)fetchRequest

The return value could be an NSArray or an NSError. The problem with this is that in almost all cases this puts some ugly conditional code into the API’s client—though only the same ugly condition you currently have to do in testing the return code before examining the error. This separation of success and failure handlers encapsulates that condition in code the client author doesn’t need to see.

Coda: related pattern

I realised after writing this post that the Smalltalk-esque ifTrue:ifFalse: style of conditional can be supported, and leads to some interesting possibilities. First, consider defining an abstract Outcome class:

@interface Outcome : NSObject

- (void)ifTrue:(successHandler)success ifFalse: (failureHandler)failure;

@end

You can now define two subclasses which know what outcome they represent and the supporting data:

@interface Success : Outcome

+ (instancetype)successWithResult: (id)result;

@end

@interface Failure : Outcome

+ (instancetype)failureWithError: (NSError *)error;

@end

The implementation of these two classes is very similar, you can infer the behaviour of Failure from the behaviour of Success:

@implementation Success
{
  id _result;
}

+ (instancetype)successWithResult: (id)result
{
  Success *success = [self new];
  success->_result = result;
  return success;
}

- (void)ifTrue:(successHandler)success ifFalse:(failureHandler)failure
{
  success(_result);
}
@end

But that’s not the end of it. You could use the -ifThis:then:otherwise: method above to implement a Deferred outcome, which doesn’t evaluate its result until someone asks for it. Or you could build a Pending result, which starts the evaluation in the background, resolving to success or failure on completion. Or you could do something else.

Posted in code-level, OOP, software-engineering | Comments Off

How to excel at IDE design

Posted on 2012-09-18 by Graham

When people have the “which IDE is best” argument, what they’re actually discussing is “which slightly souped-up monospace text editor with a build button do you like using”. Eclipse, Xcode, IntelliJ, Visual Studio…all of these tools riff on the same design—letting you see the source code and change the source code. As secondary effects you can also do things like build the source code, run the built product, and debug it.

The most successful IDE in the world, I would contend (and then wave my hands unconvincingly when anyone asks for data), is one that’s not designed like that at all. It’s the one that is used by more non-software specialists than any of those named above. The one that doesn’t require you to practice being an IDE user for years before you get any good. The one that business analysts, office assistants, accountants and project managers alike all turn to when they need their computer to run through some custom algorithm.

The IDE whose name appears in the title of this post.

Now what makes a spreadsheet better as a development environment is difficult to say; I’m unaware of anyone having researched it. But what makes it a different development environment from an IDE can be clearly seen by using each of them.

In a spreadsheet, it’s the inputs and results that are front-and-centre in the presentation, not the intermediate stuff that gets you from one to the other. You can test your “code” by typing in a different input and watching the results change in front of you. You can see intermediate results, not by breaking and stepping through, or putting in a log statement then switching to the log view, but by breaking the algorithm up into smaller steps (or functions or procedures, if you want to call them that). You can then visualise how these intermediate results change right along side the inputs and outputs. That’s quicker feedback than even REPLs can offer.

Many spreadsheet users naturally adopt a “test-first” approach; they create inputs for which they know what the results should be and make successively better attempts to build a formula that achieves these results. And, of course, interesting visualisations like graphs are available (though the quality does vary between products). Drawing a graph in Xcode is…challenging (and yes, nerds, I know about CorePlot).

All of this is not to say that spreadsheets are the future of IDEs. In many ways spreadsheets are more accessible as IDEs, and it’d be good to study and learn from the differences and incorporate some of them into the things that are the future of IDEs.

Posted in Uncategorized | Comments Off

An apology to readers of Test-Driven iOS Development

Posted on 2012-09-03 by Graham

I made a mistake. Not a typo or a bug in some pasted code (actually I’ve made some of those, too). I perpetuated what seems (now, since I analyse it) to be a big myth in software engineering. I uncritically quoted some work without checking its authority, and now find it lacking. As an author, it’s my responsibility to ensure that what I write represents the best of my knowledge and ability so that you can take what I write and use it to be better at this than I am.

In propagating incorrect information, I’ve let you down. I apologise unreservedly. It’s my fervent hope that this post explains what I got wrong, why it’s wrong, and what we can all learn from this.

The mistake

There are two levels to this. The problem lies in Table 1.1 at the top of page 4 (in the print edition) of Test-Driven iOS Development. Here it is:

Table 1-1 from TDiOSD

As explained in the book, this table is reproduced from an earlier publication:

Table 1.1, reproduced from Code Complete, 2nd Edition, by Steve McConnell (Microsoft Press, 2004), shows the results of a survey that evaluated the cost of fixing a bug as a function of the time it lay “dormant” in the product. The table shows that fixing bugs at the end of a project is the most expensive way to work, which makes sense…

The first mistake I made was simply that I seem to have made up the bit about it being the result of a survey. I don’t know where I got that from. In McConnell, it’s titled “Average Cost of Fixing Defects Based on When They’re Introduced and Detected” (Table 3-1, at the top of page 30). It’s introduced in a paragraph marked “HARD DATA”, and is accompanied by an impressive list of references in the table footer. McConnell:

The data in Table 3-1 shows that, for example, an architecture defect that costs $1000 to fix when the architecture is being created can cost $15,000 to fix during system test.

As already covered, the first problem is that I misattributed the data in the table. The second problem, and the one that in my opinion I’ve let down my readers the most by not catching, is that the table is completely false.

Examining the data

I was first alerted to the possibility that something was fishy with these data by the book the Leprechauns of Software Engineering by Laurent Bossavit. His Appendix B has a broad coverage of the literature that claims to report the exponential increase in cost of bug fixing.

It was this that got me worried, but I thought deciding that the table was broken on the basis of another book would be just as bad as relying on it from CC2E was in the first place. I therefore set myself the following question:

Is it possible to go from McConnell’s Table 3-1 to data that can be used to reconstruct the table?

My results

The first reference is Design and Code Inspections to Reduce Errors in Program Development. ~~In a perennial problem for practicing software engineers, I can’t read this paper: I subscribe to the IEEE Xplore library, but they still want more cash to read this title.~~ Laurent Bossavit, author of the aforementioned Leprechauns book, pointed out that the IEEE often charge for papers that are available for free elsewhere, and that this is the case with this paper (download link).

The paper anecdotally mentions a 10-100x factor as a result of “the old way” of working (i.e. without code inspections). The study itself looks at the amount of time saved by adding code reviews to projects that hitherto didn’t do code reviews; even if it did have numbers that correspond to this table I’d find it hard to say that the study (based on a process where the code was designed such that each “statement” in the design document corresponded to 3-10 estimated code statements, and all of the code was written in the PL/S language before a compiler pass was attempted) has relevance to modern software practice. In such a scheme, even a missing punctuation symbol is a defect that would need to be detected and reworked (not picked up by an IDE while you’re typing).

The next I discovered was Boehm and Turner’s “Balancing Agility and Discipline”. McConnell doesn’t tell us where in this book he was looking, and it’s a big book. Appendix E has a lot of secondary citations supporting “The steep version of the cost-of-change curve”, but quotes figures from “100:1” to “5:1” comparing “requirements phase” changes to “post-delivery” changes. All defect fixes are changes but not all changes are defect fixes, so these numbers can’t be used to build Table 3-1.

The graphs shown from studies for agile Java projects have “stories” on the x axis and “effort to change” in person/hour on the y-axis; again not about defects. These numbers are inconsistent with the table in McConnell anyway. As we shall see later, Boehm has trouble getting his data to fit agile projects.

“Software Defect Removal” by Dunn is another book, which I couldn’t find.

“Software Process Improvement at Hughes Aircraft” (Humphrey, Snyder, and Willis 1991) The only reference to cost here is a paragraph on “cost/performance index” on page 22. The authors say (with no supporting data; the report is based on a confidential study) that costs for software projects at Hughes were 6% over budget in 1987, and 3% over budget in 1990. There’s no indication of whether this was related to the costs of fixing defects, or the “spread” of defect discovery/fix phases. In other words this reference is irrelevant to constructing Table 3-1.

The other report from Hughes Aircraft is “Hughes Aircraft’s Widespread Deployment of a Continuously Improving So” by Ron R. Willis, Robert M. Rova et al.. This is the first reference I found to contain useful data! The report is looking at 38,000 bugs: the work of nearly 5,000 developers over dozens of projects, so this could even be significant data.

It’s a big report, but Figure 25 is the part we need. It’s a set of tables that relate the fix time (in person-days) of defects when comparing the phase that they’re fixed with the phase in which they’re detected.

Unfortunately, this isn’t the same as comparing the phase they’re discovered with the phase they’re introduced. One of the three tables (the front one, which obscures parts of the other two) looks at “in-phase” bugs: ones that were addressed with no latency. Wide differences in the numbers in this table (0.36 days to fix a defect in requirements analysis vs 2.00 days to fix a defect in functional test) make me question the presentation of Table 3-1: why put “1” in all of the “in-phase” entries in that table?

Using these numbers, and a little bit of guesswork about how to map the headings in this figure to Table 3-1, I was able to use this reference to construct a table like Table 3-1. Unfortunately, my “table like Table 3-1” was nothing like Table 3-1. Far from showing an incremental increase in bug cost with latency, the data look like a mountain range. In almost all rows the relative cost of a fix in System Test is greater than in maintenance.

I then looked at “Calculating the Return on Investment from More Effective Requirements Management” by Leffingwell. I have to hope that this 2003 webpage is a reproduction of the article cited by McConnell as a 1997 publication, as I couldn’t find a paper of that date.

This reference contains no primary data, but refers to “classic” studies in the field:

Studies performed at GTE, TRW, and IBM measured and assigned costs to errors occurring at various phases of the project life-cycle. These statistics were confirmed in later studies. Although these studies were run independently, they all reached roughly the same conclusion: If a unit cost of one is assigned to the effort required to detect and repair an error during the coding stage, then the cost to detect and repair an error during the requirements stage is between five to ten times less. Furthermore, the cost to detect and repair an error during the maintenance stage is twenty times more.

These numbers are promisingly similar to McConnell’s: although he talks about the cost to “fix” defects while this talks about “detecting and repairing” errors. Are these the same things? Was the testing cost included in McConnell’s table or not? How was it treated? Is the cost of assigning a tester to a project amortised over all bugs, or did they fill in time sheets explaining how long they spent discovering each issue?

Unfortunately Leffingwell himself is already relying on secondary citation: the reference for “Studies performed at GTE…” is a 520-page book, long out of print, called “Software Requirements—Objects Functions and States”. We’re still some degrees away from actual numbers. Worse, the citation at “confirmed in later studies” is to Understanding and controlling software costs by Boehm and Papaccio, which gets its numbers from the same studies at GTE, TRW and IBM! Leffingwell is bolstering the veracity of one set of numbers by using the same set of numbers.

A further reference in McConnell, “An Economic Release Decision Model” is to the proceedings on a 1999 conference on Applications of Software Measurement. ~~If these proceedings have actually been published anywhere, I can’t find them: the one URL I discovered was to a “cybersquatter” domain.~~ I was privately sent the powerpoint slides that comprise this citation. It’s a summary of then-retired Grady’s experiences with software testing, and again contains no primary data or even verifiable secondary citations. Bossavit describes a separate problem where one of the graphs in this presentation is consistently misattributed and misapplied.

The final reference provided by McConnell is What We Have Learned About Fighting Defects, a non-systematic literature review carried out in an online “e-Workshop” in 2002.

Section 3.1 of the report is “the effort to find and fix”. The 100:1 figure is supported by “general data” which are not presented and not cited. Actual cited figures are 117:1 and 135:1 for “severe” defects from two individual studies, and 2:1 for “non-severe” defects (a small collection of results).

The report concludes:

“A 100:1 increase in effort from early phases to post-delivery was a usable heuristic for severe defects, but for non-severe defects the effort increase was not nearly as large. However, this heuristic is appropriate only for certain development models with a clearly defined release point; research has not yet targeted new paradigms such as extreme programming (XP), which has no meaningful distinction between “early” and “late” development phases.”

A “usable heuristic” is not the verification I was looking for – especially one that’s only useful when practising software development in a way that most of my readers wouldn’t recognise.

Conclusion

If there is real data behind Table 3-1, I couldn’t find it. It was unprofessional of me to incorporate the table into my own book—thereby claiming it to be authoritative—without validating it, and my attempts to do so have come up wanting. I therefore no longer consider Table 1-1 in Test-Driven iOS Development to be representative of defect fixing in software engineering, I apologise for including it, and I resolve to critically appraise material I use in my work in the future.

Posted in books, Responsibility, software-engineering, TDiOSD | Comments Off

It’s not @jnozzi’s fault!

Posted on 2012-08-30 by Graham

My last post was about how we don’t use evidence-based techniques in software engineering. If we don’t rely on previous results to guide us, what do we use?

The answer is that the industry is guided by anecdote. Plenty of people give their opinions on whether one thing is better than another and why, and we read those opinions, combining them with out experiences into a world-view of our own.

Not every opinion has equal weight. Often we’ll identify some people as experts (or “rock star thought leaders”, if they’re Marcus) and consider their opinions as more valuable than the average.

So how do these people come to be experts? Usually it’s through the tautological sense: we’ve come to value what they say because they’ve repeatedly said things that we value. Whatever opinion you hold of the publishing industry, writing a book is a great way to get your thoughts out to a wide subset of the community, and to become recognised as an expert on the book’s content (there’s a very good reason why we spell the word AUTHORity).

I noticed that in my own career. I was “the Mac security guy” for a long time, a reputation gained through the not-very-simple act of writing one book published in 2010.

In just the same way, Joshua is the Xcode guy. His book, Mastering Xcode 4, is a comprehensive guide to using Xcode, based on Joshua’s experiences and opinions. As an author on Xcode, he becomes known as the Xcode guy.

And here’s where things get confusing. See, being an authority and having authority are not the same thing. Someone who told you how a thing works is not necessarily best placed to change how that thing works, or even justify why the thing works that way. And yet some people do not make this distinction. Hence Joshua being on the receiving end of complaints about Xcode not working the way some people would like it to work.

These people are frustrated because “the expert” is saying “I’m not going to tell you how to do that”. And while that’s true, we see that truth is a nuanced thing with many subtleties. In this case, you’re not being blown off, it can’t be done.

So yeah, the difference between being an authority and having authority. If you want to tell someone your opinion about Xcode not working, talk to someone with the authority to change how Xcode works. Someone like the product manager for Xcode. If you want to find out how Xcode does work, talk to someone who is an authority on how Xcode works. Someone like Joshua.

Posted in books | Comments Off

Does that thing you like doing actually work?

Posted on 2012-08-29 by Graham

Genuine question. I’ve written before about Test-Driven Development, and I’m sure some of you practice it: can you show evidence that it’s better than (or, for that matter, evidence that it’s worse than) some other practice? Statistically significant evidence?

How about security? Can you be confident that there’s a benefit to spending any money or time on information security countermeasures? On what should it be spent? Which interventions are most successful? Can you prove that?

I am, of course, asking whether there’s any evidence in software engineering. I ask rhetorically, because I believe that there isn’t—or there isn’t a lot that’s in a form useful to practitioners. A succinct summary of this position comes courtesy of Anthony Finkelstein:

For the most part our existing state-of-practice is based on anecdote. It is, at its very best quasi-evidence-based. Few key decisions from the choice of an architecture to the configuration of tools and processes are based on a solid evidential foundation. To be truthful, software engineering is not taught by reference to evidence either. This is unacceptable in a discipline that aspires to engineering science. We must reconstruct software engineering around an evidence-based practice.

Now there is a discipline of Evidence-Based Software Engineering, but herein lies a bootstrapping problem that deserves examination. Evidence-Based [ignore the obvious jokes, it’s a piece of specific jargon that I’m about to explain] practice means summarising the significant results in scientific literature and making them available to practitioners, policymakers and other “users”. The primary tools are the systematic literature review and its statistics-heavy cousin, the meta-analysis.

Wait, systematic literature review? What literature? Here’s the problem with trying to do EBSE in 2012. Much software engineering goes on behind closed doors in what employers treat as proprietary or trade-secret processes. Imagine that a particular project is delayed: most companies won’t publish that result because they don’t want competitors to know that their projects are delayed.

Even for studies, reports and papers that do exist, they’re not necessarily accessible to the likes of us common programmers. Let’s imagine that I got bored and decided to do a systematic literature survey of whether functional programming truly does lead to fewer concurrency issues than object-oriented programming.[*] I’d be able to look at articles in the ACM Digital Library, on the ArXiv pre-print server, and anything that’s in Leamington Spa library (believe me, it isn’t much). I can’t read IEEE publications, the BCS Computer Journal, or many others because I can’t afford to subscribe to them all. And there are probably tons of journals I don’t even know about.

[*]Results of asking about this evidence-based approach to paradigm selection revealed that either I didn’t explain myself very well or people don’t like the idea of evidence mucking up their current anecdotal world views.

So what do we do about this state of affairs? Actually, to be more specific: if our goal is to provide developers with better access to evidence from our field, what do we do?

I don’t think traditional journals can be the answer. If they’re pay-to-read, developers will never see them. If they’re pay-to-write, the people who currently aren’t supplying any useful evidence still won’t.

So we need something lighter weight, free to contribute to and free to consume; and we probably need to accept that it then won’t be subject to formal peer review (in exactly the same way that Wikipedia isn’t).

I’ve argued before that a great place for this work to be done is the Free Software Foundation. They’ve got the components in place: a desire to prove that their software is preferable to commercial alternatives; public development projects with some amount of central governance; volunteer coders willing to gain karma by trying out new things. They (or if not them, Canonical or someone else) could easily become the home of demonstrable quality in software production.

Could the proprietary software developers be convinced to open up on information about what practices do or don’t work for them? I believe so, but it wouldn’t be easy. Iteratively improving practices is a goal for both small companies following Lean Startup and similar techniques, and large enterprises interested in process maturity models like CMMI. Both of these require you to know what metrics are important; to measure, test, improve and iterate on those metrics. This can be done much more quickly if you can combine your results from those of other teams—see what already has or hasn’t worked elsewhere and learn from that.

So that means that everyone will benefit if everyone else is publishing their evidence. But how do you bootstrap that? Who will be first to jump from a culture of silence to a culture of sharing, the people who give others the benefit of their experience before they get anything in return?

I believe that this is the role of the platform companies. These are the companies whose value lies not only in their own software, but in the software created by ISVs on their platforms. If they can help their ISVs to make better software more efficiently, they improve their own position in the market.

Posted in advancement of the self, documentation, software-engineering | Comments Off

I made a web!

Posted on 2012-08-19 by Graham

That is, I made a C program using the literate programming tool, CWEB. The product it outputs is, almost by definition, self-documenting, so find out about the algorithm and how I built it by reading the PDF. This post is about the process.

Unsurprisingly I found it much more mentally taxing to understand a prose description of a complex algorithm and how I might convert that into C than writing the C itself. In that, and acknowledging that this little project was a very artificial example, it was very helpful to be able to write long-form comments alongside the code.

That’s not to say that I don’t normally comment my code; I often do when I’m trying something I don’t think I understand. But often I’ll write out a prose description of what I’m trying to do in a notebook, or produce incredibly terse C comments. The literate programming environment encouraged me to marry these two ideas and create long prose that’s worth reading, but attach it to the code I’m writing.

I additionally found it useful to be able to break up code into segments by idea rather than by function/class/method. If I think “oh, I’ll need one of these” I can just start a new section, and then reference it in the place it’ll get used. It inverts my usual process, which is to write out the code I think I’ll need to do a task and then go back and pick out isolated sections with refactoring tools.

As a developer’s tool, it’s pretty neat too, though not perfect. The ctangle tool that generates the C source code inserts both comments referring to the section of the web that the code comes from, and (more usefully) preprocessor #line directives. If you debug the executable (which I needed to…) you’ll get told where in the human-readable source the PC is sitting, not where in the generated C.

The source file, a “web” that contains alternating TeX and C code, is eminently readable (if you know both TeX and C, obviously) and plays well with version control. Because this example was a simple project, I defined everything in one file but CWEB can handle multiple-file projects.

The main issue is that it’d be much better to have an IDE that’s capable of working with web files directly. A split-pane preview of the formatted documentation would be nice, and there are some good TeX word processors out there that would be a good starting point. Code completion, error detection and syntax highlighting in both the C and TeX parts would be really useful. Refactoring support would be…a challenge, but probably doable.

So my efforts with CWEB haven’t exactly put me off, but do make me think that even three decades after being created it’s not in a state to be a day-to-day developer environment. Now if only I knew someone with enough knowledge of the Clang API to think about making a C or ObjC IDE…

Posted in code-level, documentation, tool-support | Comments Off

A brief history of talking on the interwebs (or: why I’m not on app.net)

Posted on 2012-08-19 by Graham

When I first went to university, I was part of an Actual September, though it took place in October. Going from a dial-up internet service shared with the telephone line to the latest iteration of SuperJANET with its multi-megabit connection to my computer opened many new possibilities for me and my peers.

One of these possibilities was Usenet, which we accessed via news.ox.ac.uk. Being new to this online society, my fellow neophytes and I made all of the social faux pas that our forbears had made this time last year, and indeed in prior years. We top-posted. We cross-posted. We fed the trolls. Some of us even used Outlook Express. Over time, those of us who were willing to make concessions to the rules became the denizens, and it was our job the next September to flame the latest crop of newbies.

The above description is vastly oversimplified, of course. By the time of my Actual September, Usenet was feeling the effects of the Neverending September. Various commercial ISPs – most notoriously America Online – had started carrying Usenet and their customers were posting. Now there was, all year round, an influx of people who didn’t know about the existing society and rules but were, nonetheless, posting to Usenet.

Between AOL and – much later – Google Groups incorporating Usenet into its content, the people who felt themselves the guardians and definition of all that Usenet stood for found that they were the minority of users. Three main ways of dealing with this arose. Some people just gave up and left for other services. Others joined in with the new way of using Usenet. Still others worked in the old way despite the rise of the new way, wielding their ability to plonk newbies into their kill file as a badge of honour.

By now I probably don’t need to ask the rhetorical question: what has all of this to do with twitter? Clearly it has everything to do with twitter. The details differ but the analogy is near watertight. In each instance, we find a community of early adopters for a service that finds a comfortable way to use that service. In each we find that as the community grows, latecomers use the service in different ways, unanticipated or frowned upon by the early adopters. In each case the newcomers outnumber the early adopters by orders of magnitude and successfully, whether by sheer scale or through the will of the owners of the service, redefine the culture of the service. Early adopters complain that the new majority don’t “get” the culture.

Moving to app.net does nothing except reset that early-adopter clock. Any postmodernist philosopher will tell you that: probably while painting your living room lilac and dragging a goldfish bowl on a leash. If app.net takes off then the population of users will be orders of magnitude greater than the number of “backers”. The people who arrive later will have their own ideas of how to use the service; and together will have contributed orders more cash to the founders than the initial tranche of “backers”. I wonder who the management will listen to.

Any publicly-accessible communication platform will go through this growth and change. When I joined Facebook it was only open to university members and was a very different beast than modern Facebook. I would not be surprised to read similar complaints made about citizens’ band radio or Morse telegraphy.

The people who move on don’t necessarily want a changed experience. It seems to me they want a selective experience, and moving into the wilderness allows them an approximation of that. In the short term, anyway. Soon the undesirables will move in next door and they’ll choose to move on again.

I suggest that what’s required is actually something more like Usenet. I run my own status.net server, initially to archive my tweet stream but it turns out I’m not using it for that. If I chose I could open that server up to selected people, just as news.ox.ac.uk was only open to members of one university. I could curate a list of servers that mine peers with. If there are some interesting people at status.cocoadev.com, I could peer with that server. If status.beliebers.net isn’t to my taste, I don’t peer with it. But that’s fine, their users don’t see what I write in return for me not seeing what they write. In fact Usenet could’ve benefitted from more selective peering, and a lot of the paid-for access now has, easily-detectable spam aside, a higher signal to noise ratio than the service had a decade ago.

Another service that has some of the aspects of the curated experience is Glassboard. Theirs is entirely private, losing some of the discoverability of a public tweet stream. In return all conversations are (to some extent) invitation only and moderated. If you don’t like someone’s contributions, the board owner can kick ban them.

So the problem long-term tweeters have with twitter is not a new problem. Moving wholesale to something that does the same thing means deferring, not solving, the problem.

I thought I’d update this post (nearly six months later) on the day that I joined app.net. It’s changed quite a lot—both by adding a cloud storage API and by going freemium—in the intervening time. I remain skeptical that the problem with a social network is the tool, and I also wonder how the people who joined to get away from people using Twitter really badly will react to the free tier allowing the unwashed masses like me to come and use app.net really badly. Still, there’s a difference between skeptical and closed-minded, so here I am.

Posted in Twitter, user-error | 2 Comments

What’s a software architect?

Posted on 2012-07-29 by Graham

After a discussion on the twitters with Kellabyte and Iris Classon about software architects, I thought I’d summarise my position. Feel welcome to disagree.

What does a software architect do?

A software architect is there to identify risks that affect the technical implementation of the software product, and address those risks. Preferably before they stop or impede the development of the product.

That could mean doing tests to investigate the feasibility or attributes of a proposed solution. It could mean evangelising the developers to the clients or managers to avoid those people interrupting the development work. It could mean giving a junior developer a tutorial on a certain technology-or getting that developer to tutor the rest of the team on the thing that person is an expert on.

What a software architect doesn’t do

A software architect doesn’t micromanage the developers who work with them. The architect doesn’t rule by memos and UML diagrams. The architect doesn’t prognisticate on things they have no experience of. Perhaps confusingly, the role of software architect bears very little resemblance to the profession after which it’s named. If you want analogies with civil engineering, all developers are like architects. If you want to see the software analogue to the builder, that’s work done by the compiler and IDE.

Architects don’t make decisions where none is necessary. They don’t ignore or belittle suggestions that come from people who aren’t architects.

In one sentence

A software architect is there to make it easier for developers to develop.

Posted in Responsibility, software-engineering | 2 Comments

On free apps

Posted on 2012-07-28 by Graham

This post is sort-of a follow-on to @daveaddey’s post on the average app; although in reality it’s a follow-on to the response that comes out every time a post on app store revenue is written.

Events go like this:

Some statistic about app store revenue.
“Your numbers include free apps. You shouldn’t include free apps.

Yes you should. The revenue that comes from the app store is indeed shared across all apps, free and paid. Free apps contribute significantly to the long-tail price distribution of apps on the store, and the consumer perception that apps shouldn’t cost much. Some of them generate revenue through in-app purchase: revenue that probably is counted in Apple’s “we’ve given $5bn to developers” number. Some of them will generate revenue through iAd: it’s not clear whether that’s included in the $5bn but it’s certainly money paid by Apple to developers. Similarly, it’s not clear whether money paid through Newsstand subscriptions (again, in free apps) counts: it probably does. Some of them have not always and will not always be free; again these apps make money directly from Apple.

A lot of free apps come from companies that do other things, but feel a marketing need to be on the app store:Amazon, O2, facebook and others go down this route. In these cases there is absolutely no money to be made from the app directly, though there are possibly many sorts of collateral benefits. It costs Amazon money to write the Windowshop app, but they bank on users buying more things from them than if the apps doesn’t exist.

On the other hand, some free apps are just written by developers who want to put a free app on the store so that it can act as their portfolio when they try to get work as iOS app developers.

So yes, other business models exist. But when talking about money made from the app store, you have to include all of those products that don’t make money on the app store. Ignoring the odd outlier is fine statistics. Ignoring large quantities of data that make your conclusions look bad is not science; it’s witchcraft.

Posted in AAPL, Business | Comments Off

Inheritance is old and busted

Posted on 2012-07-25 by Graham

Back when I started reading about Object-Oriented Programming (which was when Java was new, I was using Delphi and maybe the ArcGIS scripting language, which also had OO features) the entire hotness was inheritance. Class hierarchies as complicated as biological taxonomies were diagrammed, implemented and justified. Authors explained that while both a Dog and a Cat could makeNoise(), they’d do it in different ways. They also got their knickers in a knot over whether a Square is-a Rectangle or the other way around. If you believe in successive specialisation then the Square is-a (special kind of) Rectangle. If you believe in the Liskov Substitution Principle then it cannot be. If you don’t care then you’d only define a Rectangle in the first place.

Slightly earlier than this time it was noticed that you sometimes want to separate the idea of “this object does these things” from “this object is something of this type”. C++ programmers were inheriting from purely virtual classes. Objective-C gave us protocols to conform to; Java borrowed this idea and called the protocols interfaces.

Both inheritance and conformance (which is what I’ll call protocol adoption, to avoid the confusion between Objective-C and Java over what it means to “implement” an “interface”) are blunt and unwieldy instruments, however. In practice, I use them in a few specific places:

I most often inherit from a “base class”. My Python object is-an object, my Java object is-an (java.lang.)object, my ObjC object is-a NSObject.
I sometimes inherit from deeper classes because something requires me to build a specialised version of that class. For example, UIKit requires that my view controllers are specialised versions of UIViewController, and that my views are specialised versions of those views.
I usually adopt a protocol or abstract interface in order to implement some design pattern. For example I might adopt a data source or delegate protocol. I might provide an implementation of a strategy or state or visitor.
I pretty much avoid even with bargepoles the “marker interfaces” that pervade pre-annotation Java, like Cloneable. I even hate NSCopying as a way of designing a protocol.

It’s time we added specialised syntax on top of inheritance to account for these cases particularly. For a start, any language that doesn’t assume you want its usual version of Object-ness when you create a class without saying anything about it is broken. Yes, I’m looking at you Objective-C. Yes, I know about NSProxy and I’m old enough to remember Object. I still say that not wanting an NSObject is rare enough that you should have to shout about it. I’ve never intentionally created a new Objective-C root class – but that’s the easiest thing the syntax permits.

Think of it like pointers. Yes, you can use pointers in ObjC and C++. Yes, arrays are built using pointers—as are object references. No you don’t need to use pointer syntax to build arrays and object references – we’ve got higher-level syntax. Just like we’ve got higher-level syntax for getting at properties or at subscripted entries in Objective-C.

So yes, you can create a delegate using protocols. But why not have syntax saying “this class’s delegates must do this” and “this is a delegate of that class”? Why not say “this is a special version of that thing”? Why not say “this group of classes implement a state machine?”

Inheritance is old and busted.