At the old/new interface: jQuery in WebObjects

It turns out to be really easy to incorporate jQuery into an Objective-C WebObjects app (targeting GNUstep Web). In fact, it doesn’t really touch the Objective-C source at all. I defined a WOJavascript object that loads jQuery itself from the application’s web server resources folder, so it can be reused across multiple components:

jquery_script:WOJavaScript {scriptFile="jquery-2.0.2.js"}

Then in the components where it’s used, any field that needs uniquely identifying should have a CSS identifier, which can be bound via WebObjects’s id binding. In this example, a text field for entering an email address in a form will only be enabled if the user has checked a “please contact me” checkbox.

email_field:WOTextField {value=email; id="emailField"}
contact_boolean:WOCheckBox {checked=shouldContact; id="shouldContact"}

The script itself can reside in the component’s HTML template, or in a WOJavascript that looks in the app’s resources folder or returns javascript that’s been prepared by the Objective-C code.

    <script>
function toggleEmail() {
    var emailField = $("#emailField");
    var isChecked = $("#shouldContact").prop("checked");
    emailField.prop("disabled", !isChecked);
    if (!isChecked) {
        emailField.val("");
    }
}
$(document).ready(function() {
    toggleEmail();
    $("#shouldContact").click(toggleEmail);
});
    </script>

I’m a complete newbie at jQuery, but even so that was easier than expected. I suppose the lesson to learn is that old technology isn’t necessarily incapable technology. People like replacing their web backend frameworks every year or so; whether there’s a reason (beyond caprice) warrants investigation.

The code you wrote six months ago

We have this trope in programming that you should hate the code you wrote six months ago. This is a figurative way of saying that you should be constantly learning and assimilating new ideas, so that you can look at what you were doing earlier this year and have new ways of doing it.

It would be more accurate, though less visceral, to say “you should be proud that the code you wrote six months ago was the best you could do with the knowledge you then had, and should be able to ways to improve upon it with the learning you’ve accomplished since then”. If you actually hate the code, well, that suggests that you think anyone who doesn’t have the knowledge you have now is an idiot. That kind of mentality is actually deleterious to learning, because you’re not going to listen to anyone for whom you have Set the Bozo Bit, including your younger self.

I wrote a lot about learning and teaching in APPropriate Behaviour, and thinking about that motivates me to scale this question up a bit. Never mind my code, how can we ensure that any programmer working today can look at the code I was writing six months ago and identify points for improvement? How can we ensure that I can look at the code any other programmer was working on six months ago, and identify points for improvement?

My suggestion is that programmers should know (or, given the existence of the internet, know how to use the index of) the problems that have already come before, how we solved them, and why particular solutions were taken. Reflecting back on my own career I find a lot of problems I introduced by not knowing things that had already been solved: it wasn’t until about 2008 that I really understood automated testing, a topic that was already being discussed back in 1968. Object-oriented analysis didn’t really click for me until later, even though Alan Kay and a lot of really other clever people had been working on it for decades. We’ll leave discussion of parallel programming aside for the moment.

So perhaps I’m talking about building, disseminating and updating a shared body of knowledge. The building part already been done, but I’m not sure I’ve ever met anyone who’s read the whole SWEBOK or referred to any part of it in their own writing or presentations so we’ll call the dissemination part a failure.

Actually, as I said we only really need an index, not the whole BOK itself: these do exist for various parts of the programming endeavour. Well, maybe not indices so much as catalogues; summaries of the state of the art occasionally with helpful references back to the primary material. Some of them are even considered “standards”, in that they are the go-to places for the information they catalogue:

  • If you want an algorithm, you probably want The Art of Computer Programming or Numerical Recipes. Difficulties: you probably won’t understand what’s written in there (the latter book in particular assumes a bunch of degree-level maths).
  • If you want idioms for your language, look for a catalogue called “Effective <name of your language>”. Difficulty: some people will disagree with the content here just to be contrary.
  • If you want a pattern, well! Have we got a catalogue for you! In fact, have we got more catalogues than distinct patterns! There’s the Gang of Four book, the PloP series, and more. If you want a catalogue that looks like it’s about patterns but is actually comprised of random internet commentators trying to prove they know more than Alastair Cockburn, you could try out the Portland Pattern Repository. Difficulty: you probably won’t know what you’re looking for until you’ve already read it—and a load of other stuff.

I’ve already discussed how conference talks are a double-edged sword when it comes to knowledge sharing: they reach a small fraction of the practitioners, take information from an even smaller fraction, and typically set up a subculture with its own values distinct from programming in the large. The same goes for company-internal knowledge sharing programs. I know a few companies that run such programs (we do where I work, and Etsy publish the talks from theirs). They’re great for promoting research, learning and sharing within the company, but you’re always aware that you’re not necessarily discovering things from without.

So I consider this one of the great unsolved problems in programming at the moment. In fact, let me express it as two distinct questions:

  1. How do I make sure that I am not reinventing wheels, solving problems that no longer need solving or making mistakes that have already been fixed?
  2. A new (and for sake of this discussion) inexperienced programmer joins my team. How do I help this person understand the problems that have already been solved, the mistakes that have already been made, and the wheels that have already been invented?

Solve this, and there are only two things left to do: fix concurrency, name things, and improve bounds checking.

Lighter UIViewControllers

The first issue of Objective-C periodical objc.io has just been announced:

Issue #1 is about lighter view controllers. The introduction tells you a bit more about this issue and us. First, Chris writes about lighter view controllers. Florian expands on this topic with clean table view code. Then Daniel explores view controller testing. Finally, in our first guest article, Ricki explains view controller containment.

Each of the articles is of a high quality, I thoroughly recommend that you read this. I powered through it this morning and have already put myself on the mailing list for the next issue. I think it’s worth quite a few dollars more than the $0 they’re asking.

On a more self-focussed note, I’m pleased to note that two of the articles cite my work. This isn’t out of some arrogant pleasure at seeing my name on someone else’s website. One of my personal goals has been to teach the people who teach the other people: to ensure that what I learn becomes a part of the memeplex and isn’t forgotten and reinvented a few years later. In APPropriate Behaviour a few of the chapters discuss the learning and teaching of software making. I consider it one of the great responsibilities of our discipline, to ensure the mistakes we had to make are not made by those who come after us.

The obj.io team have done a great job of researching their articles, taking the knowledge that has been found and stories that have been told and synthesising new knowledge and new stories from them. I’m both proud and humble about my small, indirect role in this effort.

APPropriate Behaviour is complete!

APPropriate Behaviour, the book on things programmers do that aren’t programming, is now complete! The final chapter – a philosophy of software making – has been added, concluding the book.

Just because it’s complete, doesn’t mean it’s finished: as my understanding of what we do develops I’ll probably want to correct things, or add new anecdotes or ideas. Readers of the book automatically get free updates whenever I create them in the future, so I hope that this is a book that grows with us.

As ever, the introduction to the book has instructions on joining the book’s Glassboard to discuss the content or omissions from the content. I look forward to reading what you have to say about the book in the Glassboard.

While the recommended purchase price of APPropriate Behaviour is $20, the minimum price now that it’s complete is just $10. Looking at the prices paid by the 107 readers who bought it while it was still being written, $10 is below the median price (so most people chose to pay more than $10) and the modal price (so the most common price chosen by readers was higher than $10).

A little about writing the book: I had created the outline of the book last Summer, while thinking about the things I believed should’ve been mentioned in Code Complete but were missing. I finally decided that it actually deserved to be written toward the end of the year, and used National Novel Writing Month as an excuse to start on the draft. A sizeable portion of the draft typescript was created in that month; enough to upload to LeanPub and start getting feedback on from early readers. I really appreciate the help and input those early readers, along with other people I’ve talked to the material about, have given both in preparing APPropriate Behaviour and in understanding my career and our industry.

Over the next few months, I tidied up that first draft, added new chapters, and extended the existing material. The end result – the 11th release including that first draft – is 141 pages of reflection over the decade in which I’ve been paid to make software: not a long time, but still nearly 15% of the sector’s total lifespan. I invite you to grab a copy from LeanPub and share in my reflections on that decade, and consider what should happen in the next.

Specifications for interchanging objects

One of the interesting aspects of Smalltalk and similar languages including Objective-C and Ruby is that while the object model exposes a hierarchy of classes, consumers of objects in these environments are free to ignore the position of the object in that hierarchy. The hierarchy can be thought of as a convenience: on the one hand, for people building objects (“this object does all the same stuff as instances of its parent class, and then some”). It’s also a convenience for people consuming objects (“you can treat this object like it’s one of these types further up the hierarchy”).

So you might think that -isKindOfClass: represents a test for “I can use this object like I would use one of these objects”. There are two problems with this, which are both expressed across two dimensions. As with any boolean test, the problems are false positives and false negatives.

A false positive is when an object passes the test, but actually can’t be treated as an instance of the parent type. In a lot of recent object-oriented code this is a rare problem. The idea of the Liskov Substitution Principle, if not its precise intent as originally stated, has become entrenched in the Object-Oriented groupthink.

I’ve worked with code from the 1980s though where these false positives exist: an obvious example is “closing off” particular selectors. A parent class defines some interface, then subclasses inherit from that class, overriding selectors to call [self doesNotRecognize:] on features of the parent that aren’t relevant in the subclass. This is still possible today, though done infrequently.

False negatives occur when an object fails the -isKindOfClass: test but actually could be used in the way your software intends. In Objective-C (though neither in Smalltalk[*] nor Ruby), nil _does_ satisfy client code’s needs in a lot of cases but never passes the hierarchy test. Similarly, you could easily arrange for an object to respond to all the same selectors as another object, and to have the same dynamic behaviour, but to be in an unrelated position in the hierarchy. You _can_ use an OFArray like you can use an NSArray, but it isn’t a kind of NSArray.

[*] There is an implementation of an Objective-C style Null object for Squeak.

Obviously if the test is broken, we should change the test. False negatives can be addressed by testing for protocols (again, in the languages I’ve listed, this only applies to Objective-C and MacRuby). Protocols are unfortunately named in this instance: they basically say “this object responds to any selector in this list”. We could then say that rather than testing for an object being a kind of UIView, we need an object that conforms to the UIDrawing protocol. This protocol doesn’t exist, but we could say that.

Problems exist here. An object that responds to all of the selectors doesn’t necessarily conform to the protocol, so we still have false negatives. The developer of the class might have forgotten to declare the protocol (though not in MacRuby, where protocol tests are evaluated dynamically), or the object could forward unknown selectors to another object which does conform to the protocol.

There’s still a false positive issue too: ironically protocol conformance only tells us what selectors exist, not the protocol in which they should be used. Learning an interface from a protocol is like learning a language from a dictionary, in that you’ve been told what words exist but not what order they should be used in or which ones it’s polite to use in what circumstances.

Consider the table view data source. Its job is to tell the table view how many sections there are, how many rows there are in each section, and what cell to display for each row. An object that conforms to the data source protocol does not necessarily do that. An object that tells the table there are three sections but crashes if you ask how many rows are in any section beyond the first conforms to the protocol, but doesn’t have the correct dynamic behaviour.

We have tools for verifying the dynamic behaviour of objects. In his 1996 book Superdistribution: Objects as Property on the Electronic Frontier, Brad Cox describes a black box test of an object’s dynamic behaviour, in which test code messages the object then asserts that the object responds in expected ways. This form of test was first implemented in a standard fashion, to my knowledge, in 1998 by Kent Beck as a unit test.

Unit tests are now also a standard part of the developer groupthink, including tests as specification under the name Test-Driven Development But we still use them in a craft way, as a bespoke specification for our one-of-a-kind classes. What we should really do is to make more use of these tests: substituting our static, error-prone type tests for dynamic specification tests.

A table view does not need something that responds to the data source selectors, it needs something that behaves like a data source. So let’s create some tests that any data source should satisfy, and bundle them up as a specification that can be tested at runtime. Notice that these aren’t quite unit tests in that we’re not testing our data source, we’re testing any data source. We could define some new API to test for satisfactory behaviour:

- (void)setDataSource: (id <UITableViewDataSource>)dataSource {
  NSAssert([Demonstrate that: dataSource satisfies: [Specification for: @protocol(UITableViewDataSource)]]);
  _dataSource = dataSource;
  [self reloadData];
}

But perhaps with new language and framework support, it could look like this:

- (void)setDataSource: (id @<UITableViewDataSource>)dataSource {
  NSAssert([dataSource satisfiesSpecification: @specification(UITableViewDataSource)]);
  _dataSource = dataSource;
  [self reloadData];
}

You could imagine that in languages that support design-by-contract, such as Eiffel, the specification of a collaborator could be part of the contract of a class.

In each case, the expression inside the assertion handler would find and run the test specification appropriate for the collaborating object. Yes this is slower than doing the error-prone type hierarchy or conformance tests. No, that’s not a problem: we want to make it right before making it fast.

Treating test fixtures as specifications for collaboration between objects, rather than (or in addition to) one-off tests for one-off classes, opens up new routes for collaboration between the developers of the objects. Framework vendors can supply specifications as enhanced documentation. Framework consumers can supply specifications of how they’re using the frameworks as bug reports or support questions: vendors can add those specifications to a regression testing arsenal. Application authors can create specifications to send to contractors or vendors as acceptance tests. Vendors could demonstrate that their code is “a drop-in replacement” for some other code by demonstrating that both pass the same specification.

But finally it frees object-oriented software from the tyranny of the hierarchy. The promise of duck typing has always been tempered by the dangers, because we haven’t been able to show that our duck typed objects actually can quack like ducks until it’s too late.

On designing collections

Introduction

This post explores the pros and the cons of following the design rule “Objects responsible for collections of other objects should expose an interface to the collection, not the collection itself”. Examples and other technical discussion is in Objective-C, assuming Foundation classes and idioms.

Example Problem

Imagine you were building Core Data today. You get to NSEntityDescription, which encapsulates the information about an entity in the Managed Object Model including its attributes and relations, collectively “properties”. You have two options:

  1. Let client code have access to the actual collection of properties.
  2. Make client code work with the properties via API on NSEntityDescription or abstract collection interfaces.

In reality, NSEntityDescription does both, but not with the same level of control. As external clients of the class we can’t see whether the collection objects are internal state of the entity description or are themselves derived from its real state, although this LGPL implementation from GSCoreData does return its real instance variable in one case. However, this implementation is old enough not to show another aspect of Apple’s class: their entity description class itself conforms to NSFastEnumeration.

How to give access to the actual collection

This is the easiest case.

@implementation NSEntityDescription
{
    NSArray *_properties;
}

- (NSArray *)properties
{
    return _properties;
}

@end

A common (but trivial) extension to this is to return an autoreleased copy of the instance variable, particularly in the case where the instance variable itself is mutable but clients should be given access to a read-only snapshot of the state. A less-common technique is to build a custom subclass of NSArray using an algorithm specific to this application.

How to provide abstract collection access

There are numerous ways to do this, so let’s take a look at a few of them.

Enumerator

It doesn’t matter how a collection of things is implemented, if you can supply each in turn when asked you can give out an enumerator. This is how NSFileManager lets you walk through the filesystem. Until a few years ago, it’s how NSTableView let you see each of the selected columns. It’s how PSFeed works, too.

The good news is, this is usually really easy. If you’re already using a Foundation collection class, it can already give you an appropriate NSEnumerator.

- (NSEnumerator *)propertyEnumerator
{
     return [_properties objectEnumerator];
}

You could also provide your own NSEnumerator subclass to work through the collection in a different way (for example if the collection doesn’t actually exist but can be derived lazily).

Fast enumeration

This is basically the same thing as “Enumerator”, but has different implementation details. In addition, the overloaded for keyword in Objective-C provides a shorthand syntax for looping over collections that conform to the NSFastEnumeration protocol.

Conveniently, NSEnumerator conforms to the protocol so it’s possible to go from “Enumerator” to “Fast enumeration” very easily. All of the Foundation collection classes also implement the protocol, so you could do this:

- (id <NSFastEnumeration>)properties
{
    return _properties;
}

Another option—the one that NSEntityDescription actually uses—is to implement the -countByEnumeratingWithState:options:count: method yourself. A simple implementation passes through to a Foundation collection class that already gets the details right. There are a lot of details, but a custom implementation could do the things a custom NSEnumerator subclass could.

Object subscripting

If you’ve got a collection that’s naturally indexed, or naturally keyed, you can let people access that collection without giving them the specific implementation that holds the collected objects. The subscripting methods let you answer questions like “what is the object at index 3?” or “which object is named ‘Bob’?”. As with “Fast enumeration” there is syntax support in the language for conveniently using these features in client code.

- (id)objectAtIndexedSubscript: (NSUInteger)idx
{
    return _properties[idx];
}

Key-Value Coding

Both ordered and unordered collections can be hidden behind the to-many Key-Value Coding accessors. These methods also give client code the opportunity to use KVC’s mutable proxy collections, treating the real implementation (whatever it is) as if it were an NSMutableArray or NSMutableSet.

- (NSUInteger)countOfProperties
{
    return [_properties count];
}

- (NSPropertyDescription *)objectInPropertiesAtIndex: (NSUInteger)index
{
    return _properties[index];
}

- (NSArray *)propertiesAtIndexes: (NSIndexSet *)indexes
{
    return [_properties objectsAtIndexes: indexes];
}

Block Application (or Higher-Order Messaging)

You could decide not to give out access to the collection at all, but to allow clients to apply work to the objects in that collection.

- (void)enumeratePropertiesWithBlock: (void (^)(NSPropertyDescription *propertyDescription))workBlock
{
    NSUInteger count = [_properties count];
    dispatch_apply(count, _myQueue, ^(size_t index) {
        workBlock(_properties[index]);
    });
}

Higher-order messaging is basically the same thing but without the blocks. Clients could supply a selector or an invocation to apply to each object in the collection.

Visitor

Like higher-order messaging, Visitor is a pattern that lets clients apply code to the objects in a collection regardless of the implementation or even logical structure of the collection. It’s particularly useful where client code is likely to need to maintain some sort of state across visits; a compiler front-end might expose a Visitor interface that clients use to construct indices or symbol tables.

- (void)acceptPropertyVisitor: (id <NSEntityDescriptionVisitor>)aVisitor
{
    for (NSPropertyDescription *property in _properties)
    {
        [aVisitor visit: property];
    }
}

Benefits of Hiding the Collection

By hiding the implementation of a collection behind its interface, you’re free to change the implementation whenever you want. This is particularly true of the more abstract approaches like Enumerator, Fast enumeration, and Block Application, where clients only find out that there is a collection, and none of the details of whether it’s indexed or not, sparse or not, precomputed or lazy and so on. If you started with an array but realise an indexed set or even an unindexed set would be better, there’s no problem in making that change. Clients could iterate over objects in the collection before, they can now, nothing has changed—but you’re free to use whatever algorithms make most sense in the application. That’s a specific example of the Principle of Least Knowledge.

With the Key-Value Coding approach, you may be able to answer some questions in your class without actually doing all the work to instantiate the collected objects, such as the Key-Value Coding collection operators.

Additionally, there could be reasons to control use of the collected objects. For mutable collections it may be better to allow Block Application than let clients take an array copy which will become stale. Giving out the mutable collection itself would lead to all sorts of trouble, but that can be avoided just with a copy.

Benefits of Providing the Collection

It’s easier. Foundation already provides classes that can do collections and all of the things you might want to do with them (including enumeration, KVC, and application) and you can just use those implementations without too much work: though notice that you can do that while still following the Fast Enumeration pattern.

It’s also conventional. Plenty of examples exist of classes that give you a collection of things they’re responsible for, like subviews, child view controllers, table view columns and so on (though notice that the table view previously gave an Enumerator of selected columns, and lets you Apply Block to row views). Doing the same thing follows a Principle of Least Surprise.

When integrating your components into an app, there may be expectations that a given collection is used in a context where a Foundation collection class is expected. If your application uses an array controller, you need an array: you could expect the application to provide an adaptor, or you could use Key-Value Coding and supply the proxy array, but it might be easiest to just give the application access to the collection in the first place.

Finally, the Foundation collection classes are abstract, so you still get some flexibility to change the implementation by building custom subclasses.

Conclusion

There isn’t really a “best” way to do collections, there are benefits and limitations of any technique. By understanding the alternatives we can choose the best one for any situation (or be like NSEntityDescription and do a bit of each). In general though, giving interfaces to get or apply work to elements of the collection gives you more flexibility and control over how the collection is maintained, at the cost of being more work.

Coda

“More than one thing” isn’t necessarily a collection. It could be better modelled as a source, that generates objects over time. It could be a cursor that shows the current value but doesn’t remember earlier ones, a bit like an Enumerator. It could be something else. The above argument doesn’t consider those cases.

On rewriting your application

I’m really far behind on podcasts. I have a long commute, and listen to one audiobook every month, filling the slack time with a selection of podcasts. It happens that between two really long books (Cryptonomicon by Neal Stephenson and The Three Musketeers by Alexandre Dumas, both of which I’d recommend) and quite a few snow days I’ve managed to fall behind.

This is the context in which I listened to Episode 77 of iDeveloper Live, on whether to rewrite or not rewrite an application. I was meant to be on that program but had to pull out due to a combination of being ill and pressing on with writing what would become Discworld: the Ankh-Morpork Map. I imagine that one of these two factors was the cause of the other.

Anyway, listening to the podcast made me decide to put my case forward. It’s unfortunate that I wasn’t part of the live discussion but I hope that what I have to say still has value. I’d recommend listening to what Scotty, Uli, John and Pilky have to say on the original podcast, too. If I make references to the discussion in the podcast, I’ll apologise in advance for failing to remember which person said what.

The view from the outside

The point that got me thinking about this was the marketing for a version x.0 of an app. I don’t remember what the app was, I think Danny Greg probably posted the link to it. The release described how the app was “completely rewritten” from the previous version; an announcement that’s not uncommon in software release notes.

My position is this: A complete rewrite is neither a feature nor a benefit. It’s a warning. It says “warning: you might like the previous version, in fact that’s probably why you’re buying it, but what we’re now selling you has nothing in common with it”. It says “warning: the workflow you’re accustomed to in this app may no longer work, or may have different bugs than the ones you’ve discovered how to work around”. It says “warning: this product and the previous version are similar in name alone”. This is what I infer when I read that your product has been rewritten.

The view from the inside

As programmers, we’re saying “but I’ve got to rewrite this, it’s so old and crappy”. Why is it old and crappy? Is it because we weren’t trained to write readable code, and we weren’t trained to read code? Someone on the podcast referred to the “you should hate code you wrote six months ago or you’re not learning” trope. No. You should look at code you were writing six months ago and see how it could be improved. You should be able to decide whether to make those improvements, depending on whether they would benefit the product.

Many of the projects I’ve worked on have taken more than six months to complete. In each case, we could have either released it, or we could still be in a cycle of finding code that was modified more than six months ago, looking at it in disgust, throwing it away and writing it again—and waiting for the new regression bug reports to come in.

Bear in mind that source code is a liability, not an asset. When you tear something out to rewrite it from scratch, you’re using up time and money to create a thing that provides the same value as the thing you’re replacing. It’s at times like this that we enjoy waving our hands and talking about Technical Debt. Martin Fowler:

The tricky thing about technical debt, of course, is that unlike money it’s impossible to measure effectively. The interest payments hurt a team’s productivity, but since we CannotMeasureProductivity, we can’t really see the true effect of our technical debt.

We can’t really see the true effect. So how do you know that this rewrite is going to pay off? If you have some problem now it might be clear that this problem can be addressed more cheaply with a rewrite than by extending or modifying existing code. This is the situation Uli described in the podcast; they’d used some third-party library in version 1, which got them a quick release but had its problems. Having learned from those problems, they decided to replace that library in version 2.

Where you have problems, you can solve them either by modification or by rewriting. If you think that some problem might occur in the future, then leave it: you don’t make a profit by solving problems that no-one has.

A case study

While I had a few jobs in computing beforehand, all of which required that I write code, my first job where the title meant “someone who writes code” started about six years ago, at a company that makes anti-virus software. I was a developer (and would quickly become lead developer, despite not being ready) on the Mac version of the software.

This software had what could be described as an illustrious history: it was older than some of the readers of this blog (which is not uncommon: Cocoa is old enough to vote in the UK and UNIX is middle-aged). It started life as a Lightspeed/THINK C product, then become a PowerPlant product at around the time of the PowerPC transition. When I started working on it in 2007 the PowerPlant user interface still existed, but it did look out of place and dated. In addition, Apple were making noises about the library not being supportable on future versions of Mac OS X, so the first project for the new team was to build a new UI in Cocoa.

We immediately set out getting things wrong more quickly than any other team in the company. The lead developer when I joined had plenty of experience on the MS-DOS and Windows versions of the product, but had never worked on a Mac nor in Objective-C: then I became the lead developer, having worked in Objective-C but never in a team of more than one person. I won’t go into all of the details but the project ended up taking multiple times its estimated duration: not surprising, when the team had never worked together and none of the members had worked on this type of project so our estimates were really random numbers.

At the outset of this project, being untrained in the reading of other people’s code, I was dismayed by what I saw. I repeatedly asked for permission to rewrite other parts of the system that had copyright dates from when Kurt Cobain was still making TV appearances. I was repeatedly refused: the correct decision as while the old code had its bugs it was a lot more stable than what we were writing, and as it had already been written it cost a lot less to supply to the customer[*].

Toward the eventual end of the project, I asked my manager why we hadn’t given up on it after a year of getting things wrong, declared that a learning exercise and started over. Essentially, why couldn’t we take the “I hate the code I wrote last year, let’s start from scratch” approach. His answer was that at least one person would’ve got frustrated and quit after having their code thrown away; then we’d have no product and also no team so would not be in a better position.[*]

Eventually the product did get out of the door, and while I’m no longer involved with it I can tell that the version shipping today still has most of the moving parts that were developed during my time and before. Gradual improvement, responding to changes in what customers want and what suppliers provide, has stood that product in good stead for over two decades.

[*] It’s important to separate these two arguments from the Sunk Cost Fallacy. In neither case are we including the money spent on prior work. The first paragraph says “from today’s perspective, what we already have is free and what you haven’t written is not free, but they both do the same thing”. The second paragraph says “from today’s perspective, finishing what you’ve done costs a lot of money. Starting afresh costs a lot of money and introduces social disruption. But they both do the same thing.”

A two-dimensional dictionary

What?

A thing I made has just been open-sourced by my employers at Agant: the AGTTwoDimensionalDictionary works a bit like a normal dictionary, except that the keys are CGPoints meaning we can find all the objects within a given rectangle.

Why?

A lot of time on developing Discworld: The Ankh-Morpork Map was spent on performance optimisation: there’s a lot of stuff to get moving around a whole city. As described by Dave Addey, the buildings on the map were traced and rendered into separate images so that we could let characters go behind them. This means that there are a few thousand of those little images, and whenever you’re panning the map around the app has to decide which images are visible, put them in the correct place (in three dimensions; remember people can be in front of or behind the buildings) and draw everything.

A first pass involved creating a set containing all of the objects, looping over them to find out which were within the screen region. This was too slow. Implementing this 2-d index instead made it take about 20% the original time for only a few tens of kilobytes more memory, so that’s where we are now. It’s also why the data type doesn’t currently do any rebalancing of its tree; it had become fast enough for the app it was built in already. This is a key part of performance work: know which battles are worth fighting. About one month of full-time development went into optimising this app, and it would’ve been more if we hadn’t been measuring where the most benefit could be gained. By the time we started releasing betas, every code change was measured in Instruments before being accepted.

Anyway, we’ve open-sourced it so it can be fast enough for your app, too.

How?

There’s a data structure called the multidimensional binary tree or k-d tree, and this dictionary is backed by that data structure. I couldn’t find an implementation of that structure I could use in an iOS app, so cracked open the Objective-C++ and built this one.

Objective-C++? Yes. There are two reasons for using C++ in this context: one is that the structure actually does get accessed often enough in the Discworld app that dynamic dispatch all the way down adds a significant time penalty. The other is that the structure contains enough objects that having a spare isa pointer per node adds a significant memory penalty.

But then there’s also a good reason for using Objective-C: it’s an Objective-C app. My controller objects shouldn’t have to be written in a different language just to use some data structure. Therefore I reach for the only application of ObjC++ that should even be permitted to compile: an implementation in one language that exposes an interface in the other. Even the unit tests are written in straight Objective-C, because that’s how the class is supposed to be used.

The Liskov Citation Principle

In her keynote speech at QCon London 2013 on The Power of Abstraction, Barbara Liskov referred to several papers contemporary with her work on abstract data types. I’ve collected these references and found links to free copies of the articles where available.

Dijkstra 1968 Go To statement considered harmful

Wirth 1971 Program development by stepwise refinement

Parnas 1971 Information distribution aspects of design methodology

Liskov 1972 A design methodology for reliable software systems

Schuman and Jorrand 1970 Definition mechanisms in extensible programming languages
Not apparently available online for free

Balzer 1967 Dataless Programming

Dahl and Hoare 1972 Hierarchical program structures
Not apparently available online for free

Morris 1973 Protection in programming languages

Liskov and Zilles 1974 Programming with abstract data types

Liskov 1987 Data abstraction and hierarchy

When all you have is a NailFactory…

…every problem looks like it can be solved by configuring a different nail.

We have an obsession with tools in the software industry. We’ve built tools for building software, tools for testing software, tools for recording how the software is broken, tools for recording when we fixed software. Tools for migrating data from the no-longer-cool tools into the cool tools. Tools for measuring how much other tools have been used.

Let’s call this Tool-Driven Development, and let’s give Tool-Driven Development the following manifesto (a real manifesto that outlines intended behaviour, not a green paper):

Given an observed lack of consideration toward attribute x, we Tool-Driven Developers commit to supplying a tool that automates the application of attribute x.

So, if your developers aren’t thinking about testing, we’ll make a tool to make the tests they don’t write run quicker! If your developers aren’t doing performance analysis, we’ve got all sorts of tools for getting the very reports they don’t know that they need!

This fascination with creating tools is a natural consequence of assuming that everyone[*] is like me. I’ve found this problem that I need to solve, surely everyone needs to solve this problem so I’ll write a tool. Then I can tell people how to use this tool and the problem will be solved for everyone!

[*]Obviously not everyone, just everyone who gets it. Those clueless [dinosaurs clinging to the old tools|hipsters jumping on the new bandwagons] don’t get it, and I’m not talking about them.

No. Well, not yet. We’ve skipped two important steps out of a three-step enlightenment scheme:

  1. Awareness. Tell me what the unknown that I don’t know is.
  2. Education. Tell me why this thing that I now know about is a big deal, what I’m missing out on, what the drawbacks are, and why solving it would be beneficial.
  3. Training. Now that I know this thing exists, and that I should do something about it, and what that something is, now is the time to show me the tools and how I can use them to solve my new problem.

One of the talks at QCon London was by Damian Conway on dead languages. It covered these three features almost in reverse, to make the point that the tools we use constrain our mental models of the problems we’re trying to solve. Training: here’s a language, this is how it works, this is a code problem solved in that language. Education: the language has these features which lets us write our code in this way with these limitations. Awareness: there are ways to write code, and hence to solve problems in software, that aren’t the way you’re currently doing it.

A lot of what I’ve worked on has covered awareness without going further. The App Makers’ Privacy Pledge raises awareness that privacy in mobile apps is a problem, without discussing the details of the problem or the mechanics of a solution. APPropriate Behaviour contains arguments expressing that programmers should be aware of the social scope in which their programming activities sit.

While I appreciate and even accept the charge of intellectual foreplay, I think a problem looking for a solution is more useful than a solution looking for a problem. Still, with some of us doing the awareness thing and others doing the training thing, a scheme by which we can empower ourselves and future developers is clear: let’s team up and meet in the middle.