Structure and Interpretation of Computer Programmers

Is it an anti-pattern to use properties everywhere?

Posted on 2012-02-17 by Graham

I’ve seen questions about whether to always provide accessors for ivars, and recommendations, such as in akosma software’s ObjC code standards, that say Whenever possible, do not specify ivars in the header file; use only @property and @synthesize statements instead.

This isn’t how I work, which led me to ask the question: is this a good recommendation? Obviously the question of whether coding standards are “good” or “bad” is subjective, so what I’m really asking is whether this is something I’d want to do myself.

How I currently work

If I need to be able to see the state of an object from outside that object, I’ll make a readonly property. If I need to be able to change the state of an object then I’ll create a readwrite property. Whether these are synthesised or dynamic properties depends on how I need to compute their values.

If, in implementing a method, I find I need to make use of some object or value that was generated in another method, I’ll create an instance variable to store the value away in the other method so it can be used from this method. I define this ivar in the @implementation of the class.

This effectively acts as a de facto distinction between public and private data. Public data is accessed via properties which can be seen and called from anywhere; private data is accessed via ivars that cannot be seen anywhere except inside the current class (don’t worry about the visibility modifiers @public and friends: because the ivars are inside the implementation even subclasses don’t know what they are).

If I consistently use the property accessors even inside a class, I can see where an object is making use of state that is accessible to the outside world. That can indicate an encapsulation failure, making the class fragile to external prodding. Of course, it’s also expected that objects do get told about the outside world, so it’s not an automatic fail to do this: but it’s useful to be able to see where it happens.

About the “properties everywhere” approach

As I see it, there are benefits and drawbacks to that technique. The pros:

Encapsulate memory management. This is less of a benefit with automatic reference counting or garbage collection, but in a manually reference-counted environment you can put the memory management semantics in one place rather than sprinkling your code with retain, release and copy calls.
Consistency. Rather than having two distinct techniques for accessing ivars, you have just one.
Properties are new(ish), and ivars are old and busted. :-)

The cons:

Broken encapsulation. Now all of your ivars have accessors, and as previously discussed here Objective-C doesn’t have method visibility modifiers. All accessors are public (even if declared in the class extension), so any dev armed with a copy of class-dump might decide to change the internal state of your classes.
Consistency. This is where it gets subjective, because this was also a pro, but as I said earlier I deliberately make a distinction between internal state and externally-available properties, so making both of these the same is a problem.
Writing code you don’t need. Even where you never actually use an ivar outside of the object that owns it, you’re still writing public methods for that ivar.

Conclusion

I’m not going to be adopting the “everything is a property” recommendation, I value the privacy of my parts. I’ll carry on with writing instance variables where I need them, and “promoting” them to properties with accessors where that’s necessary.

Posted in code-level | 10 Comments

Some LightReading about mobile app security

Posted on 2012-02-10 by Graham

[This article was co-written with @securityninja]

If mobile app security is failing, it’s up to the security industry, not developers, to repair it.

An article published yesterday at security news site DarkReading announces “Developers not applying secure development life cycle practices in mobile app production”. The author finds many faults with the way application security is treated by mobile app developers, but doesn’t address what we believe to be the underlying problem: the engagement between security specialists and app makers.

Before addressing this point, though, there are several inaccuracies in the article that need to be corrected. Ericka Chickowski describes the mobile app landscape as “a development environment still in its infancy and no real standards to lead the way”, though in fact developers are able to bring a lot of the experience and tools they’ve used in building desktop and web apps to the party. The Cocoa Touch SDK used in Apple’s iOS shares a common heritage with Mac OS X’s Cocoa, just as the Java used in Android and SilverLight in Windows Phone 7 strongly resemble their desktop counterparts. For example the Code Access Security approach introduced in Silverlight ended up being adopted as the Code Access Security approach for the whole .NET framework.

Taking the 50,000 foot view, mobile apps look a lot like any other software application: there’s a client that’s delivered to the user somehow, which has some local storage and (often) communicates to an online service via the internet. This is exactly the situation we see with web apps, and indeed a lot of the techniques used to secure web apps are directly applicable in the mobile world. Sure there are new challenges to address with mobile, but these represent a tweak, not a rewrite, of security advice for developers.

It is probably fair to say security guidance and testing tools aren’t quite as mature as they are in the web application security world but to say they don’t exist is false. One of us (David) is an author of open source tools which help reviewers find vulnerabilities in mobile applications and has to correct the original author on this point. In addition to my own tools others exist: in fact there is a Live CD called the ARE (Android Reverse Engineering) VM which was created solely to allow people to security review/test Android applications and established commercial offerings from the likes of Veracode.

Chickowski also states that no secure coding guidelines for mobile applications exist: OWASP have a mobile security project which is very useful and will only get better over time. The OWASP project includes iGoat and GoatDroid: insecure apps for the iPhone and Android respectively that developers can use to understand what vulnerable code looks like and how to detect and fix security problems on those mobile platforms.

Both Apple and Android have good secure development documentation. The Apple “Introduction to Secure Coding” document has been around since 2006 with guidance for iOS development added in 2008. Apple’s World-Wide Developer Conference (WWDC) includes multiple sessions on application security; these sessions are freely available to registered developers. Google has produced similar guidance as part of their development reference site so to say no guidance exists and that Apple and Google have only just started to think about needs to be corrected. In fact any one who spends a small amount of time researching the architecture of those platforms (as well as others including WP7) will understand that security has been a consideration pretty much since day one.

The final point we want to pick up from the article is the following line: “Rapid and Agile Development causes changes to happen in very short iterations, thus security gets overlooked and becomes a nice thing to do but rarely gets done.” This is certainly not a mobile specific issue and as people who work in a company where we have security integrated throughout an Agile process, we can tell you the security deliverables in SDLC don’t really change much. Sure it means you have to do security testing and reviews more often but that doesn’t mean security should be, or always is overlooked. You can try to blame developers, you can even try to place the blame at the door of security professionals but if the business doesn’t want to produce secure code there is very little those people can do.

So what’s actually behind the apparent lack of security practice among mobile app developers? We see a clue in the fact that Chickowski’s article was published at a security news site, not at a site for mobile developers. Security practitioners telling each other how apps fail at security can be entertaining and make for good conversation in the bars at conferences, but we need to engage the people making the apps if we want to effect change in the way those apps are made. Developers, project managers and executive officers need to be able to evaluate the risk that they are exposing their customers and their businesses to. They need to know how to measure the security posture of their apps and to make decisions on what changes to make, feeding those decisions into the same process they use to prioritise features and bug fixes. In short, we need to help developers to get this right, not call them out when they get it wrong,

David Rook (@securityninja) is the Application Security Lead at Realex Payments.

Graham Lee (@iamleeg) is the Smartphone Security Boffin at O2 Labs.

Posted in Uncategorized | 2 Comments

Building an object-oriented dispatch system in Objective-C

Posted on 2012-02-09 by Graham

iTunes was messing about rebuilding the device I was trying to use for development, so I had time over lunch to write a new message dispatch system in the Objective-C language. “But wait,” you say, “Objective-C already has a message dispatch system!” True, and it’s better than the one I’ve created. But it doesn’t use blocks, and blocks are cool :-). In the discussion below, I’ll build up an implementation of a “recent items” list, which is discussed in Kevlin Henney’s presentation linked in the acknowledgements.

The constructor

One important part of an object system is the ability to make new objects. Let’s declare an object type, and a constructor type that returns one of those objects:

typedef id (^BlockObject)(NSString *selector, NSDictionary *parameters);
typedef BlockObject(^BlockConstructor)(void);

I’d better explain signature of the BlockObject type. Objects can be sent messages; what we’re doing is saying that if you execute the object with a selector name, the object will dispatch the correct implementation with the parameters you supply and will give you back the return value from the implementation. That’s what objc_msgSend() does in old-school Objective-C. The constructor is going to return this dispatch block – actually it’ll return a copy of that block, so invoking the constructor multiple times results in multiple copies of the object. Let’s see that in action.

BlockConstructor newRecentItemsList = ^ {
    BlockObject list = ^(NSString *selector, NSDictionary *parameters) {
        return (id)nil;
    }
    return (BlockObject)[list copy];
}

Yes, you have to cast nil to id. Who knew C could be so annoying?

Message dispatch

An object that can’t do anything isn’t very exciting, so we should add a way for it to look up and execute implementations. Method implementations are of the following type:

typedef id (^BlockIMP)(NSDictionary *parameters);

With that in place, I’ll show you an example of the object with a dispatch system in place, and discuss it afterward.

typedef void (^SelectorUpdater)(NSString *selector, BlockIMP implementation);

BlockConstructor newRecentItemsList = ^ {
    __block NSMutableDictionary *selectorImplementationMap = [NSMutableDictionary dictionary];
    __block SelectorUpdater setImplementation = ^(NSString *selector, BlockIMP implementation) {
        [selectorImplementationMap setObject: [implementation copy] forKey: selector];
    };

    BlockObject list = ^(NSString *selector, NSDictionary *parameters) {
        BlockIMP implementation = [selectorImplementationMap objectForKey: selector];
        return implementation(parameters);
    };
    return (BlockObject)[list copy];
};

The variables selectorImplementationMap and setImplementation are __block variables in the constructor block. This means that every time the constructor is called, the returned instance has its own copy of these variables that it is free to use and to modify. Let me put that another way: the entire message-dispatch system is encapsulated inside each instance. If a class, or even an individual instance, wants to implement message dispatch in a different way, that’s cool. It also means that an instance can change its own methods at runtime without affecting any other objects, including other instances of the same class. As long as the object still conforms to the contract that governs method dispatch, that’s cool too.

Implementing the recent items list

OK, now that we’ve got construction and messaging in place, we can start making useful objects. Here’s the implementation of the recent items list, where I’ve chosen to use an NSMutableArray for the internal storage, as with the dispatch map it’s an instance variable of the list. Needless to say you could change this to a C array, STL container or anything else without breaking external customers of the object.

BlockConstructor newRecentItemsList = ^ {
    __block NSMutableDictionary *selectorImplementationMap = [NSMutableDictionary dictionary];
    __block SelectorUpdater setImplementation = ^(NSString *selector, BlockIMP implementation) {
        [selectorImplementationMap setObject: [implementation copy] forKey: selector];
    };
    __block NSMutableArray *recentItems = [NSMutableArray array];
    setImplementation(@"isEmpty", ^(NSDictionary *parameters) {
        return [NSNumber numberWithBool: ([recentItems count] == 0)];
    });
    
    setImplementation(@"size", ^(NSDictionary *parameters) {
        return [NSNumber numberWithInteger: [recentItems count]];
    });
    
    setImplementation(@"get", ^(NSDictionary *parameters) {
        NSInteger index = [[parameters objectForKey: @"index"] integerValue];
        return [recentItems objectAtIndex: index];
    });
    
    setImplementation(@"add", ^(NSDictionary *parameters) {
        id itemToAdd = [parameters objectForKey: @"itemToAdd"];
        [recentItems removeObject: itemToAdd];
        [recentItems insertObject: itemToAdd atIndex: 0];
        
        return (id)nil;
    });
    
    BlockObject list = ^(NSString *selector, NSDictionary *parameters) {
        BlockIMP implementation = [selectorImplementationMap objectForKey: selector];
        return implementation(parameters);
    };
    return (BlockObject)[list copy];
};

Using the list

Here is an example of creating and using a recent items list. Thankfully, since writing this post literal dictionaries have appeared, so it doesn’t look so bad:

    BlockObject recentItems = newRecentItemsList();
    BOOL isEmpty = [recentItems(@"isEmpty", nil) boolValue];

    NSDictionary *getArgs = @{ @"index" : @0 };
    @try {
        id firstItem = recentItems(@"get", getArgs);
        NSLog(@"first item in empty list: %@", firstItem);
    }
    @catch (id e) {
        NSLog(@"can't get first item in empty list");
    }

Exercises for the reader

The above class is not complete. Here are some ways you could extend it, that I haven’t covered (or, for that matter, tried).

Implement @"isEqual". Remember that no instance can see the ivars of any other instance, so you need to use the public interface of the other object to decide whether it’s equal to this object. You’ll need to provide a new method @"respondsToSelector" in order to build @"isEqual" properly.
Respond to unimplemented selectors well. The implementation shown above crashes if you send an unknown message: it’ll try to dereference a NULL block. That, well, it’s bad. Objective-C objects have a mechanism that catches these messages, allowing the object a chance to lazily add a method implementation or forward the message to a different object.
Write an app using these objects. :-)

Credit where it’s due

This work was inspired by Kevlin Henney’s presentation: It is possible to do OOP in Java. The implementation shown here isn’t even the first time this has been done using Objective-C blocks. The Security Transforms in Mac OS X 10.7 work in a very similar way. This is probably the first attempt to badly document a bad example of the art, though.

Posted in code-level, OOP, Uncategorized | Comments Off

On privacy, hashing, and your customers

Posted on 2012-02-08 by Graham

I’ve talked before about not being a dick when it comes to dealing with private data and personally-identifying information. It seems events have conspired to make it worth diving into some more detail.

Only collect data you need to collect (and have asked for)

There’s plenty of information on the iPhone ripe for the taking, as fellow iOS security boffin Nicolas Seriot discussed in his Black Hat paper. You can access a lot of this data without prompting the user: should you?

Probably not: that would mean being a dick. Think about the following questions.

Have I made it clear to my customers that we need this data?
Have I already given my customers the choice to decline access to the data?
Is it obvious to my customer, from the way our product works, that the product will need this data to function?

If the answer is “no” to any of these, then you should consider gathering the data to be a risky business, and the act of a dick. By the way, you’ll notice that I call your subscribers/licensees “your customers” not “the users”; try doing the same in your own discussions of how your product behaves. Particularly when talking to your investors.

Should you require a long-form version of that discussion, there’s plenty more detail on appropriate handling of customer privacy in the GSMA’s privacy guidelines for mobile app developers.

Only keep data you need to keep

Paraphrasing Taligent: There is no data more secure than no data. If you need to perform an operation on some data but don’t need to store the inputs, just throw the data away. As an example: if you need to deliver a message, you don’t need to keep the content after it’s delivered.

Hash things where that’s an option

If you need to understand associations between facts, but don’t need to be able to read the facts themselves, you can store a one-way hash of the fact so that you can trace the associations anonymously.

As an example, imagine that you direct customers to an affiliate website to buy some product. The affiliates then send the customers back to you to handle the purchase. This means you probably want to track the customer’s visit to your affiliate and back into your purchase system, so that you know who to charge for what and to get feedback on how your campaigns are going. You could just send the affiliate your customer’s email address:

X-Customer-Identifier: iamleeg%40gmail.com

But now everybody who can see the traffic – including the affiliate and their partners – can see your customer’s email address. That’s oversharing, or “being a dick” in the local parlance.

So you might think to hash the email address using a function like SHA1; you can track the same hash in and out of the affiliate’s site, but the outsiders can’t see the real data.

X-Customer-Identifier: 028271ebf0e9915b1b0af08b297d3cdbcf290e3c

We still have a couple of problems though. Anyone who can see this hash can take some guesses at what the content might be: they don’t need to reverse the hash, just figure out what it might contain and have a go at that. For example if someone knows you have a user called ‘iamleeg’ they might try generating hashes of emails at various providers with that same username until they hit on the gmail address as a match.

Another issue is that if multiple affiliates all partner with the same third business, that business can match the same hash across those affiliate sites and build up an aggregated view of that customer’s behaviour. For example, imagine that a few of your affiliates all use an analytics company called “Slurry” to track use of their websites. Slurry can see the same customer being passed by you to all of those sites.

So an additional step is to append a different random value called a salt to the data before you hash it in each context. Then the same data seen in different contexts cannot be associated, and it becomes harder to precompute a table of guesses at the meaning of each hash. So, let’s say that for one site you send the hash of “sdfugyfwevojnicsjno” + email. Then the header looks like:

X-Customer-Identifier: 22269bdc5bbe4473454ea9ac9b14554ae841fcf3

[OK, I admit I’m cheating in this case just to demonstrate the progressive improvement: in fact in the example above you could hash the user’s current login session identifier and send that, so that you can see purchases coming from a particular session and no-one else can track the same customer on the same site over time.]

N.B. I previously discussed why Apple are making a similar change with device identifiers.

But we’re a startup, we can’t afford this stuff

Startups are all about iterating quickly, finding problems and fixing them or changing strategy, right? The old pivot/persevere choice? Validated learning? OK, tell me this: why doesn’t that apply to security or privacy?

I would say that it’s fine for a startup to release a first version that covers the following minimum requirements (something I call “Just Barely Good Enough” security):

Legal obligations to your customers in whatever country those countries (and your data) reside
Standard security practices such as mitigating the OWASP top ten or OWASP mobile top ten
Not being a dick

In the O2 Labs I’ve been working with experts from various groups – legal, OFCOM compliance, IT security – to draw up checklists covering all of the above. Covering the baseline security won’t mean building the thing then throwing it at a pen tester to laugh at all the problems: it will mean going through the checklist. That can even be done while we’re planning the product.

Now, as with everything else in both product engineering and in running a startup, it’s time to measure, optimise and iterate. Do changes to your product change its conformance with the checklist issues? Are your customers telling you that something else you didn’t think of is important? Are you detecting intrusions that existing countermeasures don’t defend against? Did the law change? Measure those things, change your security posture, iterate: use the metrics to ensure that you’re pulling in the correct direction.

I suppose if I were willing to spend the time, I could package the above up as “Lean Security” and sell a 300-page book. But for now, this blog post will do. Try not to be a dick; check that you’re not being a dick; be less of a dick.

Posted in Business, Crypto, Data Leakage, Privacy, Responsibility | Comments Off

On home truths in iOS TDD

Posted on 2012-02-07 by Graham

The first readers of Test-Driven iOS Development (currently available in Rough Cuts form on Safari Books Online: if you want to buy a paper/kindle/iBooks editions, you’ll have to wait until it enters full production in a month or so) are giving positive feedback on the book’s content, which is gratifying. Bar last minute corrections and galley proof checking, my involvement with the project is nearly over so it’s time for me to reflect on the work that has dominated my schedule for over a year.

As explained in the book’s front matter, I chose to give all of the examples in the book and accompanying source code using OCUnit. As the BBC might say, “other unit test frameworks are available”. Some of the alternative frameworks are discussed in the book, so interested readers can try them out for themselves.

What made OCUnit the correct choice—put it a different way, what made OCUnit the choice I made? It’s the framework that’s shipped with Xcode, so anyone who might want to try out unit testing can pick up the book and give it a go. There are no third-party dependencies to become unsupported or change beyond all recognition—though that does occasionally happen to Xcode. File-New Project…, include unit tests, and you’re away, following the examples and trying out your own things.

Additionally, the shared body of knowledge in the Cocoa development community is greatest when it comes to OCUnit. Aside from people who consider automated testing to be teh suck, plenty of developers on Mac, iOS and other platforms have got experience using OCUnit or something very much like it. Some of those people have switched to other frameworks, but plenty are using OCUnit. There’s plenty of experience out there, and plenty of help available.

The flip side to this is that OCUnit doesn’t represent the state of the art in testing. Far from it: the kit was first introduced in 1998, and hasn’t changed a great deal since. Indeed many of the alternatives we see in frameworks like GHUnit and Google Toolkit for Mac are really not such great improvements, adding some extra macros and different reporting tools. Supporting libraries such as OCHamcrest and OCMock give us some additional features, but we can look over the fence into the neighbouring fields of Java, ruby and C# to see greater innovations and more efficient testers.

Before you decide to take the book out of your Amazon basket, let me assure you that learning TDD via OCUnit is not wasted effort. The discipline of red-green-refactor, the way that writing tests guides the design of your classes, the introduction of test doubles to remove dependencies in tests: these are all things that (I hope) the book can teach you, and that you can employ whether you use OCUnit or some other framework. And, as I said, there’s plenty of code out there that is in an OCUnit harness. It’s not bad, it could be better.

So what are the problems with OCUnit?

repetition. Every time you write STAssert, you’re saying two things. Firstly, “hey, I’m using OCUnit”, which isn’t really useful information. Second, “what’s coming up is a test, read on to find out what kind of test”. Then you finally get to the end of the macro where you reveal what it is you’re going to do. This is the important information, but we bury it in the middle of the line behind some boilerplate.
Imagine, instead, a hypothetical language where we could send messages to arbitrary expressions (ok that exists, but imagine it’s objc). Then you could do [[2+2 should] equal: 4]; which more closely reflects our intention.
repetition. In the same way that STAssert is boilerplate, so is sub classing SenTestCase and writing -(void)test at the beginning of every test method. It gives you no useful information, and hides the actual data about the test behind the boilerplate.
Newer test frameworks in languages like C# and Java use the annotation features of those languages to take the fact that a method is a test out of its signature and make it metadata. ObjC doesn’t support annotations, so we can’t do that. But take a look at the way CATCH tests are marked up. You indicate that something is a test, and the fact that this means the framework needs to generate an objective-c++ class and call a method on it is encapsulated in the framework’s implementation.
repetition. You might think that there’s a theme developing here :-). If you write descriptive method names, you might have a test named something like -testTheNetworkConnectionIsCleanedUpWhenADownloadFails. Should that test fail, you’re told what is going wrong: the network connection is not cleaned up when a download fails.
So what should you write in the mandatory message parameter all of the STAssert…() macros require? How about @"the network connection was not cleaned up when a download failed"? Not so useful.
organisation. I’ve already discussed how OCUnit makes you put tests into particular classes and name them in particular ways. What if you don’t want to do that? What if you want to define multiple groups of related tests in the same class, in the way BDD practitioners do to indicate they’re all part of the same story? What if you want to group some of the tests in one of those groups? You can’t do that.

I’m sure other people have other complaints about OCUnit, and that yet other people can find no fault with it. In this post I wanted to draw attention to the fact that there’s more than one way to crack a nut, and the vendor-supplied nutcracker is useful though basic.

Posted in books, code-level, TDD, tool-support | 7 Comments

Security: probably doing it wrong

Posted on 2012-01-27 by Graham

Being knowledgable in the field of information security is useful and beneficial. However, it’s not sufficient, and while it’s (somewhat) easy to argue that it’s necessary there’s a big gap between being a security expert and making software better, or even making software more secure.

The security interaction on many projects goes something like this:

Develop software
Get a penetration tester in
Oh, shit
Fix anything that won’t take more than two days
Get remaining risk signed off by senior management
Ship
Observe that most of the time, this doesn’t cause much trouble

Now whether or not a company can afford to rely on that last bullet point being correct is a matter for the executives to decide, but let’s assume that they don’t want to depend on it. The problem they’ll have is that they must depend on it anyway, because the preceding software project was done wrong.

Security people love to think that they’re important and clever (and they are, just not any more than other software people). Throughout the industry you hear talk of “fail” or even “epic fail”. This is not jargon, it’s an example of the mentality that promotes calling developers idiots.

Did the developer get the security wrong because he’s an idiot, or was it because you didn’t tell him it was wrong until after he had finished?

“But we’re penetration testers; we weren’t engaged until after the developers had written the software.” Who’s fault is that? Did you tell anyone you had advice to give in the earlier stages of development? Did you offer to help with the system architecture, or with the requirements, or with tool selection?

You may think at this point that I shouldn’t rock the boat; that if we carry on allowing people to write insecure software, there’ll be more money to be made in testing it and writing reports about how many high-severity issues there are that need fixing. That may be true, though it won’t actually lead to software becoming more secure.

Take another look at the list of actions above. Once the project manager knows that the software has a number of high-priority issues, the decision that project manager will have to take looks like this:

If I leave these problems in the software, will that cause more work in the project, or in maintenance? Do I look like my bonus depends on what happens in maintenance?

So, as intimated in the process at the top of the post, you’ll see the quick fixes done – anything that doesn’t affect the ship date – but more fundamental problems will be left alone, or perhaps documented as “nice to haves” for a future version. Anything that requires huge changes, like architectural modification or component rewrites, isn’t going to happen.

If we actually want to get security problems fixed, we have to distribute the importance assigned to it more evenly. It’s no good having security people who think that security is the most important thing ever, if they’re not going to be the people making the stuff: conversely it’s no good having the people who make the thing unaware of security if it really does have some importance associated with it.

Here’s my proposal: it should be the responsibility of the software architect to know security or to know someone who knows security. Security is a requirement of a software system, and it’s the architect’s job to understand what the requirements are, how the software is to implement them and how to make any trade-off needed if the requirements come into conflict. It’s the architect’s job to justify those decisions, and to make them and see them followed throughout development.

That makes the software architect the perfect person to ensure that the relative importance of security versus performance, correctness, responsiveness, user experience and other aspects of the product is both understood and correctly executed in building the software. It promotes (or demotes, depending on your position) software security to its correct position in the firmament: as an aspect of constructing software.

Posted in software-engineering | Comments Off

Irresponsible tolerance

Posted on 2012-01-11 by Graham

Context

@unclebobmartin said:

One of the bad behaviors that destroys projects is “irresponsible tolerance”. Tolerating what you know you should fix.

This triggered a discussion between @phil_nash and myself. As far as this got on the Twitters, we agreed that it’s not necessarily irresponsible to ignore a problem for now as long as what you’re actually doing is deferring the fix until you’ve got time…except that it’s easy for deferral to slip into tacit acceptance as other work comes up. We may even be able to delude ourselves into thinking we still intend to fix that issue “some day”, even though the reality is that will never happen.

My >140char response

Yes, that is easy to do. I’ve done it myself. I’ve even – though not in a number of years – used tolerance of a badly-written component as an excuse to avoid not just cleaning it up, but of doing other useful work on the same component. “Touching that spaghetti code would be too risky, and rewriting it would take too long, so let’s just leave it as it is.”

Since reading the GTD book, I’ve tried a new approach which has, for the most part, been more successful. It’s not exactly a GTD technique, but borrows the spirit. In GTD there’s a two-minute rule: if you think of something you need to do that would take less than two minutes, just do it. If it would take longer, add it to your backlog.

The analogous approach for refusing to tolerate software problems is this: if you see something you think needs fixing, and you have time to fix it now, fix it now. If you do not have time to fix it, write a bug report.

What goes into the bug report?

All of the things a good software architect should be logging as part of their work anyway: a description of the problem, discussion of potential solutions, choice of solution and justification of that solution. So if there’s some ugly class that needs rewriting, explain why it’s ugly. Describe what would be better, and why.

What do I get from this?

In the first instance, the act of describing what it is that you dislike about the current code often makes it easier to see that the fix actually wouldn’t take too long. So that really disgusting class is full of long methods: what’s three minutes with the “extract method” tool between friends?

Oftentimes the solution will still be too big to work on right now. So hit the “report” button, and get the bug report into the tracker (or the backlog, or icebox, or whatever this week’s cool term is. I can no longer keep up; I’m in my thirties now). You know how they say a problem shared is a problem halved? It’s crap. A problem shared is a problem everyone is burdened with, so there are more people to go “oh crap, yeah, I hate that too”. Maybe one (or more) of you has the time to spend a day or two sorting the issue out, or is willing to make time. Maybe someone else knows enough about that code to propose a better alternative.

Even if not, the whole team can no longer ignore the issue: every time someone looks at the outstanding issues, there’s your problem, reminding everyone not to tolerate it. It’s harder to say “oh yeah, I thought about fixing that once but I didn’t have the time” if every time you read the bug list you are forced to think about it again. One of these days you, or someone else, will have time to fix it, and so will have to either do that or think of a convincing excuse to shelve it again. Then explain at the next bug review time why the issue is still there. If you wrote a good justification for why the proposed solution would be better, it looks like you’re actively trying to avoid making a better product.

Depending on how you track bugs, you may have an additional benefit: the ability to link your complaint to other issues. So maybe (and this is a real-world example from my experience), the problem is that a class for reading files has a hard-coded list of search paths. Then a request comes in saying that an extra filesystem has been provided by IT, and they want to put some of the files into a location on this new filesystem. Link them. Someone will be assigned the user request that’s been prioritised as a business issue, and when they do they’ll see your report that a good way to fix the problem would also clean up the product, so they can do both at the same time. If the issue is linked to enough problems in the product, it becomes clear that addressing the underlying issue will benefit the customers and the work will be scheduled. Then you really have no excuse for not finding the time to fix it: it’s your job.

So this is a silver bullet, is it?

No. It’s worked well for me, but it’s not foolproof. Some projects I’ve worked on have suffered from a form of bug tracker malaise, where the backlog is so great that it’s easy to ignore the vast number of open issues—some of which are no longer relevant— meaning that adding another straw to the haystack isn’t going to help anyone. That’s an extreme position for a project to get into, it’s basically a slippery-slope version of Uncle Bob’s “irresponsible tolerance” where even problems being reported by customers can be tolerated. In those cases, a special injection of enthusiasm into the development team is required: the whole product is already on a death march.

For most projects, though, reporting an issue is a good way to avoid ignoring that issue.

Posted in Uncategorized | Comments Off

On standards in free software engineering

Posted on 2012-01-08 by Graham

I have previously written on the economics of software insecurity, and I quote a couple of paragraphs from that post below:

One option that is not fully explored in the book, but which I believe could be worth exploring, is this: development of critical infrastructure software could be taken away from the free market.

Now the size of even the U.S. government IT budget probably isn’t sufficient to completely fund a bunch of infrastructure developers, but there are other options. Rice correctly notes the existence of not-for-profit software development organisations (particularly the Open Source Initiative and Free Software Foundation), and discusses the benefits and drawbacks of the open source model as it applies to commercial software. He does not explore the possibility that charity development organisations could withdraw from market competition, and focus on engineering practice, quality and security without feature parity or first-to-market speed.

Today I was re-reading Free Software, Free Society by Richard Stallman, a collection of his essays and speeches on topics including copyleft, the GNU system and General Public Licence. In thinking about this book, I went wandering back to the idea of non-commercial driver for good quality software.

I am now convinced that the Free Software Foundation should be investigating, researching and promoting standards, practices and quality in software construction.

The principal immediate benefit the FSF would gain is in terms of visibility and support. Everywhere that software is used—public, private and academic sectors—organisations are interested in finding out ways to improve quality, reliability, deliverability: in other words, the success of their software. An entity that could offer to evaluate and report on whether particular techniques are feasible and offer improvement—in return for funding and staffing the production of their sought-after free software—would be welcomed and would be put to good use.

The FSF is well-placed to achieve this goal, because all of its output is copyleft. A large problem with analysing the success or otherwise of development practices is that the outputs are proprietary: not just the code, but the project documentation, meeting minutes and so forth. With an FSF project everything is (or should be) freely available so inspecting how a project was run, what the developers did and—crucially—whether users are happy with the end result is much easier. Conclusions should be reproducible because everybody can see everything that went on.

Notice that in this scheme, relationships between the FSF and other (proprietary, open source, whatever) organisations are mutually beneficial, not antagonistic as is often either actually the case or just assumed. The benefits seen by external parties are the improvements in process and technique; benefits that all developers can make use of. The discussion moves away from free vs. fettered, and becomes making the field of software engineering better for everyone.

Incidentally, such a focus would also put free software at the forefront of discussions on software quality and deliverability. This would be something of a coup for free software, which is often associated with chaotic management, lack of road maps, and paucity of documentation and support. OK, the FSF wants people to consider freedom as a value in itself, but there’s nothing wrong with ensuring that free software is the best software too, surely?

Posted in Business, software-engineering | Comments Off

On the economics of software insecurity

Posted on 2012-01-08 by Graham

This post is mainly motivated by having read Geekonomics: the real cost of insecure software, by David Rice. Since writing the book Rice has apparently been hired by Apple, though his bio at the Geekonomics site doesn’t mention that (nor his LinkedIn profile).

Geekonomics is a thoroughly interesting read. It’s evidently designed as a call to arms for users to demand better security, and as a result resorts to hyperbole in parts. You are a crash test dummy for software manufacturers and are paying extravagantly for the privilege. In this way it reads as if it is to security as The Inmates are Running the Asylum was to user experience in the 1999: do you realise just how shoddy all of this software you use is?

That said, once you actually dig into Rice’s arguments, the hyperbole disappears and the book becomes well-sourced, internally consistent and rational. He explains why the market forces in the software industry don’t lead to security (or even high quality) as either the primary customer requirement nor the key focus of producers.

Interestingly, while Geekonomics only incidentally touches on the role of security researchers in the software economy, their position is roughly consistent with the one I outlined in On Securing Lion: they are in it to get money (and sometimes fame) from selling either the vunerabilities they discover, or their skill at selling vulnerabilities.

The book ends by describing the different options a curated free market like the US market has for correcting the situation where market forces lead to socially undesirable outcomes: these options are redress via contract law, via tort law or via strict liability legislation. The impact on each of the above on the software industry is estimated.

One option that is not fully explored in the book, but which I believe could be worth exploring, is this: development of critical infrastructure software could be taken away from the free market.

Now the size of even the U.S. government IT budget probably isn’t sufficient to completely fund a bunch of infrastructure developers, but there are other options. Rice correctly notes the existence of not-for-profit software development organisations (particularly the Open Source Initiative and Free Software Foundation), and discusses the benefits and drawbacks of the open source model as it applies to commercial software. He does not explore the possibility that charity development organisations could withdraw from market competition, and focus on engineering practice, quality and security without feature parity or first-to-market speed.

Governments, trade groups, communications carriers and other organisations with an interest in using software as infrastructure (e.g. so-called “cloud” companies) could fund non-profits (maybe with money, maybe with staff) that develop infrastructure-grade software. Those non-profits would have a mission to do quality-centric development, and would put confidentiality, integrity, availability, reliability and correctness before feature richness or novelty. Their governance (the bit I haven’t fully thought through, admittedly) would be organised to promote and reward exactly that approach to development.

The software, its documentation and its engineering methodologies would be open, so that commercial software can take advantage of its advances at low cost. This is partially of importance because where security is a “hygiene factor” to software purchasers, the “security gap” between the infrastructure-grade and commercial-grade software would become clear and would artificially introduce infrastructure-grade robustness to the marketplace. Commercial vendors who could cheaply pick up parts of the infrastructure-grade software for their own products would be, in a self-interested manner, bringing that software’s quality into the commercial marketplace and making it a competition point.

“But,” some people say, “such software would be feature-poor. Why would anyone choose [SafeOS, SafeWebServer, SafeSmartPhone, whatever] over a feature-rich commercial offering?” The point is, that in infrastructure, correct function is more important than features. It’s only the fact that software exists purely in a competitive world that means the focus is on features.

Case in point: one analogy used throughout Geekonomics is that infrastructure software is like cement (actually, in a book I’m currently writing on software testing, I make the same analogy, though relating to design rather than function). Well even taking into account innovations like Portland cement, the feature list of cement hasn’t changed in thousands of years. It sticks aggregate together to make concrete or mortar. It’s only the quality of its stick-aggregate-together-ness that has changed.

In relation to software, most computers are still “stuck together” using RFC791 (Internet Protocol version 4), which was documented in 1981 but was in use already at that time. The main advantages of RFC2460 (Internet Protocol version 6), written in 1998, are increased address space and reduced overhead. It’s better at stick-computers-together-ness, but doesn’t really do anything new. There may have been new applications of networks recently (and of course, the late addition of confidentiality in the mid-1990s), but networking itself doesn’t frequently need new features.

Or even operating system software. The last version of Mac OS X that added any features for its users was version 10.5, said new features were:

Time Machine: computers have been doing backup for years, this added a new UI.
Spaces: an improvement on the ability to draw windows on the screen.
Back to my Mac: an integration of existing capabilities (VNC and wide-area zeroconf networking).
Boot Camp: managing partitions, and giving the primary bootloader compatibility with a 1983 computer standard.

All of the other enhancements were in the applications, which still all require the same things of the OS: schedule processes, protect memory, abstract the file systems, manage devices. Again, there may have been new applications of an operating system, but there hasn’t been much newness in the operating system itself.

The part where such non-profit infrastructure software becomes tricky is in integrating with the rest of the “stack”. On the hardware side, it would be inappropriate to require that a government-sponsored and not-for-profit software project run on proprietary hardware. On the other hand, it might be inappropriate to disbar deployment on proprietary hardware—but is infrastructure-grade software on commercial-grade hardware still an infrastructure-grade deployment?

That’s particularly difficult in our world—the world of smartphones—because there isn’t really any open hardware. There are somewhat open definitions: the Android Compatibility Definition Document for example. But as Ken Thompson taught us: in a trusted system, we need to question who we trust and why.

Going the other way, of course, is much easier. Anyone could write an application that interoperates with infrastructure-grade software, or a system partially constructed out of such software. But the same question would still exist: how much of an impact do the non-infrastructure-grade components have on the reliability of the system?

Posted in software-engineering | 1 Comment

These things are hard

Posted on 2011-12-27 by Graham

Mike Lee recently wrote about his feelings on seeing those classic pictures from the American space program, in which the earth appears as a small blue marble set against the backdrop of space. His concluding paragraph:

Life has its waves. There are ups and downs. My not insubstantial gut and my lucky stars both are telling me 2012 is going to be an upswell. Let us do as we do where I grew up and catch that wave. Put aside your fear and cynicism. The future is ours to create. The system is ours to debug and refactor.

For me, the most defining and inspiring moment of the whole space program predates the Space Shuttle, the moon landing, even the Gemini program. It is the words of a politician, driving his country to one of its highest ebbs of innovation and discovery. The whole speech is worth watching, but this sentence encapsulates the sentiment perfectly.

We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win, and the others, too.

So let us create the future. It is not going to be easy, but we shall choose to do it because it is hard.