On cryptographic file storage

In Chapter 3 of Professional Cocoa Application Security, I talk about using CommonCrypto to encrypt files stored on either Mac or iOS file systems. In Chapter 4, I talk about using CommonCrypto to generate Hashed Message Authentication Codes (HMACs) to verify the integrity of messages received by an app. Unfortunately, I didn’t connect the two in the book.

I actually did discuss the reasons for providing an HMAC in a recent talk on cryptographic storage for iOS. But I went to Justin Clark’s talk at Security B-Sides London, where he demonstrated the attacks that are possible against cryptosystems that don’t verify the integrity of their content. This talk was very informative: if the presentation ever makes its way online, I fully recommend that you check it out.

The talk reminded me that there are important attacks I didn’t discuss in the book. So I need to bring these concepts together for the readers of PCAS. The tl;dr version is: if you encrypt content, you must also verify the integrity of the ciphertext. The long version follows.

I recommended in the book using CBC, or Cipher Block Chaining, mode for AES encryption. In this mode, the first block of plaintext is XORed with an Initialization Vector (IV), and this is then encrypted. The resultant ciphertext is then XORed with the second block of plaintext, and this is encrypted. And so on. This offers some protection against certain cryptanalysis attacks that Electronic Code Book (EBC) mode does not: for example, blocks of ciphertext cannot be arbitrarily swapped without the decrypted plaintext becoming nonsensical. A change in the ciphertext not only affects the decryption of the changed block, but also subsequent blocks.

Well, it turns out that this propagating-change property of CBC mode can be used by attackers—if they have information about errors encountered during decryption—to decrypt and encrypt data without needing to discover the key. Ouch. See the B-Sides talk described above (or google for Padding Oracle Attack) for the full details.

Of course, not providing the padding oracle is an important solution to this attack: but refusing to handle content that isn’t verifiably valid avoids any attack involving modified ciphertext: including encrypted content that lies about its length, if that’s an important consideration for your file format. Therefore, if you’re encrypting data in your app, you should be generating an HMAC and verifying that before attempting a decryption operation. Tampered data is not data you want to deal with.

Categories will bite you

What I wanted to do was this:

+ (void)load {
    Method foo = class_getInstanceMethod(self, @selector(foo));
    Method newFoo = class_getInstanceMethod(self, @selector(FZA_swizzleFoo));
    method_exchangeImplementations(foo, newFoo);
}

However, my tests wouldn’t work when I did that. It turns out that for some reason +load was running twice, so the methods got swizzled twice meaning that each implementation ends up married to its original selector. So I thought I’d do this:

+ (void)load {
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        Method foo = class_getInstanceMethod(self, @selector(foo));
        Method newFoo = class_getInstanceMethod(self, @selector(FZA_swizzleFoo));
        method_exchangeImplementations(foo, newFoo);
    });
}

…and, *drum roll*, that doesn’t work either. What gives?

Well, it turns out that the reason that the category was being loaded twice is because it was included in two separate binary images: the library I was testing and the unit test bundle. Because of this, the “static” dispatch_once_t instance was actually a separate instance for each image, so dispatch_once() couldn’t tell it had already run the block.

In my case there’s a solution; I only need the category to be loaded once so removed it from one of the targets. But the general message is that category loading is very non-deterministic, and hard to rely on. Imagine if you rely on an open-source category, then one of your plugin developers relies on the same category. Or one of the frameworks your app links against. There are specific cases where the implementation of categories causes no effects, or effects that are understood and may be worked around, but those are really the exceptions.

On internal quality

I was asked by attendees at my VTM talk on test-driven development a small collection of questions on a similar theme, which I’ll summarise here.

  • How do I do TDD when my boss doesn’t want me to?
  • What do I do when my boss wants me to ship untested prototype code?
  • Can you give me rhetoric to convince my boss that I should be doing TDD?

I believe that I can adequately present these questions as facets of the following root issue, and then blabber away endlessly about that one instead.

I have externalised the obligation I have to myself to write software that makes me proud by delegating it up to my boss. Can you help me to sleep at night?

The fundamental problem here is one of cognitive dissonance. Belief: I, as a software engineer, believe that process/technique/tool is good for writing software. Introspection: I observe that I am not using said process/technique/tool. Justifications: it must be that my boss doesn’t hold my output to the same quality standard that I do. It must be that my deadlines and ever-shifting requirements don’t give me time to do things properly. It must be that this company doesn’t want great software engineers.

No. I have learned the hard way that such justifications are false. The only person responsible for how you write code is you. If you work on a team, or are handing your source code over to some other stakeholder (which means you’re working on a team), then you may have style guides that limit what source code you write; but ultimately how you get there is solely down to you.

You may be thinking “but I asked my manager for time to implement TDD/CI/buzzword-compliance on my current project and she said no. Also, by the way, you’re an arsehole for being so rude and presuming to know what I think.” Let me address the latter point first: suck it. Now onto the first point.

Your manager was correct to turn down your request for time on the project. Your customer doesn’t want unit tests, or comment documentation, or build server output. Your customer wants working features. The project exists to satisfy the user’s needs. Therefore the time allocated on the project should be dedicated to making working features.

Now it turns out that doing TDD is one way to write code that leads to working features, so what you should have done was to agree to the version of the product plan where you make working features, and then done TDD anyway. Where this all started going wrong was not when your boss turned down your request, but when you asked your boss in the first place. It turns out that you both have the same goal—making a good product—but that you have different views on the process and different motivators. By asking your boss how to write code you gave her permission to micro-manage you, then got frustrated when she did, and decided that the problem was all her fault.

I’ve seen this failure mode quite a few times now. As one example, I worked in a company where there was an ingrained antagonism between the product managers (who clearly just don’t get that software is a craft and a labour of love) and the programmers (who clearly just don’t get that we work in a competitive marketplace and would rewrite the product from scratch every day if they could).

As is common (and often correct) in our industry, the product managers owned the product requirements. As should be welcomed when we care about the products we make, the programmers were invited to criticise the requirements for the projects they were working on. Due to this antagonistic culture, the programmers would typically stuff “engineering requirements” onto the project, which were things like rewriting components from scratch, epic refactorings, or setting up new developer workflows.

Requirements were, as far as I could tell, prioritised based on how much revenue they would attract (i.e. new customers), how much revenue they would protect (retained customers) and how much they would cost. As you can probably already guess, so-called engineering requirements don’t protect any revenue, they don’t generate any revenue, and they do take time and money to implement, so they would inevitably get dropped or indefinitely postponed.

The moral of that little story is this: it is folly to try and express development processes and methodologies as business requirements, because they fail at being business requirements. It’s the equivalent of a taxi driver asking the customer to pay a $100,000 fare for one journey because he wants to switch to automatic transmission for the next journey and that means buying a new vehicle. The customer cares about being driven somewhere, and doesn’t care about how the driver operates the vehicle as long as the journey is bug-free. He certainly doesn’t intend to pay for the driver to select a particular mode of operation: if the driver wants to use automatic transmission, the driver should just get on and do that.

So what I’m saying is this: it’s not up to your boss, your customer or your project manager to choose how you write software. It’s up to you to choose how you write software, and as long as the approach you take leads to working software that delights your customers, the rest of the stakeholders won’t mind what you’re doing “under the hood”. My good friend and long-lost son @bmf once said “don’t let them see you making it.” I would go further: don’t let them know it was made. Let them use exciting, compelling software: the magic is your business. If you want to make the magic in one particular way, that would let you take additional pride in your creation, that’s entirely your call.

A first look at appCode, and the future of Cocoa IDEs?

It’s been almost a full rotation of this great rock about its axis since JetBrains announced the start of its appCode Early Access Program.

appCode is an Integrated Development Environment, just the same as Xcode. Just like Xcode, appCode works with Objective-C projects on Mac and iOS: in fact, under the hood it’s using xcodebuild, llvm and all of the same tools that Xcode uses.

Unlike Xcode, appCode was written by JetBrains, the company who make the IntelliJ IDEA environment for Java. Now I’ve never used IntelliJ, but I have used Eclipse and NetBeans (and even Xcode!) for Java development, and that’s kindof the point. There’s a lot of competition in the Java world for IDEs, so they all have a boatload of features timesaving devices that are missing from Xcode.

Now, before we get too far into details, let’s get this out of the way. appCode—though not itself a cross-platform app—is a Java app based on the same framework as IntelliJ. It therefore doesn’t look like a Cocoa app, it looks like this:

Screen shot 2011 04 06 at 16 53 22

That’s enough to put some of you off, I’m sure. As people who know that user experience is the first most important factor of any app, it’s easy to be distracted by an obviously alien user interface.

Maybe it’s partly because I’ve used so many cross-platform IDEs, but I’ve become inured to the look of the products. I don’t mind (or at least, can deal with) a nonstandard interface if it provides a better experience. It’s quite pleasing that in some cases, appCode does indeed come through.

What does it do that Xcode doesn’t?

One feature that every other IDE I’ve used has which is missing from Xcode is the ability to automatically stub out methods defined in the class interface (or the superclass interface, or a category, or a protocol…) in the implementation. This is a great feature for saving a boatload of typing (or even flipping between multiple views with copy and paste). Design your class’s header, then use the Override Methods, Implement Methods and Generate… items in appCode’s Code menu to automatically fill in the boilerplate bits in the implementation.

Another improvement appCode brings over Xcode is in its refactoring support. Xcode has long had a refactoring capability that only has a little more syntax awareness than sed, and I’ve previously talked about tops as a command-line equivalent. But what about when you want to change a method signature?

Xcode’s refactoring tools can’t support that, although I want to do it fairly frequently: in fact the last time was this morning. I wanted to change the return type and add a new parameter to an existing method, effectively going from -(void)frobulate: to -(BOOL)frobulate:error:. Well, it turns out that in appCode, you can do that:

Screen shot 2011 04 06 at 17 19 32

So I should be using appCode, then?

Well, that’s not my decision to make, but I definitely wouldn’t recommend using it for production yet. It’s got some bugs (yes, I have filed bug reports) particularly with managing Xcode projects and integrating with the underlying SDKs and tools. You probably don’t want to touch production code with appCode yet.

Some particular issues I’ve found are with the default keyboard shortcuts conflicting with Mac OS X’s default keyboard shortcuts, it doesn’t offer a way to run the static analyzer, and I’ve had a problem where it removed a file from the definition of a project’s target. I was able to address both of the last two problems by going back into Xcode, but of course it would be preferable if I didn’t need to. Another big deal for me is that there isn’t (yet?) any integration with the API documentation.

That said, it is just an early preview, so definitely download it and have a motor around. You’ll probably find some things you do when you’re coding that are easier in appCode, and other things that are easier in Xcode.

What’s all this about the future of Cocoa IDEs?

I sincerely hope that JetBrains take this product to release and that people start adopting it. Based on the current preview, I would expect developers coming to iOS from Java development to be the majority of appCode users, because they’ll be the people most comfortable with its way of working.

My biggest hope is that developers in Apple (both inside the Xcode team and beyond) pick up appCode and take a look at it. What we Cocoa developers need is some competition among IDE producers—particularly now that we need to pay for Xcode— to drive development of the tools we use for our day-to-day work. Xcode is to 2010s Objective-C IDEs as Internet Explorer was to 1990s browsers: it basically works, and you don’t get a choice of what you use so be thankful for whatever features you get from On High.

It’d be great for someone to come along with the Firefox of Objective-C IDEs: a product that reminds users and vendors alike that there’s more than one way to do an IDE and that people are open to trying the other ways.

On my own competency

There was a question on programmers.stackexchange.com about whether to put your Stack Overflow reputation in your CV. I don’t, and answered as much: there’s no point in writing for its own sake, unless you want to be a writer. If you want to be a programmer, then you should be a good programmer and any writing you do should be evidence of that.

There’s a great—if somewhat old—resource for evaluating your own capability as a software engineer called the programmer competency matrix. Writing the answer I gave to the above question renewed the memory of the matrix in my mind, so I decided to evaluate myself against it and use that to help build a research and personal development plan.

I’m sharing my own results in this post. My hope is that this will trigger some introspection of your own, and demonstrate just how easy it is to devote a little time to personal development (this took me about an hour to do, including the writing). I’m also being selfish: when I write for publication I’m a lot more organised than if I just push notes into iThoughtsHD or OmniOutliner.

It doesn’t matter whether you agree or disagree with anything I write. If we agree on any row of the matrix, then we’re both pulling in the same direction and pushing the bar a little higher in that area of software engineering. If we disagree, then we’re still both raising the bar, because we both have our special skills that (hopefully) contribute to making the whole industry that bit more skilful and knowledgeable. Of course, if you think that I’m completely brain-dead for choosing not to develop my skills in a particular area, feel free to flame me in the comments.

Notice that it’s hard to be objective about self-evaluation. I hope that my answers below are somewhat accurate, but it’s easy to have an over-inflated opinion of your own competence. In fact, when I was a junior software engineer I felt like I knew more than I do now :). It would definitely be trite to give a nonsense buzzphrase like “we should always be continually improving at everything we’re doing”, because there aren’t enough hours in the day for that. There should be things that we definitely need to work on, things that we “just do” but reflect on how they’re going, and things that we just automatically get on with. Of course, when I think that something should be automatic but I make a mistake, that’s a time to re-evaluate my perception of my skills.

Computer Science

  • data structures: Level 1. I’ve done some study into some of the level 2 features (and even know what some of the words mean in Level 3), but at heart I feel like that’s stuff for a framework programmer to deal with and I’m an applications programmer. I’m happy to believe that the framework engineers know how to implement List<Integer> or NSArray properly and leave them to it.
    Related tale: I once had an interview where the interviewer drew a list on the whiteboard, and asked how I’d sort it. I wrote a "[" at the beginning, and " sort];" at the end. I didn’t get the job.
  • algorithms: Level 2. Again this is largely the domain of the framework engineer, but there are times when it’s become important to my own work such as when I’ve had to do performance engineering (and of course cryptographic algorithms factor heavily into my work). I tend to address algorithmic problems on a case-by-case basis, and don’t feel the need to explicitly schedule some research time.
  • systems programming: Level 2. I do believe that I need to develop here, particularly in the areas of JIT compiling and interpretation. If BDD and DDD (and, to some extent, OOP) have taught us anything, it’s that there are great efficiency savings to be made by expressing a software problem in a way that’s understandable by, or at least explainable to, the user. And that means Domain Specific Languages. I’ve actually seen that work very well, when I was working for Diamond Light Source. Scientists could write working prototypes in a Jython-based DSL that would then either be cleaned up by engineers or reimplemented in the core engine.

    Of course, BDD and DDD are both also examples of RDD (Résumé-Driven Development), but that doesn’t mean they have nothing to offer us.

Software Engineering

  • source code version control: Level 3. To be honest I think that the matrix is outdated here, and that DVCS is well on its way to becoming the new established workflow. When the matrix was written, SVN was the de facto standard for VCS. Version control is just one of those things that you need to know and need to use properly. It will probably still change quite a lot (I don’t think that any of the DVCS platforms out there make the uber-common “star” distribution any easier than decentralised development), and my opinion is that I need to be good at the tools I need to use. Of course, being a contractor that often means being good at all of the tools.
  • build automation: Level 3. I’ve written here before about setting up tools like HeaderDoc and CruiseControl. I definitely think some of the tools need improving, so could find time to work on that.
  • automated testing: Level 3. Something I need to do more of, and can “learn on the job” while I’m doing that: but there isn’t any point in having a “do UI testing” research project.

Programming

  • problem decomposition: Level 3. That’s not to say I’m completely perfect at object-oriented analysis and design, and every time I see someone else’s code I’m interested in what patterns they’ve chosen, and how they’ve organised their code. Actually, my current writing project—a book, video training course and several conference talks on TDD—has been useful for learning more about decomposition and modularity.
  • systems decomposition: Level 2. This is an area I definitely need to focus on, because almost all modern applications are actually multiple-component systems that either do, or should, present the same data across multiple devices and feature both dedicated and incidental online components.
  • communication: Level 3. I think the amount of time I’ve spent recently on books, conference talks, workshops and the like means that I’ve actually over-engineered this aspect significantly and can dial back on it once my current obligations are discharged.
  • code organisation within a file: Unsure. Maybe level 3: well I definitely have a style I use and am comfortable with: is it “beautiful”? Does that even matter? Shouldn’t the important factor be: “reading this programmer’s code doesn’t get in the way of understanding and modifying this programmer’s code”?
  • code organisation across multiple files: Level 2. Who uses the file system any more, anyway? I organise my classes and other resources in the Xcode project navigator. Turns out I use Xcode as some sort of “project builder” tool, not the Finder. If the view of projects and groups in the IDE were the metric in question, then I think I’d be at Level 3 here.
  • source tree organisation: Level 2, for similar reasons.
  • code readability: Level 3. Again, this book/research project on TDD has been my most recent and valuable guide.
  • defensive coding: Level 3. Ensuring you don’t only code and test the happy path is one of my current rants, if you give me enough beer. There was a section in my first book on this.
  • error handling: Level 3. I think this is really just an aspect of defensive coding, though I agree that a consistent approach across projects is important. I’m not a person who avoids using exceptions in Objective-C code.
  • IDE: Level 2. I suppose if you count shell script build phases as ‘custom macros’, then I’ve done level 3 work, but I don’t have a library of those shell scripts that I carry around to various projects. I probably could make some efficiency savings by recording how I use Xcode and reducing any commonly-repeated steps or actions.
  • API: Level 1. No, srsly. I know of the existence and capabilities of many of the Cocoa and Cocoa Touch APIs (with, for obvious reasons, the most focus on Foundation and Core Foundation) but always have to look up the class reference docs. This is really the most important thing I need to work on, so my next research project will be to deepen my knowledge of the APIs.
  • framework: Level 2. In addition to the Cocoa frameworks, I’ve worked with WebObjects, Eclipse RCP, wxWidgets and (more’s the pity) Swing. Not sure I particularly need to write a framework at the moment, there are other people who do good jobs of that. Exploring MonoTouch would be useful.
  • requirements: Level 3. Being a contractor, it’s important to know when to try and change the client’s mind and do things differently, if you want to write good software rather than write crap for a good hourly rate. I’ve also worked on in-house teams where the product manager or architect didn’t have any Mac experience, and it’s impossible to work on those without being able to suggest improved requirements.
  • scripting: Level 2. Yes, automating common developer tasks is still in its infancy, and I probably should find, clean up and publish some of the scripts I’ve created over time.
    (By the way: different scripting languages are useful for different things. I’ve learnt a lot of bash, perl and python, but should probably pick up ruby too. Don’t dismiss any of them for looking like unintelligible spider scrawl; you’ll probably find a use for it one day.)
  • database: Level 1. EOF and Core Data do exist for a reason…to be honest I don’t think I’ll ever be a Level 3 database person, it’s probably cheaper to hire somebody if I ever need that. Most of the products I’m currently being asked to work on either have somebody else for that or don’t have strong performance requirements.

Experience

Actually, I’m going to skip this section. I don’t believe that you can do a research project on improving your experience; only your exposure. You gain experience by doing.

Knowledge

  • tool knowledge: Level 2. As I’ve said there are clear shortcomings in the tools I’ve used, and I’d love to improve them or write something new and get it out there.
  • languages exposed to: Level 2. I have done Prolog, but no concurrent programming languages. Additionally, Andre Pang’s talk on Functional Programming at NSConf 09 showed that I could learn a lot more from that arena.
  • knowledge of upcoming technologies: Level 3. It’s a particular point of professional pride to me that I know what’s coming up and how it changes what I’m working on. Of course, in a world of non-disclosure agreements then “sharing” has to be limited to immediate project teams and people at WWDC. You are coming to WWDC, right?
  • platform internals: Level 2. I’m not Amit Singh. I already have a project on the horizon to do some more hacking on Clang, and have spoken about that before at NSConf Mini. It’s questionable whether a deep understanding of the platform internals is necessary, or whether that should be abstracted by the platform frameworks and interfaces. Personally I believe it is something useful to know, particularly in diagnosing bug reports. It’s well known that almost no bugs reported in your software are really bugs in the OS, and a deep understanding of the OS can help to confirm that.
  • books: Level 1, because I haven’t read Peopleware. I’ve written 1.5 books, does that count for anything?
  • blogs: Level 3. You’re reading it. As identified above, I probably expend more effort on communication than is necessary.

So, that’s my current state of competency in a nutshell. From there, it looks like my priorities for research goals over the upcoming year or so should be:

  1. Improve breadth and depth of API knowledge
  2. Investigate systems decomposition of a smartphone/desktop/internet system.
  3. Learn how to create and support DSLs
  4. Write and publish improved developer support tools

OK, your turn: what are yours?

What happens when you jailbreak an iPad

Having played around with an iPad running a jailbreak OS yesterday, I thought it would be useful to explain one possible attack that doesn’t seem to get much coverage.

As I’ve discussed in numerous talks, the data protection feature of iOS (introduced in iOS 4, enabled by setting the NSFileProtectionComplete option on a file or writing data with the NSDataWritingFileProtectionComplete option) only works fully when the user has a passcode lock enabled. The operating system can derive a key to protect the files (indirectly, but that’s another talk) from the passcode, so when the device is locked the files are really inaccessible because the device has no idea what the unlock key is.

This can be seen when you try and access the content via SSH. Of course, the SSH daemon must be installed on a jailbreak operating system, but you don’t need the passcode to jailbreak:

$ ssh -l root@192.168.0.27
[key/auth exchange...the default password is still 'alpine']
# cd /User/Applications/C393CDBF-1A82-4D7B-A064-D6DFB8CC20DB/Documents
# cat UnprotectedFile
The flag. You haz it.
# cat ProtectedFile
cat: Error: Operation not permitted
[unlock]
# cat ProtectedFile
The flag. You haz it.

Now of course you do need physical access to jailbreak, but that doesn’t take particularly long. So here’s a situation that should probably appear in your threat models:

  1. Attacker retrieves target’s iPad
  2. Attacker installs jailbreak OS with data-harvesting tools
  3. Attacker returns iPad to the target
  4. Target uses iPad

Of course, an attacker who simply tea leafs the target’s iPad can’t perform this attack, and won’t be able to retrieve the files.

On counting numbers

While we were at NSConference, Alistair Houghton told me that he was working on static NSNumbers in clang. I soon thought: wouldn’t it be nice to have code like this?

for (NSNumber *i in [@10 times]) { /* ... */ }

That would work something like this. You must know three things: one is that the methods have been renamed to avoid future clashes with Apple methods. Another is that we automatically get the NSFastEnumeration support from NSEnumerator, though it certainly is possible to code up a faster implementation of this. Finally, that this code is available under the terms of the WTFPL though without warranty, to the extent permitted by applicable law.

NSNumber+FALEnumeration.h

@interface NSNumber (FALEnumeration)
- (NSEnumerator *)FALtimes;
- (NSEnumerator *)FALto: (NSNumber *)otherNumber;
- (NSEnumerator *)FALto: (NSNumber *)otherNumber step: (double)step;
@end

NSNumber+FALEnumeration.m

#import "NSNumber+FALEnumeration.h"
#import "FALNumberEnumerator.h"

@implementation NSNumber (FALEnumeration)

- (NSEnumerator *)FALtimes {
    double val = [self doubleValue];
    return [FALNumberEnumerator enumeratorFrom: 0.0
                                            to: val
                                          step: val > 0.0 ? 1.0 : -1.0];
}

- (NSEnumerator *)FALto: (NSNumber *)otherNumber {
    double val = [self doubleValue];
    double otherVal = [otherNumber doubleValue];
    double sgn = (otherVal - val) > 0.0 ? 1.0 : -1.0;
    return [self to: otherNumber step: sgn]; 
}

- (NSEnumerator *)FALto: (NSNumber *)otherNumber step: (double)step {
    double val = [self doubleValue];
    double otherVal = [otherNumber doubleValue];
    return [FALNumberEnumerator enumeratorFrom: val
                                            to: otherVal
                                          step: step];
}
@end

FALNumberEnumerator.h

@interface FALNumberEnumerator : NSEnumerator {
    double end;
    double step;
    double cursor;
}

+ (id)enumeratorFrom: (double)beginning to: (double)conclusion step: (double)gap;

- (id)nextObject;
- (NSArray *)allObjects;

@end

FALNumberEnumerator.m

#import "FALNumberEnumerator.h"

@implementation FALNumberEnumerator

+ (id)enumeratorFrom:(double)beginning to:(double)conclusion step:(double)gap {
    NSParameterAssert(gap != 0.0);
    NSParameterAssert((conclusion - beginning) * gap > 0.0);
    FALNumberEnumerator *enumerator = [[self alloc] init];
    if (enumerator) {
        enumerator->end = conclusion;
        enumerator->step = gap;
        enumerator->cursor = beginning;
    }
    return [enumerator autorelease];
}

- (id)nextObject {
    if ((step > 0.0 && cursor >= end) || (step < 1.0 && cursor <= end)) {
        return nil;
    }
    id answer = [NSNumber numberWithDouble: cursor];
    cursor += step;
    return answer;
}

- (NSArray *)allObjects {
    NSMutableArray *objs = [NSMutableArray array];
    id nextObj = nil;
    while ((nextObj = [self nextObject]) != nil) {
        [objs addObject: nextObj];
    }
    return [[objs copy] autorelease];
}

@end

On NSInvocation

I was going to get down to doing some writing, but then I got some new kit I needed to set up, so that isn’t going to happen. Besides which, I was talking to one developer about NSInvocation and writing to another about NSInvocation, then another asked about NSInvocation. So now seems like a good time to talk about NSInvocation.

What is NSInvocation?

Well, we could rely on Apple’s NSInvocation class reference to tell us that

An NSInvocation is an Objective-C message rendered static, that is, it is an action turned into an object.

This means that you can construct an invocation describing sending a particular message to a particular object, without actually sending the message. At some later point you can send the message as rendered, or you can change the target, or any of the parameters. This “store-and-forward” messaging makes implementing some parts of an app very easy, and represents a realisation of a design pattern called Command.

How is that useful?

The Gang of Four describes Command like this:

Encapsulate a request as an object, thereby letting you parameterize clients with different requests, queue or log requests, and support undoable operations.

Well, what is NSInvocation other than a request encapsulated as an object?

You can imagine that this would be useful in a distributed system, such as a remote procedure call (RPC) setup. In such a situation, code in the client process sends a message to its RPC library, which is actually acting as a proxy for the remote service. The library bundles up the invocation and passes it to the remote service, where the RPC implementation works out which object in the server process is being messaged and invokes the message on that target.

Spoiler alert: that really is how Distributed Objects on Mac OS X operates. NSInvocation instances can be serialised over a port connection and sent to remote processes, where they get deserialised and invoked.

An undo manager, similarly, works using the Command pattern and NSInvocation. Registering an undo action creates an invocation, describing what would need to be done to revert some user action. This invocation is placed on a queue, so the undo operations are all recorded in order. When the user hits Cmd-Z, the undo manager sends the most recent undo invocation to its target.

Similarly, an operation queue is just a list of requests that can be invoked later…this also sounds like it could be a job for NSInvocation (though to be sure, blocks are also used, which is another implementation of the same pattern).

The remaining common application of Command is for sending the same method to all of the objects in a collection. You would construct an invocation for the first object, then for each object in the collection change the invocation’s target before invoking it.

Got a Concrete Example?

OK, here’s one. You can use +[NSThread detachNewThreadSelector: toObject: withTarget:] to spawn a new thread. Because every thread in an iOS application needs its own autorelease pool, you need to create an autorelease pool at the beginning of the target selector’s method and release it at the end. Without using the Command pattern, this means one or more of:

  • Having a memory leak, if you can’t edit the method implementation
  • Having boilerplate autorelease pool code on every method that might – sometime – be called on its own thread
  • Having a wrapper method for any method that might – sometime – need to be called with or without a surrounding pool.

Sucks, huh? Let’s see if we can make that any better with NSInvocation and the Command pattern.

- (id)newResultOfAutoreleasedInvocation:(NSInvocation *)inv {
    id returnValue = nil;
    NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    [inv invoke];
    if ([[inv methodSignature] methodReturnLength] > 0) {
        if (strncmp([[inv methodSignature] methodReturnType],@encode(id), 1)) {
            char *buffer = malloc([[inv methodSignature] methodReturnLength]);
            if (buffer != NULL) {
                [inv getReturnValue: buffer];
                returnValue = [NSValue valueWithBytes: buffer objCType: [[inv methodSignature] methodReturnType]];
                free(buffer);
            }
        }
        else {
            [inv getReturnValue: &returnValue];
        }
        [returnValue retain];
    }
    [pool release];
    return returnValue;
}

Of course, we have to return a retained object, because the NSAutoreleasePool at the top of the stack when the invocation is fired off no longer exists. That’s why the method name is prefixed with “new”: it’s a hint to the analyser that the method will return a retained object.

The other trick here is the mess involving NSValue. That, believe it or not, is a convenience, so that the same method can be used to wrap invocations that have non-object return values. Of course, using NSInvocation means we’re subject to its limitations: we can’t use variadic methods or those that return a union type.

Now, for any method you want to call on a separate thread (or in an operation, or from a dispatch queue, or…) you can use this wrapper method to ensure that it has an autorelease pool in place without having to grub into the method implementation or write a specific wrapper method.

A side note on doing Objective-C properly: this method compares the result of -[NSMethodSignature methodReturnType] with a specific type using the @encode() keyword. Objective-C type encodings are documented to be C strings, and there’s even a page in the documentation listing the current values returned by @encode. It’s best not to rely on those, as Apple might choose to change or extend them in the future.

On comment docs

Something I’m looking at right now is generation of (in my case, HTML) API documentation from some simple markup format. The usual way to do this is by writing documentation markup inline in the source code, using specially formatted comments in header files.

The point

Some people argue that well-written source code should be its own documentation. Well, that’s true, it is: but it’s documentation with limited utility. Source code provides the following documentation:

  • Document, for the compiler’s benefit, the machine instructions that the compiler should generate
  • Document, for the programmer’s benefit, the machine instructions that the compiler should generate

Developing a high-level model of how software works from its source code is possible, but mentally taxing. It’s not designed for that. It would be like asking an ant to map the coastline of Africa: it can be done, but the information available is at entirely the wrong scale.

Several other approaches for high-level documentation of software systems exist. Of course, each of them is not actually the source code, but a model: a map of Africa is not actually Africa, but if you want to know what the coastline of Africa looks like then the map is a very useful model.

Comment documentation is one such model of software. Well-written comment docs explain why you might use a method or class, and how its properties or parameters help you to use it. It gives you a sense of how the classes fit together, and how you can exploit them. They’re called comment docs because they go into comments right alongside the source they document, usually marked up with particular tokens. As an example of a token, many documentation comment systems let me use a line like this:

@author Graham J. Lee

to indicate that I wrote a particular part of the project.

Of course, this documentation could go anywhere, so why put it into the header files? For a start, you already have the header files, so you’re not having to maintain two parallel hierarchies of content. Also, the proximity of the documentation to the source code means that there’s a higher probability (still not unity, but higher) that a developer who changes the intent or usage of a method will remember to update the documentation. Additionally, it means that the documentation can be no more verbose than required: anything that is obvious from the source (the names of methods and their parameters, for example) can be discovered from the source. This is not inconsistent with my earlier statement about source-as-documentation: you can easily find out a method signature without having to grub through the actual program instructions.

Finally, it means that when you’re working with the source, you have the documentation right there. This is one very common way to interact with comment documentation. The other way is to use a tool to create some friendly formatted output (for me, the goal is HTML) by reading the source files and extracting the useful information from the comments.

Doxygen

I have for the last forever (alright, three years) used Doxygen, for a couple of reasons:

  • The other people on my team at the time I adopted it were already using it
  • Since then, it has gained support for generating Xcode documentation sets, which can be viewed inside the Xcode organizer.

It’s not great, though. It’s a very complicated tool that tries to do all things for all people, so the configuration is huge and needs a lot of consideration. You can customise the look of the output using a CSS file but you pretty much need to as the default output looks like arse. Making it actually output different stuff is trickier, as it’s a C++ project and I’ve only ever learned enough C++ to customise Clang.

HeaderDoc

So I decided to fish around for alternatives. Of course, we know that Apple uses (and ships) HeaderDoc, but it uses its own comment format. Luckily, its comment format is identical to Javadoc, which is the format used by Doxygen. You can get headerdoc2html to look for the /** trigger by passing the -j flag.

Speaking of headerdoc2html, this is one of two programs in the HeaderDoc distribution. The other is gatherheaderdoc, which generates a table of contents from a collection of output files. Each of these is a well-written and well-documented perl script, which makes extension and modification super-easy.

Neither should be immediately required, in fact. The tool understands all of the Objective-C language features, and some extra bonus things like availability macros and groups of related methods. The default output basically looks like someone forgot to apply the style sheet to developer.apple.com. It’s trivial to configure HeaderDoc to use an external style sheet, and you can even create a custom template HTML file so that things appear in whatever fashion you want. Another useful configuration point is the C preprocessor, which you can get HeaderDoc to run through before interpreting the documentation.

Autogsdoc

The GNUstep project uses its own comment documentation tool called Autogsdoc. The fact that the autogsdoc documentation has buggy HTML does not fill one with confidence.

In fact, Autogsdoc’s output is unstyled XHTML 1.0, so it would be easy to style it to look more useful. Some projects that I’ve seen appear to have frames-based HTML, which is unfortunate.

Autogsdoc uses inline comment documentation, the same as the other options we’ve looked at. However, its markup is significantly different, as it uses SGML-style tags inside the documentation. It has a (well-documented, of course) Objective-C implementation with a good split of responsibilities between different classes. Hacking about on its innards shouldn’t be too hard for any motivated Cocoa coder.

Footnote

I know this has been annoying you all since the first paragraph in the section on HeaderDoc: */.

On Being a Software Person

Mike's Mum.On Wednesday I spoke at Qcon London, about “Mobile App Security and Privacy: You’re Doing It Wrong (and so am I)” as part of @akosma’s track on iOS and Android. The whole track was full of win: particularly, if you ever get the opportunity to hear @fraserspeirs talk about how his school is using iPads to change the way they teach, please do take the opportunity. You will find out a lot about how to write apps that people can (and will) use.

Thursday, I was part of a London iPhone Developer Group panel, alongside @akosma and @bmf, where we talked about $5 Xcode, when iOS and Mac OS X will finally converge, why you can’t sell software to an Android user. Oh, and we prototyped the UI for Photoshop for iPad.

Seriously, we did that. Why? Because there was sentiment in the discussion group that desktops couldn’t die because it was impossible to do Photoshop on an iPad. This is annoying. In order to show the group that doing Photoshop on the iPad might be possible, I made us do it. Now it turns out that once you have done it, it is possible.

[Update: @akosma since pointed out that Adobe has already done PhotoShop on iOS, and that this whole conversation was redundant before we began.]

What this really shows us is that in order to be a Software Person, you need to take seriously the idea that writing software might be possible. It often is possible, and if you try it you’re more likely to get a useful (and, of course, saleable) app out the end than if you don’t try it.

Conveniently, this is compatible with a meme that I started in my Qcon talk, which goes like this:

If you do not know x, then you cannot be a software engineer.

So, if you do not know tenacity, then you cannot be a software engineer. Conversely, you must also know when discretion is the better part of valour, and it is time to give up. No-one likes a death march project, when the cost of development has gone way beyond any likely return, or when the software you’re writing is no longer relevant.

The following is a list of things that you must know (or be able to hire someone to know for you) in order to be a software engineer. It is not exhaustive: these are merely the ones that I either came up with in my talk, or that Mike and I talked about while learning about evolution.

  • Tenacity
  • Discretion
  • Marketing
  • User experience
  • Estimation
  • Humility
  • When to be a dick
  • Testing
  • Human nature
  • Why phishing is successful
  • Requirements engineering
  • Why you would want to be a software engineer

Some people also add a task called “coding” to this list. Please add your own in the comments.