What happens when you jailbreak an iPad

Having played around with an iPad running a jailbreak OS yesterday, I thought it would be useful to explain one possible attack that doesn’t seem to get much coverage.

As I’ve discussed in numerous talks, the data protection feature of iOS (introduced in iOS 4, enabled by setting the NSFileProtectionComplete option on a file or writing data with the NSDataWritingFileProtectionComplete option) only works fully when the user has a passcode lock enabled. The operating system can derive a key to protect the files (indirectly, but that’s another talk) from the passcode, so when the device is locked the files are really inaccessible because the device has no idea what the unlock key is.

This can be seen when you try and access the content via SSH. Of course, the SSH daemon must be installed on a jailbreak operating system, but you don’t need the passcode to jailbreak:

$ ssh -l root@192.168.0.27
[key/auth exchange...the default password is still 'alpine']
# cd /User/Applications/C393CDBF-1A82-4D7B-A064-D6DFB8CC20DB/Documents
# cat UnprotectedFile
The flag. You haz it.
# cat ProtectedFile
cat: Error: Operation not permitted
[unlock]
# cat ProtectedFile
The flag. You haz it.

Now of course you do need physical access to jailbreak, but that doesn’t take particularly long. So here’s a situation that should probably appear in your threat models:

  1. Attacker retrieves target’s iPad
  2. Attacker installs jailbreak OS with data-harvesting tools
  3. Attacker returns iPad to the target
  4. Target uses iPad

Of course, an attacker who simply tea leafs the target’s iPad can’t perform this attack, and won’t be able to retrieve the files.

On counting numbers

While we were at NSConference, Alistair Houghton told me that he was working on static NSNumbers in clang. I soon thought: wouldn’t it be nice to have code like this?

for (NSNumber *i in [@10 times]) { /* ... */ }

That would work something like this. You must know three things: one is that the methods have been renamed to avoid future clashes with Apple methods. Another is that we automatically get the NSFastEnumeration support from NSEnumerator, though it certainly is possible to code up a faster implementation of this. Finally, that this code is available under the terms of the WTFPL though without warranty, to the extent permitted by applicable law.

NSNumber+FALEnumeration.h

@interface NSNumber (FALEnumeration)
- (NSEnumerator *)FALtimes;
- (NSEnumerator *)FALto: (NSNumber *)otherNumber;
- (NSEnumerator *)FALto: (NSNumber *)otherNumber step: (double)step;
@end

NSNumber+FALEnumeration.m

#import "NSNumber+FALEnumeration.h"
#import "FALNumberEnumerator.h"

@implementation NSNumber (FALEnumeration)

- (NSEnumerator *)FALtimes {
    double val = [self doubleValue];
    return [FALNumberEnumerator enumeratorFrom: 0.0
                                            to: val
                                          step: val > 0.0 ? 1.0 : -1.0];
}

- (NSEnumerator *)FALto: (NSNumber *)otherNumber {
    double val = [self doubleValue];
    double otherVal = [otherNumber doubleValue];
    double sgn = (otherVal - val) > 0.0 ? 1.0 : -1.0;
    return [self to: otherNumber step: sgn]; 
}

- (NSEnumerator *)FALto: (NSNumber *)otherNumber step: (double)step {
    double val = [self doubleValue];
    double otherVal = [otherNumber doubleValue];
    return [FALNumberEnumerator enumeratorFrom: val
                                            to: otherVal
                                          step: step];
}
@end

FALNumberEnumerator.h

@interface FALNumberEnumerator : NSEnumerator {
    double end;
    double step;
    double cursor;
}

+ (id)enumeratorFrom: (double)beginning to: (double)conclusion step: (double)gap;

- (id)nextObject;
- (NSArray *)allObjects;

@end

FALNumberEnumerator.m

#import "FALNumberEnumerator.h"

@implementation FALNumberEnumerator

+ (id)enumeratorFrom:(double)beginning to:(double)conclusion step:(double)gap {
    NSParameterAssert(gap != 0.0);
    NSParameterAssert((conclusion - beginning) * gap > 0.0);
    FALNumberEnumerator *enumerator = [[self alloc] init];
    if (enumerator) {
        enumerator->end = conclusion;
        enumerator->step = gap;
        enumerator->cursor = beginning;
    }
    return [enumerator autorelease];
}

- (id)nextObject {
    if ((step > 0.0 && cursor >= end) || (step < 1.0 && cursor <= end)) {
        return nil;
    }
    id answer = [NSNumber numberWithDouble: cursor];
    cursor += step;
    return answer;
}

- (NSArray *)allObjects {
    NSMutableArray *objs = [NSMutableArray array];
    id nextObj = nil;
    while ((nextObj = [self nextObject]) != nil) {
        [objs addObject: nextObj];
    }
    return [[objs copy] autorelease];
}

@end

On NSInvocation

I was going to get down to doing some writing, but then I got some new kit I needed to set up, so that isn’t going to happen. Besides which, I was talking to one developer about NSInvocation and writing to another about NSInvocation, then another asked about NSInvocation. So now seems like a good time to talk about NSInvocation.

What is NSInvocation?

Well, we could rely on Apple’s NSInvocation class reference to tell us that

An NSInvocation is an Objective-C message rendered static, that is, it is an action turned into an object.

This means that you can construct an invocation describing sending a particular message to a particular object, without actually sending the message. At some later point you can send the message as rendered, or you can change the target, or any of the parameters. This “store-and-forward” messaging makes implementing some parts of an app very easy, and represents a realisation of a design pattern called Command.

How is that useful?

The Gang of Four describes Command like this:

Encapsulate a request as an object, thereby letting you parameterize clients with different requests, queue or log requests, and support undoable operations.

Well, what is NSInvocation other than a request encapsulated as an object?

You can imagine that this would be useful in a distributed system, such as a remote procedure call (RPC) setup. In such a situation, code in the client process sends a message to its RPC library, which is actually acting as a proxy for the remote service. The library bundles up the invocation and passes it to the remote service, where the RPC implementation works out which object in the server process is being messaged and invokes the message on that target.

Spoiler alert: that really is how Distributed Objects on Mac OS X operates. NSInvocation instances can be serialised over a port connection and sent to remote processes, where they get deserialised and invoked.

An undo manager, similarly, works using the Command pattern and NSInvocation. Registering an undo action creates an invocation, describing what would need to be done to revert some user action. This invocation is placed on a queue, so the undo operations are all recorded in order. When the user hits Cmd-Z, the undo manager sends the most recent undo invocation to its target.

Similarly, an operation queue is just a list of requests that can be invoked later…this also sounds like it could be a job for NSInvocation (though to be sure, blocks are also used, which is another implementation of the same pattern).

The remaining common application of Command is for sending the same method to all of the objects in a collection. You would construct an invocation for the first object, then for each object in the collection change the invocation’s target before invoking it.

Got a Concrete Example?

OK, here’s one. You can use +[NSThread detachNewThreadSelector: toObject: withTarget:] to spawn a new thread. Because every thread in an iOS application needs its own autorelease pool, you need to create an autorelease pool at the beginning of the target selector’s method and release it at the end. Without using the Command pattern, this means one or more of:

  • Having a memory leak, if you can’t edit the method implementation
  • Having boilerplate autorelease pool code on every method that might – sometime – be called on its own thread
  • Having a wrapper method for any method that might – sometime – need to be called with or without a surrounding pool.

Sucks, huh? Let’s see if we can make that any better with NSInvocation and the Command pattern.

- (id)newResultOfAutoreleasedInvocation:(NSInvocation *)inv {
    id returnValue = nil;
    NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    [inv invoke];
    if ([[inv methodSignature] methodReturnLength] > 0) {
        if (strncmp([[inv methodSignature] methodReturnType],@encode(id), 1)) {
            char *buffer = malloc([[inv methodSignature] methodReturnLength]);
            if (buffer != NULL) {
                [inv getReturnValue: buffer];
                returnValue = [NSValue valueWithBytes: buffer objCType: [[inv methodSignature] methodReturnType]];
                free(buffer);
            }
        }
        else {
            [inv getReturnValue: &returnValue];
        }
        [returnValue retain];
    }
    [pool release];
    return returnValue;
}

Of course, we have to return a retained object, because the NSAutoreleasePool at the top of the stack when the invocation is fired off no longer exists. That’s why the method name is prefixed with “new”: it’s a hint to the analyser that the method will return a retained object.

The other trick here is the mess involving NSValue. That, believe it or not, is a convenience, so that the same method can be used to wrap invocations that have non-object return values. Of course, using NSInvocation means we’re subject to its limitations: we can’t use variadic methods or those that return a union type.

Now, for any method you want to call on a separate thread (or in an operation, or from a dispatch queue, or…) you can use this wrapper method to ensure that it has an autorelease pool in place without having to grub into the method implementation or write a specific wrapper method.

A side note on doing Objective-C properly: this method compares the result of -[NSMethodSignature methodReturnType] with a specific type using the @encode() keyword. Objective-C type encodings are documented to be C strings, and there’s even a page in the documentation listing the current values returned by @encode. It’s best not to rely on those, as Apple might choose to change or extend them in the future.

On comment docs

Something I’m looking at right now is generation of (in my case, HTML) API documentation from some simple markup format. The usual way to do this is by writing documentation markup inline in the source code, using specially formatted comments in header files.

The point

Some people argue that well-written source code should be its own documentation. Well, that’s true, it is: but it’s documentation with limited utility. Source code provides the following documentation:

  • Document, for the compiler’s benefit, the machine instructions that the compiler should generate
  • Document, for the programmer’s benefit, the machine instructions that the compiler should generate

Developing a high-level model of how software works from its source code is possible, but mentally taxing. It’s not designed for that. It would be like asking an ant to map the coastline of Africa: it can be done, but the information available is at entirely the wrong scale.

Several other approaches for high-level documentation of software systems exist. Of course, each of them is not actually the source code, but a model: a map of Africa is not actually Africa, but if you want to know what the coastline of Africa looks like then the map is a very useful model.

Comment documentation is one such model of software. Well-written comment docs explain why you might use a method or class, and how its properties or parameters help you to use it. It gives you a sense of how the classes fit together, and how you can exploit them. They’re called comment docs because they go into comments right alongside the source they document, usually marked up with particular tokens. As an example of a token, many documentation comment systems let me use a line like this:

@author Graham J. Lee

to indicate that I wrote a particular part of the project.

Of course, this documentation could go anywhere, so why put it into the header files? For a start, you already have the header files, so you’re not having to maintain two parallel hierarchies of content. Also, the proximity of the documentation to the source code means that there’s a higher probability (still not unity, but higher) that a developer who changes the intent or usage of a method will remember to update the documentation. Additionally, it means that the documentation can be no more verbose than required: anything that is obvious from the source (the names of methods and their parameters, for example) can be discovered from the source. This is not inconsistent with my earlier statement about source-as-documentation: you can easily find out a method signature without having to grub through the actual program instructions.

Finally, it means that when you’re working with the source, you have the documentation right there. This is one very common way to interact with comment documentation. The other way is to use a tool to create some friendly formatted output (for me, the goal is HTML) by reading the source files and extracting the useful information from the comments.

Doxygen

I have for the last forever (alright, three years) used Doxygen, for a couple of reasons:

  • The other people on my team at the time I adopted it were already using it
  • Since then, it has gained support for generating Xcode documentation sets, which can be viewed inside the Xcode organizer.

It’s not great, though. It’s a very complicated tool that tries to do all things for all people, so the configuration is huge and needs a lot of consideration. You can customise the look of the output using a CSS file but you pretty much need to as the default output looks like arse. Making it actually output different stuff is trickier, as it’s a C++ project and I’ve only ever learned enough C++ to customise Clang.

HeaderDoc

So I decided to fish around for alternatives. Of course, we know that Apple uses (and ships) HeaderDoc, but it uses its own comment format. Luckily, its comment format is identical to Javadoc, which is the format used by Doxygen. You can get headerdoc2html to look for the /** trigger by passing the -j flag.

Speaking of headerdoc2html, this is one of two programs in the HeaderDoc distribution. The other is gatherheaderdoc, which generates a table of contents from a collection of output files. Each of these is a well-written and well-documented perl script, which makes extension and modification super-easy.

Neither should be immediately required, in fact. The tool understands all of the Objective-C language features, and some extra bonus things like availability macros and groups of related methods. The default output basically looks like someone forgot to apply the style sheet to developer.apple.com. It’s trivial to configure HeaderDoc to use an external style sheet, and you can even create a custom template HTML file so that things appear in whatever fashion you want. Another useful configuration point is the C preprocessor, which you can get HeaderDoc to run through before interpreting the documentation.

Autogsdoc

The GNUstep project uses its own comment documentation tool called Autogsdoc. The fact that the autogsdoc documentation has buggy HTML does not fill one with confidence.

In fact, Autogsdoc’s output is unstyled XHTML 1.0, so it would be easy to style it to look more useful. Some projects that I’ve seen appear to have frames-based HTML, which is unfortunate.

Autogsdoc uses inline comment documentation, the same as the other options we’ve looked at. However, its markup is significantly different, as it uses SGML-style tags inside the documentation. It has a (well-documented, of course) Objective-C implementation with a good split of responsibilities between different classes. Hacking about on its innards shouldn’t be too hard for any motivated Cocoa coder.

Footnote

I know this has been annoying you all since the first paragraph in the section on HeaderDoc: */.

On Being a Software Person

Mike's Mum.On Wednesday I spoke at Qcon London, about “Mobile App Security and Privacy: You’re Doing It Wrong (and so am I)” as part of @akosma’s track on iOS and Android. The whole track was full of win: particularly, if you ever get the opportunity to hear @fraserspeirs talk about how his school is using iPads to change the way they teach, please do take the opportunity. You will find out a lot about how to write apps that people can (and will) use.

Thursday, I was part of a London iPhone Developer Group panel, alongside @akosma and @bmf, where we talked about $5 Xcode, when iOS and Mac OS X will finally converge, why you can’t sell software to an Android user. Oh, and we prototyped the UI for Photoshop for iPad.

Seriously, we did that. Why? Because there was sentiment in the discussion group that desktops couldn’t die because it was impossible to do Photoshop on an iPad. This is annoying. In order to show the group that doing Photoshop on the iPad might be possible, I made us do it. Now it turns out that once you have done it, it is possible.

[Update: @akosma since pointed out that Adobe has already done PhotoShop on iOS, and that this whole conversation was redundant before we began.]

What this really shows us is that in order to be a Software Person, you need to take seriously the idea that writing software might be possible. It often is possible, and if you try it you’re more likely to get a useful (and, of course, saleable) app out the end than if you don’t try it.

Conveniently, this is compatible with a meme that I started in my Qcon talk, which goes like this:

If you do not know x, then you cannot be a software engineer.

So, if you do not know tenacity, then you cannot be a software engineer. Conversely, you must also know when discretion is the better part of valour, and it is time to give up. No-one likes a death march project, when the cost of development has gone way beyond any likely return, or when the software you’re writing is no longer relevant.

The following is a list of things that you must know (or be able to hire someone to know for you) in order to be a software engineer. It is not exhaustive: these are merely the ones that I either came up with in my talk, or that Mike and I talked about while learning about evolution.

  • Tenacity
  • Discretion
  • Marketing
  • User experience
  • Estimation
  • Humility
  • When to be a dick
  • Testing
  • Human nature
  • Why phishing is successful
  • Requirements engineering
  • Why you would want to be a software engineer

Some people also add a task called “coding” to this list. Please add your own in the comments.