Security consultancy from the other side

I used to run an application security consultancy business, back before the kinds of businesses who knew they needed to consider application security had got past assessing creating mobile apps. Whoops!

Something that occasionally, nay, often happened was that clients would get frustrated if I didn’t give them a direct answer to a question they asked, or if the answer to an apparently simple question was a one-day workshop. It certainly seems sketchy that someone who charges a day rate would rather engage in another day’s work than write a quick email, doesn’t it?

Today I was on the receiving end of this interaction. I’m defining the data encryption standards for our applications, so talked to our tame infosecnician about the problem. Of course, rather than specific recommendations about protocols, modes of operation, key lengths, rotation frequencies etc he would only answer my questions with other questions.

And this is totally normal and OK. The reason that I want to encrypt (some of) this data is that its confidentiality and integrity are valuable to me (my employer) and it’s worth investing some of my (our) time in protecting those attributes. How much is it worth? How much risk am I willing to leave on the table? How long should the data be protected for? Who do I trust with the data, and how much do I trust them?

Only I can answer those questions, so it’s not up to someone else to answer them for me, but they can remind me that I need to ask them.

Password checking with CommonCrypto

I previously described a system for storing and checking credentials on Mac OS and iOS based on using many rounds of a hashing function to generate a key from the password. Time has moved on, and Apple has extended the CommonCrypto library to provide a simple, standard and supported way of doing this. If this is still a problem you need to solve, you should look at doing it this way instead of following the earlier post.

We’re going to use a key-stretching function called PBKDF2 to make an encryption key of a standard length. This key probably will be much longer than the user’s password. However, it can’t possibly be any more random: it must be deterministically derived from that password so we’re stuck with the same amount of randomness that this password contains. A good key-stretching function should “smooth out” the randomness from the initial password, so that you can’t guess anything about the password given the function’s output.

We’re then going to use this key to calculate an HMAC of some known data. An HMAC is a bit like a digital signature, in that it depends both on the key used and the input data. We’ll store the HMAC but not the password and not the key derived from the password. Whenever the password is needed in the future, it must be provided by a user and cannot be derived from any of the data in the app.

The idea, then, is that when you set a new password, the app calculates this key, uses that to calculate an HMAC and stores the HMAC. When you try to use the app, you present a password, from which the app generates a key and an HMAC of the same data. If this HMAC matches the one that was previously calculated, the same keys were used, which (hopefully) means that the same password was supplied.

The faster you scream, the slower we go.

One of the problems that a key-stretching function must address is that computers are really, really fast. Normally computers being really, really fast is a benefit, but the faster it is to compute all of the above stuff the more guesses an attacker can try at the password in some amount of time. We therefore want to make this function slow enough that brute force attacks are limited, but not so slow that people get frustrated with the app’s performance.

PBKDF2 has a tuneable parameter – the number of rounds of a hashing function it uses internally to stretch the key. With CommonCrypto you can ask for a number of rounds that will result in the function taking (approximately) a certain amount of time to work on a password of a certain length.

Requirements for the rounds parameter

  • The number of rounds used when checking a password should be the same as the number used when generating the stored HMAC, otherwise the keys generated won’t match.
  • The above means that you need to choose a single value to use across app installs and hardware – unless you’re happy with losing access to all user data when a customer upgrades their iPad.
  • You will want to revise this parameter upwards as faster hardware becomes available. This conflicts with the first requirement: you’ll need a fallback mechanism to try the same password with different “versions” of your rounds parameter so that you can upgrade users’ credentials with your app.

With those in mind, you can construct an algorithm to choose a tuning parameter based on this call:

        const uint32_t oneSecond = 1000;
         rounds = CCCalibratePBKDF(kCCPBKDF2,
                                   predictedPasswordLength,
                                   predictedSaltLength,
                                   kCCPRFHmacAlgSHA256,
                                   kCCKeySizeAES128,
                                   oneSecond);

You can probably know what length of salt you’ll use: salt should be a block of random data that you supply, that is used as additional input to the key-stretching function. If two different users supply the same password, the salt stops the function from generating the same output. You probably can’t know what length of password users will use, but you can guess based on experience, data and knowledge of any password strength rules incorporated into your app.

Notice that you should derive this rounds property on the target hardware. The number of rounds that take a second to run through on your brand new iMac will take significantly longer on your customer’s iPhone 3GS.

Generating the HMAC data

Both storing and checking passwords use the same internal function:

- (NSData *)authenticationDataForPassword: (NSString *)password salt: (NSData *)salt rounds: (uint) rounds
{
    const NSString *plainData = @"Fuzzy Aliens";
    uint8_t key[kCCKeySizeAES128] = {0};
    int keyDerivationResult = CCKeyDerivationPBKDF(kCCPBKDF2,
                                                   [password UTF8String],
                                                   [password lengthOfBytesUsingEncoding: NSUTF8StringEncoding],
                                                   [salt bytes],
                                                   [salt length],
                                                   kCCPRFHmacAlgSHA256,
                                                   rounds,
                                                   key,
                                                   kCCKeySizeAES128);
    if (keyDerivationResult == kCCParamError) {
        //you shouldn't get here with the parameters as above
        return nil;
    }
    uint8_t hmac[CC_SHA256_DIGEST_LENGTH] = {0};
    CCHmac(kCCHmacAlgSHA256,
           key,
           kCCKeySizeAES128,
           [plainData UTF8String],
           [plainData lengthOfBytesUsingEncoding: NSUTF8StringEncoding],
           hmac);
    NSData *hmacData = [NSData dataWithBytes: hmac length: CC_SHA256_DIGEST_LENGTH];
    return hmacData;
}

Storing credentials for a new password simply involves generating a salt, computing the authentication data then writing it somewhere:

- (void)setPassword: (NSString *)password
{
    //generate a random salt…

    //do any checking (on complexity, or whether the passwords entered in two fields match)…

    NSData *hmacData = [self authenticationDataForPassword: password salt: salt rounds: [self roundsForKeyDerivation]];
    //store the HMAC and the salt, perhaps by concatenating them and putting them in the keychain…

}

Then testing the credentials involves applying the generation function to the password guess, and comparing the result with what you previously stored:

- (BOOL)checkPassword: (NSString *)password
{
    //recover the HMAC and the salt from wherever you stored them…

    NSData *guessedHmac = [self authenticationDataForPassword: password salt: salt rounds: [self roundsForKeyDerivation]];
    return [guessedHmac isEqualToData: hmacData];
}

Conclusion

If you can avoid storing a password in your app, even in the keychain, you should; there’s then a much reduced chance that the password can be recovered from the app by an attacker. Deriving a key from the password using PBKDF2 then testing whether you can use that key to obtain a known cryptographic result obviates the need to store the password itself. Mac OS X and iOS provide an easy way to use PBKDF2 in the CommonCrypto library.

The key derived from the password could even be used to protect the content in the app, so none of the documents are available without the password being presented. Doing this offers additional confidentiality over simply using the password for access control. Building a useful protocol around this key requires key wrapping, the subject of a future post.

TDD and crypto in one place

Well, I suppose if I’ve written two books, it’s about time I wrote a contorted blog post that references both of the worlds.

I recently wrote an encryption module for an app, and thought it’d be useful to share something about the design process. Notice that the source code here was quickly thrown together for the purposes of demonstrating the design technique, and bears little resemblance to the code I was actually writing (which was in a different language). A lot of the complexity I was actually trying to deal with has been elided in order to bring the story to the fore.

Steps 1a, 2a, 1b, 2b etc: Design the API hand-in-hand with a fake implementation.

A pass-thru implementation would work here, but I prefer to use something that actually does change its input just so that it’s clear that there’s a requirement the input should be acted on. You might end up with tests that look something like this:

@implementation CryptoDesignTests
{
    id <StringEncryptor>cryptor;
}

- (void)setUp {
    cryptor = [[ROT13StringEncryptor alloc] init];
}

- (void)tearDown {
    cryptor = nil;
}

- (void)testEncryptionOfPlainText {
    NSString *plainText = @"hello";
    NSString *cipherText = [cryptor encipherString: plainText];
    STAssertEqualObjects(@"uryyb", cipherText, @"This method should encrypt its parameter");
}

- (void)testDecryptionOfCipherText {
    NSString *cipherText = @"uryyb";
    NSString *plainText = [cryptor decipherString: cipherText];
    STAssertEqualObjects(@"hello", plainText, @"This method should decrypt its parameter");
}

@end

Where the protocol looks like this:

@protocol StringEncryptor <NSObject>

- (NSString *)encipherString: (NSString *)plainText;
- (NSString *)decipherString: (NSString *)cipherText;

@end

Of course you can implement this however you want. Here’s one potential implementation (which may have its problems, but is only being used to guide the API design):

@interface ROT13StringEncryptor : NSObject <StringEncryptor>

@end

@implementation ROT13StringEncryptor

- (NSString *)rot13: (NSString *)original {
    NSMutableData *originalBytes = [[original dataUsingEncoding: NSASCIIStringEncoding] mutableCopy];
    for (NSInteger i = 0; i < [originalBytes length]; i++) {
        NSRange cursor = NSMakeRange(i, 1);
        char c;
        [originalBytes getBytes: &c range: cursor];
        if       (c >= 'a' && c <= 'm') c += 13;
        else if  (c >= 'n' && c <= 'z') c -= 13;
        else if  (c >= 'A' && c <= 'M') c += 13;
        else if  (c >= 'A' && c <= 'Z') c -= 13;
        [originalBytes replaceBytesInRange: cursor withBytes: &c];
    }
    return [NSString stringWithCString: [originalBytes bytes] encoding: NSASCIIStringEncoding];
}

- (NSString *)encipherString:(NSString *)plainText {
    return [self rot13: plainText];
}

- (NSString *)decipherString:(NSString *)cipherText {
    return [self rot13: cipherText];
}

@end

Step 3. Write an integration test.

Having designed the API for the class, I'd like to write an integration test that does whatever it is that the app is trying to achieve. In my real app this task was that two cryptors hooked up to each other should agree on a key and pass a message between each other. For this demo, we'll require that the cryptor can round-trip a message and end up with the original text.

@implementation CryptoRoundTripTests
{
    id <StringEncryptor>> cryptor;
}

- (void)setUp {
    cryptor = [[ROT13StringEncryptor alloc] init];
}

- (void)tearDown {
    cryptor = nil;
}

- (void)testRoundTripEncryption {
    NSString *plain = @"Mary had a little lamb.";
    NSString *outOfTheSausageMachine = [cryptor decipherString: [cryptor encipherString: plain]];
    STAssertEqualObjects(plain, outOfTheSausageMachine, @"Content should survive an encryption round-trip");
}

@end

This test passes without any extra work.

Step 4. Give that lot to the poor sap who needs to write the app.

That's right, the other developers can start working straight away. They've got the protocol which tells them what the object can do, the unit tests which explain how the API should work and the integration test that explains how you expect the object to be used.

Leaving you free to:

Step 5. Point the integration test at a different implementation of the protocol.

But you haven't written that implementation yet! It must be time to:

Step 6. Write it.

You know when it works because you have integration tests. You know it'll work with the app the other person's writing because they've got the same integration test.

So the extra artefacts that some people see as a wasteful byproduct of code-level testing — the fake implementation of the protocol for example — are demonstrably useful in this scenario. Other developers on the team can use the fake implementation as a stand-in while you're head-down coding up the production code. It's also much easier to design a class when you're not also sweating over how the internal details are going to work out, so you probably get where you're going quicker too.

On the magic of key agreement

Imagine that you want to implement AirDrop, or something like it. Two computers that have (possibly) never communicated before are going to share a file. Now you know that you want to encrypt the file in transit so that only these two computers get to see the file, but how do you set up an encryption key? You can’t generate it on one of the computers and send it over, because that way lies the infinite stack of turtles: how do you keep the key secure in transit?

The solution (or at least, one solution known as the Diffie–Hellman protocol) turns out to be beautifully simple. First, arrange that everyone knows two numbers, which we’ll call the base or generator g and the modulus m. For example, in the SKIP protocol these two numbers are documented:

g = 0x cf ca d9 90 8d 98 c1 c4 4e 4e 88 20 69 e8 7b fb
d0 ca e1 ee 24 64 a9 6c 89 70 a0 8c 86 00 30 c6
c2 83 4a 0d 82 bd a5 ba 10 cf 80 af 61 96 11 4b
3b 87 a9 7e 59 98 c4 aa 0c bd b7 2b 23 18 33 64
f3 b6 2e 1a 4d e8 86 85 aa de 78 a1 5d ce 65 33
75 7a 9d 00 b9 9e 05 26 ed 79 62 15 97 c4 06 26
fa 51 e1 5e f5 1d cb d2 23 e2 73 e5 f2 c7 3c c4
de 58 cd 3b b6 15 3c aa 84 7c fa 5f cb 6b 8d 78

m = 0x dd c9 08 3e 8f ae ef 28 16 ad 50 a9 68 ac bd 04
8e 90 9e ab 5d 41 fc 0a 51 3f 86 0a c4 1b 22 b9
30 dc a3 0a 59 73 05 38 59 b7 85 66 df ff c6 5b
b9 1f fe 44 d9 d6 5e cb 9b 68 38 a1 fd 25 3f 01
51 88 9e 93 c3 22 24 f9 03 e6 9b 8b 07 34 9d 9f
9b 38 4b c0 97 03 dc e5 2e 92 47 4c 2c e1 59 26
14 82 49 dd 58 13 91 05 12 11 a1 45 06 ac 11 8f
b1 83 53 46 93 88 9a 46 b1 8a 01 50 cb 5e 82 55

The two computers each generate a random number, which we’ll call x_A and x_B. They keep these numbers secret, but can independently calculate:

r = g^(x_A) mod m

s = g^(x_B) mod m

and they send each other r and s. The first computer can use x_A and s to derive:

k_A = s^(x_A) mod m = g^(x_A.x_B) mod m

while the second uses x_B and r to derive:

k_B = r^(x_B) mod m = g^(x_A.x_B) mod m ≡ k_A

They got the same key! Because you can’t gain any of the secrets k, x_A or x_B knowing only r and/or s, the two systems can negotiate a secret key that’s only known to the two of them without having to first establish a secure channel. I just think that’s really cool.

If you want to use this stuff in an iOS app, OpenSSL contains an implementation of Diffie–Hellman. On Android javax.crypto.KeyAgreement can do it for you.

On privacy, hashing, and your customers

I’ve talked before about not being a dick when it comes to dealing with private data and personally-identifying information. It seems events have conspired to make it worth diving into some more detail.

Only collect data you need to collect (and have asked for)

There’s plenty of information on the iPhone ripe for the taking, as fellow iOS security boffin Nicolas Seriot discussed in his Black Hat paper. You can access a lot of this data without prompting the user: should you?

Probably not: that would mean being a dick. Think about the following questions.

  • Have I made it clear to my customers that we need this data?
  • Have I already given my customers the choice to decline access to the data?
  • Is it obvious to my customer, from the way our product works, that the product will need this data to function?

If the answer is “no” to any of these, then you should consider gathering the data to be a risky business, and the act of a dick. By the way, you’ll notice that I call your subscribers/licensees “your customers” not “the users”; try doing the same in your own discussions of how your product behaves. Particularly when talking to your investors.

Should you require a long-form version of that discussion, there’s plenty more detail on appropriate handling of customer privacy in the GSMA’s privacy guidelines for mobile app developers.

Only keep data you need to keep

Paraphrasing Taligent: There is no data more secure than no data. If you need to perform an operation on some data but don’t need to store the inputs, just throw the data away. As an example: if you need to deliver a message, you don’t need to keep the content after it’s delivered.

Hash things where that’s an option

If you need to understand associations between facts, but don’t need to be able to read the facts themselves, you can store a one-way hash of the fact so that you can trace the associations anonymously.

As an example, imagine that you direct customers to an affiliate website to buy some product. The affiliates then send the customers back to you to handle the purchase. This means you probably want to track the customer’s visit to your affiliate and back into your purchase system, so that you know who to charge for what and to get feedback on how your campaigns are going. You could just send the affiliate your customer’s email address:

X-Customer-Identifier: iamleeg%40gmail.com

But now everybody who can see the traffic – including the affiliate and their partners – can see your customer’s email address. That’s oversharing, or “being a dick” in the local parlance.

So you might think to hash the email address using a function like SHA1; you can track the same hash in and out of the affiliate’s site, but the outsiders can’t see the real data.

X-Customer-Identifier: 028271ebf0e9915b1b0af08b297d3cdbcf290e3c

We still have a couple of problems though. Anyone who can see this hash can take some guesses at what the content might be: they don’t need to reverse the hash, just figure out what it might contain and have a go at that. For example if someone knows you have a user called ‘iamleeg’ they might try generating hashes of emails at various providers with that same username until they hit on the gmail address as a match.

Another issue is that if multiple affiliates all partner with the same third business, that business can match the same hash across those affiliate sites and build up an aggregated view of that customer’s behaviour. For example, imagine that a few of your affiliates all use an analytics company called “Slurry” to track use of their websites. Slurry can see the same customer being passed by you to all of those sites.

So an additional step is to append a different random value called a salt to the data before you hash it in each context. Then the same data seen in different contexts cannot be associated, and it becomes harder to precompute a table of guesses at the meaning of each hash. So, let’s say that for one site you send the hash of “sdfugyfwevojnicsjno” + email. Then the header looks like:

X-Customer-Identifier: 22269bdc5bbe4473454ea9ac9b14554ae841fcf3

[OK, I admit I’m cheating in this case just to demonstrate the progressive improvement: in fact in the example above you could hash the user’s current login session identifier and send that, so that you can see purchases coming from a particular session and no-one else can track the same customer on the same site over time.]

N.B. I previously discussed why Apple are making a similar change with device identifiers.

But we’re a startup, we can’t afford this stuff

Startups are all about iterating quickly, finding problems and fixing them or changing strategy, right? The old pivot/persevere choice? Validated learning? OK, tell me this: why doesn’t that apply to security or privacy?

I would say that it’s fine for a startup to release a first version that covers the following minimum requirements (something I call “Just Barely Good Enough” security):

  • Legal obligations to your customers in whatever country those countries (and your data) reside
  • Standard security practices such as mitigating the OWASP top ten or OWASP mobile top ten
  • Not being a dick

In the O2 Labs I’ve been working with experts from various groups – legal, OFCOM compliance, IT security – to draw up checklists covering all of the above. Covering the baseline security won’t mean building the thing then throwing it at a pen tester to laugh at all the problems: it will mean going through the checklist. That can even be done while we’re planning the product.

Now, as with everything else in both product engineering and in running a startup, it’s time to measure, optimise and iterate. Do changes to your product change its conformance with the checklist issues? Are your customers telling you that something else you didn’t think of is important? Are you detecting intrusions that existing countermeasures don’t defend against? Did the law change? Measure those things, change your security posture, iterate: use the metrics to ensure that you’re pulling in the correct direction.

I suppose if I were willing to spend the time, I could package the above up as “Lean Security” and sell a 300-page book. But for now, this blog post will do. Try not to be a dick; check that you’re not being a dick; be less of a dick.

On explaining stuff to people

An article that recently made the rounds, though it was written back in September, is called Apple’s Idioten Vektor. It’s a discussion of how the CCCrypt() function in Apple’s CommonCrypto library, when used in its default cipher block chaining mode, treats the IV (Initialization Vector) parameter as optional. If you don’t supply an IV, it provides its own IV of 0x0.

Professional Cocoa Application Security also covers CommonCrypto, CBC mode, and the Initialization Vector. Pages 79-88 discuss block encryption. The section includes sample code for both one-shot and staged use of the API. It explains how to set the IV using a random number generator, and why this should be done.[1] Mercifully when the author of the above blog post reviewed the code in my book section, he decided I was doing it correctly.

So both publications cover the same content. There’s a clear difference in presentation technique, though. I realise that the blog post is categorised as a “rant” by the author, and that I’m about to be the pot that calls the kettle black. However, I do not believe that the attitude taken in the post—I won’t describe it, you can read it—is constructive. Calling people out is not cool, helping them get things correct is. Laughing at the “fail” is not something that endears people to us, and let’s face it, security people could definitely be more endearing. We have a difficult challenge: we ask developers to do more work to bring their products to market, to spend more money on engineering (and often consultants), in return for potentially protecting some unquantified future lost revenue and customer hardship.

Yes there is a large technical component in doing that stuff, but solving the above challenge also depends very strongly on relationship management. Security experts need to demonstrate that we’re all on the same side; that we want to work with the rest of the software industry to help make better software. Again, a challenge arises: a lot of the help provided by security engineers comes in the form of pointing out mistakes. But we shouldn’t be self promoting douchebags about it. Perhaps we’re going about it wrong. I always strive to help the developers I work with by identifying and discussing the potential mistakes before they happen. Then there’s less friction: “we’re going to do this right” is a much more palatable story than “you did this wrong”.

On the other hand, the Idioten Vektor approach generated a load of discussion and coverage, while only a couple of thousand people ever read Professional Cocoa Application Security. So there’s clearly something in the sensationalist approach too. Perhaps it’s me that doesn’t get it.

[1]Note that the book was written while iPhone OS 3 was the current version, which is why the file protection options are not discussed. If I were covering the same topic today I would recommend eschewing CCCrypto for all but the most specialised of purposes, and would suggest setting an appropriate file protection level instead. The book also didn’t put encryption into the broader context of cryptographic protocols; a mistake I have since rectified.

On the top 5 iOS appsec issues

Nearly 13 months ago, the Intrepidus Group published their top 5 iPhone application development security issues. Two of them are valid issues, the other three they should perhaps have thought longer over.

The good

Sensitive data unprotected at rest

Secure communications to servers

Yes, indeed, if you’re storing data on a losable device then you need to protect the data from being lost, and if you’re retrieving that data from elsewhere then you need to ensure you don’t give it away while you’re transporting it.

Something I see a bit too often is people turning off SSL certificate validation while they’re dealing with their test servers, and forgetting to turn it on in production.

The bad

Buffer overflows and other C programming issues

While you can indeed crash an app this way, I’ve yet to see evidence you can exploit an iOS app through any old buffer overflow due to the stack guards, restrictive sandboxes, address-space layout randomisation and other mitigations. While there are occasional targeted attacks, I would have preferred if they’d been specific about which problems they think exist and what devs can do to address them.

Patching your application

Erm, no. Just get it right. If there are fast-moving parts that need to change frequently, extract them from the app and put them in a hosted component.

The platform itself

To quote Scott Pack in “The DMZ”, If you can’t trust your users to implement your security plan, then your security plan must work without their involvement. In other words, if you have a problem and the answer is to train 110 million people, then you have two problems.

Storing and testing credentials: Cocoa Touch Edition

This article introduces the concept of key stretching, using code examples to explain the ideas. For code you can use in an app that more closely resembles current practice, see Password checking with CommonCrypto.

There’s been quite the media circus regarding the possibility that Sony was storing authentication credentials for its PlayStation Network credentials in plain text. I was even quoted in a UK national daily paper regarding the subject. But none of this helps you: how should you deal with user passwords?

The best solution is also the easiest: if you can avoid it, don’t store the passwords yourself. On the Mac, you can use the OpenDirectory framework to authenticate both local users and users with accounts on the network (where the Mac is configured to talk to a networked directory service). This is fully covered in Chapter 2 of Professional Cocoa Application Security.

On the iPhone, you’re not so lucky. And maybe on the Mac there’s a reason you can’t use the local account: your app needs to manage its own password. The important point is that you never need to see that password—you need to know that the same password was presented in order to know (or at least have a good idea) that the same user is at the touchscreen, but that’s not the same as seeing the password itself.

That means that we don’t even need to use encryption where we can protect the password and recover it when we must check the password. Instead we can use a cryptographic one-way hash function to store data derived from the password: we can never get the password back, but we can always generate the same hash value when we see the same password.

Shut up Graham. Show me the code.

Here it is. This code is provided under the terms of the WTFPL, and comes without any warranty to the extent permitted by applicable law.

The first thing you’ll need to do is generate a salt. This is a random string of bytes that is combined with the password to hash: the point here is that if two users on the same system have the same password, the fact that the salt is different means that they still have different hashes. So you can’t do any statistical analysis on the hashes to work out what some of the passwords are. Otherwise, you could take your knowledge that, say, 10% of people use “password” as their password, and look for the hash that appears 10% of the time.

It also protects the password against a rainbow tables attack by removing the one-one mapping between a password and its hash value. This mitigation is actually more important in the real world than the one above, which is easier to explain :-).

This function uses Randomization Services, so remember to link Security.framework in your app’s link libraries build phase.

NSString *FZARandomSalt(void) {
    uint8_t bytes[16] = {0};
    int status = SecRandomCopyBytes(kSecRandomDefault, 16, bytes);
    if (status == -1) {
        NSLog(@"Error using randomization services: %s", strerror(errno));
        return nil;
    }
    NSString *salt = [NSString stringWithFormat: @"%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x",
                      bytes[0],  bytes[1],  bytes[2],  bytes[3],
                      bytes[4],  bytes[5],  bytes[6],  bytes[7],
                      bytes[8],  bytes[9],  bytes[10], bytes[11],
                      bytes[12], bytes[13], bytes[14], bytes[15]];
    return salt;
}

Now you pass this string, and the password, to the next function, which actually calculates the hash. In fact, it runs through the hashing function 5,000 times. That slows things down a little—on an A4-equipped iPad it takes nearly 0.088s to compute the hash—but it also slows down brute-force attacks.

NSData *FZAHashPassword(NSString *password, NSString *salt) {
    NSCParameterAssert([salt length] >= 32);
    uint8_t hashBuffer[64] = {0};
    NSString *saltedPassword = [[salt substringToIndex: 32] stringByAppendingString: password];
    const char *passwordBytes = [saltedPassword cStringUsingEncoding: NSUTF8StringEncoding];
    NSUInteger length = [saltedPassword lengthOfBytesUsingEncoding: NSUTF8StringEncoding];
    CC_SHA512(passwordBytes, length, hashBuffer);
    for (NSInteger i = 0; i < 4999; i++) {
        CC_SHA512(hashBuffer, 64, hashBuffer);
    }
    return [NSData dataWithBytes: hashBuffer length: 64];
}

Where do I go now?

You now have two pieces of information: a random salt, like edbfe42b3da2995a159c16c0a7184211, and a hash of the password, like 855fec563d91576db0e66d8745a3a9cb71dbe40d7cb2615a82b1c87958dd2e8e56db02860739422b976f182a7055dd223a3037dd3dcc5e1ca28aaaf0bade8a08. Store both of these on the machine where the password will be tested. In principle there isn’t too much worry about this data being leaked, because it’s super-hard to get the password out of it, but it’s still best practice to restrict access as much as you can so that attackers have to brute-force passwords on your terms.

When you come to verify the user’s password, pass the string presented by the user and the stored salt to FZAHashPassword(). You should get the same hash out that you previously calculated, if the same password was presented.

Anything else?

Yes. The weakest part of this solution is no longer the password storage: it’s the password itself. The salt+hash shown above is actually for the password “password” (try it yourself), and no amount of software is going to change the fact that that’s a questionable choice of password…well, software that finally does away with password authentication will, but that’s a different argument.

If you want to limit a user’s ability to choose a simple password, you have to do this at password registration and change time. Just look at the (plain-text) password the user has given you and decide whether you want to allow its use.

On cryptographic file storage

In Chapter 3 of Professional Cocoa Application Security, I talk about using CommonCrypto to encrypt files stored on either Mac or iOS file systems. In Chapter 4, I talk about using CommonCrypto to generate Hashed Message Authentication Codes (HMACs) to verify the integrity of messages received by an app. Unfortunately, I didn’t connect the two in the book.

I actually did discuss the reasons for providing an HMAC in a recent talk on cryptographic storage for iOS. But I went to Justin Clark’s talk at Security B-Sides London, where he demonstrated the attacks that are possible against cryptosystems that don’t verify the integrity of their content. This talk was very informative: if the presentation ever makes its way online, I fully recommend that you check it out.

The talk reminded me that there are important attacks I didn’t discuss in the book. So I need to bring these concepts together for the readers of PCAS. The tl;dr version is: if you encrypt content, you must also verify the integrity of the ciphertext. The long version follows.

I recommended in the book using CBC, or Cipher Block Chaining, mode for AES encryption. In this mode, the first block of plaintext is XORed with an Initialization Vector (IV), and this is then encrypted. The resultant ciphertext is then XORed with the second block of plaintext, and this is encrypted. And so on. This offers some protection against certain cryptanalysis attacks that Electronic Code Book (EBC) mode does not: for example, blocks of ciphertext cannot be arbitrarily swapped without the decrypted plaintext becoming nonsensical. A change in the ciphertext not only affects the decryption of the changed block, but also subsequent blocks.

Well, it turns out that this propagating-change property of CBC mode can be used by attackers—if they have information about errors encountered during decryption—to decrypt and encrypt data without needing to discover the key. Ouch. See the B-Sides talk described above (or google for Padding Oracle Attack) for the full details.

Of course, not providing the padding oracle is an important solution to this attack: but refusing to handle content that isn’t verifiably valid avoids any attack involving modified ciphertext: including encrypted content that lies about its length, if that’s an important consideration for your file format. Therefore, if you’re encrypting data in your app, you should be generating an HMAC and verifying that before attempting a decryption operation. Tampered data is not data you want to deal with.

On the broken(?) Mac App Store

A day after the Mac App Store was launched, people are reporting that it has been cracked. There are two separate stories here, a vapourware circumvention of the FairPlay DRM used to generate the receipts and a report that certain apps aren’t validating the receipts properly. We can ignore the first case for the moment: it’s important, and if it’s true then Apple needs to fix it (and co-ordinate updating the validation code with us third-party developers). But for the moment, it’s more important that developers are implementing the protections that are in place in their applications – it’s those applications that are supposed to be protected.

Let’s skip, for the moment, the question of whether DRM or anti-cracking mechanisms are ethically right, worthwhile, or how much effort you want to put into them. Apple have done most of the legwork, in providing a vendor-signed receipt that’s part of your signed app bundle. What you need to do is:

  • Check whether you have a receipt
  • Check whether Apple signed the receipt you have
  • Check whether the receipt is valid for your product
  • Check whether the receipt is valid for this version of your product
  • Check whether the receipt is valid for this computer

That’s it in a nutshell. Of course, some nutshells surround very big and complex nuts, and that’s true in this case:

  • There’s some good example code for receipt validation at github/roddi/ValidateStoreReceipt, if you’re going to use it then don’t just paste it wholesale. If everyone uses the same code then it’s super-easy for someone to detect and strip that code from each instance.
  • Check at runtime, as well as before startup. If you just check at startup, then all an attacker needs to do is patch main() to jump straight into NSApplicationMain() and your app runs for free.
  • Code obfuscation is not a very effective tool. Having worked in anti-virus, I know it’s much easier to classify code based on what it does than what it is, and it’s quite easy to find the code that opens the receipt file, or calls exit(173). That said, some of the commercial obfuscation companies offer a guaranteed service, so you can still protect your revenue after the app gets cracked.
  • Update I have been advised privately and seen in a blog post that people are recommending hard-coding their app bundle IDs and version numbers into the binary rather than using Info.plist, because that file can be edited. Well, so can the app binary…and in either case you’d need to re-sign the product with a valid certificate to continue, because Apple have used the kill flag:
    heimdall:~ leeg$ codesign -dvvvv /Applications/Twitter.app/
    Executable=/Applications/Twitter.app/Contents/MacOS/Twitter
    Identifier=com.twitter.twitter-mac
    Format=bundle with Mach-O universal (i386 x86_64)
    CodeDirectory v=20100 size=12452 flags=0x200(kill) hashes=616+3 location=embedded
    CDHash=8e0736639d79a108a5a1ebe89f928d1da0d49d94
    Signature size=4169
    Authority=Apple Mac OS Application Signing
    Authority=Apple Worldwide Developer Relations Certification Authority
    Authority=Apple Root CA
    Info.plist entries=21
    Sealed Resources rules=4 files=78
    Internal requirements count=2 size=344
    

    Changing a hard-coded string in a binary file is not difficult. You can of course obfuscate the string, but the motivated cracker still finds the point where the comparison is made (particularly easily if you use NSStrings). Really, how far you want to go depends on how much you’re willing to spend.

Of course, Fuzzy Aliens Ltd has already been implementing receipt validation for customers, so if this is too hard for you or you don’t have the time… ;-)

A last word on publicising receipt-validation vulnerabilities

You and I both make our living by selling software, or by selling services to people who sell software. Crowing on the interwebs about how this application or that application doesn’t validate its receipts properly is not cool, because you are shitting on your own doorstep. There is no public benefit to public disclosure that class of vulnerability, because DRM is not a user security feature. Don’t do that. Send the developer a private message explaining your findings. Give them a chance to put extra effort into protecting their product, if that’s what they want to do.