Confine ALL the things!

I was talking with Saul Mora at lunchtime about NSManagedObjectContext thread confinement. We launched into an interesting thought experiment: what if every object ran on its own thread?

This would be interesting. You can never use a method that returns a value, because then you’d need to block this object’s thread while waiting for that object. You’d need to rely solely on being able to pass messages from one object to another, without seeing return values. In other words, the “tell don’t ask” principle is enforced.

You get to avoid a lot of problems with concurrency issues in any app. Each object has its own data (something we’ve been able to get right for decades), and its own context to work on that data. No object should be able to trample on another object’s data because we have encapsulation, and that means that no thread should be able to trample on another thread’s data because one thread == one object.

Also notice that one object == one thread. You don’t have to worry about whether different methods on the same object are thread-safe, because you’re confining the object to a single thread so no code in that object will execute in parallel.

So yeah, I built that thing. Or really I built a very simple test app that demonstrates that thing in action. Check out the source from the GitHub project linked here. For your edification, here’s a description of the meat of the confinement behaviour, the FZAConfinementProxy. Starting with the interface:

@interface FZAConfinementProxy : NSProxy

- (id)initWithRemoteObject: (id)confinedObject;

@end

This class is a NSProxy subclass, because all it needs to do is forward messages to another object. Or rather, that’s what it claims to do.

static const void *FZAProxyAssociation = (const void *)@"FZAConfinementProxyAssociationName";

@implementation FZAConfinementProxy {
    id remoteObject;
    NSOperationQueue *operationQueue;
}

- (id)initWithRemoteObject:(id)confinedObject {
    id existingProxy = objc_getAssociatedObject(confinedObject, FZAProxyAssociation);
    if (existingProxy) {
        self = nil;
        return existingProxy;
    }

Ensure that there’s exactly one proxy per object, using Objective-C’s associated objects to map proxies onto implementations.

    remoteObject = confinedObject;
    if ([remoteObject isKindOfClass: [UIView class]]) {
        operationQueue = [NSOperationQueue mainQueue];
    } else {
        operationQueue = [[NSOperationQueue alloc] init];
        operationQueue.maxConcurrentOperationCount = 1;
    }
    objc_setAssociatedObject(confinedObject, FZAProxyAssociation, self, OBJC_ASSOCIATION_ASSIGN);
    return self;
}

If we need to create a new proxy, then create an operation queue to which it can dispatch messages. UIKit needs to be used on the main thread, so ensure proxies for UIViews use the main operation queue.

What this means is that if your controller uses confinement proxies for both views and models (just as the dummy app does), then there’s no need to mess with methods like -performSelectorOnMainThread:withObject:waitUntilDone: in your controller logic. That’s automatically handled by the proxies.

- (NSMethodSignature *)methodSignatureForSelector:(SEL)sel {
    if (sel == @selector(initWithRemoteObject:)) {
        return [NSMethodSignature signatureWithObjCTypes: "@@:@"];
    }
    else {
        return [remoteObject methodSignatureForSelector: sel];
    }
}

Ensure that the Objective-C machinery can tell what messages it can send to this proxy.

- (void)forwardInvocation:(NSInvocation *)invocation {
    [invocation setTarget: remoteObject];
    NSInvocationOperation *operation = [[NSInvocationOperation alloc] initWithInvocation: invocation];
    [operationQueue addOperation: operation];
}

Here’s all the smarts, and it isn’t really very smart. Objective-C notices that the proxy object itself doesn’t implement the received method, and so asks the proxy whether it wants to forward the method. It’s, well, it’s a proxy, so it does want to. It rewrites the invocation target to point to the confined object, then adds the invocation to the operation queue.

That’s really it, bar some cleanup that you can see in GitHub. One thread per object limits the ways in which you can design code, so it’s a great way to learn about writing code within those limitations.

Posted in code-level, software-engineering | 4 Comments

How to TDD with CATCH

Plenty of people have asked me about the TDD framework I use. While the book Test-Driven iOS Development has code using OCUnit (for pragmatic, and previously-covered, reasons); I am currently more frequently to be discovered using Phil Nash’s CATCH framework. Here’s how you can get from New Project to first passing test. I’m starting in the same place that the book’s sample code starts: I need a Topic class that has a name.

Set up the project.

File->New Project in Xcode. Create an app project that matches your needs: an empty app will be fine. I’ll assume in this project, just as in the book, that you’ll be using Automatic Reference Counting for this project.

Now you need to add another app target. The “Empty Application” template is sufficient for this, and I call this target “Unit Tests”. When you’ve got that target set up, remove the app delegate class and the main.m file: we’ll add a test runner from CATCH.

For this target, you also have to choose whether or not to use ARC. If you enabled ARC for the app target, then the classes you’re testing will use Automatic Reference Counting, but the CATCH internals cannot because they use scary runtime hacks to dynamically generate classes from the test cases. In this walkthrough, you should enable ARC but we’ll have to edit compiler flags for a couple of files to get manual reference counting where we need it.

Check out CATCH from the GitHub project linked above. Then go into the Build Rules for your Unit Tests target, and add the path to the Catch/include source folder. For example, if you clone CATCH to the top-level folder of your Xcode project, you’ll need to add "$(SRCROOT)/Catch/include".

Locate the file Catch/projects/runners/iTchRunner/itChRunnerMain.mm (capitalisation as supplied) in Finder, and drag it onto your Xcode project. Add this file to the Unit Tests target. Now you need to go to the Build Phases inspector for your target, and under Compile Sources, next to itChRunnerMain.mm add the flag -fno-objc-arc.

Add the first test

Add a new file to the Unit Tests target, called TopicTests.mm (you don’t need the matching header file). As with itChRunnerMain.mm, you’ll need to set the -fno-objc-arc flag in the Build Phases. Include the catch.hpp header and write the test that checks for the name:

#include "catch.hpp"

TEST_CASE("Topic/Name", "Require that Topic has a name") {
    Topic *topic = [[Topic alloc] initWithName: @"iPhone"];
    REQUIRE(topic != nil);
    REQUIRE([topic.name isEqualToString: @"iPhone"]);
    [topic release];
}

Get this test to compile

So currently, the test target won’t even build: we’re using a Topic class and its name property, and these don’t exist. Add a new file to the project, an NSObject class called Topic. You could add this file to both targets as it will be app code: don’t disable ARC here because the whole point is you want to write automatically reference-counted app code.

Just to fast-forward things a bit, here’s an implementation of Topic.[hm] that contains minimal implementations of the required methods.

Topic.h
#import <Foundation/Foundation.h>

@interface Topic : NSObject

@property (readonly) NSString *name;

- (id)initWithName: (NSString *)aName;

@end

Topic.m
#import "Topic.h"

@implementation Topic

@synthesize name = _name;

- (id)initWithName:(NSString *)aName {
    if (self = [super init]) {
    }
    return self;
}

@end

Observe that the test fails.

Run the “Unit Tests” app in Xcode. You’ll see a screen like this:

Default Catch Screen

It’s not very exciting, but at least you can tell what to do: tap the “Run All Tests” button.

Catch failed test

Still not exciting, and it’s possible that if you have certain types of colour-blindness it’s not clear what’s happened. Luckily the logs are more explicit:

2012-03-13 17:18:01.640 UnitTests[960:707] Application windows are expected to have a root view controller at the end of application launch

2012-03-13 17:18:03.915 UnitTests[960:707] topic != __null succeeded for: 0x2d11b0 != __null

2012-03-13 17:18:03.917 UnitTests[960:707] [topic.name isEqualToString: @"iPhone"] failed for: false

2012-03-13 17:18:03.923 UnitTests[960:707] 1 failures

Make this test pass

There’s only one line to add to the app code to get everything passing. Here’s the Topic implementation again with the change highlighted:

@implementation Topic

@synthesize name = _name;

- (id)initWithName:(NSString *)aName {
    if (self = [super init]) {
        _name = [aName copy];
    }
    return self;
}

@end

Run the tests again:

Passing CATCH tests

2012-03-13 17:46:17.644 UnitTests[1049:707] Application windows are expected to have a root view controller at the end of application launch

2012-03-13 17:46:43.746 UnitTests[1049:707] topic != __null succeeded for: 0x2ca7f0 != __null

2012-03-13 17:46:43.748 UnitTests[1049:707] [topic.name isEqualToString: @"iPhone"] succeeded for: true

2012-03-13 17:46:43.753 UnitTests[1049:707] no failures

Win.

Conclusions

The runner for CATCH tests is currently very basic: it lets you know when things have passed or failed but not which tests have failed, though the tests are named. You need to manually prod it to get tests to run. However, the syntax is a refreshing change from the verbosity of JUnit-derived test frameworks.

Posted in code-level, software-engineering, TDD | 1 Comment

The security apprentice

This originally appeared in a post at Sophos’ Naked Security blog.

There have been two recent occasions on which my computing life has been influenced by Lord Sugar, the business mogul in charge of Amstrad and star of BBC One’s reality show The Apprentice.

The first was on a visit to the National Museum of Computing at Bletchley Park, where I got to use a computer Amstrad produced back in the 1980s, that went by the catchy name “CPC 6128”. It was running a Tetris clone called Blocks created—as was proudly proclaimed on the game screen—by an upstart programmer called “G Cluley”.

The second was this weekend, when the device you see in the picture showed up in a charity shop. This is the E-mailer Plus, a sort of executive phone/internet thing released by Amstrad in 2002. Being a fan of old computers, especially oddball ones like the E-mailer, I bought it.

The key feature of this phone was that it also had e-mail and web capabilities, albeit delivered via a premium rate number that lined Lord Sugar’s pockets with every mail check. Users could configure the phone to automatically fetch their mail to be read on the attached LCD screen.

And, indeed, someone had used this E-mailer for e-mail. Somone I shall call “Colin” had set up two accounts on the device. How do I know this? Because Colin hadn’t deleted these accounts before taking his phone to the charity shop.

As I said, the E-mailer relies on a dial-up service which was discontinued earlier this year by its ultimate owners, BSkyB. That means that I couldn’t, should I want to, fetch Colin’s new e-mail messages. But there were messages already stored on the phone that I could have read. More surprisingly, the configuration screens let me see passwords assigned to Colin’s accounts: has he used the same passwords on any other services?

Hopefully you’re aware of the need to ensure there’s no sensitive information stored on old computers before you dispose of them, particularly if you’re going to sell them on to other users. My new (or should I say Colin’s old) E-mailer shows that this goes for any device that stores or accesses your data, including phones both smart and retro.

I now imagine a scene in Lord Sugar’s office. “Colin, you made a basic error. By failing to delete your accounts before giving away your phone, you put your e-mail messages and your passwords at risk. You compromised the privacy of your own and your company’s data, and for that reason, you’re fired.”

Posted in Uncategorized | Comments Off on The security apprentice

On advertising’s place in the tech industry

Dave Winer said the way is open for a non-ad-supported tech sector:

The tech industry has been absorbed by the ad industry, and vice versa.
However, there is, imho, still room for a tech industry that is not merged with the ad industry.
In fact, if we want to have a tech industry at all, we’d better invest in the “other” one, because advertising isn’t much to bet on long-term. Seriously.

Daniel Jalkut went further, saying that ads are on the decline:

I’ll take this a step further: advertising is on the way out. Technology loathes a middle-man, and advertising as an industry is the king of all middle-men. The purpose of advertising is to connect customers with companies, so as to facilitate a transfer of money in exchange for goods or services. As time goes by, customers and companies will be more and more capable of achieving this on their own.

I agree with Dave, but not so much with Daniel. As I was writing this post, WPP (the world’s largest ad company) were on the radio announcing a 20% increase in profits worldwide, with an increase in the fraction of revenue they get from digital adverts up to 30%. WPP say that concerns of disintermediation (i.e. the removal of middle-men Daniel talks about) are unfounded. The internet has produced its own new middle men, particularly Google and Facebook. “Traditional” advertisers such as WPP are profiting from advertising with those companies.

Ads can be on the way out, but if that’s a world we want to see, it’s up to us app makers to do something about it, it won’t just happen automatically. Why us? Because the future involves even more mobile internet, and that means that we are in a privileged position to control what customers of the future are looking at. If we want them to look at something other than adverts, we need to put something that isn’t an advert and is worth looking at on their screen.

That’s a world in which many apps will need a very different business model than today’s. There are a lot—particularly on the web and on Android—of apps that are free to users and generate revenue from ads. Those would need to be funded in different ways: I don’t think that the freemium model of “pay to remove ads” will work. It doesn’t solve the goal of reducing ads (every user still sees the ad version at least once), for a start. Also, conversion rates will likely be low as people can “put up with” adverts.

Apps that still want to rely on viral growth still need a really low barrier to entry for new users, but need to change why and how many people convert to a paid version. Companies like Evernote and Dropbox have a model that appears to work for them, where paying customers get extra features/capacity over free users, and provide enough income to support the cost of providing the free product. And then of course there’s the shareware model that’s worked for decades: where the store you’re selling in allows you to do that.

But not all products need to be promoted virally. You can just sell something on the promise that it solves a problem, in the same way that I made a living selling solutions to app security problems, and the Omni Group make even more money selling the solution to the problem of remembering all the stuff you need to do.

You can also sell the promise that people can come back later: that is, you make money by letting customers stick to you. Look at iTunes Match. For some amount of money, Apple manages your music library. Then, next year, they’ll ask you whether you want to carry on doing that. You get the choice between pulling all your music back down and looking after it yourself again, or you can pay Apple once more and they’ll carry on doing it for a year. Apple are betting on there being enough people who come back every year to make this worthwhile.

So, to conclude, app makers who want to see an ad-free world are well placed to make that happen: doing so may start with an uncomfortable review of the way our apps are going to make money. In the short term, it’ll mean having an expensive fully-customer-funded app competing in a world of cheap ad-supported apps. But in the longer term it could result in better, more transparent relationships with our customers who aren’t worried about what data we’re selling to advertisers.

Posted in Business | 4 Comments

Why we don’t trust -retainCount

I’m pretty sure @bbum must have worn through a few keyboards telling users of StackOverflow not to rely on the value of an Objective-C object’s -retainCount. Why? When we create an object, it has a retain count of 1, right? Retains (and, for immutable objects, copies) bump that up, releases (and, some time later, autoreleases) bring it down, right? If an attempt to release would bring the retain count to 0, that object gets released, right? Right?!?

Well, that’s not true for all objects, but leaving that aside, your code isn’t the only code running in your process. The system libraries along with any third-party code you’ve included in your app are all doing things, including retaining and releasing objects. Let’s take a look at a specific case of that.

The code in this post is from Gnustep-base, the LGPL implementation of OpenStep’s Foundation library. The behaviour shown is identical to behaviour in Apple’s Foundation. Here’s (most of) +[NSNumber initialize].

/*
 * Numbers from -1 to 12 inclusive that are reused.
 */
static NSNumber *ReusedInstances[14];
static NSBoolNumber *boolY;		// Boolean YES (integer 1)
static NSBoolNumber *boolN;		// Boolean NO (integer 0)

+ (void) initialize
{
  int i;

  if ([NSNumber class] != self)
    {
      return;
    }

  // ...

  boolY = NSAllocateObject (NSBoolNumberClass, 0, 0);
  boolY->value = 1;
  boolN = NSAllocateObject (NSBoolNumberClass, 0, 0);
  boolN->value = 0;

  for (i = 0; i < 14; i++)
    {
      NSIntNumber *n = NSAllocateObject (NSIntNumberClass, 0, 0);

      n->value = i - 1;
      ReusedInstances[i] = n;
    }
}

We see that sixteen instances of NSNumber subclasses, representing YES, NO and the integers -1 to 12, have been allocated and stored away in static variables. What’s that about? Well here’s (most of) +numberWithInt:.

/*
 * Macro for checking whether this value is the same as one of the singleton
 * instances.  
 */
#define CHECK_SINGLETON(aValue) \
if (aValue >= -1 && aValue <= 12)\
{\
  return ReusedInstances[aValue+1];\
}

// ...

+ (NSNumber *) numberWithInt: (int)aValue
{
  NSIntNumber *n;

  if (self != NSNumberClass)
    {
      return [[[self alloc] initWithBytes: (const void *)&aValue
        objCType: @encode(int)] autorelease];
    }

  CHECK_SINGLETON (aValue);
  n = NSAllocateObject (NSIntNumberClass, 0, 0);
  n->value = aValue;
  return AUTORELEASE(n);
}

So it looks like the retain count of an object you get back from +numberWithInt: depends on the value you pass, and whether anyone else is trying to use a number with the same value right now.

In other words, while it’s easy to get an object’s retain count using the -retainCount method, the number you expect to see may well be wrong. The real value depends on so many parameters that it’s reasonable to conclude that you don’t know what it’s expected to be right now, so don’t depend on it.

By the way, the NSNumber implementation is a great place to find out more about how Foundation is built. It’s a class cluster (which is why you can see NSIntNumber and NSBoolNumber classes), and you can see how to use marker pointers to encode the value of an NSNumber right into the object pointer.

Posted in code-level, Foundation, iPad, iPhone, Mac | Comments Off on Why we don’t trust -retainCount

Synthesized ivars are private

Perhaps this isn’t news. Perhaps it doesn’t matter because you’ve provided public accessors. But here are the results anyway.

#import <Foundation/Foundation.h>

@interface A: NSObject
@property (nonatomic, assign) int a;
@end

@interface B: A
- (int)differentGetter;
@end

@interface C: NSObject
@property (nonatomic, retain) A *anA;
- (int)aFromA;
@end

int main(int argc, char *argv[]) {
	@autoreleasepool {
		B *b = [[B alloc] init];
		b.a = 3;
		NSLog(@"[b differentGetter] = %d", [b differentGetter]);
		[b release];
		
		C *c = [[C alloc] init];
		c.anA = [[A alloc] init];
		c.anA.a = 4;
		NSLog(@"[c aFromA] = %d", [c aFromA]);
		[c release];
	}
	return 0;
}

@implementation A
@synthesize a=_a;
@end

@implementation B
- (int)differentGetter { return _a; } // must be at least @protected
@end

@implementation C
@synthesize anA = _anA;
- (int)aFromA { return _anA->_a; } // must be @public
- (void)dealloc { self.anA = nil; [super dealloc]; }
@end

Doesn’t compile:

Untitled.m:37:33: error: instance variable '_a' is private
- (int)differentGetter { return _a; }
                                ^
Untitled.m:42:30: error: instance variable '_a' is private
- (int)aFromA { return _anA->_a; }
                             ^
2 errors generated.
Posted in Uncategorized | Comments Off on Synthesized ivars are private

Is it an anti-pattern to use properties everywhere?

I’ve seen questions about whether to always provide accessors for ivars, and recommendations, such as in akosma software’s ObjC code standards, that say Whenever possible, do not specify ivars in the header file; use only @property and @synthesize statements instead.

This isn’t how I work, which led me to ask the question: is this a good recommendation? Obviously the question of whether coding standards are “good” or “bad” is subjective, so what I’m really asking is whether this is something I’d want to do myself.

How I currently work

If I need to be able to see the state of an object from outside that object, I’ll make a readonly property. If I need to be able to change the state of an object then I’ll create a readwrite property. Whether these are synthesised or dynamic properties depends on how I need to compute their values.

If, in implementing a method, I find I need to make use of some object or value that was generated in another method, I’ll create an instance variable to store the value away in the other method so it can be used from this method. I define this ivar in the @implementation of the class.

This effectively acts as a de facto distinction between public and private data. Public data is accessed via properties which can be seen and called from anywhere; private data is accessed via ivars that cannot be seen anywhere except inside the current class (don’t worry about the visibility modifiers @public and friends: because the ivars are inside the implementation even subclasses don’t know what they are).

If I consistently use the property accessors even inside a class, I can see where an object is making use of state that is accessible to the outside world. That can indicate an encapsulation failure, making the class fragile to external prodding. Of course, it’s also expected that objects do get told about the outside world, so it’s not an automatic fail to do this: but it’s useful to be able to see where it happens.

About the “properties everywhere” approach

As I see it, there are benefits and drawbacks to that technique. The pros:

  • Encapsulate memory management. This is less of a benefit with automatic reference counting or garbage collection, but in a manually reference-counted environment you can put the memory management semantics in one place rather than sprinkling your code with retain, release and copy calls.
  • Consistency. Rather than having two distinct techniques for accessing ivars, you have just one.
  • Properties are new(ish), and ivars are old and busted. :-)

The cons:

  • Broken encapsulation. Now all of your ivars have accessors, and as previously discussed here Objective-C doesn’t have method visibility modifiers. All accessors are public (even if declared in the class extension), so any dev armed with a copy of class-dump might decide to change the internal state of your classes.
  • Consistency. This is where it gets subjective, because this was also a pro, but as I said earlier I deliberately make a distinction between internal state and externally-available properties, so making both of these the same is a problem.
  • Writing code you don’t need. Even where you never actually use an ivar outside of the object that owns it, you’re still writing public methods for that ivar.

Conclusion

I’m not going to be adopting the “everything is a property” recommendation, I value the privacy of my parts. I’ll carry on with writing instance variables where I need them, and “promoting” them to properties with accessors where that’s necessary.

Posted in code-level | 10 Comments

Some LightReading about mobile app security

[This article was co-written with @securityninja]

If mobile app security is failing, it’s up to the security industry, not developers, to repair it.

An article published yesterday at security news site DarkReading announces “Developers not applying secure development life cycle practices in mobile app production”. The author finds many faults with the way application security is treated by mobile app developers, but doesn’t address what we believe to be the underlying problem: the engagement between security specialists and app makers.

Before addressing this point, though, there are several inaccuracies in the article that need to be corrected. Ericka Chickowski describes the mobile app landscape as “a development environment still in its infancy and no real standards to lead the way”, though in fact developers are able to bring a lot of the experience and tools they’ve used in building desktop and web apps to the party. The Cocoa Touch SDK used in Apple’s iOS shares a common heritage with Mac OS X’s Cocoa, just as the Java used in Android and SilverLight in Windows Phone 7 strongly resemble their desktop counterparts. For example the Code Access Security approach introduced in Silverlight ended up being adopted as the Code Access Security approach for the whole .NET framework.

Taking the 50,000 foot view, mobile apps look a lot like any other software application: there’s a client that’s delivered to the user somehow, which has some local storage and (often) communicates to an online service via the internet. This is exactly the situation we see with web apps, and indeed a lot of the techniques used to secure web apps are directly applicable in the mobile world. Sure there are new challenges to address with mobile, but these represent a tweak, not a rewrite, of security advice for developers.

It is probably fair to say security guidance and testing tools aren’t quite as mature as they are in the web application security world but to say they don’t exist is false. One of us (David) is an author of open source tools which help reviewers find vulnerabilities in mobile applications and has to correct the original author on this point. In addition to my own tools others exist: in fact there is a Live CD called the ARE (Android Reverse Engineering) VM which was created solely to allow people to security review/test Android applications and established commercial offerings from the likes of Veracode.

Chickowski also states that no secure coding guidelines for mobile applications exist: OWASP have a mobile security project which is very useful and will only get better over time. The OWASP project includes iGoat and GoatDroid: insecure apps for the iPhone and Android respectively that developers can use to understand what vulnerable code looks like and how to detect and fix security problems on those mobile platforms.

Both Apple and Android have good secure development documentation. The Apple “Introduction to Secure Coding” document has been around since 2006 with guidance for iOS development added in 2008. Apple’s World-Wide Developer Conference (WWDC) includes multiple sessions on application security; these sessions are freely available to registered developers. Google has produced similar guidance as part of their development reference site so to say no guidance exists and that Apple and Google have only just started to think about needs to be corrected. In fact any one who spends a small amount of time researching the architecture of those platforms (as well as others including WP7) will understand that security has been a consideration pretty much since day one.

The final point we want to pick up from the article is the following line: “Rapid and Agile Development causes changes to happen in very short iterations, thus security gets overlooked and becomes a nice thing to do but rarely gets done.” This is certainly not a mobile specific issue and as people who work in a company where we have security integrated throughout an Agile process, we can tell you the security deliverables in SDLC don’t really change much. Sure it means you have to do security testing and reviews more often but that doesn’t mean security should be, or always is overlooked. You can try to blame developers, you can even try to place the blame at the door of security professionals but if the business doesn’t want to produce secure code there is very little those people can do.

So what’s actually behind the apparent lack of security practice among mobile app developers? We see a clue in the fact that Chickowski’s article was published at a security news site, not at a site for mobile developers. Security practitioners telling each other how apps fail at security can be entertaining and make for good conversation in the bars at conferences, but we need to engage the people making the apps if we want to effect change in the way those apps are made. Developers, project managers and executive officers need to be able to evaluate the risk that they are exposing their customers and their businesses to. They need to know how to measure the security posture of their apps and to make decisions on what changes to make, feeding those decisions into the same process they use to prioritise features and bug fixes. In short, we need to help developers to get this right, not call them out when they get it wrong,

David Rook (@securityninja) is the Application Security Lead at Realex Payments.

Graham Lee (@iamleeg) is the Smartphone Security Boffin at O2 Labs.

Posted in Uncategorized | 2 Comments

Building an object-oriented dispatch system in Objective-C

iTunes was messing about rebuilding the device I was trying to use for development, so I had time over lunch to write a new message dispatch system in the Objective-C language. “But wait,” you say, “Objective-C already has a message dispatch system!” True, and it’s better than the one I’ve created. But it doesn’t use blocks, and blocks are cool :-). In the discussion below, I’ll build up an implementation of a “recent items” list, which is discussed in Kevlin Henney’s presentation linked in the acknowledgements.

The constructor

One important part of an object system is the ability to make new objects. Let’s declare an object type, and a constructor type that returns one of those objects:

typedef id (^BlockObject)(NSString *selector, NSDictionary *parameters);
typedef BlockObject(^BlockConstructor)(void);

I’d better explain signature of the BlockObject type. Objects can be sent messages; what we’re doing is saying that if you execute the object with a selector name, the object will dispatch the correct implementation with the parameters you supply and will give you back the return value from the implementation. That’s what objc_msgSend() does in old-school Objective-C. The constructor is going to return this dispatch block – actually it’ll return a copy of that block, so invoking the constructor multiple times results in multiple copies of the object. Let’s see that in action.

BlockConstructor newRecentItemsList = ^ {
    BlockObject list = ^(NSString *selector, NSDictionary *parameters) {
        return (id)nil;
    }
    return (BlockObject)[list copy];
}

Yes, you have to cast nil to id. Who knew C could be so annoying?

Message dispatch

An object that can’t do anything isn’t very exciting, so we should add a way for it to look up and execute implementations. Method implementations are of the following type:

typedef id (^BlockIMP)(NSDictionary *parameters);

With that in place, I’ll show you an example of the object with a dispatch system in place, and discuss it afterward.

typedef void (^SelectorUpdater)(NSString *selector, BlockIMP implementation);

BlockConstructor newRecentItemsList = ^ {
    __block NSMutableDictionary *selectorImplementationMap = [NSMutableDictionary dictionary];
    __block SelectorUpdater setImplementation = ^(NSString *selector, BlockIMP implementation) {
        [selectorImplementationMap setObject: [implementation copy] forKey: selector];
    };

    BlockObject list = ^(NSString *selector, NSDictionary *parameters) {
        BlockIMP implementation = [selectorImplementationMap objectForKey: selector];
        return implementation(parameters);
    };
    return (BlockObject)[list copy];
};

The variables selectorImplementationMap and setImplementation are __block variables in the constructor block. This means that every time the constructor is called, the returned instance has its own copy of these variables that it is free to use and to modify. Let me put that another way: the entire message-dispatch system is encapsulated inside each instance. If a class, or even an individual instance, wants to implement message dispatch in a different way, that’s cool. It also means that an instance can change its own methods at runtime without affecting any other objects, including other instances of the same class. As long as the object still conforms to the contract that governs method dispatch, that’s cool too.

Implementing the recent items list

OK, now that we’ve got construction and messaging in place, we can start making useful objects. Here’s the implementation of the recent items list, where I’ve chosen to use an NSMutableArray for the internal storage, as with the dispatch map it’s an instance variable of the list. Needless to say you could change this to a C array, STL container or anything else without breaking external customers of the object.

BlockConstructor newRecentItemsList = ^ {
    __block NSMutableDictionary *selectorImplementationMap = [NSMutableDictionary dictionary];
    __block SelectorUpdater setImplementation = ^(NSString *selector, BlockIMP implementation) {
        [selectorImplementationMap setObject: [implementation copy] forKey: selector];
    };
    __block NSMutableArray *recentItems = [NSMutableArray array];
    setImplementation(@"isEmpty", ^(NSDictionary *parameters) {
        return [NSNumber numberWithBool: ([recentItems count] == 0)];
    });
    
    setImplementation(@"size", ^(NSDictionary *parameters) {
        return [NSNumber numberWithInteger: [recentItems count]];
    });
    
    setImplementation(@"get", ^(NSDictionary *parameters) {
        NSInteger index = [[parameters objectForKey: @"index"] integerValue];
        return [recentItems objectAtIndex: index];
    });
    
    setImplementation(@"add", ^(NSDictionary *parameters) {
        id itemToAdd = [parameters objectForKey: @"itemToAdd"];
        [recentItems removeObject: itemToAdd];
        [recentItems insertObject: itemToAdd atIndex: 0];
        
        return (id)nil;
    });
    
    BlockObject list = ^(NSString *selector, NSDictionary *parameters) {
        BlockIMP implementation = [selectorImplementationMap objectForKey: selector];
        return implementation(parameters);
    };
    return (BlockObject)[list copy];
};

Using the list

Here is an example of creating and using a recent items list. Thankfully, since writing this post literal dictionaries have appeared, so it doesn’t look so bad:

    BlockObject recentItems = newRecentItemsList();
    BOOL isEmpty = [recentItems(@"isEmpty", nil) boolValue];

    NSDictionary *getArgs = @{ @"index" : @0 };
    @try {
        id firstItem = recentItems(@"get", getArgs);
        NSLog(@"first item in empty list: %@", firstItem);
    }
    @catch (id e) {
        NSLog(@"can't get first item in empty list");
    }

Exercises for the reader

The above class is not complete. Here are some ways you could extend it, that I haven’t covered (or, for that matter, tried).

  • Implement @"isEqual". Remember that no instance can see the ivars of any other instance, so you need to use the public interface of the other object to decide whether it’s equal to this object. You’ll need to provide a new method @"respondsToSelector" in order to build @"isEqual" properly.
  • Respond to unimplemented selectors well. The implementation shown above crashes if you send an unknown message: it’ll try to dereference a NULL block. That, well, it’s bad. Objective-C objects have a mechanism that catches these messages, allowing the object a chance to lazily add a method implementation or forward the message to a different object.
  • Write an app using these objects. :-)

Credit where it’s due

This work was inspired by Kevlin Henney’s presentation: It is possible to do OOP in Java. The implementation shown here isn’t even the first time this has been done using Objective-C blocks. The Security Transforms in Mac OS X 10.7 work in a very similar way. This is probably the first attempt to badly document a bad example of the art, though.

Posted in code-level, OOP, Uncategorized | Comments Off on Building an object-oriented dispatch system in Objective-C

On privacy, hashing, and your customers

I’ve talked before about not being a dick when it comes to dealing with private data and personally-identifying information. It seems events have conspired to make it worth diving into some more detail.

Only collect data you need to collect (and have asked for)

There’s plenty of information on the iPhone ripe for the taking, as fellow iOS security boffin Nicolas Seriot discussed in his Black Hat paper. You can access a lot of this data without prompting the user: should you?

Probably not: that would mean being a dick. Think about the following questions.

  • Have I made it clear to my customers that we need this data?
  • Have I already given my customers the choice to decline access to the data?
  • Is it obvious to my customer, from the way our product works, that the product will need this data to function?

If the answer is “no” to any of these, then you should consider gathering the data to be a risky business, and the act of a dick. By the way, you’ll notice that I call your subscribers/licensees “your customers” not “the users”; try doing the same in your own discussions of how your product behaves. Particularly when talking to your investors.

Should you require a long-form version of that discussion, there’s plenty more detail on appropriate handling of customer privacy in the GSMA’s privacy guidelines for mobile app developers.

Only keep data you need to keep

Paraphrasing Taligent: There is no data more secure than no data. If you need to perform an operation on some data but don’t need to store the inputs, just throw the data away. As an example: if you need to deliver a message, you don’t need to keep the content after it’s delivered.

Hash things where that’s an option

If you need to understand associations between facts, but don’t need to be able to read the facts themselves, you can store a one-way hash of the fact so that you can trace the associations anonymously.

As an example, imagine that you direct customers to an affiliate website to buy some product. The affiliates then send the customers back to you to handle the purchase. This means you probably want to track the customer’s visit to your affiliate and back into your purchase system, so that you know who to charge for what and to get feedback on how your campaigns are going. You could just send the affiliate your customer’s email address:

X-Customer-Identifier: iamleeg%40gmail.com

But now everybody who can see the traffic – including the affiliate and their partners – can see your customer’s email address. That’s oversharing, or “being a dick” in the local parlance.

So you might think to hash the email address using a function like SHA1; you can track the same hash in and out of the affiliate’s site, but the outsiders can’t see the real data.

X-Customer-Identifier: 028271ebf0e9915b1b0af08b297d3cdbcf290e3c

We still have a couple of problems though. Anyone who can see this hash can take some guesses at what the content might be: they don’t need to reverse the hash, just figure out what it might contain and have a go at that. For example if someone knows you have a user called ‘iamleeg’ they might try generating hashes of emails at various providers with that same username until they hit on the gmail address as a match.

Another issue is that if multiple affiliates all partner with the same third business, that business can match the same hash across those affiliate sites and build up an aggregated view of that customer’s behaviour. For example, imagine that a few of your affiliates all use an analytics company called “Slurry” to track use of their websites. Slurry can see the same customer being passed by you to all of those sites.

So an additional step is to append a different random value called a salt to the data before you hash it in each context. Then the same data seen in different contexts cannot be associated, and it becomes harder to precompute a table of guesses at the meaning of each hash. So, let’s say that for one site you send the hash of “sdfugyfwevojnicsjno” + email. Then the header looks like:

X-Customer-Identifier: 22269bdc5bbe4473454ea9ac9b14554ae841fcf3

[OK, I admit I’m cheating in this case just to demonstrate the progressive improvement: in fact in the example above you could hash the user’s current login session identifier and send that, so that you can see purchases coming from a particular session and no-one else can track the same customer on the same site over time.]

N.B. I previously discussed why Apple are making a similar change with device identifiers.

But we’re a startup, we can’t afford this stuff

Startups are all about iterating quickly, finding problems and fixing them or changing strategy, right? The old pivot/persevere choice? Validated learning? OK, tell me this: why doesn’t that apply to security or privacy?

I would say that it’s fine for a startup to release a first version that covers the following minimum requirements (something I call “Just Barely Good Enough” security):

  • Legal obligations to your customers in whatever country those countries (and your data) reside
  • Standard security practices such as mitigating the OWASP top ten or OWASP mobile top ten
  • Not being a dick

In the O2 Labs I’ve been working with experts from various groups – legal, OFCOM compliance, IT security – to draw up checklists covering all of the above. Covering the baseline security won’t mean building the thing then throwing it at a pen tester to laugh at all the problems: it will mean going through the checklist. That can even be done while we’re planning the product.

Now, as with everything else in both product engineering and in running a startup, it’s time to measure, optimise and iterate. Do changes to your product change its conformance with the checklist issues? Are your customers telling you that something else you didn’t think of is important? Are you detecting intrusions that existing countermeasures don’t defend against? Did the law change? Measure those things, change your security posture, iterate: use the metrics to ensure that you’re pulling in the correct direction.

I suppose if I were willing to spend the time, I could package the above up as “Lean Security” and sell a 300-page book. But for now, this blog post will do. Try not to be a dick; check that you’re not being a dick; be less of a dick.

Posted in Business, Crypto, Data Leakage, Privacy, Responsibility | Comments Off on On privacy, hashing, and your customers