On documentation

Over at the daily WTF, Alex Papadimoulis writes about Documentation Done Right. His conclusion is spot on:

The immediate answer to what’s the right way to do documentation is clear: produce the least amount of documentation needed to facilitate the most understanding, and be very explicit about which documentation is to be maintained and which is to be archived (i.e., read-only and left to rot).

The amount of documentation appropriate to any project depends very much on the project, and the people working on it. However I’ve found that there are some useful rough guides that can generally be applied. Like Alex, I’m ignoring user documentation here and talking about internal project artifacts.

Enterprise project documentation exists to give managers someone to blame.

The reason large projects seem so documentation-heavy is twofold: managers don’t like to think they don’t know what’s going on; and managers like to be able to come down like a ton of shit on someone when they find out that they don’t know what’s going on. That’s where the waterfall model comes into its own. The big up-front planning means that everyone on the project has signed off on the ridiculous unreadable project documentation of doom, so it must describe the best software possible. Whoever does something that isn’t the same as the documentation has fucked up.

You never need the amount of documentation produced by most large-company software projects. Never. It’s usually inaccurate even at the moment the it’s signed off, because it took so long to get there the requirements changed, and because the level of precision required leads authors to make assumptions about how the OS, APIs etc. work. It’s often very hard to work from, because there are multiple documents, poorly cross-referenced using support “tools” like Word, each with its own glossary depending heavily on terminology relevant to the author’s domain. And they weren’t written by the same author.

In order to code up a feature from one of these projects, you need to check the product requirements document to see what it’s supposed to do (and whether it’s the highest priority outstanding work – you also get to find out who asked for the feature, how much money it’s worth, and plenty of other information that’s of no use to a software engineer). You check the functional specification to see how it’s supposed to do it. You check the architecture document to see what classes go in which packages. You raise a change control because the APIs don’t support part of the functional specification. Two weeks later, a fix has been approved. You write the code, ensuring that you use the test plan to find out what the acceptance criteria are, and when they’ll be tested (in fact, I’ve worked on projects where the functional and performance test plans were delivered at different times by different people). Oh, and make sure you’re keeping up to date with the issue tracker, otherwise you might do work that’s been assigned to someone else. Your systems engineer can point to the process documentation to let you know how you do that.

Good documentation doesn’t answer all the questions, but leaves you capable of asking smart questions of the right people.

It’s no secret that I’m a fan of user stories as a form of requirements documentation. A well-written user story tells you what the user expects to be able to do after a feature has been added, and precious little else. There might be questions over terminology, or specifics such as failure cases, but you know who to ask – the person who wrote the user story. There will be nothing about architecture or the GUI layout, because those things aren’t up to the user, customer or product manager (or at least they shouldn’t be).

Similarly, a good architecture diagram is going to tell you something about how the classes in an application fit together, and precious little else. If you can’t work out how to fit your feature in, you can ask the architect, but you’ll both be able to use the diagram as a good place to start. As Alex says in his post, precise documentation will go out of date quickly, so the documentation needs to be good enough to hang discussions on, and no “better”.

UML diagrams are great to describe code before it’s been written; they’re not so great at writing code.

If you’re trying to explain to another engineer how you think a feature should work in code, what design pattern to follow, or what steps will be needed to talk to a server, then a UML diagram is a great way to do it. Many engineers understand UML, and those who don’t can work out what it means (with your help) very quickly. It’s a consistent language for talking about software.

That said, when I draw UML diagrams I tend to prefer whiteboards and agnostic diagramming software like Omnigraffle or dia over syntax-checking UML tools like Enterprise Architect or ArgoUML. The reason is that I gave in the last section: the diagram needs to be good enough to explain to someone else what we’re going to do, not a gold-plated example of conformance to the UML specification. I don’t care if my swim-lane is in the wrong place if everybody involved knows what I mean (or can ask).

Code generated from UML tools tends to have readability issues. Whereas you or I might put related methods together, the tool might sort them by visibility or name. If you’re diagram is rough enough to be useful, then the tool will only generate a few method stubs – and you’ll be tempted to fill them in rather than creating small private methods to do logical units of work. These and other problems (I’ve seen EA generate code that uses UUIDs as class identifiers) can be fixed, but you shouldn’t need to fix them. You should write your application and sell it.

After code has been written, the only accurate documentation is the code itself (or generated from it).

There are other useful forms of documentation – for instance, a whiteboard diagram explains a developer’s understanding of the code. It doesn’t document the code itself – an important distinction.[*] All of those shiny enterprise documents your project manager got you to write went out of date as soon as the first customer reported a bug or feature request. The code, however, documents exactly what the code does – just not necessarily in the most useful way. Javadoc/doxygen comments are more likely than any thing else to stay in sync with the code due to their proximity, but even those can be outdated or unhelpful.

This is where UML tools can come in very handy. Those with the ability to generate diagrams based on code (even Xcode does this) can automatically give you a correct view of the application’s behaviour, at a more appropriate level of abstraction than the code itself. If what you need is a package dependency diagram, it’s better to get it from a UML tool than to try and read all of the source.

Unfortunately, Xcode (and Doxygen’s very limited capabilities) are the only games in town for Objective-C. Tool support for Java and C# is way ahead, but for Cocoa developers there’s only MacTranslator (which I’ve never tried). Not that UML maps particularly well onto Objective-C anyway.

[*]Though documenting a developer’s assumptions and comparing them with reality can often explain where some subtle bugs come from, and is of course useful.

The better your prototypes, the worse the feedback.

Back in the 1990s there was an explosion of Rapid Application Development tools, after the rest of the software industry saw Interface Builder and decided it was good. The RAD way (which is, of course, eminently achievable using IB) is to produce an executable prototype for users to give feedback on. Of course, the problem is that you end up shipping the prototype. I’ve actually ended up doing that on a couple of Mac applications.

One problem is that because the app looks complete, users assume it is complete and that they’re being asked to provide polishing details, or spot spelling mistakes and misplaced buttons. The other is that because it looks complete, managers assume it is complete and will tell you to ship it.

Don’t make that mistake. Do UI prototypes as paper-based wireframes, Keynote presentations or Cappuccino apps. Whatever you do, make it look like it’s just a crappy sketch you’re willing to have ripped to shreds. That way, people will rip it to shreds.

There are few document “artifacts” that need to hang around.

If you think about what you’re going to do with an application after you’ve written it, there’s selling it, supporting it, maintaining it, and extending it. Support people might need some high-level architecture knowledge so that they can work out what component a problem is in, or how to diagnose a particular failure.

Such an architecture document can be a good aid for planning new features or bug fixes, because you can quickly see where you need to modify the app to get the desired behaviour (clearly you then need to dive into the code to get a better idea, but you know where to jump). Similarly, architecture rationale documentation (including the threat model, API/library use justification) can be handy so that you don’t need to go through the same debates/research when fixing a bug or adding a feature. Threat models particularly can take a lot of time and expertise to construct from scratch.

Sales people will need to know which features have been delivered, which features are on the roadmap, and can probably find out specific questions from engineers if they get a grilling from an awkward customer. Only in a very limited set of circumstances will sales staff need to give customers security documentation, test coverage information, or anything other than the user manual.

Notice that in each of these cases, one of the main aspects of keeping the document around is so that you can keep it up to date. There’s no point having the 1.0 feature list when you’re selling version 2.5 – in fact it would be positively detrimental. So obviously the fewer documents you keep around, the fewer you need to keep up to date. There’s some trade-off involved (natürlich) – if you need something you didn’t hang on to, you’re going to have to regenerate it.

YOUR development team needs security engineers

It can definitely be tempting if your engineers don’t have a whole lot of security expertise to get a consultant in. Indeed this can be a great way to bootstrap a security process, however it then needs to be owned and executed inside the team. The reason has mainly to do with the goals of your developers and of the consultant.

Fuzzy logic

The consultant’s high fee is based on being able to come in to a new team and quickly make a high impact. This means, of course, rapidly reporting low-hanging problems. It also means that all of the reported problems are important – it’s a good job you hired methis anonymous consultant, having found all of those critical issues, no?

The fact is that most of the security problems found by external security consultants and penetration testers are those that can be found in short order by a script kiddie – in other words, ones it would take very little effort to write automated tests for. In fact that’s pretty much what consultants, pen testers and script kiddies do, and it’s called fuzzing. Now, if you’d given the task of writing the fuzzing tools to one of your developers, you’d be slightly more behind schedule (than you already are because you’ve got to fix those data-handling defects), and you’d have a developer who knows how to write fuzzing tools.

Finding architectural and requirements-level security problems is not impossible for a consultant, however we only really have a second hand knowledge of the requirements and architecture (and it’s likely that we got that from talking to your developers, unless you have some really shit-hot and surprisingly relevant documentation. Hint: you don’t). What we bring to the party is a thorough knowledge of threat analysis and risk management, which we then get to apply to an unknown project. Your developers have an inside-out knowledge of the project, but probably aren’t as experienced as me at carrying out security assessments.

Don’t get me wrong, the consultant will definitely find real problems and will probably leave you in a position to deliver a more secure product. However, in order to get most value out of the consultant’s input, you need to make sure that your project’s security model and security knowledge remain up to date, and that means bringing it at least partially in house.

Security champions

Make sure that every development team has its own security champion. That person’s responsibility is:

  • Define the team’s security process:
    • How is the threat model discovered/updated?
    • How are security problems found and reported?
    • How are they addressed?
  • Ensure the process is followed
  • Define security criteria for each iteration, user story, or whatever makes sense in your project management setup

The champion is authorised to:

  • Delegate responsibility for carrying out tasks in the security process
  • Declare any build/user story/iteration/whatever to be a failure until its security criteria are satisfied

As we know, responsibility without authority just means you have someone to fire. A team can only get things done if they’re actually allowed to ensure they get things done. Sounds like common sense, but many software engineering managers fail to grasp it.

In our team, everybody is the security guy. We share the responsibility.

Bullshit. You get nothing done. If everyone is responsible for a problem, then everyone knows that somebody else can deal with it. I’ve seen software released with known and rather heinous vulnerabilities, all because no-one wanted to be the person who delayed release just before the ship date. Don’t allow your project to get into that state – give someone ownership of the security efforts.

On McAfee

Today, Apple’s CPU/motherboard supplier Intel announced that it will acquire McAfee, in a deal worth nearly $7.7B. While this is definitely big bucks, it doesn’t seem like terrifically big security news.

Intel probably don’t want the technology. McAfee is the world’s biggest security vendor, so there are cheaper ways for Intel to acquire security technology. Intel probably don’t want a fast buck either: or if they do, they’re not about to get it. It would take around a decade for Intel’s new security software division to make its money back, assuming no huge changes in organization.

Intel may want the IP. McAfee has an extensive patent portfolio (as do all the big players in the cold war world of security software), there’s bound to be things that Intel could implement in silico. Jokes have already been doing the rounds on Twitter of a new CPU opcode, SCANAV. Encryption and data tagging seem more likely targets. But couldn’t they just license the patents?

I expect that what Intel are after is to make the company a one-stop IT shop, with security software being just one element in that. Large businesses and government in particular value having a small network of large, stable, boring, trusted partners. We’ve already seen in the last couple of years that the likes of Cisco, HP and Oracle have been shifting toward “vertical” provision of IT services. Intel now have a few different software houses under their wing, and of course McAfee brings a vast collection of juicy business customers. What Paul Ottelini is likely hoping is that such customers will start looking to Intel for other services, and maybe hardware too. And that Intel’s existing customers will buy into McAfee’s security offerings.

On voices that matter

In October I’ll be in Philadelphia, PA talking at Voices That Matter: Fall iPhone Developers’ Conference. I’m looking forward to meeting some old friends and new faces, and sucking up a little more of that energy and enthusiasm that pervades all of the Apple-focussed developer events I’ve been to. In comparison with other fields of software engineering, Cocoa and Cocoa Touch development have a sense of community that almost exudes from the very edifices of the conference venues.

But back to the talk. Nay, talks. While Alice was directed by the cake on the other side of the looking glass to “Eat Me”, the label on my slice said “bite off more of me than you can chew”. Therefore I’ll be speaking twice at this event, the first talk on Saturday is on Unit Testing, which I’ve taken over just now from Dave Dribin. Having seen Dave talk on software engineering practices before (and had lively discussions regarding coupling and cohesion in Cocoa code in the bar afterwards), I’m fully looking forward to adding his content’s biological and technological distinctiveness to my own. I’ll be covering the why of unit testing in addition to the how and what.

My second talk is on – what else – security and encryption in iOS applications. In this talk I’ll be looking at some of the common features of iOS apps that require security consideration, how to tease security requirements out of your customers and product managers and then looking at the operating system support for satisfying those requirements. This gives me a chance to focus more intently on iOS than I was able to in Professional Cocoa Application Security (Wrox Professional Guides), looking at specific features and gotchas of the SDK and the distinctive environment iOS apps find themselves in.

I haven’t decided what my schedule for the weekend will be, but there are plenty of presentations I’m interested in watching. So look for me on the conference floor (and, of course, in the bar)!

On stopping service management abuse

In chapter 2 of their book The Mac Hacker’s Handbook (is there only one Mac hacker?), Charlie Miller and Dino Dai Zovi note that an attacker playing with a sandboxed process could break out of the sandbox via launchd. The reasoning goes that the attacker just uses the target process to submit a launchd job. Launchd, which is not subject to sandbox restrictions, then loads the job, executing the attacker’s payload in an environment where it will have more capabilities.

This led me to wonder whether I could construct a sandbox profile that would stop a client process from submitting launchd jobs. I have done that, but not in a very satisfying way or even necessarily a particularly complete one. My profile does this:

(version 1)
(deny default)
(debug deny)

(allow process-exec)
(allow file-fsctl)
(allow file-ioctl)
; you can probably restrict access to even more files - don't forget to let dyld link Cocoa though!
(allow file-read* file-write*)
(deny file-read* (regex "^/System/Library/Frameworks/ServiceManagement.framework"))
(deny file-read* (literal "/bin/launchctl" "/bin/launchd"))
(allow signal (target self))
(allow ipc-posix-shm)
(allow sysctl*)
; this lot was empirically discovered - Cocoa apps needs these servers
(allow mach-lookup (global-name "com.apple.system.notification_center" "com.apple.system.DirectoryService.libinfo_v1"
 "com.apple.system.DirectoryService.membership_v1" "com.apple.distributed_notifications.2"
 "com.apple.system.logger" "com.apple.SecurityServer" 
"com.apple.CoreServices.coreservicesd" "com.apple.FontServer" 
"com.apple.dock.server" "com.apple.windowserver.session" 
"com.apple.pasteboard.1" "com.apple.pbs.fetch_services" 

So processes run in the above sandbox profile are not allowed to use the launchd or launchctl processes, nor can they link the ServiceManagement framework that allows Cocoa apps to discover and submit jobs directly.

Unfortunately I wasn’t able to fully meet my goal: a process can still do the same IPC that launchctl et al use directly. I found that if I restricted IPC access to launchd, apps would crash when trying to check-in with the daemon. Of course the IPC protocol is completely documented so it might be possible to do finer-grained restrictions, but I’m not optimistic.

Of course, standard disclaimers apply: the sandbox Scheme environment is supposed to be off-limits to us smelly non-Apple types.

On private methods

Let’s invent a hypothetical situation. You’re the software architect for an Objective-C application framework at a large company. This framework is used by many thousands of developers to create all sorts of applications for a particular platform.

However, you have a problem. Developer Technical Support are reporting that some third-party developers are using a tool called class-dump to discover the undocumented methods on your framework’s classes, and are calling them directly in application code. This is leading to stability and potentially other issues, as the methods are not suitable for calling at arbitrary points in the objects’ life cycles.

You immediately reject the distasteful solution of making the private method issue a policy problem. While you could analyse third-party binaries looking for use of undocumented method selectors, this approach is unscalable and error-prone. Instead you need a technical solution.

The problem in more detail

Consider the following class:

@interface GLStaticMethod : NSObject {
    int a;
@property (nonatomic, assign) int a;
- (void)doTheLogThing;

@interface GLStaticMethod ()
- (void)logThis;

@implementation GLStaticMethod

@synthesize a;

- (void)doTheLogThing {
    [self logThis];

- (void)logThis {
    NSLog(@"Inside logThis: %d", self->a);


Clearly this -logThis method would be entirely dangerous if called at unexpected times. Oh OK, it isn’t, but let’s pretend. Well, we haven’t documented it in the header, so no developer will find it, right? Enter class-dump:

 *     Generated by class-dump 3.3.2 (64 bit).
 *     class-dump is Copyright (C) 1997-1998, 2000-2001, 2004-2010 by Steve Nygard.

#pragma mark -

 * File: staticmethod
 * Arch: Intel x86-64 (x86_64)
 *       Objective-C Garbage Collection: Unsupported

@interface GLStaticMethod : NSObject
    int a;

@property(nonatomic) int a; // @synthesize a;
- (void)logThis;
- (void)doTheLogThing;


OK, that’s not so good. Developers can find our private method, and that means they’ll use the gosh-darned thing! What can we do?

Solution 1: avoid static discovery

We’ll use the dynamic method resolution feature of the new Objective-C runtime to only bind this method when it’s used. We’ll put our secret behaviour into a function that has the same signature as an IMP (Objective-C method implementation), and attach that to the class when the private method is first used. So our class .m file now looks like this:

@interface GLStaticMethod ()
void logThis(id self, SEL _cmd);

@implementation GLStaticMethod

@synthesize a;

+ (BOOL)resolveInstanceMethod: (SEL)aSelector {
    if (aSelector == @selector(logThis)) {
        class_addMethod(self, aSelector, (IMP)logThis, "v@:");
        return YES;
    return [super resolveInstanceMethod: aSelector];

- (void)doTheLogThing {
    [self logThis];

void logThis(id self, SEL _cmd) {
    NSLog(@"Inside logThis: %d", ((GLStaticMethod *)self)->a);


What does that get us? Let’s have another look at class-dump’s output now:

 *     Generated by class-dump 3.3.2 (64 bit).
 *     class-dump is Copyright (C) 1997-1998, 2000-2001, 2004-2010 by Steve Nygard.

#pragma mark -

 * File: staticmethod
 * Arch: Intel x86-64 (x86_64)
 *       Objective-C Garbage Collection: Unsupported

@interface GLStaticMethod : NSObject
    int a;

+ (BOOL)resolveInstanceMethod:(SEL)arg1;
@property(nonatomic) int a; // @synthesize a;
- (void)doTheLogThing;


OK, so our secret method can’t be found using class-dump any more. There’s a hint that something special is going on because the class provides +resolveInstanceMethod:, and a really dedicated hacker could use otool to disassemble that method and find out what selectors it uses. In fact, they can guess just by looking at the binary:

heimdall:Debug leeg$ strings staticmethod 
Inside logThis: %d

You could mix things up a little more by constructing strings at runtime and using NSSelectorFromString() to generate the selectors to test.

Problem extension: runtime hiding

The developers using your framework have discovered that you’re hiding methods from them and found a way to inspect these methods. By injecting an F-Script interpreter into their application, they can see the runtime state of every object including your carefully-hidden instance methods. They know that they can call the methods, and can even declare them in categories to avoid compiler warnings. Where do we go from here?

Solution 2: don’t even add the method

We’ve seen that we can create functions that behave like instance methods – they can get access to the instance variables just as methods can. The only requirement is that they must be defined within the class’s @implementation. So why not just call the functions? That’s the solution proposed in ProCocoaAppSec – it’s a little uglier than dynamically resolving the method, but means that the method never appears in the ObjC runtime and can never be used by external code. It makes our public method look like this:

- (void)doTheLogThing {
    logThis(self, _cmd);

Of course, logThis() no longer has an Objective-C selector of its very own – it can only get the selector of the method from which it was called (or whatever other selector you happen to pass in). Most Objective-C code doesn’t ever use the _cmd variable so this isn’t a real drawback. Of course, if you do need to be clever with selectors, you can’t use this solution.


Objective-C doesn’t provide language-level support for private methods, but there are technological solutions for framework developers to hide internal code from their clients. Using these methods will be more reliable and easier to support than asking developers nicely not to use those methods, and getting angry when they do.

On authorization proxy objects

Authorization Services is quite a nice way to build in discretionary access controls to a Mac application. There’s a whole chapter in Professional Cocoa Application Security (Chapter 6) dedicated to the topic, if you’re interested in how it works.

The thing is, it’s quite verbose. If you’ve got a number of privileged operations (like, one or more) in an app, then the Auth Services code can get in the way of the real code, making it harder to unpick what a method is up to when you read it again a few months later.

Let’s use some of the nicer features of the Objective-C runtime to solve that problem. Assuming we’ve got an object that actually does the privileged work, we’ll create a façade object GLPrivilegedPerformer that handles the authorization for us. It can distinguish between methods that do or don’t require privileges, and will attempt to gain different rights for different methods on different classes. That allows administrators to configure privileges for the whole app, for a particular class or even for individual tasks. If it can’t get the privilege, it will throw an exception. OK, enough rabbiting. The code:

@interface GLPrivilegedPerformer : NSObject {
    id actual;
    AuthorizationRef auth;
- (id)initWithClass: (Class)cls;

@implementation GLPrivilegedPerformer

- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector {
    NSMethodSignature *sig = [super methodSignatureForSelector: aSelector];
    if (!sig) {
        sig = [actual methodSignatureForSelector: aSelector];
    return sig;

- (BOOL)respondsToSelector:(SEL)aSelector {
    if (![super respondsToSelector: aSelector]) {
        return [actual respondsToSelector: aSelector];
    return YES;

- (void)forwardInvocation:(NSInvocation *)anInvocation {
    if ([actual respondsToSelector: [anInvocation selector]]) {
        NSString *selName = NSStringFromSelector([anInvocation selector]);
        if ([selName length] > 3 && [[selName substringToIndex: 4] isEqualToString: @"priv"]) {
            NSString *rightName = [NSString stringWithFormat: @"%@.%@.%@",
                                   [[NSBundle mainBundle] bundleIdentifier],
                                   NSStringFromClass([actual class]),
            AuthorizationItem item = {0};
            item.name = [rightName UTF8String];
            AuthorizationRights requested = {
                .count = 1,
                .items = &item,
            OSStatus authResult = AuthorizationCopyRights(auth,
                                                          kAuthorizationFlagDefaults |
                                                          kAuthorizationFlagExtendRights |
            if (errAuthorizationSuccess != authResult) {
                [self doesNotRecognizeSelector: [anInvocation selector]];
        [anInvocation invokeWithTarget: actual];
    else {
        [super forwardInvocation: anInvocation];

- (id)initWithClass: (Class)cls {
    self = [super init];
    if (self) {
        OSStatus authResult = AuthorizationCreate(NULL,
        if (errAuthorizationSuccess != authResult) {
            NSLog(@"couldn't create auth ref");
            return nil;
        actual = [[cls alloc] init];
    return self;

- (void)dealloc {
    AuthorizationFree(auth, kAuthorizationFlagDefaults);
    [actual release];
    [super dealloc];

Some notes:

  • You may want to raise a custom exception rather than using -doesNotRecognizeSelector: on failure. But you’re going to have to @catch something on failure. That’s consistent with the way Distributed Objects handles authentication failures.
  • The rights it generates will have names of the form com.example.MyApplication.GLActualPerformer.privilegedTask, where GLActualPerformer is the name of the target class and privilegedTask is the method name.
  • There’s an argument for the Objective-C proxying mechanism making code harder to read than putting the code inline. As discussed in Chapter 9, using object-oriented tricks to make code non-linear has been found to make it harder to review the code. However, this proxy object is small enough to be easily-understandable, and just removes authorization as a cross-cutting concern in the style of aspect-oriented programming (AOP). If you think this will make your code too hard to understand, don’t use it. I won’t mind.
  • As mentioned elsewhere, Authorization Services is discretionary. This proxy pattern doesn’t make it impossible for injected code to bypass the authorization request by using the target class directly. Even if the target class has the “hidden” visibility attribute, class-dump can find it and NSClassFromString() can get the Class object.

NSConference MINI videos available

During WWDC week I talked at NSConference MINI, a one-day conference organised by Scotty and the MDN. The videos are now available: free to attendees, or $50 for all 10 for non-attendees.

My own talk was on extending the Clang static analyser, to perform your own tests on your code. I’m pleased with the amount I managed to get in, and I like how the talk managed to fit well with the general software-engineering theme of the conference. There’s stuff on bit-level manipulation, eXtreme Programming, continuous integration, product management and more. I’d fully recommend downloading the whole shebang.

On Trashing

Back in the 1980s and 1990s, people who wanted to clandestinely gain information about a company or organisation would go trashing.[*] That just meant diving in the bins to find information about the company structure – who worked there, who reported to whom, what orders or projects were currently in progress etc.

You’d think that these days trashing had been thwarted by the invention of the shredder, but no. While many companies do indeed destroy or shred confidential information, this is not universal. Those venues where shredding is common leave it up to their staff to decide what goes in the bin and what goes in the shredder; these staff do not always get it correct (they’re likely to think about whether a document is secret to them rather than the impact on the company). Nor do they always want to think about which bin to put a worthless sheet of paper in.

Even better: in those places that do shred secret papers, they helpfully collect all of the secrets in big bins marked “To Shred” to help the trashers :). They then collect all of these bins into a big hopper, and leave that around (sometimes outside the building, in a covered or open yard) for the destruction company to come and pick up.

So if an attacker can get entry to the building, he just roots around in the “To Shred” bins. Someone asks, he tells them he put a printout there in the morning but now think he needs it again. Even if he can’t get in, he just dives in the hopper outside and get access to all those juicy secrets (with none of the banana peelings and teabags associated with the non-secret bin).

But for those attackers who don’t like getting their hands dirty, they can gain some of the same information using technological means. LinkedIn will helpfully provide a list of employees – including their positions, so the public can find out something of the reporting structure. Some will be looking for recruitment opportunities – these are great people to phone for more information! So are ex-employees, something LinkedIn will also help you out with.

But the fun doesn’t stop there. Once our attacker has the names, he now goes over to Twitter and Facebook. There he can find people griping about work…or describing what the organisation is up to, to put it another way.

All of the above information about 21st-century trashing comes from real experience with an office I was invited into in the last 12 months. Of course, I will not name the organisation in charge of that office (or their data destruction company). The conclusion is that trashing is alive and well, and that those who participate need no longer root around in, well, in the trash. How does your organisation deal with the problem?

[*] for me, it was mainly the 1990s. I was the perfect size in the 1980s for trashing, but still finding my way around a Dragon 32.

On detecting God Classes

Opinion on Twitter was divided when I suggested the following static analyser behaviour: report on any class that conforms to too many protocols.

Firstly, a warning: “too many” is highly contextual. Almost all objects implement NSObject and you couldn’t do much without it, so it gets a bye. Other protocols, like NSCoding and NSCopying, are little bits of functionality that don’t really dilute a class by their presence. It’s probably harmless for a class to implement those in addition to other protocols. Still others are so commonly implemented together (like UITableViewDataSource and UITableViewDelegate, or WebView‘s four delegate protocols) that they probably shouldn’t independently count against a class’s “protocol weight”. On the other hand, a class that conforms to UITableViewDelegate, UIAlertViewDelegate and MKMapViewDelegate might be trying to do too much – of which more after the next paragraph.

Secondly, a truism: the goal of a static analyser is to ignore code that the developer’s happy with, and to warn about code the developer isn’t happy with. If your coding style permits a class to conform to any number of protocols, and you’re happy with that, then you shouldn’t implement this analyser rule. If you would be happy with a maximum of 2, 4, or 1,024 protocols, you would benefit from a rule with that number. As I said in my NSConf MINI talk, the analyser’s results are not as cleanly definable as compiler errors (which indicate code that doesn’t conform to the language definition) or warnings (which indicate code that is very probably incorrect). The analyser is more of a code style and API use checker. Conforming to protocols is use of the API that can be checked.

OK, let’s get on with it. A class that conforms to too many protocols has too many reponsibilities – it is a “God Class”. Now how can this come about? A developer who has heard about model-view-controller (MVC) will try to divide classes into three high-level groups, like this:

MVC high-level architecture

The problem comes when the developer fails to take that diagram for what it is: a 50,000-foot overview of the MVC architecture. Many Mac and iOS developers will use Core Data, and will end up with a model composed of multiple different entities. If a particular piece of the app’s workflow needs to operate on multiple entities, they may break that out into “business logic” objects that can be called by the controller. Almost all Mac and iOS developers use the standard controls and views, meaning they have no choice but to break up the view into multiple objects.

But where does that leave the controller? Without any motivation to divide things up, everything else is stuffed into a single object (likely an NSDocument or UIViewController subclass). This is bad. What happens if you need to display the same table in two different places? You have to go through that class, picking out the bits that are relevant, finding out where they’re tied to the bits that aren’t and untangling them. Ugh.

Cocoa developers will probably already be using multiple controller objects if they’re using Cocoa Bindings. Each NSArrayController or similar receives its object or objects, usually from the main controller object, and then gets on with the job of marshalling the interaction between the bound view and the observed model objects. So, if we take the proposed changes so far, our MVC diagram looks like this:

MVC - Slightly closer look

The point of my protocol-checking code is to go the remaining distance, and abstract out the other data sources into their own objects. What we’re left with is a controller that looks after the view’s use case, ensuring that logic actions take place when they ought, that steps in the workflow only become available when their preconditions are met, and so on. Everything related to performing the logic is down in those dynamic model objects, and everything to do with data presentation is in its own controller objects. Well, maybe not everything – a button doesn’t exactly have a complicated API. But if you need a table of employees for this view controller and a table of employees for that view controller, you just take the same table datasource object in both places. You don’t have two different datasource implementations in two view controllers (or even the same one pasted twice). This makes the diagram look like this:

MVC - more separation

So to summarise, a class that conforms to too many protocols probably has too many responsibilities. The usual reason for this is that controller objects take on responsibility for managing workflow, providing data to views and handling delegate responsibilities for the views. This leads to code that is not reusable except through the disdainful medium of copy-pasting, as it is harder to define a clean interface between these various concerns. By producing a tool that reports on the existence of such God classes, developers can be alerted to their presence and take steps to fix them.