Structure and Interpretation of Computer Programmers

I make it easier and faster for you to write high-quality software.

Wednesday, May 12, 2010

LLVM projects you may not be aware of

All Mac and iPhone OS developers must by now be familiar with LLVM, the Low-Level Virtual Machine compiler that Apple has backed in preference to GCC (presumably at least partially because because GCC 4.5 is now a GPLv3 project, in addition to technical problems with improving the older compiler). You’ll also be familiar with Clang, the modular C/ObjC/C++ lexer/parser that can be used as an LLVM front-end, or as a library for providing static analysis, refactoring and other code comprehension facilities. And of course MacRuby uses LLVM’s optimisation libraries.

The LLVM umbrella also covers a number of other projects that Mac/iPhone developers may not yet have heard about, but which nonetheless are pretty cool. This post is just a little tour of some of those. There are other projects that have made use of LLVM code, but which aren’t part of the compiler project – they are not the subject of this post.

LibC++ is a C++ library, targeting 100% compatibility with the C++0x (draft) standard.

KLEE looks very cool. It’s a “symbolic execution tool”, capable of automatically generating unit tests for software with high degrees of coverage (well over 90%). Additionally, given information about an application’s constraints and requirements it can automatically discover bugs, generating failing tests to demonstrate the bug and become part of the test suite. There’s a paper describing KLEE including a walkthrough of discovering a bug in tr, and tutorials in its use.

vmkit is a substrate layer for running bytecode. It takes high-level bytecode (currently JVM bytecode or IL, the bytecode of the .Net runtime) and translates it to IR, the LLVM intermediate representation. In doing so it can make use of LLVM’s optimisations and make better decisions regarding garbage collection.

posted by Graham Lee at 09:49  

Thursday, April 15, 2010

The difference between NSTableView and UITableView

A number of times, I’ve chased myself down rat holes in iPhone projects because I’ve created a design or implementation that assumes UITableView and NSTableView are similar objects. They aren’t.

The main problem I come across is related to how the cells are treated in Cocoa and in Cocoa touch. An AppKit table comprises columns, each of which uses a cell to display its content. A cell contains the drawing and event-handling stuff of a view, but nothing to do with its place in the view hierarchy or responder chain. It’s essentially a light-weight view. For each row in the table, NSTableColumn takes its cell, configures it for the content in that row and then draws the cell at its location in the column. No matter how many rows there are, a single cell is used.

UIKit works differently. Of course a UITableView only has one column, but it also displays views rather than cells. This is good, but leads to the key distinction that always trips me up: you can’t use the same view more than once in a table view. Of course, sections in a UITableView will often have more than one row, but each row that is visible on-screen will needs its own instance of UITableViewCell (which is a subclass of UIView, and therefore a view in the traditional sense rather than a cell). If you try to re-use the same instance multiple times, the table view will configure each row but only the last one it prepared will be drawn.

So what’s this -reuseIdentifier? stuff? That’s related to caching views for scrolling. Imagine a table view with 10 rows, of which 4 can be seen on screen at once. Each uses the same type of cell in this example. When the table view first becomes visible there will be 4 UITableViewCell instances in use, displaying rows 0-3. Now you start to scroll the view. UITableView finds it needs an extra cell to display row 4, which is now partially on-screen and row 0 is starting to slide off. When row 0 disappears completely, the table view could just delete its cell – but rather than do that, it adds it to a queue of reusable cells. When row 5 starts to appear, the table view can re-use the object it’s already created for row 0, because it’s the same type of cell as the one for row 5 and is currently unused.

So, that’s that really. Note to self: don’t treat UIKit like it’s just AppKit, you’ll end up wasting a day of code.

posted by Graham Lee at 13:32  

Tuesday, March 2, 2010

How to hire Graham Lee

There are few people who can say that when it comes to Cocoa application security, they wrote the book. In fact, I can think of only one: me. I’ve just put the final draft together for Professional Cocoa Application Security and it will hit the shops in June: click the link to purchase through my Amazon affiliate programme.

Now that the book’s more-or-less complete, I can turn my attention to other interesting projects: by which I mean yours! If your application could benefit from a developer with plenty of security experience and knowledge to share in a pragmatic fashion, or a software engineer who led development of a complex Cocoa application from its legacy PowerPlant origins through Snow Leopard readiness, or a programmer who has worked on performance enhancement in networking systems and low-level daemon code on Darwin and other UNIX platforms, then your project will benefit from an infusion of the Graham Lee magic. Even if you have some NeXTSTEP or OPENSTEP code that needs maintaining, I can help you out: I’ve been using Cocoa for about as long as Apple has.

Send an email to iamleeg <at> securemacprogramming <dot> com and let’s talk about your project. The good news is that for the moment I am available, you probably can afford me[], and I really want to help make your product better. Want to find out more about my expertise? Check out my section on the MDN show, and the MDN security column.

[] It came up at NSConference that a number of devs thought I carry a premium due to the conference appearances, podcasts and other material I produce. Because I believe that honesty is the best policy, I want to come out and say that I don’t charge any such premium. My rates are consistent with other contractors with my level of experience, and I even provide a discounted rate for NGOs and academic institutions.

posted by Graham Lee at 13:22  

Sunday, January 10, 2010

Unit testing Core Data-driven apps, fit the second

It took longer than I expected to follow up my previous article on unit testing and Core Data, but here it is.

Note that the pattern presented last time, Remove the Core Data Dependence, is by far my preferred option. If a part of your code doesn’t really depend on managed objects and suchlike, it shouldn’t need them to be present just because it works with (or in) classes that do. The following pattern is recommended only when you aren’t able to abstract away the Core Data-ness of the code under test.

Pattern 2: construct an in-memory Core Data stack. The unit test classes you develop ought to have these, seemingly contradictory properties:

  • no dependence on external state: the tests must run the same way every time they run. That means that the environment for each test must be controlled exactly; dependence on “live” application support files, document files or the user defaults are all no-nos.
  • close approximation to the application environment: you’re interested in how your app runs, not how nice a unit test suite you can create.

To satisfy both of these properties simultaneously, construct a Core Data stack in the test suite which behaves in the same way but which does not use the persistent store (i.e. document files) used by the real app. My preference is to use the in-memory store type, so that every time it is created it is guaranteed to have no reference to any prior state (unlike a file-backed store type, where you have to rely on unlinking the document files and hoping there are no timing issues in the test framework which might cause two tests simultaneously to use the same file).

My test case class interface looks like this (note that this is for a dependent test case bundle that gets embedded into the app; there’s an important reason for that which I’ll come to later). The managed object context will be needed in the test methods to insert new objects, I don’t (yet) need any of the other objects to be visible inside the tests but the same objects must be used in -setUp and -tearDown.

#import <SenTestingKit/SenTestingKit.h>

@interface SomeCoreDataTests : SenTestCase {
NSPersistentStoreCoordinator *coord;
NSManagedObjectContext *ctx;
NSManagedObjectModel *model;
NSPersistentStore *store;
}

@end

The environment for the tests is configured thus. I would have all of the error reporting done in tests, rather than that one lone assertion in -tearDown, because the SenTest framework doesn’t report properly on assertion failures in that method or in -setUp. So the -testThatEnvironmentWorks test method is a bellwether for the test environment being properly set up, but obviously can’t test the results of tear-down because the environment hasn’t been torn down when it runs.


#import "TuneNeedsHighlightingTests.h"

@implementation TuneNeedsHighlightingTests

- (void)setUp
{
model = [[NSManagedObjectModel mergedModelFromBundles: nil] retain];
NSLog(@"model: %@", model);
coord = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel: model];
store = [coord addPersistentStoreWithType: NSInMemoryStoreType
configuration: nil
URL: nil
options: nil
error: NULL];
ctx = [[NSManagedObjectContext alloc] init];
[ctx setPersistentStoreCoordinator: coord];
}

- (void)tearDown
{
[ctx release];
ctx = nil;
NSError *error = nil;
STAssertTrue([coord removePersistentStore: store error: &error],
@"couldn't remove persistent store: %@", error);
store = nil;
[coord release];
coord = nil;
[model release];
model = nil;
}

- (void)testThatEnvironmentWorks
{
STAssertNotNil(store, @"no persistent store");
}
@end

The important part is in setting up the managed object model. In using [NSManagedObjectModel mergedModelFromBundles: nil], we get the managed object model derived from loading all MOMs in the main bundle—remembering that this is an injected test framework, that’s the application bundle. In other words the MOM is the same as that created by the app delegate. We get to use the in-memory store as a clean slate every time through, but otherwise the entity definitions and behaviours ought to be identical to those provided by the real app.

posted by Graham Lee at 18:12  

Sunday, September 6, 2009

Unit testing Core Data-driven apps

Needless to say, I’m standing on the shoulders of giants here. Chris Hanson has written a great post on setting up the Core Data “stack” inside unit tests, Bill Bumgarner has written about their experiences unit-testing Core Data itself and PlayTank have an article about introspecting the object tree in a managed object model. I’m not going to rehash any of that, though I will touch on bits and pieces.

In this post, I’m going to look at one of the patterns I’ve employed to create testable code in a Core Data application. I’m pretty sure that none of these patterns I’ll be discussing is novel, however this series has the usual dual-purpose intention of maybe helping out other developers hoping to improve the coverage of the unit tests in their Core Data apps, and certainly helping me out later when I’ve forgotten what I did and why ;-).

Pattern 1: remove the Core Data dependence. Taking the usual example of a Human Resources application, the code which determines the highest salary in any department cares about employees and their salaries. It does not care about NSManagedObject instances and their values for keys. So stop referring to them! Assuming the following initial, hypothetical code:

- (NSInteger)highestSalaryOfEmployees: (NSSet *)employees {
NSInteger highestSalary = -1;
for (NSManagedObject *employee in employees) {
NSInteger thisSalary = [[employee valueForKey: @"salary"] integerValue];
if (thisSalary > highestSalary) highestSalary = thisSalary;
}
//note that if the set's empty, I'll return -1
return highestSalary;
}

This is how this pattern works:

  1. Create NSManagedObject subclasses for the entities.
    @interface GLEmployee : NSManagedObject
    {}
    @property (nonatomic, retain) NSString *name;
    @property (nonatomic, retain) NSNumber *salary;
    @property (nonatomic, retain) GLDepartment *department;
    @end

    This step allows us to see that employees are objects (well, they are in many companies anyway) with a set of attributes. Additionally it allows us to use the compile-time checking for properties with the dot syntax, which isn’t available in KVC where we can use any old nonsense as they key name. So go ahead and do that!

    - (NSInteger)highestSalaryOfEmployees: (NSSet *)employees {
    NSInteger highestSalary = -1;
    for (GLEmployee *employee in employees) {
    NSInteger thisSalary = [employee.salary integerValue];
    if (thisSalary > highestSalary) highestSalary = thisSalary;
    }
    //note that if the set's empty, I'll return -1
    return highestSalary;
    }

  2. Abstract out the interface to a protocol.
    @protocol GLEmployeeInterface <NSObject>
    @property (nonatomic, retain) NSNumber *salary;
    @end

    Note that I’ve only added the salary to the protocol definition, as that’s the only property used by the code under test and the principle of YAGNI tells us not to add the other properties (yet). The protocol extends the NSObject protocol as a safety measure; lots of code expects objects which are subclasses of NSObject or adopt the protocol. And the corresponding change to the class definition:

    @interface GLEmployee : NSManagedObject <GLEmployeeInterface>
    {}
    ...
    @end

    Now our code can depend on that interface instead of a particular class:

    - (NSInteger)highestSalaryOfEmployees: (NSSet *)employees {
    NSInteger highestSalary = -1;
    for (id <GLEmployeeInterface> employee in employees) {
    NSInteger thisSalary = [employee.salary integerValue];
    if (thisSalary > highestSalary) highestSalary = thisSalary;
    }
    //note that if the set's empty, I'll return -1
    return highestSalary;
    }

  3. Create a non-Core Data “mock” employee
    Again, YAGNI tells us not to add anything which isn’t going to be used.
    @interface GLMockEmployee : NSObject <GLEmployeeInterface>
    {
    NSNumber *salary;
    }
    @property (nonatomic, retain) NSNumber *salary;
    @end

    @implementation MockEmployee
    @synthesize salary;
    @end

    Note that because I refactored the code under test to handle classes which conform to the GLEmployeeInterface protocol rather than any particular class, this mock employee object is just as good as the Core Data entity as far as that method is concerned, so you can write tests using that mock class without needing to rely on a Core Data stack in the test driver. You’ve also separated the logic (“I want to know what the highest salary is”) from the implementation of the model (Core Data).

OK, so now that you’ve written a bunch of tests to exercise that logic, it’s time to safely refactor that for(in) loop to an exciting block implementation :-).

posted by Graham Lee at 15:36  

Wednesday, July 8, 2009

Refactor your code from the command-line

While the refactoring support in Xcode 3 has been something of a headline feature for the development environment, in fact there’s been a tool for doing Objective-C code refactoring in Mac OS X for a long time. Longer than it’s been called Mac OS X.

tops of the form

My knowledge of the early days is very sketchy, but I believe that tops was first introduced around the time of OPENSTEP (so 1994). Certainly its first headline use was in converting code which used the old NextStep APIs into the new, shiny OpenStep APIs. Not that this was as straightforward as replacing NX with NS in the class names. The original APIs hadn’t had much in the way of foundation classes (the Foundation Kit was part of OpenStep, but had been available on NeXTSTEP for use with EOF), so took char * strings rather than NSStrings, id[]s rather than NSArrays and so on. Also much rationalision and learning-from-mistakes was done in the Application Kit, parts of which were also pushed down into the Foundation Kit.

All of this meant that a simple search-and-replace tool was not going to cut the mustard. Instead, tops needed to be syntax aware, so that individual tokens in the source could be replaced without any (well, alright, without too much) worry that any of the surrounding expressions would be broken, without too much inappropriate substitution, and without needing to pre-empt every developer’s layout conventions.

before we continue – a warning

tops performs in-place substitution on your source code. So if you don’t like what it did and want to go back to the original… erm, tough. If you’re using SCM, there’s no problem – you can always revert its changes. If you’re not using SCM, then the first thing you absolutely need to do before attempting to try out tops on your real code is to adopt SCM. Xcode project snapshots also work.

replacing deprecated methods

Let’s imagine that, for some perverted reason, I’ve written the following tool. No, scrub that. Let’s say that I find myself having to maintain the following tool :-).

#import <Foundation/Foundation.h>

int main(int argc, char **argv, char **envp)
{
NSAutoreleasePool *arp = [[NSAutoreleasePool alloc] init];
NSString *firstArg = [NSString stringWithCString: argv[1]];
NSLog(@"Argument was %s", [firstArg cString]);
[arp release];
return 0;
}

Pleasant, non? Actually non. What happens when I compile it?

heimdall:Documents leeg$ cc -o printarg printarg.m -framework Foundation
printarg.m: In function ‘main’:
printarg.m:6: warning: ‘stringWithCString:’ is deprecated (declared at /System/Library/Frameworks/Foundation.framework/Headers/NSString.h:386)
printarg.m:7: warning: ‘cString’ is deprecated (declared at /System/Library/Frameworks/Foundation.framework/Headers/NSString.h:367)

OK so we obviously need to do something about this use of ancient NSString API. For no particular reason, let’s start with -cString:

heimdall:Documents leeg$ tops replacemethod cString with UTF8String printarg.m

So what do we have now?

#import <Foundation/Foundation.h>

int main(int argc, char **argv, char **envp)
{
NSAutoreleasePool *arp = [[NSAutoreleasePool alloc] init];
NSString *firstArg = [NSString stringWithCString: argv[1]];
NSLog@"Argument was %s", [firstArg UTF8String], length);
[arp release];
return 0;
}

Looking good. But we still need to fix the -stringWithCString:. That could be just as easy, replacemethod stringWithCString: with stringWithUTF8String: would do the trick. However let’s be a little
different here. Why don’t we use -stringWithCString:encoding:? If we do that, then we’re going to need to take a guess at the second argument, because we’ve got no idea what the encoding should be (that’s why -stringWithCString: is deprecated, after all. However if we’re happy to assume UTF8 is fine for the output, let’s do that for the input. We’d better let everyone know that’s what happened, though.

So this rule is starting to look quite complex. It says “replace -stringWithCString: with -stringWithCString:encoding:, keeping the C string argument but adding another argument, which should be NSUTF8StringEncoding. While you’re at it, warn the developer that you’ve had to make that assumption”. We also (presumably) want to combine it with the previous rule, so that if we see the original file we’ll catch both of the problems. Luckily tops lets us write scripts, which comprise of one or more rule descriptions. Here’s a script which encapsulates both our cString rules:

replacemethod "cString" with "UTF8String"
replacemethod "stringWithCString:<cString>" with "stringWithCString:<cString>encoding:<encoding>" {
replace "<encoding_arg>" with "NSUTF8StringEncoding"
} warning "Assumed input encoding is UTF8"

So why does the <encoding> token become <encoding_arg> in the sub-rule? Well that means “the thing which is passed as the encoding argument”. This avoids confusion with <encoding_param>, the parameter as declared in the class interface (yes, you can run tops on headers as well as implementations).

Now if we save this script as cStringNoMore.tops, we can run it against our source file:

heimdall:Documents leeg$ tops -scriptfile cStringNoMore.tops printarg.m

Which results in the following source:

#import <Foundation/Foundation.h>

int main(int argc, char **argv, char **envp)
{
NSAutoreleasePool *arp = [[NSAutoreleasePool alloc] init];
#warning Assumed input encoding is UTF8
NSString *firstArg = [NSString stringWithCString:argv[1] encoding:NSUTF8StringEncoding];
NSLog(@"Argument was %s", [firstArg UTF8String]);
[arp release];
return 0;
}

Now, when we compile it, we no longer get told about deprecated API. Cool! But it looks like I need to verify that the use of UTF8 is acceptable:

heimdall:Documents leeg$ cc -o printarg printarg.m -framework Foundation
printarg.m:6:2: warning: #warning Assumed input encoding is UTF8

exercises for the reader, and caveats

There’s plenty more to tops than I’ve managed to cover here. You could (and indeed Apple do) use it to 64-bit-cleanify your sources. Performing security audits is another great use – particularly using constructs such as:

replace strcpy with same error "WTF do you think you're doing?!?"

However, notice that tops is a blunter instrument than the Xcode refactoring capability. Its smallest unit of operation is the source file; refactoring only within particular methods is not quite easily achieved. Also, as I said before, remember to check your source into SCM before running a script! There is a -dont option to make tops output its proposed changes without applying them, too.

Finally tops shouldn’t be used fully automated. Always assume that you need to inspect the output carefully, don’t just Build and Go.

posted by Graham Lee at 04:33  

Wednesday, July 1, 2009

KVO and +initialize

Got caught by a really hard-to-diagnose issue today, so I decided to write it down in part so that you don’t get bitten by it, and partly so that next time I come across the issue, I’ll remember what it was.

I had a nasty bug in trying to add support for the AppleScript duplicate command to one of my objects. Now duplicate should, in principle, be simple: just conform to NSCopying and implement -copyWithZone:. The default implementation of NSCloneCommand should deal with everything else. But what I found was that there’s a class variable (OK, there isn’t, there’s a static in the class implementation file with accessors) with which the instances must compare some properties. And this was empty by the time the AppleScript ran. Well, that’s odd, thought I, it’s only being emptied once, and that’s when it’s created in +[MyClass initialize]. So what’s going on?

Having set a watchpoint on the static, I now know the answer: the +initialize method was being called twice. Erm, OK…why? It’s only called whenever a class is first used. It turns out that there were two classes with the same IMP for that method. The first was MyClass, and the second? NSKVONotifying_MyClass. Ah, great, Apple are adding a subclass of one of my classes for me!

It turns out that TFM has a solution:


+ (void)initialize
{
if (self == [MyClass class])
{
//real code
}
}

and I could use that solution here to fix my problem. But finding out that is the problem was a complete pig.

posted by Graham Lee at 15:35  

Wednesday, June 10, 2009

Unit testing Cocoa projects in Xcode

Unlike Bill, whose reference to unit testing in Xcode 3.0 is linked at the title, when I started writing unit tests for my Cocoa projects I had no experience of testing in any other environment (well, OK, I’d used OCUnit on GNUstep, but I decline to consider that as a separate environment). However, what I’ve seen of unit testing in Cocoa still makes me think I must be missing something.

The first thing is that when people such as Kent Beck talk about test-driven development, they mention “red-green-refactor”. Well, where’s my huge red bar? Actually, I sometimes write good code so I’d like to see my huge green bar too, but Xcode 3.1 doesn’t have one of those either. You have to grub through the build results window to see what happened.

Sometimes, a test is just so badly broken that rather than just failing, it crashes the test runner. This is a bit unfortunate, because it can be very hard to work out what test broke the harness. That’s especially true if the issue is some surprising concurrency bug and one test breaks a different test, or if the test manages to destroy the assumptions made in -teardown and crashes the harness after it’s run. Now Chris Hanson has posted a workaround to get the debugger working with a unit test bundle target, but wouldn’t it be nice if that “just worked”, in the same way that breaking into the debugger from Build and Run “just works” in an app target?

posted by Graham Lee at 16:52  

Saturday, April 25, 2009

On dynamic vs. static polymorphism

An interesting juxtaposition in the ACCU 2009 schedule put my talk on “adopting MVC in Objective-C and Cocoa” next to Peter Sommerlad’s talk on “Design patterns with modern C++”. So the subject matter in each case was fairly similar, but then the solutions we came up with were entirely different.

One key factor was that Peter’s solutions try to push all of the “smarts” of a design pattern into the compiler, using templates and metaprogramming to separate implementations from interfaces. On the other hand, my solutions use duck typing and dynamic method resolution to push all of the complexity into the runtime. Both solutions work, of course. It’s also fairly obvious that they’re both chosen based on the limitations and capabilities of the language we were each using. Nonetheless, it was interesting that we both had justifications for our chosen (and thus One True) approach.

In the Stroustroup corner, the justification is this: by making the compiler resolve all of the decisions, any problems in the code are resolved before it ever gets run, let alone before it gets into the hands of a user. Whereas the Cox defence argues that my time as a programmer is too expensive to spend sitting around waiting for g++ to generate metaprogramming code, so replace the compilation with comparitively cheap lookups at runtime – which also allows for classes that couldn’t have possibly existed at compiletime, such as those added by the Python or Perl bridge.

This provided concrete evidence of a position that I’ve argued before – namely that Design Patterns are language-dependent. We both implemented Template Method. Peter’s implementation involved a templatized abstract class which took a concrete subclass in the realisation (i.e. as the parameter in the <T>). My implementation is the usual Cocoa delegate pattern – the “abstract” (or more correctly undecorated) class takes any old id as the delegate, then tests whether it implements the delegation sequence points at runtime. Both implement the pattern, and that’s about where the similiarities end.

posted by Graham Lee at 19:44  

Friday, April 17, 2009

NSConference: the aftermath

So, that’s that then, the first ever NSConference is over. But what a conference! Every session was informative, edumacational and above all enjoyable, including the final session where (and I hate to crow about this) the “American” team, who had a working and well-constructed Core Data based app, were soundly thrashed by the “European” team who had a nob joke and a flashlight app. Seriously, we finally found a reason for doing an iPhone flashlight! Top banana. I met loads of cool people, got to present with some top Cocoa developers (why Scotty got me in from the second division I’ll never know, but I’m very grateful) and really did have a good time talking with everyone and learning new Cocoa skills.

It seems that my presentation and my Xcode top tip[] went down really well, so thanks to all the attendees for being a great audience, asking thoughtful and challenging questions and being really supportive. It’s been a couple of years since I’ve spoken to a sizable conference crowd, and I felt like everyone was on my side and wanted the talk – and indeed the whole conference – to be a success.

So yes, thanks to Scotty and Tim, Dave and Ben, and to all the speakers and attendees for such a fantastic conference. I’m already looking forward to next year’s conference, and slightly saddened by having to come back to the real world over the weekend. I’ll annotate my Keynote presentation and upload it when I can.

[] Xcode “Run Shell Script” build phases get stored on one line in the project.pbxproj file, with all the line breaks replaced by n. That sucks for version control because any changes by two devs result in a conflict over the whole script. So, have your build phase call an external .sh file where you really keep the shell script. Environment variables will still be available, and now you can work with SCM too :-).

posted by Graham Lee at 18:16  
Next Page »

Powered by WordPress