Working Effectively with Legacy Code

I gave a talk to my team at ARM today on Working Effectively with Legacy Code by Michael Feathers. Here are some notes I made in preparation, which are somewhat related to the talk I gave.

This may be the most important book a software developer can
read. Why? Because if you don’t, then you’re part of the problem.

It’s obviously a lot easier and a lot more enjoyable to work on
greenfield projects all the time. You get to choose this week’s
favourite technologies and tools, put things together in the ways that
suit you now, and make progress because, well anything is progress
when there’s nothing there already. But throwing away an existing
system and starting from scratch makes it easy to throw away the
lessons learned in developing that system. It may be ugly, and patched
up all over the place, but that’s because each of those patches was
needed. They each represent something we learned about the product
after we thought we were done.

The new system is much more likely to look good from the developer’s
perspective
, but what about the users’? Do they want to pay again
for development of a new system when they already have one that mostly
works? Do they want to learn again how to use the software? We have
this strange introspective notion that professionalism in software
development means things that make code look good to other coders:
Clean Code, “well-crafted” code. But we should also have some
responsibility to those people who depend on us and who pay our way,
and that might mean taking the decision to fix the mostly-working
thing.

A digression: Lehman’s Laws

Manny Lehman identified three different categories of software system:
those that are exactly specified, those that implement
well-understood procedures, and those that are influenced by the
environment in which they run. Most software (including ours) comes
into that last category, and as the environment changes so must the
software, even if there were no (known) problems with it at an earlier
point in its evolution.

He expressed
Laws governing the evolution of software systems,
which govern how the requirements for new development are in conflict
with the forces that slow down maintenance of existing systems. I’ll
not reproduce the full list here, but for example on the one hand the
functionality of the system must grow over time to provide user
satisfaction, while at the same time the complexity will increase and
perceived quality will decline unless it is actively maintained.

Legacy Code

Michael Feather’s definition of legacy code is code without tests. I’m
going to be a bit picky here: rather than saying that legacy code is
code with no tests, I’m going to say that it’s code with
insufficient tests
. If I make a change, can I be confident that I’ll
discover the ramifications of that change?

If not, then it’ll slow me down. I even sometimes discard changes
entirely, because I decide the cost of working out whether my change
has broken anything outweighs the interest I have in seeing the change
make it into the codebase.

Feathers refers to the tests as a “software vice”. They clamp the
software into place, so that you can have more control when you’re
working on it. Tests aren’t the only tools that do this: assertions
(and particularly Design by Contract) also help pin down the software.

How do I test untested code?

The apparent way forward then when dealing with legacy code is to
understand its behaviour and encapsulate that in a collection of unit
tests. Unfortunately, it’s likely to be difficult to write unit tests
for legacy code, because it’s all tightly coupled, has weird and
unexpected dependencies, and is hard to understand. So there’s a
catch-22: I need to make tests before I make changes, but I need to
make changes before I can make tests.

Seams

Almost the entire book is about resolving that dilemma, and contains a
collection of patterns and techniques to help you make low-risk
changes to make the code more testable, so you can introduce the tests
that will help you make the high-risk changes. His algorithm is:

  1. identify the “change points”, the things that need modifying to
    make the change you have to make.
  2. find the “test points”, the places around the change points where
    you need to add tests.
  3. break dependencies.
  4. write the tests.
  5. make the changes.

The overarching model for breaking dependencies is the “seam”. It’s a
place where you can change the behaviour of some code you want to
test, without having to change the code under test itself. Some examples:

  • you could introduce a constructor argument to inject an object
    rather than using a global variable
  • you could add a layer of indirection between a method and a
    framework class it uses, to replace that framework class with a
    test double
  • you could use the C preprocessor to redefine a function call to use
    a different function
  • you can break an uncohesive class into two classes that collaborate
    over an interface, to replace one of the classes in your tests

Understanding the code

The important point is that whatever you, or someone else, thinks
the behaviour of the code should be, actually your customers have paid
for the behaviour that’s actually there and so that (modulo bugs) is
the thing you should preserve.

The book contains techniques to help you understand the existing code
so that you can get those tests written in the first place, and even
find the change points. Scratch refactoring is one technique: look
at the code, change it, move bits out that you think represent
cohesive functions, delete code that’s probably unused, make notes in
comments…then just discard all of those changes. This is like Fred
Brooks’s recommendation to “plan to throw one away”, you can take what
you learned from those notes and refactorings and go in again with a
more structured approach.

Sketching is another technique recommended in the book. You can draw
diagrams of how different modules or objects collaborate, and
particularly draw networks of what parts of the system will be
affected by changes in the part you’re looking at.

Build systems are a huge annoyance

Take Smalltalk. Do I have an object in my image? Yes? Well I can use it. Does it need to do some compilation or something? I have no idea, it just runs my Smalltalk.

Take Python. Do I have the python code? Yes? Well I can use it. Does it need to do some compilation or something? I have no idea, it just runs my Python.

Take C.

Oh my God.

C is portable, and there are portable operating system interface specifications for the system behaviour accessible from C, so you need to have C sources that are specific to the platform you’re building for. So you have a tool like autoconf or cmake that tests how to edit your sources to make them actually work on this platform, and performs those changes. The outputs from them are then fed into a thing that takes C sources and constructs the software.

What you want is the ability to take some C and use it on a computer. What C programmers think you want is a graph of the actions that must be taken to get from something that’s nearly C source to a program you can use on a computer. What they’re likely to give you is a collection of things, each of which encapsulates part of the graph, and not necessarily all that well. Like autoconf and cmake, mentioned above, which do some of the transformations, but not all of them, and leave it to some other tool (in the case of cmake, your choice of some other tool) to do the rest.

Or look at make, which is actually entirely capable of doing the graph thing well, but frequently not used as such, so that make all works but making any particular target depends on whether you’ve already done other things.

Now take every other programming language. Thanks to the ubiquity of the C run time and the command shell, every programming language needs its own build system named [a-z]+ake that is written in that language, and supplies a subset of make’s capabilities but makes it easier to do whatever it is needs to be done by that language’s tools.

When all you want is to use the software.

The package management paradox

There was no need to build a package management system since CPAN, and yet npm is the best.
Wait, what?

Every time a new programming language or framework is released, people seem to decide that:

  1. It needs its own package manager.

  2. Simple algorithms need to be rewritten from scratch in “pure” $language/framework and distributed as packages in this package manager.

This is not actually true. Many programming languages – particularly many of the trendy ones – have a way to call C functions, and a way to expose their own routines as C functions. Even C++ has this feature. This means that you don’t need any new packaging system, if you can deploy packages that expose C functions (whatever the implementation language) then you can use existing code, and you don’t need to rewrite everything.

So there hasn’t been a need for a packaging system since at least CPAN, maybe earlier.

On the other hand, npm is the best packaging system ever because people actually consume existing code with it. It’s huge, there are tons of libraries, and so people actually think about whether this thing they’re doing needs new code or the adoption of existing code. It’s the realisation of the OO dream, in which folks like Brad Cox said we’d have data sheets of available components and we’d pull the components we need and bind them together in our applications.

Developers who use npm are just gluing components together into applications, and that’s great for software.

Apple’s Best Programming Language

My talk at App Builders 2016 was on Apple’s best programming language. Spoiler alert: it’s Dylan. Or is it?

I chose a few properties one might wish to find in programming languages, then demonstrated how these were all present in the Dylan language. I also took a dig at certain other languages, which do things in ways that could be seen as less good than Dylan’s. For example, did you know that there’s a programming language out there which distinguishes constant from variable values with the words let and var, rather than using the arguably more readable constant keyword?

Now here’s a thing: of course there are many things iOS app programmers could have been using if we didn’t want to use Objective-C, without the addition of Swift. Indeed, a few months before Swift was introduced I enumerated some of these alternatives on this very blog. However, many of us chose to use Objective-C rather than any of the alternatives, and then chose Swift when that alternative was presented.

Similarly, Apple could have pursued any of those alternatives, and indeed did pursue quite a few of them. What would the world look like if Apple had invested in MacRuby? It would have Swift in it, we know that because they did and it does.

At the end of my talk, I invited the conference to discuss what it was particularly about Swift that led to its brisk success, when it can be considered equivalent to many existing alternatives in numerous ways. Here are some of the suggestions (none of them from me, all from the audience):

  • marketing
  • evangelism
  • LLVM
  • Apple now isn’t the same as Apple in Dylan’s time
  • “Halo effect” from iOS
  • people only want to use first-party tools
  • Objective-C pain provided the opportunity
  • big enough community to reach critical mass

That leads me to wonder how closely related the conditions for “better” and the conditions for “accepted” are, whether there are “better” things out there for programmers that haven’t been adopted, whether those things truly are better, and how aware we all are of the distinction between being better and being popular when we make engineering choices.

Did that work? Maybe.

A limitation with yesterday’s error-preserving approach is that it leaves you on your own to recover from problems. Assuming your error definitions are sufficiently granular, this should be straightforward but tedious. Find out what went wrong, recover from it, then replay everything that happened afterwards.

Recovering from failures automatically is difficult in general, after all, if we could do that there wouldn’t have been a failure in the first place. But there’s no reason to then have a bunch of code that replays the rest of the workflow from the point of recovery, you’ve already written that code on the happy path.

Why can’t we just do that? On an error, why don’t we recover from the immediate error then go back in time to the point where it occurred and start again with our recovered value? Seems straightforward enough, let’s do it. In doing so, let’s borrow (at least in a superficial fashion) the idea of the optional object.

A Maybe object represents the possibility that a value exists, and the ability to find out whether it does. Should it not contain a value, you should be able to find out why not.

@interface Maybe : NSObject

@property (nonatomic, strong, readonly) NSError *error;

- (BOOL)hasValue;
- recoverWithStartingValue:value;

+ just:value;
+ none:aClass error:(NSError *)anError;

@end

I have a value

OK, let’s say that what you wanted to do succeeded, and you want to use the result object. It’d be kindof sucky if you had to test whether the Maybe contained a value at every step of a long operation, and unwrap it to get the object out that you care about. So let’s not make you do that.

@interface Just : Maybe
-initWithValue:value;
@end

@implementation Just
{
    id _value;
}

-initWithValue:value {
    self = [super init];
    if (self) {
        _value = value;
    }
    return self;
}

-(NSError *)error {
    NSAssert(NO, @"No error in success case");
    return nil;
}
-(BOOL)hasValue { return YES; }
-recoverWithStartingValue:value {
    NSAssert(NO, @"Cannot recover from success");
    return nil;
}
-forwardingTargetForSelector:(SEL)aSelector { return _value; }

@end

OK, if everything succeeds you can use the Maybe result (which will be Just the value) as if it is the value itself.

I don’t have a value

The other case is that your operation failed, so we need to represent that. We need to know what type of object you don’t have(!), which will be useful because we can then treat the lack of value as if it is an instance of the value. How will this work? In None, the no-value version of a Maybe, we’ll just absorb every message you send.

@interface None : Maybe
-initWithClass:aClass error:(NSError *)anError;
@end

@implementation None
{
    Class _class;
    NSError *_error;
    NSMutableArray *_invocations;
}

-initWithClass:aClass error:(NSError *)anError {
    self = [super init];
    if (self) {
        _class = aClass;
        _error = anError;
        _invocations = [NSMutableArray array];
    }
    return self;
}

-(NSError *)error { return _error; }
-(BOOL)hasValue { return NO; }

-methodSignatureForSelector:(SEL)aSelector {
    return [_class instanceMethodSignatureForSelector:aSelector];
}

-(void)forwardInvocation:(NSInvocation *)anInvocation {
    id returnValue = self;
    [_invocations addObject:anInvocation];
    [anInvocation setReturnValue:&returnValue];
}

-recoverWithStartingValue:value {
    id nextObject = value;
    while([_invocations count]) {
        id invocation = [_invocations firstObject];
        [_invocations removeObjectAtIndex:0];
        [invocation invokeWithTarget:nextObject];
        [invocation getReturnValue:&nextObject];
    }
    return nextObject;
}

@end

Again, there’s no need to unwrap a None (and that makes no sense, because it doesn’t contain anything). You just use it as if it is the thing that it represents (a lack of).

Recovering

You go through your complex process, and somewhere along the way it failed. At this point your Maybe answer turned into None, because you didn’t get a value at some step. But it carried on paying attention to what you wanted to do.

Now it’s time to turn back the clock. Looking at the error I get from the None, I can see what step failed and what I need to do to be in a position to try again. When I make that happen, I’ll get some object which would’ve been valid at that intermediate point in my operation.

Because the None was paying attention to what I tried to do to it, I can replay the whole process from the point where it failed, using the result from my recovery path.

Worked Example

Here’s an object that requires two steps to use. Either step could fail, and one of them does unless you take action to fix it up.

@interface AnObject : NSObject

@property (nonatomic, assign, getter=isRecovered) BOOL recovered;

-this;
-that;

@end

@implementation AnObject

-this
{
    return [self isRecovered] ? [Maybe just:self] :
        [Maybe none:[self class] error:[NSError errorWithDomain:@"Nope"
                                                           code:23
                                                       userInfo:@{}]];
}

-that
{
    return [Maybe just:@"Winning"];
}

@end

In using this object, I want to compose the two steps. If it goes wrong, I know what to do to recover, but I don’t want to have to explicitly write out the workflow twice if I got it right the first time.

int main(int argc, char *argv[]) {
    @autoreleasepool {
        id anObject = [AnObject new];
        id result = [[anObject this] that];
        if ([result hasValue]) {
            NSLog(@"success: %@", [result lowercaseString]);
        } else {
            NSLog(@"failed with error %@, recovering...", [result error]);
            [anObject setRecovered:YES];
            result = [result recoverWithStartingValue:anObject];
            NSLog(@"ended up with %@", [result uppercaseString]);
        }
    }
}

Conclusion

The NSError-star-star convention lets you compose possibly-failing messages and find out either that it succeeded, or where it went wrong. But it doesn’t encapsulate what would have happened had it gone right, so you can’t just rewind time to where things failed and try again. It is possible to do so, simply by encapsulating the idea that something might work…maybe.

Getting better at doing it wrong

For around a month at the end of last year, I kept a long text note called “doing doing it wrong right”. I was trying to understand error handling in programming, look at some common designs and work out a plan for cleaning up some error-handling code I was working with myself (mercifully someone else, with less analysis paralysis, has taken on that task now).

Deliciously the canonical writing in this field is by an author with the completely apt name Goodenough. His Structured Exception Handling and Exception Handling: Issues and a Proposed Notation describe the problem pretty completely and introduce the idea of exceptions that can be thrown on an interesting condition and caught at the appropriate level of abstraction in the caller.

As an aside, his articles show that exception handling can be used for general control flow. Your long-running download task can throw the “I’m 5% complete now” exception, which is caught to update the UI before asking the download to continue. Programming taste moved away from doing that.

In the Cocoa world, exceptions have never been in favour, probably because they’re too succinct. In their place, multi-if statement complex handling code is introduced:

NSError *error = nil;
id thing = [anObject giveMeAThing:&error];
if (!thing) {
  [self handleError:error];
  return;
}
id otherThing = [thing doYourThing:&error];
if (!otherThing) {
  [self handleError:error];
  return;
}
id anotherThing = [otherThing someSortOfThing:&error];

…and so it goes.

Yesterday in his NSMeetup talk on Swift, Saul Mora reminded me of the nil sink pattern in Objective-C. Removing all the error-handling from the above, a fluent (give or take the names) interface would look like this:

id anotherThing = [[[anObject giveMeAThing] doYourThing] someSortOfThing];

The first method in that chain to fail would return nil, which due to the message-sink behaviour means that everything subsequent to it preserves the nil and that’s what you get out. Saul had built an equivalent thing with option types, and a function Maybe a -> (a -> Maybe b) -> Maybe b to hide all of the option-unwrapping conditionals.

Remembering this pattern, I think it’s possible to go back and tidy up my error cases:

NSError *error = nil;
id anotherThing = [[[anObject giveMeAThing:&error]
                               doYourThing:&error]
                           someSortOfThing:&error];
if (!anotherThing) {
  [self handleError:error];
}

Done. Whichever method goes wrong sets the error and returns nil. Everything after that is sunk, which crucially means that it can’t affect the error. As long as the errors generated are specific enough to indicate what went wrong, whether it’s possible to recover (and if so, how) and whether anything needs cleaning up (and if so, what) then this approach is…good enough.

Hiding behind messages

A problem I think about every so often is how to combine the software design practice of hiding implementations behind interfaces with the engineering practice of parallel execution. What are the trade-offs between making parallelism explicit and information hiding? Where are some lines that can be drawn?

Why do we need abstractions for this stuff, when the libraries we have already work? Those libraries make atomic features like threads and locks explicit, or they make change of control flow explicit, and those are things we can manage in one place for the benefit of the rest of our application. Nobody likes to look at the dispatch terrace:

    });
  });
});

Previous solutions have involved object confinement, where every object has its own context and runs its code there, and the command bus, where you ask for work to be done but don’t get to choose where.

Today’s solution is a bit more explicit, but not much. For every synchronous method, create an asynchronous version:

@interface MyObject : Awaitable

- (NSString *)expensiveCalculation;
- async_expensiveCalculation;

@end

@implementation MyObject

- (NSString *)expensiveCalculation
{
  sleep(5);
  return @"result!";
}

@end

int main(int argc, const char * argv[]) {
  @autoreleasepool {
    MyObject *o = [MyObject new];
    id asyncResult = [o async_expensiveCalculation];
    NSLog(@"working...");
    // you could explicitly wait for the calculation...
    NSLog(@"Result initial: %@", [[asyncResult await]   substringToIndex:1]);
    // ...but why bother?
    NSLog(@"Shouty result: %@", [asyncResult uppercaseString]);
  }
  return 0;
}

This is more of a fork-and-join approach than fire and forget. The calling thread carries on running until it actually needs the result of the calculation, at which point it waits (if necessary) for the callee to complete before continuing. It’ll be familiar to programmers on other platforms as async/await.

The implementation is – blah blah awesome power of the runtime – a forwardInvocation: method that looks for messages with the async marker, patches their selector and creates a proxy object to invoke them in the background. That proxy is then written into the original invocation as its return value. Not shown: a pretty straightforward category implementing -[NSInvocation copyWithZone:].

@implementation Awaitable

- (SEL)suppliedSelectorForMissingSelector:(SEL)aSelector
{
  NSString *selectorName = NSStringFromSelector(aSelector);
  NSString *realSelectorName = nil;
  if ([selectorName hasPrefix:@"async_"]) {
    realSelectorName = [selectorName substringFromIndex:6];
  }
  return NSSelectorFromString(realSelectorName);
}

- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector
{
  NSMethodSignature *aSignature =   [super methodSignatureForSelector:aSelector];
  if (aSignature == nil) {
    aSignature =    [super methodSignatureForSelector:[self suppliedSelectorForMissingSelector:aSelector]];
  }
  return aSignature;
}

- (void)forwardInvocation:(NSInvocation *)anInvocation
{
  SEL trueSelector =    [self suppliedSelectorForMissingSelector:[anInvocation selector]];
  NSInvocation *cachedInvocation = [anInvocation copy];
  [cachedInvocation setSelector:trueSelector];
  CBox *box = [CBox cBoxWithInvocation:cachedInvocation];
  [anInvocation setReturnValue:&box];
}

@end

Why is the proxy object called CBox? No better reason than that I built this while reading a reflection on Concurrent Smalltalk where that’s the name of this object too.

@interface CBox : NSProxy

+ (instancetype)cBoxWithInvocation:(NSInvocation *)inv;
- await;

@end

@implementation CBox
{
  NSInvocation *invocation;
  NSOperationQueue *queue;
}

+ (instancetype)cBoxWithInvocation:(NSInvocation *)inv
{
  CBox *box = [[self alloc] init];
  box->invocation = [inv retain];
  box->queue = [NSOperationQueue new];
  NSInvocationOperation *op = [[NSInvocationOperation alloc]    initWithInvocation:inv];
  [box->queue addOperation:op];
  return [box autorelease];
}

- init
{
  return self;
}

- await
{
  [queue waitUntilAllOperationsAreFinished];
  id returnValue;
  [invocation getReturnValue:&returnValue];
  return [[returnValue retain] autorelease];
}

- (void)dealloc
{
  [queue release];
  [invocation release];
  [super dealloc];
}

- forwardingTargetForSelector:(SEL)aSelector
{
  return [self await];
}

@end

You don’t always need some huge library to clean things up. Here are about 70 lines of Objective-C that abstract an implementation of concurrent programming and stop my application code from having to interweave the distinct responsibilities of what it’s trying to do and how it’s trying to do it.

Fun and games (with rewritten rules) in Objective-C

An object-oriented programming environment is not a set of rules. Programs do not need to be constructed according to the rules supplied by the environment. An object-oriented environment includes tools for constructing new rules, and programs can use these to good effect. Let’s build multiple method inheritance for Objective-C, to see a new set of rules.

The goal

Given three classes, A, B, and C:

@interface A : NSObject

-(NSInteger)a;
-(NSInteger)double:(NSInteger)n;

@end

@implementation A

-(NSInteger)a { return [self double:6]; }

@end

@interface B : NSObject

-(NSInteger)b;

@end

@implementation B

-(NSInteger)b { return 30; }

@end

@interface C : NSObject

-c;

@end

@implementation C

-(NSInteger)double:(NSInteger)a
{
  return a*2;
}

-c { return @([self a] + [self b]); }

@end

We want to find the value of a C instance’s c property. That depends on its values of a and b, but that class doesn’t have those methods. We should add them.

int main(int argc, const char * argv[]) {
  @autoreleasepool {
    C *c = [C new];
    [c addSuperclass:[A class]];
    [c addSuperclass:[B class]];
    NSLog(@"The answer is %@", c.c);
  }
}

We want the following output:

2015-02-17 20:23:36.810 Mixins[59019:418112] The answer is 42

Find a method from any superclass

Clearly there’s some chicanery going on here. I’ve changed the rules: methods are no longer simply being looked up in a single class. My instance of C has three superclasses: A, B and NSObject.

@interface NSObject (Mixable)

- (void)addSuperclass:(Class)aSuperclass;

@end

@implementation NSObject (Mixable)

-superclasses
{
  return objc_getAssociatedObject(self, "superclasses");
}

- (void)addSuperclass:(Class)aSuperclass
{
  id superclasses = [self superclasses]?:@[];
  id newSupers = [superclasses arrayByAddingObject:aSuperclass];
  objc_setAssociatedObject(self, "superclasses", newSupers, OBJC_ASSOCIATION_RETAIN_NONATOMIC);
}

- (Class)superclassForSelector:(SEL)aSelector
{
  __block Class potentialSuperclass = Nil;
  [[self superclasses] enumerateObjectsUsingBlock:^(Class aClass, NSUInteger idx, BOOL *stop) {
    if ([aClass instancesRespondToSelector:aSelector])
    {
      potentialSuperclass = aClass;
      *stop = YES;
    }
  }];
  return potentialSuperclass;
}

- (NSMethodSignature *)original_methodSignatureForSelector:(SEL)aSelector
{
  NSMethodSignature *signature = [self original_methodSignatureForSelector:aSelector];
  if (signature)
  {
    return signature;
  }
  Class potentialSuperclass = [self superclassForSelector:aSelector];
  return [potentialSuperclass instanceMethodSignatureForSelector:aSelector];
}

- (void)forwardInvocation:(NSInvocation *)anInvocation
{
  SEL aSelector = [anInvocation selector];
  Class potentialSuperclass = [self superclassForSelector:aSelector];
  [anInvocation invokeSuperImplementation:potentialSuperclass];
}

+ (void)load
{
  if (self == [NSObject class])
  {
    method_exchangeImplementations(class_getInstanceMethod(self, @selector(original_methodSignatureForSelector:)),
                                   class_getInstanceMethod(self, @selector(methodSignatureForSelector:)));
  }
}

@end

Now invoke that method.

When you write [super foo], the Objective-C runtime needs to send a message to your object but tell the resolution machinery to look at the superclass for the method implementation, not at the current class. It uses a function objc_msgSendSuper to do this. In this case, I don’t have the superclass: I have a superclass, one of potentially many. So what I need to do is more general than what messaging super does.

Luckily for me, objc_msgSendSuper is already sufficiently general. It receives a pointer to self, just like the usual objc_msgSend, but in addition it receives a pointer to the class to be used as the superclass. By controlling that class pointer, I can tell the system which superclass to use.

A category on NSInvocation calls objc_msgSendSuper with the appropriate arguments to get the correct method from the correct class. But how can it call the function correctly? Objective-C messages could receive any number of arguments of any type, and return a value of any type. Constructing a function call when the parameters are discovered at runtime is the job of libffi, which is used here (not shown: a simple, if boring, map of Objective-C value encodings to libffi type descriptions).

@interface NSInvocation (SuperInvoke)

-(void)invokeSuperImplementation:(Class)superclass;

@end

@implementation NSInvocation (SuperInvoke)

- (BOOL)isVoidReturn
{
  return (strcmp([[self methodSignature] methodReturnType], "v") == 0);
}

-(void)invokeSuperImplementation:(Class)superclass
{
  NSMethodSignature *signature = [self methodSignature];
  if (superclass)
  {
    struct objc_super super_class = { .receiver = [self target],
      .super_class = superclass };
    struct objc_super *superPointer = &super_class;
    ffi_cif callInterface;
    NSUInteger argsCount = [signature numberOfArguments];
    ffi_type **args = malloc(sizeof(ffi_type *) * argsCount);
    for (int i = 0; i < argsCount; i++) {
      args[i] = [self ffiTypeForObjCType:[signature getArgumentTypeAtIndex:i]];
    }
    ffi_type *returnType;
    if ([self isVoidReturn]) {
      returnType = &ffi_type_void;
    }
    else {
      returnType = [self ffiTypeForObjCType:[signature methodReturnType]];
    }
    ffi_status status = ffi_prep_cif(&callInterface,
                                     FFI_DEFAULT_ABI,
                                     (unsigned int)[signature numberOfArguments],
                                     returnType,
                                     args);
    if (status != FFI_OK) {
      NSLog(@"I can't make an FFI frame");
      free(args);
      return;
    }

    void *argsBuffer = malloc([signature frameLength]);
    int cursor = 0;
    cursor += args[0]->size;
    void **values = malloc([signature numberOfArguments] * sizeof(void  *));
    values[0] = &superPointer;

    for (int i = 1; i < [signature numberOfArguments]; i++) {
      values[i] = (argsBuffer + cursor);
      [self getArgument:values[i] atIndex:i];
      cursor += args[i]->size;
    }
    if ([self isVoidReturn]) {
      ffi_call(&callInterface, objc_msgSendSuper, NULL, values);
    } else {
      void *result = malloc(returnType->size);
      ffi_call(&callInterface, objc_msgSendSuper, result, values);
      [self setReturnValue:result];
      free(result);
    }
    free(args);
    free(values);
    free(argsBuffer);
  }
}

@end

Conclusion

You’ve seen this conclusion before: blah blah awesome power of the runtime. It doesn’t just let you do expressive things in the Objective-C game, it lets you define a better game.

Objective-Curry

Sadly it’s not called Schoenfinkeling, but that’s the name of the
person who noticed that there’s no reason to ever have a function with
more than one argument. What you think is a function with two
arguments is actually a function with one argument that returns a
function with one argument, this second function closing over the
first argument and acting perhaps on both.

This is, of course, famed for its application (pardon the pun) to Object-Oriented
Programming. We can take a class like this:

@interface NamePrinter : Curryable

- (void)printFirstName:(NSString *)first surname:(NSString *)last;
- (void)printFirstName:(NSString *)first age:(NSInteger)age;
- (void)printFirstName:(NSString *)first middle:(NSString *)middle surname:(NSString *)last;

@end

and use it like this:

int main(int argc, char *argv[]) {
  @autoreleasepool {
    id printer = [[NamePrinter new] autorelease];
    id curried = [printer printFirstName:@"Graham"];
    [curried surname:@"Lee"];
    [curried surname:@"Greene"];
    [curried surname:@"Garden"];
    [curried age:18];

    id alex = [printer printFirstName:@"Alexander"];
    id alexG = [alex middle:@"Graham"];
    [alexG surname:@"Bell"];
  }
}

(your compiler probably complains at this point that it doesn’t know
what you’re up to. Of course, you know better.)

We’ll get results like this:

2015-02-13 00:57:57.421 CurrySauce[41877:134228] Graham Lee
2015-02-13 00:57:57.421 CurrySauce[41877:134228] Graham Greene
2015-02-13 00:57:57.421 CurrySauce[41877:134228] Graham Garden
2015-02-13 00:57:57.422 CurrySauce[41877:134228] Graham is 18 years old
2015-02-13 00:57:57.422 CurrySauce[41877:134228] Alexander Graham Bell

OK, we don’t actually get that for free. There’s some secret sauce I
haven’t shown you: secret curry sauce.

Here be dragons

As with
all the best bits of Objective-C,
you need to turn off the automatic reference counting to do what
follows (you’ll have already seen a call to -autorelease above). ARC
works wonders when you try to write Java in Objective-C, give or take
the strong-weak-strong tango. It isn’t so helpful when you try to
write Objective-C in Objective-C.

Partial application

The NamePrinter class knows that you might call it with a partial
application, so it checks whether the selector you sent in your
message matches the prefix of any method it knows. If it does, and the
return type and argument types in this prefix can be uniquely
determined, then it creates a proxy to listen for the rest of the
application.

The restrictions on types are there due to the way that Objective-C
deals with C types in message sending. It’d be a lot easier to do this
in Ruby or Smalltalk, where everything is an object reference: as
Objective-C needs to deal with C types of different sizes, you must be
able to construct a single invocation from the prefix selector.

int argumentsForSelector(SEL aSelector)
{
  const char *name = sel_getName(aSelector);
  int count = 0;
  while(*name != '\0') {
    if (*name++ == ':') {
      count++;
    }
  }
  return count;
}

@implementation Curryable

- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector
{
  NSMethodSignature *superSignature = [super methodSignatureForSelector:aSelector];
  if (superSignature) {
    return superSignature;
  }
  NSString *thisSelectorName = NSStringFromSelector(aSelector);
  NSMutableSet *signatures = [NSMutableSet set];
  int argCount = argumentsForSelector(aSelector);
  Class currentClass = [self class];
  while (currentClass != Nil) {
    unsigned int methodCount = 0;
    Method *methodList = class_copyMethodList(currentClass, &methodCount);
    for (int i = 0; i < methodCount; i++) {
      Method method = methodList[i];
      SEL anotherSelector = method_getName(method);
      NSString *selectorName = NSStringFromSelector(anotherSelector);
      if ([selectorName hasPrefix:thisSelectorName]) {
        NSMethodSignature *fullSignature = [self methodSignatureForSelector:anotherSelector];
        NSMutableString *constructedTypeSignature = [[@"@@:" mutableCopy] autorelease];
        for (int j = 2; j < argCount + 2; j++) {
          [constructedTypeSignature appendString:@([fullSignature getArgumentTypeAtIndex:j])];
        }
        [signatures addObject:[[constructedTypeSignature copy] autorelease]];
      }
    }
    free(methodList);
    currentClass = class_getSuperclass(currentClass);
  }
  if ([signatures count] != 1) {
    NSLog(@"curried selector does not uniquely match the type of any full selector prefix");
    return nil;
  }
  return [NSMethodSignature signatureWithObjCTypes:[[signatures anyObject] UTF8String]];
}

- (void)forwardInvocation:(NSInvocation *)anInvocation
{
  id curry = [CurrySauce curryTarget:self invocation:anInvocation];
  [anInvocation setReturnValue:&curry];
}

@end

All of that, just to return a proxy object.

Finish the application

This proxy also responds to unknown selectors, by trying to tack
them onto the end of the partially-applied selector. If that works,
and it ends up with a selector that the target recognises, then it
constructs the combined invocation, copies the arguments from the two
partial invocations, and invokes it on the target. If the combined
selector is supposed to return something then this proxy unpacks the
return value and puts it into the invocation it received, to ensure
that the caller picks it up.

SEL concatenateSelectors(SEL firstSelector, SEL secondSelector)
{
  NSString *firstPart = NSStringFromSelector(firstSelector);
  NSString *selectorName = [firstPart stringByAppendingString:NSStringFromSelector(secondSelector)];
  return NSSelectorFromString(selectorName);
}

@implementation CurrySauce
{
  id target;
  NSInvocation *invocation;
}

-initWithTarget:object invocation:(NSInvocation *)partialApplication
{
  target = [object retain];
  invocation = [partialApplication retain];
  return self;
}

+curryTarget:object invocation:(NSInvocation *)partialApplication
{
  return [[[self alloc] initWithTarget:object invocation:partialApplication] autorelease];
}

- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector
{
  //special case for respondsToSelector
  if (aSelector == @selector(respondsToSelector:)) {
    return [target methodSignatureForSelector:aSelector];
  }
  SEL combinedSelector = concatenateSelectors([invocation selector], aSelector);
  NSMethodSignature *combinedSignature = [target methodSignatureForSelector:combinedSelector];
  if (combinedSignature != nil) {
    NSMutableString *completionType = [NSMutableString stringWithFormat:@"%s@:",[combinedSignature methodReturnType]];
    for (int i = argumentsForSelector([invocation selector]) + 2; i < argumentsForSelector(combinedSelector) + 2; i++) {
      [completionType appendFormat:@"%s",[combinedSignature getArgumentTypeAtIndex:i]];
    }
    return [NSMethodSignature signatureWithObjCTypes:[completionType UTF8String]];
  } else {
    return nil;
  }
}

- (void)forwardInvocation:(NSInvocation *)anInvocation
{
  if ([anInvocation selector] == @selector(respondsToSelector:)) {
    [anInvocation invokeWithTarget:target];
    return;
  }
  SEL realSelector = concatenateSelectors([invocation selector], [anInvocation selector]);
  NSMethodSignature *signature = [target methodSignatureForSelector:realSelector];
  NSInvocation *combined = [NSInvocation invocationWithMethodSignature:signature];
  int argumentToSet = 2;
  void *argBuffer = malloc([[invocation methodSignature] frameLength]);
  for (int i = 2; i < [[invocation methodSignature] numberOfArguments]; i++) {
    [invocation getArgument:argBuffer atIndex:i];
    [combined setArgument:argBuffer atIndex:argumentToSet++];
  }
  free(argBuffer);
  argBuffer = malloc([[anInvocation methodSignature] frameLength]);
  for (int i = 2; i < [[anInvocation methodSignature] numberOfArguments]; i++) {
    [anInvocation getArgument:argBuffer atIndex:i];
    [combined setArgument:argBuffer atIndex:argumentToSet++];
  }
  free(argBuffer);

  [combined setTarget:target];
  [combined setSelector:realSelector];
  [combined invoke];
  if (strcmp([[combined methodSignature] methodReturnType], "v")) {
    void *returnBuffer = malloc([[combined methodSignature] methodReturnLength]);
    [combined getReturnValue:returnBuffer];
    [anInvocation setReturnValue:returnBuffer];
    free(returnBuffer);
  }
}

- (void)dealloc
{
  [target release];
  [invocation release];
  [super dealloc];
}

@end

Conclusion

Blah blah awesome power of the runtime, I suppose. It’s pretty cool
that you can do this sort of thing in 150 lines of ObjC. I doubt many
people would want to use it for reals though.