In defence of assertions

The year is 2017 and people are still recommending processing out assertions from release builds.

  1. many assertions are short tests (whether or not that’s a good thing): this variable now has a value, this number is now greater than zero), which won’t cost a lot in production. Or at least, let me phrase this another way: many assertions are too cheap to affect performance metrics in many apps. Or, let me phrase that another way: most production software probably doesn’t have good enough performance monitoring to see a result, or constrained enough performance goals to care about the result.

  2. The program counter has absolutely no business executing the instruction that follows a failed assertion, because the programmer wrote the subsequent instructions with the assumption that this would never happen. Yes, your program will terminate, leading to a 500 error/unfortunate stop dialog/guru meditation screen/other thing, but the alternative is to run…something that apparently shouldn’t ever occur. Far better to stop at the point of problem detection, than to try to re-detect it based on a surprising and unsupportive problem report later.

  3. assertions are things that programmers believe to always hold, and it’s sensible to take issue with the words always and believe. There’s an argument that goes:

    1. I have never seen this situation happen in development or staging.
    2. I got this job by reversing a linked list on a whiteboard.
    3. Therefore, this situation cannot happen in production.

    but unfortunately, there’s a flaw between the axioms and the conclusion. For example, I have seen the argument “items are added to this list as they are received, therefore these items are in chronological order” multiple times, and have seen items in another order just as often. Assertions that never fire on programmer input give false assurance.

Your build needs to be better

I’ve said it before, build systems are a huge annoyance. If your build is anything other than seemingly instantaneous, it’s costing you severe money.

Your developers are probably off reading HN, or writing blog posts about how slow builds cost them, while the build is going. When they finish doing that, which may be some time after the build completes, they’ll have forgotten some of what they were doing and need to spend some time getting back up to speed.

Your developers are probably suspicious of any build failure, thinking that “the build is flaky” rather than “I made a mistake”. They’ll press the button again and go back to HN. When the same error occurs twice, they might look into it.

Your developers probably know that the build is slow, but not which bit of the build is slow. And they don’t have time to investigate that, where it takes so long to get any work done anyway. So everyone will agree that “there is a problem”, but nothing will get done. Or maybe cargo-cult things will get done, things that speed up “builds” but are not the problem with your build.

The Joel test asks whether you can make a build in one test. Insufficient. If you notice when you’re making a build, you’re slowing your developers down.

…which is not always the worst thing, of course. Sometimes a lengthy translation step from some source language to some optimised form of a machine language program yields better results for your customers, because they get to use a faster program and don’t need to care about the time taken to prepare that program. But let’s be clear: that’s part of the release, and your developers don’t always need to be working from the released product (in fact, they’re usually not). Releases should be asynchronous, and the latency between having something ready to be released and having released it can be fairly high, compared with the latency between having created some source and being able to investigate its utility.

Nonetheless, that should all go off in the background. So really, builds and releases should both be non-events to the developers.

Working Effectively with Legacy Code

I gave a talk to my team at ARM today on Working Effectively with Legacy Code by Michael Feathers. Here are some notes I made in preparation, which are somewhat related to the talk I gave.

This may be the most important book a software developer can
read. Why? Because if you don’t, then you’re part of the problem.

It’s obviously a lot easier and a lot more enjoyable to work on
greenfield projects all the time. You get to choose this week’s
favourite technologies and tools, put things together in the ways that
suit you now, and make progress because, well anything is progress
when there’s nothing there already. But throwing away an existing
system and starting from scratch makes it easy to throw away the
lessons learned in developing that system. It may be ugly, and patched
up all over the place, but that’s because each of those patches was
needed. They each represent something we learned about the product
after we thought we were done.

The new system is much more likely to look good from the developer’s
perspective
, but what about the users’? Do they want to pay again
for development of a new system when they already have one that mostly
works? Do they want to learn again how to use the software? We have
this strange introspective notion that professionalism in software
development means things that make code look good to other coders:
Clean Code, “well-crafted” code. But we should also have some
responsibility to those people who depend on us and who pay our way,
and that might mean taking the decision to fix the mostly-working
thing.

A digression: Lehman’s Laws

Manny Lehman identified three different categories of software system:
those that are exactly specified, those that implement
well-understood procedures, and those that are influenced by the
environment in which they run. Most software (including ours) comes
into that last category, and as the environment changes so must the
software, even if there were no (known) problems with it at an earlier
point in its evolution.

He expressed
Laws governing the evolution of software systems,
which govern how the requirements for new development are in conflict
with the forces that slow down maintenance of existing systems. I’ll
not reproduce the full list here, but for example on the one hand the
functionality of the system must grow over time to provide user
satisfaction, while at the same time the complexity will increase and
perceived quality will decline unless it is actively maintained.

Legacy Code

Michael Feather’s definition of legacy code is code without tests. I’m
going to be a bit picky here: rather than saying that legacy code is
code with no tests, I’m going to say that it’s code with
insufficient tests
. If I make a change, can I be confident that I’ll
discover the ramifications of that change?

If not, then it’ll slow me down. I even sometimes discard changes
entirely, because I decide the cost of working out whether my change
has broken anything outweighs the interest I have in seeing the change
make it into the codebase.

Feathers refers to the tests as a “software vice”. They clamp the
software into place, so that you can have more control when you’re
working on it. Tests aren’t the only tools that do this: assertions
(and particularly Design by Contract) also help pin down the software.

How do I test untested code?

The apparent way forward then when dealing with legacy code is to
understand its behaviour and encapsulate that in a collection of unit
tests. Unfortunately, it’s likely to be difficult to write unit tests
for legacy code, because it’s all tightly coupled, has weird and
unexpected dependencies, and is hard to understand. So there’s a
catch-22: I need to make tests before I make changes, but I need to
make changes before I can make tests.

Seams

Almost the entire book is about resolving that dilemma, and contains a
collection of patterns and techniques to help you make low-risk
changes to make the code more testable, so you can introduce the tests
that will help you make the high-risk changes. His algorithm is:

  1. identify the “change points”, the things that need modifying to
    make the change you have to make.
  2. find the “test points”, the places around the change points where
    you need to add tests.
  3. break dependencies.
  4. write the tests.
  5. make the changes.

The overarching model for breaking dependencies is the “seam”. It’s a
place where you can change the behaviour of some code you want to
test, without having to change the code under test itself. Some examples:

  • you could introduce a constructor argument to inject an object
    rather than using a global variable
  • you could add a layer of indirection between a method and a
    framework class it uses, to replace that framework class with a
    test double
  • you could use the C preprocessor to redefine a function call to use
    a different function
  • you can break an uncohesive class into two classes that collaborate
    over an interface, to replace one of the classes in your tests

Understanding the code

The important point is that whatever you, or someone else, thinks
the behaviour of the code should be, actually your customers have paid
for the behaviour that’s actually there and so that (modulo bugs) is
the thing you should preserve.

The book contains techniques to help you understand the existing code
so that you can get those tests written in the first place, and even
find the change points. Scratch refactoring is one technique: look
at the code, change it, move bits out that you think represent
cohesive functions, delete code that’s probably unused, make notes in
comments…then just discard all of those changes. This is like Fred
Brooks’s recommendation to “plan to throw one away”, you can take what
you learned from those notes and refactorings and go in again with a
more structured approach.

Sketching is another technique recommended in the book. You can draw
diagrams of how different modules or objects collaborate, and
particularly draw networks of what parts of the system will be
affected by changes in the part you’re looking at.

Build systems are a huge annoyance

Take Smalltalk. Do I have an object in my image? Yes? Well I can use it. Does it need to do some compilation or something? I have no idea, it just runs my Smalltalk.

Take Python. Do I have the python code? Yes? Well I can use it. Does it need to do some compilation or something? I have no idea, it just runs my Python.

Take C.

Oh my God.

C is portable, and there are portable operating system interface specifications for the system behaviour accessible from C, so you need to have C sources that are specific to the platform you’re building for. So you have a tool like autoconf or cmake that tests how to edit your sources to make them actually work on this platform, and performs those changes. The outputs from them are then fed into a thing that takes C sources and constructs the software.

What you want is the ability to take some C and use it on a computer. What C programmers think you want is a graph of the actions that must be taken to get from something that’s nearly C source to a program you can use on a computer. What they’re likely to give you is a collection of things, each of which encapsulates part of the graph, and not necessarily all that well. Like autoconf and cmake, mentioned above, which do some of the transformations, but not all of them, and leave it to some other tool (in the case of cmake, your choice of some other tool) to do the rest.

Or look at make, which is actually entirely capable of doing the graph thing well, but frequently not used as such, so that make all works but making any particular target depends on whether you’ve already done other things.

Now take every other programming language. Thanks to the ubiquity of the C run time and the command shell, every programming language needs its own build system named [a-z]+ake that is written in that language, and supplies a subset of make’s capabilities but makes it easier to do whatever it is needs to be done by that language’s tools.

When all you want is to use the software.

The package management paradox

There was no need to build a package management system since CPAN, and yet npm is the best.
Wait, what?

Every time a new programming language or framework is released, people seem to decide that:

  1. It needs its own package manager.

  2. Simple algorithms need to be rewritten from scratch in “pure” $language/framework and distributed as packages in this package manager.

This is not actually true. Many programming languages – particularly many of the trendy ones – have a way to call C functions, and a way to expose their own routines as C functions. Even C++ has this feature. This means that you don’t need any new packaging system, if you can deploy packages that expose C functions (whatever the implementation language) then you can use existing code, and you don’t need to rewrite everything.

So there hasn’t been a need for a packaging system since at least CPAN, maybe earlier.

On the other hand, npm is the best packaging system ever because people actually consume existing code with it. It’s huge, there are tons of libraries, and so people actually think about whether this thing they’re doing needs new code or the adoption of existing code. It’s the realisation of the OO dream, in which folks like Brad Cox said we’d have data sheets of available components and we’d pull the components we need and bind them together in our applications.

Developers who use npm are just gluing components together into applications, and that’s great for software.

Apple’s Best Programming Language

My talk at App Builders 2016 was on Apple’s best programming language. Spoiler alert: it’s Dylan. Or is it?

I chose a few properties one might wish to find in programming languages, then demonstrated how these were all present in the Dylan language. I also took a dig at certain other languages, which do things in ways that could be seen as less good than Dylan’s. For example, did you know that there’s a programming language out there which distinguishes constant from variable values with the words let and var, rather than using the arguably more readable constant keyword?

Now here’s a thing: of course there are many things iOS app programmers could have been using if we didn’t want to use Objective-C, without the addition of Swift. Indeed, a few months before Swift was introduced I enumerated some of these alternatives on this very blog. However, many of us chose to use Objective-C rather than any of the alternatives, and then chose Swift when that alternative was presented.

Similarly, Apple could have pursued any of those alternatives, and indeed did pursue quite a few of them. What would the world look like if Apple had invested in MacRuby? It would have Swift in it, we know that because they did and it does.

At the end of my talk, I invited the conference to discuss what it was particularly about Swift that led to its brisk success, when it can be considered equivalent to many existing alternatives in numerous ways. Here are some of the suggestions (none of them from me, all from the audience):

  • marketing
  • evangelism
  • LLVM
  • Apple now isn’t the same as Apple in Dylan’s time
  • “Halo effect” from iOS
  • people only want to use first-party tools
  • Objective-C pain provided the opportunity
  • big enough community to reach critical mass

That leads me to wonder how closely related the conditions for “better” and the conditions for “accepted” are, whether there are “better” things out there for programmers that haven’t been adopted, whether those things truly are better, and how aware we all are of the distinction between being better and being popular when we make engineering choices.

Did that work? Maybe.

A limitation with yesterday’s error-preserving approach is that it leaves you on your own to recover from problems. Assuming your error definitions are sufficiently granular, this should be straightforward but tedious. Find out what went wrong, recover from it, then replay everything that happened afterwards.

Recovering from failures automatically is difficult in general, after all, if we could do that there wouldn’t have been a failure in the first place. But there’s no reason to then have a bunch of code that replays the rest of the workflow from the point of recovery, you’ve already written that code on the happy path.

Why can’t we just do that? On an error, why don’t we recover from the immediate error then go back in time to the point where it occurred and start again with our recovered value? Seems straightforward enough, let’s do it. In doing so, let’s borrow (at least in a superficial fashion) the idea of the optional object.

A Maybe object represents the possibility that a value exists, and the ability to find out whether it does. Should it not contain a value, you should be able to find out why not.

@interface Maybe : NSObject

@property (nonatomic, strong, readonly) NSError *error;

- (BOOL)hasValue;
- recoverWithStartingValue:value;

+ just:value;
+ none:aClass error:(NSError *)anError;

@end

I have a value

OK, let’s say that what you wanted to do succeeded, and you want to use the result object. It’d be kindof sucky if you had to test whether the Maybe contained a value at every step of a long operation, and unwrap it to get the object out that you care about. So let’s not make you do that.

@interface Just : Maybe
-initWithValue:value;
@end

@implementation Just
{
    id _value;
}

-initWithValue:value {
    self = [super init];
    if (self) {
        _value = value;
    }
    return self;
}

-(NSError *)error {
    NSAssert(NO, @"No error in success case");
    return nil;
}
-(BOOL)hasValue { return YES; }
-recoverWithStartingValue:value {
    NSAssert(NO, @"Cannot recover from success");
    return nil;
}
-forwardingTargetForSelector:(SEL)aSelector { return _value; }

@end

OK, if everything succeeds you can use the Maybe result (which will be Just the value) as if it is the value itself.

I don’t have a value

The other case is that your operation failed, so we need to represent that. We need to know what type of object you don’t have(!), which will be useful because we can then treat the lack of value as if it is an instance of the value. How will this work? In None, the no-value version of a Maybe, we’ll just absorb every message you send.

@interface None : Maybe
-initWithClass:aClass error:(NSError *)anError;
@end

@implementation None
{
    Class _class;
    NSError *_error;
    NSMutableArray *_invocations;
}

-initWithClass:aClass error:(NSError *)anError {
    self = [super init];
    if (self) {
        _class = aClass;
        _error = anError;
        _invocations = [NSMutableArray array];
    }
    return self;
}

-(NSError *)error { return _error; }
-(BOOL)hasValue { return NO; }

-methodSignatureForSelector:(SEL)aSelector {
    return [_class instanceMethodSignatureForSelector:aSelector];
}

-(void)forwardInvocation:(NSInvocation *)anInvocation {
    id returnValue = self;
    [_invocations addObject:anInvocation];
    [anInvocation setReturnValue:&returnValue];
}

-recoverWithStartingValue:value {
    id nextObject = value;
    while([_invocations count]) {
        id invocation = [_invocations firstObject];
        [_invocations removeObjectAtIndex:0];
        [invocation invokeWithTarget:nextObject];
        [invocation getReturnValue:&nextObject];
    }
    return nextObject;
}

@end

Again, there’s no need to unwrap a None (and that makes no sense, because it doesn’t contain anything). You just use it as if it is the thing that it represents (a lack of).

Recovering

You go through your complex process, and somewhere along the way it failed. At this point your Maybe answer turned into None, because you didn’t get a value at some step. But it carried on paying attention to what you wanted to do.

Now it’s time to turn back the clock. Looking at the error I get from the None, I can see what step failed and what I need to do to be in a position to try again. When I make that happen, I’ll get some object which would’ve been valid at that intermediate point in my operation.

Because the None was paying attention to what I tried to do to it, I can replay the whole process from the point where it failed, using the result from my recovery path.

Worked Example

Here’s an object that requires two steps to use. Either step could fail, and one of them does unless you take action to fix it up.

@interface AnObject : NSObject

@property (nonatomic, assign, getter=isRecovered) BOOL recovered;

-this;
-that;

@end

@implementation AnObject

-this
{
    return [self isRecovered] ? [Maybe just:self] :
        [Maybe none:[self class] error:[NSError errorWithDomain:@"Nope"
                                                           code:23
                                                       userInfo:@{}]];
}

-that
{
    return [Maybe just:@"Winning"];
}

@end

In using this object, I want to compose the two steps. If it goes wrong, I know what to do to recover, but I don’t want to have to explicitly write out the workflow twice if I got it right the first time.

int main(int argc, char *argv[]) {
    @autoreleasepool {
        id anObject = [AnObject new];
        id result = [[anObject this] that];
        if ([result hasValue]) {
            NSLog(@"success: %@", [result lowercaseString]);
        } else {
            NSLog(@"failed with error %@, recovering...", [result error]);
            [anObject setRecovered:YES];
            result = [result recoverWithStartingValue:anObject];
            NSLog(@"ended up with %@", [result uppercaseString]);
        }
    }
}

Conclusion

The NSError-star-star convention lets you compose possibly-failing messages and find out either that it succeeded, or where it went wrong. But it doesn’t encapsulate what would have happened had it gone right, so you can’t just rewind time to where things failed and try again. It is possible to do so, simply by encapsulating the idea that something might work…maybe.

Getting better at doing it wrong

For around a month at the end of last year, I kept a long text note called “doing doing it wrong right”. I was trying to understand error handling in programming, look at some common designs and work out a plan for cleaning up some error-handling code I was working with myself (mercifully someone else, with less analysis paralysis, has taken on that task now).

Deliciously the canonical writing in this field is by an author with the completely apt name Goodenough. His Structured Exception Handling and Exception Handling: Issues and a Proposed Notation describe the problem pretty completely and introduce the idea of exceptions that can be thrown on an interesting condition and caught at the appropriate level of abstraction in the caller.

As an aside, his articles show that exception handling can be used for general control flow. Your long-running download task can throw the “I’m 5% complete now” exception, which is caught to update the UI before asking the download to continue. Programming taste moved away from doing that.

In the Cocoa world, exceptions have never been in favour, probably because they’re too succinct. In their place, multi-if statement complex handling code is introduced:

NSError *error = nil;
id thing = [anObject giveMeAThing:&error];
if (!thing) {
  [self handleError:error];
  return;
}
id otherThing = [thing doYourThing:&error];
if (!otherThing) {
  [self handleError:error];
  return;
}
id anotherThing = [otherThing someSortOfThing:&error];

…and so it goes.

Yesterday in his NSMeetup talk on Swift, Saul Mora reminded me of the nil sink pattern in Objective-C. Removing all the error-handling from the above, a fluent (give or take the names) interface would look like this:

id anotherThing = [[[anObject giveMeAThing] doYourThing] someSortOfThing];

The first method in that chain to fail would return nil, which due to the message-sink behaviour means that everything subsequent to it preserves the nil and that’s what you get out. Saul had built an equivalent thing with option types, and a function Maybe a -> (a -> Maybe b) -> Maybe b to hide all of the option-unwrapping conditionals.

Remembering this pattern, I think it’s possible to go back and tidy up my error cases:

NSError *error = nil;
id anotherThing = [[[anObject giveMeAThing:&error]
                               doYourThing:&error]
                           someSortOfThing:&error];
if (!anotherThing) {
  [self handleError:error];
}

Done. Whichever method goes wrong sets the error and returns nil. Everything after that is sunk, which crucially means that it can’t affect the error. As long as the errors generated are specific enough to indicate what went wrong, whether it’s possible to recover (and if so, how) and whether anything needs cleaning up (and if so, what) then this approach is…good enough.