To become a beginner, first become an expert

We have a whole load of practices in programming that only really work well if you’re already good at whatever the process is supposed to help with.

Scrum is a process improvement framework, but only if you already know how to do process improvement. If you don’t, then Scrum is just the baseline mini-waterfall process with a chance to air your dirty laundry every fortnight.

Agile is good at helping you embrace change, but only if you’re already good enough at managing change to understand which changes should be embraced.

#NoEstimates helps you avoid the overhead of estimates, but only if you’re already good enough at estimates to know that you always write user stories that take 0.5-2 days to implement.

TDD helps you design your APIs, but only if you’re already good enough at API design to understand things like dependency injection and loose coupling.

Microservices help you isolate modules, but only if you’re already good enough at modularity not to get swamped in HTTP calls.

This is all very well for selling consultancy (“if your [agile] isn’t working, then you aren’t [agiling] hard enough, let me [agile] you some more”) but where’s the on-ramp?

Working Effectively with Legacy Code

I gave a talk to my team at ARM today on Working Effectively with Legacy Code by Michael Feathers. Here are some notes I made in preparation, which are somewhat related to the talk I gave.

This may be the most important book a software developer can
read. Why? Because if you don’t, then you’re part of the problem.

It’s obviously a lot easier and a lot more enjoyable to work on
greenfield projects all the time. You get to choose this week’s
favourite technologies and tools, put things together in the ways that
suit you now, and make progress because, well anything is progress
when there’s nothing there already. But throwing away an existing
system and starting from scratch makes it easy to throw away the
lessons learned in developing that system. It may be ugly, and patched
up all over the place, but that’s because each of those patches was
needed. They each represent something we learned about the product
after we thought we were done.

The new system is much more likely to look good from the developer’s
perspective
, but what about the users’? Do they want to pay again
for development of a new system when they already have one that mostly
works? Do they want to learn again how to use the software? We have
this strange introspective notion that professionalism in software
development means things that make code look good to other coders:
Clean Code, “well-crafted” code. But we should also have some
responsibility to those people who depend on us and who pay our way,
and that might mean taking the decision to fix the mostly-working
thing.

A digression: Lehman’s Laws

Manny Lehman identified three different categories of software system:
those that are exactly specified, those that implement
well-understood procedures, and those that are influenced by the
environment in which they run. Most software (including ours) comes
into that last category, and as the environment changes so must the
software, even if there were no (known) problems with it at an earlier
point in its evolution.

He expressed
Laws governing the evolution of software systems,
which govern how the requirements for new development are in conflict
with the forces that slow down maintenance of existing systems. I’ll
not reproduce the full list here, but for example on the one hand the
functionality of the system must grow over time to provide user
satisfaction, while at the same time the complexity will increase and
perceived quality will decline unless it is actively maintained.

Legacy Code

Michael Feather’s definition of legacy code is code without tests. I’m
going to be a bit picky here: rather than saying that legacy code is
code with no tests, I’m going to say that it’s code with
insufficient tests
. If I make a change, can I be confident that I’ll
discover the ramifications of that change?

If not, then it’ll slow me down. I even sometimes discard changes
entirely, because I decide the cost of working out whether my change
has broken anything outweighs the interest I have in seeing the change
make it into the codebase.

Feathers refers to the tests as a “software vice”. They clamp the
software into place, so that you can have more control when you’re
working on it. Tests aren’t the only tools that do this: assertions
(and particularly Design by Contract) also help pin down the software.

How do I test untested code?

The apparent way forward then when dealing with legacy code is to
understand its behaviour and encapsulate that in a collection of unit
tests. Unfortunately, it’s likely to be difficult to write unit tests
for legacy code, because it’s all tightly coupled, has weird and
unexpected dependencies, and is hard to understand. So there’s a
catch-22: I need to make tests before I make changes, but I need to
make changes before I can make tests.

Seams

Almost the entire book is about resolving that dilemma, and contains a
collection of patterns and techniques to help you make low-risk
changes to make the code more testable, so you can introduce the tests
that will help you make the high-risk changes. His algorithm is:

  1. identify the “change points”, the things that need modifying to
    make the change you have to make.
  2. find the “test points”, the places around the change points where
    you need to add tests.
  3. break dependencies.
  4. write the tests.
  5. make the changes.

The overarching model for breaking dependencies is the “seam”. It’s a
place where you can change the behaviour of some code you want to
test, without having to change the code under test itself. Some examples:

  • you could introduce a constructor argument to inject an object
    rather than using a global variable
  • you could add a layer of indirection between a method and a
    framework class it uses, to replace that framework class with a
    test double
  • you could use the C preprocessor to redefine a function call to use
    a different function
  • you can break an uncohesive class into two classes that collaborate
    over an interface, to replace one of the classes in your tests

Understanding the code

The important point is that whatever you, or someone else, thinks
the behaviour of the code should be, actually your customers have paid
for the behaviour that’s actually there and so that (modulo bugs) is
the thing you should preserve.

The book contains techniques to help you understand the existing code
so that you can get those tests written in the first place, and even
find the change points. Scratch refactoring is one technique: look
at the code, change it, move bits out that you think represent
cohesive functions, delete code that’s probably unused, make notes in
comments…then just discard all of those changes. This is like Fred
Brooks’s recommendation to “plan to throw one away”, you can take what
you learned from those notes and refactorings and go in again with a
more structured approach.

Sketching is another technique recommended in the book. You can draw
diagrams of how different modules or objects collaborate, and
particularly draw networks of what parts of the system will be
affected by changes in the part you’re looking at.

Yes, you may delete tests

A frequently-presented objection to the concept of writing automated tests is that it ossifies the implementation of the system under test. “If I’ve got all the tests you’re proposing,” I hear, “then I won’t be able to make any changes without breaking the tests.” There are two aspects to this problem.

Tests should make assertions about the externally-observable behaviour of the system under test. If you have tests that assert that a particular private method does its thing in a particular way, then yes, you are baking in the assumption that this private method works in a particular way. And yes, that makes it harder to change things. But it isn’t necessary.

Any test that is no longer useful is no longer useful: delete it. Those tests that helped you to build the thing were useful while you were building the thing: if they’re not helping you now then just get rid of them.

When you’re building a component, you’re probably thinking quite deeply about the algorithm it will implement, and how it will communicate with its collaborators. Thinking about those will mean writing tests about them, and whenever those tests are in the suite they’ll require the algorithm be implemented in a particular way, and that the module talk to its collaborators in a particular way. Do you no longer care about those details, as long as it does the right thing? Delete the tests, and just keep the ones that make sure it does the right thing.

Contractually-obligated testing

About a billion years ago, Bertrand Meyer (he of Open-Closed Principle fame) introduced a programming language called Eiffel. It had a feature called Design by Contract, that let you define constraints that your program had to adhere to in execution. Like you can convince C compilers to emit checks for rules like integer underflow everywhere in your code, except you can write your own rules.

To see what that’s like, here’s a little Objective-C (I suppose I could use Eiffel, as Eiffel Studio is in homebrew, but I didn’t). Here’s my untested, un-contractual Objective-C Stack class.

@interface Stack : NSObject

- (void)push:(id)object;
- (id)pop;

@property (nonatomic, readonly) NSInteger count;

@end

static const int kMaximumStackSize = 4;

@implementation Stack
{
    __strong id buffer[4];
    NSInteger _count;
}

- (void)push:(id)object
{
    buffer[_count++] = object;
}

- (id)pop
{
    id object = buffer[--_count];
    buffer[_count] = nil;
    return object;
}

@end

Seems pretty legit. But I’ll write out the contract, the rules to which this class will adhere provided its users do too. Firstly, some invariants: the count will never go below 0 or above the maximum number of objects. Objective-C doesn’t actually have any syntax for this like Eiffel, so this looks just a little bit messy.

@interface Stack : ContractObject

- (void)push:(id)object;
- (id)pop;

@property (nonatomic, readonly) NSInteger count;

@end

static const int kMaximumStackSize = 4;

@implementation Stack
{
    __strong id buffer[4];
    NSInteger _count;
}

- (NSDictionary *)contract
{
    NSPredicate *countBoundaries = [NSPredicate predicateWithFormat: @"count BETWEEN %@",
                                    @[@0, @(kMaximumStackSize)]];
    NSMutableDictionary *contract = [@{@"invariant" : countBoundaries} mutableCopy];
    [contract addEntriesFromDictionary:[super contract]];
    return contract;
}

- (void)in_push:(id)object
{
    buffer[_count++] = object;
}

- (id)in_pop
{
    id object = buffer[--_count];
    buffer[_count] = nil;
    return object;
}

@end

I said the count must never go outside of this range. In fact, the invariant must only hold before and after calls to public methods: it’s allowed to be broken during the execution. If you’re wondering how this interacts with threading: confine ALL the things!. Anyway, let’s see whether the contract is adhered to.

int main(int argc, char *argv[]) {
    @autoreleasepool {
        Stack *stack = [Stack new];
        for (int i = 0; i < 10; i++) {
            [stack push:@(i)];
            NSLog(@"stack size: %ld", (long)[stack count]);
        }
    }
}

2014-08-11 22:41:48.074 ContractStack[2295:507] stack size: 1
2014-08-11 22:41:48.076 ContractStack[2295:507] stack size: 2
2014-08-11 22:41:48.076 ContractStack[2295:507] stack size: 3
2014-08-11 22:41:48.076 ContractStack[2295:507] stack size: 4
2014-08-11 22:41:48.076 ContractStack[2295:507] *** Assertion failure in -[Stack forwardInvocation:], ContractStack.m:40
2014-08-11 22:41:48.077 ContractStack[2295:507] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException',
 reason: 'invariant count BETWEEN {0, 4} violated after call to push:'

Erm, oops. OK, this looks pretty useful. I’ll add another clause: the caller isn’t allowed to call -pop unless there are objects on the stack.

- (NSDictionary *)contract
{
    NSPredicate *countBoundaries = [NSPredicate predicateWithFormat: @"count BETWEEN %@",
                                    @[@0, @(kMaximumStackSize)]];
    NSPredicate *containsObjects = [NSPredicate predicateWithFormat: @"count > 0"];
    NSMutableDictionary *contract = [@{@"invariant" : countBoundaries,
             @"pre_pop" : containsObjects} mutableCopy];
    [contract addEntriesFromDictionary:[super contract]];
    return contract;
}

So I’m not allowed to hold it wrong in this way, either?

int main(int argc, char *argv[]) {
    @autoreleasepool {
        Stack *stack = [Stack new];
        id foo = [stack pop];
    }
}

2014-08-11 22:46:12.473 ContractStack[2386:507] *** Assertion failure in -[Stack forwardInvocation:], ContractStack.m:35
2014-08-11 22:46:12.475 ContractStack[2386:507] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException',
 reason: 'precondition count > 0 violated before call to pop'

No, good. Having a contract is a bit like having unit tests, except that the unit tests are always running whenever your object is being used. Try out Eiffel; it’s pleasant to have real syntax for this, though really the Objective-C version isn’t so bad.

Finally, the contract is implemented by some simple message interception (try doing that in your favourite modern programming language of choice, non-Rubyists!).

@interface ContractObject : NSObject
- (NSDictionary *)contract;
@end

static SEL internalSelector(SEL aSelector);

@implementation ContractObject

- (NSDictionary *)contract { return @{}; }

- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector
{
    NSMethodSignature *sig = [super methodSignatureForSelector:aSelector];
    if (!sig) {
        sig = [super methodSignatureForSelector:internalSelector(aSelector)];
    }
    return sig;
}

- (void)forwardInvocation:(NSInvocation *)inv
{
    SEL realSelector = internalSelector([inv selector]);
    if ([self respondsToSelector:realSelector]) {
        NSDictionary *contract = [self contract];
        NSPredicate *alwaysTrue = [NSPredicate predicateWithValue:YES];
        NSString *calledSelectorName = NSStringFromSelector([inv selector]);
        inv.selector = realSelector;
        NSPredicate *invariant = contract[@"invariant"]?:alwaysTrue;
        NSAssert([invariant evaluateWithObject:self],
            @"invariant %@ violated before call to %@", invariant, calledSelectorName);
        NSString *preconditionKey = [@"pre_" stringByAppendingString:calledSelectorName];
        NSPredicate *precondition = contract[preconditionKey]?:alwaysTrue;
        NSAssert([precondition evaluateWithObject:self],
            @"precondition %@ violated before call to %@", precondition, calledSelectorName);
        [inv invoke];
        NSString *postconditionKey = [@"post_" stringByAppendingString:calledSelectorName];
        NSPredicate *postcondition = contract[postconditionKey]?:alwaysTrue;
        NSAssert([postcondition evaluateWithObject:self],
            @"postcondition %@ violated after call to %@", postcondition, calledSelectorName);
        NSAssert([invariant evaluateWithObject:self],
            @"invariant %@ violated after call to %@", invariant, calledSelectorName);
    }
}

@end

SEL internalSelector(SEL aSelector)
{
    return NSSelectorFromString([@"in_" stringByAppendingString:NSStringFromSelector(aSelector)]);
}

Reflections on “Is TDD Dead”

The first thing I noticed that I needed to change as a result of watching the Is TDD Dead? series is that I started out with a defensive mindset. If I believe in the dogma of a rule, then presumably I’m still a beginner who hasn’t yet learned that there are other rules, each of which has limited applicability. So lesson one is to practise more, to find the limitations of the techniques I use, and to understand what to do when those limits are met.

An unsurprising aspect of the discussion was that the thing I call TDD is not the thing that others call TDD. At some point recently I got converted to the Mockist school. Part of me (the part that doesn’t value my social life) is inclined to write a second edition of test-driven iOS development but then keep both editions available, because the style will have changed so much between the two.

Anyway it is more germane to note that none of the three speakers has a lot of time for mocks. I need to understand this, the experiences they have had that I have not, and the things that I do not know about mocks. Perhaps they view the sorts of tests that I build as the structural tests that they would delete once the system is working.

Is TDD Dead? My questions

These are my questions for parts 5 and 6 of Is TDD Dead?. I’d like to start by thanking the panellists for publishing their discussions.

TDD the Principle

Kent and Martin, why is it that you practise test-driven development? What do you get from it? David, how do you get those same things?

What could change about the way we write software to make TDD redundant or obsolete for Kent and Martin? What could change about the way that TDD is performed to make it useful or beneficial for David?

TDD the Practise

David, is there a way in which we could retain the test-first idea of TDD, but avoid the design problems you encounter in the production code? Are you able to distil a short list of design guidelines that avoid your “design damage”? Kent and Martin, how could TDDers follow those guidelines in their work?

TDD the Community

How have all three of you felt about the community reaction to this discussion? What has been good? What has been bad? What has been missing? Is there information the community could supply to help developers evaluate and choose or discard TDD for their work that is missing? Are there questions developers should ask before choosing whether to adopt or avoid TDD that you believe are not being raised?

I use mocks and I’m happy with that

Both Kent Beck and Martin Fowler have said that they don’t use mock objects in their test-driven development. I do. I use them mostly for the sense described first in my BNR blog post on Mock Objects, namely to stand in for a thing that can receive messages I want to send, but that does not yet exist.

If you look at the code in Test-Driven iOS Development, you’ll find that it uses plenty of test doubles but none of them is a mock object. What has changed in my worldview to move from not-mocking to mocking in that time?

The key information that gave me the insight was this message, pardon the pun! from Alan Kay on object-oriented programming:

The big idea is “messaging” – that is what the kernal[sic] of Smalltalk/Squeak
is all about (and it’s something that was never quite completed in our
Xerox PARC phase). The Japanese have a small word – ma – for “that which
is in between” – perhaps the nearest English equivalent is “interstitial”.
The key in making great and growable systems is much more to design how its
modules communicate rather than what their internal properties and
behaviors should be.

What I’m really trying to do is to define the network of objects connected by message sending, but the tool I have makes me think about objects and what they’re doing. To me, mock objects are the ability to subvert the tool, and force it to let me focus on the ma.

It’s about solving problems

As ever, there’s a touchstone issue on the programmers’ corner of the intarwebs (the programmers’ corner is actually the same intarwebs everyone else is using, just we model it with geometry so it can have a corner). Here it is:

Alan Kelly predicted that by 2022, TDD will become a prerequisite for employment as a programmer. Let’s leave aside for a moment the issue discussed here and in APPropriate Behaviour, that there is no arbiter of programmer employability.

Responses did not take long in coming. Some TDDers cannot believe that TDD is not already required. Some non-TDDers are incensed that non-TDD is not considered valid. I particularly like this tweet about TDD’s place:

I like to explain that a lot of [TDD is] just a substitute for a decent type system ;).

You know what? That might be true, I’m not sure. Functional programming education has, so far, been to me like a pure map of me onto a slightly older version of me without the side effect of imparting knowledge of functional programming. Bear in mind that I’ve been paid to write LISP in the past, and I struggle with FP. I just can’t get from knowing what functions are to big-picture views of functional programming.

I wouldn’t know the mathematics of a type system if it came up and quacked at me. As far as I know, Hindley is the name of a serial killer and now I’m looking warily at Milner. But I’m fine with that, and I’m fine with you not being fine with that. It’s just that we think about things in different ways.

Other people think about things in other different ways too. And I’m fine with that, too. Let’s look some more at tests.

Many forms of automated test follow a common pattern: Assemble, Act, Assert or Given, When, Then. Let’s talk about Eiffel, a language invented by the only person I know whose opinions have a stronger identity than the person who holds them.

An Eiffel programmer would look at Given and see that this is describing a particular state of the preconditions of the system under test. Or, using theories, a range of preconditions. And they would look at Then, and observe that this is making statements about the postconditions of the system under test.

This Eiffel programmer would probably then ask why you expect them to do all of this, when they already generalised this out in designing the system’s contract. Why write all these tests when the program itself can evaluate its own preconditions and postconditions as it’s chugging along, and you can design to those?

Just as this starts to sound reasonable, a Prolog programmer asks why you’d ever write the When bit at all. Surely you just tell the computer what you’ve got and what you want, and it works out how to get there? (I tried, and it said “NO.” Oh well.)

There are loads of different ways to write software. Many of these are not better nor worse than others. In fact, to even have that discussion you’d need to be able to express and agree upon what “better” means, and I’m not sure we’re there. Some are prevalent because they let people reason about their problems in a way that feels comfortable, or efficient. Others are prevalent because a vendor threw lots of money into marketing. Some of them feel comfortable or efficient because we’ve become used to thinking in those ways, rather than because they were a natural fit onto our ab initio thoughts.

But remember, your goal is not to write software. Your goal is to solve problems, and often software is part of the solution. The tools, practices and principles we have now may be adequate, but they may not be best. They might suit some of us, and not others.

That’s fine. I know how I like to work, and I’m happy to help other people find out about it and try it out. But I’m also happy to find out about and try out other things. I’ll champion what I do, while learning about what others do. And all along I’ll champion the higher-level goal of solving problems, and the professional standard of being able to demonstrate confidence in our solutions.

Maybe I should go and read about type systems now.

ClassBrowser: warts and all

I previously gave a sneak peak of ClassBrowser, a dynamic execution environment for Objective-C. It’s not anything like ready for general use (in fact it can’t really do ObjC very well at all), but it’s at the point where you can kick the tyres and contribute pull requests. Here’s what you need to know:

Have a lot of fun!

ClassBrowser is distributed under the terms of the University of Illinois/NCSA licence (because it is based partially on code distributed with clang, which is itself under that licence).

Automated tests with the GNUstep test framework

Setup

Of course, it’d be rude not to use a temperature converter as the sample project in a testing blog post. The only permitted alternative is a flawed bank account model.

I’ll create a folder for my project, then inside it a folder for the tests:

$ mkdir -p TemperatureConverter/test
$ cd TemperatureConverter

The test runner, gnustep-tests, is a shell script that looks for tests in subfolders of the current folder. If I run it now, nothing will happen because there aren’t any tests. I’ll tell it that the test folder will contain tests by creating an empty marker file that the script looks for.

$ touch test/TestInfo

Of course, there still aren’t any tests, so I should give it something to build and run. The test fixture files themselves can be Objective-C or Objective-C++ source files.

$ cat > converter.m
#include "Testing.h"

int main(int argc, char **argv)
{
}
^D

Now the test runner has something to do, though not very much. Any of the invocations below will cause the runner to find this new file, compile and run it. It’ll also look for test successes and failures, but of course there aren’t any yet. Still, these invocations show how large test suites could be split up to let developers only run the parts relevant to their immediate work.

$ gnustep-tests #tests everything
$ gnustep-tests test #tests everything in the test/ folder
$ gnustep-tests test/converter.m #just the tests in the specified file

The first test

Following the standard practice of red-green-refactor, I’ll write the test that I want to be able to write and watch it fail. This is it:

#include "Testing.h"

int main(int argc, char **argv)
{
  TemperatureConverter *converter = [TemperatureConverter new];
  float minusFortyF = [converter convertToFahrenheit:-40.0];
  PASS(minusFortyF == -40.0, "Minus forty is the same on both scales");
  return 0;
}

The output from that:

$ gnustep-tests
Checking for presence of test subdirectories ...
--- Running tests in test ---

test/converter.m:
Failed build:     

      1 Failed build


Unfortunately we could not even compile all the test programs.
This means that the test could not be run properly, and you need
to try to figure out why and fix it or ask for help.
Please see /home/leeg/GNUstep/TemperatureConverter/tests.log for more detail.

Unsurprisingly, it doesn’t work. Perhaps I should write some code. This can go at the top of the converter.m test file for now.

#import <Foundation/Foundation.h>

@interface TemperatureConverter : NSObject
- (float)convertToFahrenheit:(float)celsius;
@end

@implementation TemperatureConverter
- (float)convertToFahrenheit:(float)celsius;
{
  return -10.0; //WAT
}
@end

I’m reasonably confident that’s correct. I’ll try it.

$gnustep-tests 
Checking for presence of test subdirectories ...
--- Running tests in test ---

test/converter.m:
Failed test:     converter.m:19 ... Minus forty is the same on both scales

      1 Failed test


One or more tests failed.  None of them should have.
Please submit a patch to fix the problem or send a bug report to
the package maintainer.
Please see /home/leeg/GNUstep/TemperatureConverter/tests.log for more detail.

Oops, I seem to have a typo which should be easy enough to correct. Here’s proof that it now works:

$ gnustep-tests
Checking for presence of test subdirectories ...
--- Running tests in test ---

      1 Passed test

All OK!

Second point, first set

If every temperature in Celsius were equivalent to -40F, then the two scales would not be measuring the same thing. It’s time to discover whether this class is useful for a larger range of inputs.

All of the tests I’m about to add are related to the same feature, so it makes sense to document these tests as a group. The suite calls these groups “sets”, and it works like this:

int main(int argc, char **argv)
{
  TemperatureConverter *converter = [TemperatureConverter new];
  START_SET("celsius to fahrenheit");
  float minusFortyF = [converter convertToFahrenheit:-40.0];
  PASS(minusFortyF == -40.0, "Minus forty is the same on both scales");
  float freezingPoint = [converter convertToFahrenheit:0.0];
  PASS(freezingPoint == 32.0, "Water freezes at 32F");
  float boilingPoint = [converter convertToFahrenheit:100.0];
  PASS(boilingPoint == 212.0, "Water boils at 212F");
  END_SET("celsius to fahrenheit");
  return 0;
}

Now at this point I could build a look-up table to map inputs onto outputs in my converter method, or I could choose a linear equation.

- (float)convertToFahrenheit:(float)celsius;
{
  return (9.0/5.0)*celsius + 32.0;
}

Even tests have aspirations

Aside from documentation, test sets have some useful properties. Imagine I’m going to add a feature to the app: the ability to convert from Fahrenheit to Celsius. This is the killer feature, clearly, but I still need to tread carefully.

While I’m developing this feature, I want to integrate it with everything that’s in production so that I know I’m not breaking everything else. I want to make sure my existing tests don’t start failing as a result of this work. However, I’m not exposing it for public use until it’s ready, so I don’t mind so much if tests for the new feature fail: I’d like them to pass, but it’s not going to break the world for anyone else if they don’t.

Test sets in the GNUstep test suite can be hopeful, which represents this middle ground. Failures of tests in hopeful sets are still reported, but as “dashed hopes” rather than failures. You can easily separate out the case “everything that should work does work” from broken code under development.

  START_SET("fahrenheit to celsius");
  testHopeful = YES;
  float minusFortyC = [converter convertToCelsius:-40.0];
  PASS(minusFortyC == -40.0, "Minus forty is the same on both scales");
  END_SET("fahrenheit to celsius");

The report of dashed hopes looks like this:

$ gnustep-tests 
Checking for presence of test subdirectories ...
--- Running tests in test ---

      3 Passed tests
      1 Dashed hope

All OK!

But we were hoping that even more tests might have passed if
someone had added support for them to the package.  If you
would like to help, please contact the package maintainer.

Promotion to production

OK, well one feature in my temperature converter is working so it’s time to integrate it into my app. How do I tell the gnustep-tests script where to find my class if I remove it from the test file?

I move the classes under test not into an application target, but a library target (a shared library, static library or framework). Then I arrange for the tests to link that library and use its headers. How you do that depends on your build system and the arrangement of your source code. In the GNUstep world it’s conventional to define a target called “check” so developers can write make check to run the tests. I also add an optional argument to choose a subset of tests, so the three examples of running the suite at the beginning of this post become:

$ make check
$ make check suite=test
$ make check suite=test/converter.m

I also arrange for the app to link the same library and use its headers, so the tests and the application use the same logic compiled with the same tools and settings.

Here’s how I arranged for the TemperatureConverter to be in its own library, using gnustep-make. Firstly, I broke the class out of test/converter.m and into a pair of files at the top level, TemperatureConverter.[hm]. Then I created this GNUmakefile at the same level:

include $(GNUSTEP_MAKEFILES)/common.make

LIBRARY_NAME=TemperatureConverter
TemperatureConverter_OBJC_FILES=TemperatureConverter.m
TemperatureConverter_HEADER_FILES=TemperatureConverter.h

-include GNUmakefile.preamble

include $(GNUSTEP_MAKEFILES)/library.make

-include GNUmakefile.postamble

Now my tests can’t find the headers or the library, so it doesn’t build again. In GNUmakefile.postamble I’ll create the “check” target described above to run the test suite in the correct argument. GNUmakefile.postamble is included (if present) after all of GNUstep-make’s rules, so it’s a good place to define custom targets while ensuring that your main target (the library in this case) is still the default.

TOP_DIR := $(CURDIR)

check::
    @(\
    ADDITIONAL_INCLUDE_DIRS="-I$(TOP_DIR)";\
    ADDITIONAL_LIB_DIRS="-L$(TOP_DIR)/$(GNUSTEP_OBJ_DIR)";\
    ADDITIONAL_OBJC_LIBS=-lTemperatureConverter;\
    LD_LIBRARY_PATH="$(TOP_DIR)/$(GNUSTEP_OBJ_DIR):${LD_LIBRARY_PATH}";\
    export ADDITIONAL_INCLUDE_DIRS;\
    export ADDITIONAL_LIB_DIRS;\
    export ADDITIONAL_OBJC_LIBS;\
    export LD_LIBRARY_PATH;\
    gnustep-tests $(suite);\
    grep -q “Failed test” tests.sum; if [ $$? -eq 0 ]; then exit 1; fi\
    )

The change to LD_LIBRARY_PATH is required to ensure that the tests can load the build version of the library. This must come first in the library path so that the tests are definitely investigating the code in the latest version of the library, not some other version that might be installed elsewhere in the system. The last line fails the build if any tests failed (meaning we can use this check as part of a continuous integration system).

More information

The GNUstep test framework is part of gnustep-make, and documentation can be found in its README. Nicola Pero has some useful tutorials about the rest of the make system, having written most of it himself.