On Singleton(s)

I woke up this morning to a discussion on Twitter over how different implementations of the Singleton pattern compare. This is like comparing your Herpes: no matter whose is better or more efficient, you still have unsightly blisters.

Overview: wtf is a Singleton?

Singleton is one of the software design patterns originally collected in the famous Design Patterns: Elements of Reusable Object-Oriented Software by the “Gang of Four”. They describe the intent of the pattern thus:

Ensure a class only has one instance, and provide a global point of access to it.

The other useful source of design pattern information is Cocoa Design Patterns by Erik Buck and Don Yacktman. In addition to giving examples of Singleton found in the Foundation and AppKit frameworks, this book shows how to implement a Singleton class in Objective-C. The key features are:

  • A single access point to retrieve the single instance.
  • Initialization of the single instance.
  • Protection against accidentally creating another instance or deleting the single instance.

When would you use that?

Well here’s the thing: I won’t say that I “never would” use Singleton, but I will certainly say that it isn’t the most reached-for tool in my belt.

The usual reason is “this class models something of which there is only one thing”. This is the most absurd thing you’ll ever hear. There’s only one print spooler, so surely it must be a singleton. Right, that only works right up until the point where you need two print spoolers. When do you need a second print spooler? I’ll come onto that in the next section.

Similarly, just because there’s one filesystem, doesn’t necessarily mean that you only need one filesystem object. It certainly doesn’t mean you need to enforce there’s only one filesystem object.

Go on then, wise guy, when do you need the second singleton?

When you test the code that uses the first one. You don’t want your unit tests to talk to the real filesystem, or the real database, or the real print spool. You want your unit tests to use a Mock Object, which means they need the second, fake, instance of your filesystem object or whatever.

That’s the real problem with Singleton: often, in your app, it makes sense to provide a shared instance of an object (particularly one that has state relevant to the entire app). However, that’s not the same as requiring that no-one ever create another instance of the object.

So how would you do it, then?

Let’s assume I have a need for an application-wide Framistan instance. I have a couple of options:

  • Create the usual +[Framistan sharedFramistan] class method, as Singleton implementers would, but not all that refcount-avoiding cruft that normally goes with it.
  • Create the method -[[NSApp delegate] sharedFramistan]. After all, I have a need for an application-wide instance, and that’s where stuff associated with the application lives.

The option I go for depends on whether I think a process needs a shared instance of the class, or whether I think this app does. Usually, the latter option gets implemented first, and I change it later when I come to write the second app that uses the same class.

Either way, when I need to use the shared instance, I use it like this:

[someFrobnicator abdjulateWithFramistan: [Framistan sharedFramistan]];

Passing the shared instance in means that I still get to pass in other instances, for example in a test I can do [testFrobnicator abdjulateWithFramistan: mockFramistan];.

I don’t like that. Does anyone do something different?

Yes, some people solve Singleton…with another Singleton. The mind boggles. Anyway, what these people do is to create a FramistanFacade Singleton whose job is to manage access to the Framistan Singleton. Now your real Framistan can be an honest Singleton class, but clients talk to the (also-Singleton) FramistanFacade class, which decides what instance of what class it wants to talk to itself.

In Objective-C, the FramistanFacade can use message forwarding to act as a Proxy object, hiding the interaction between Façade and real object. Of course, now that you’re managing the real instance behind the Façade, the real class doesn’t need to be a Singleton because a different class is already managing how its instances are used.

Conclusion

The debates over how best to implement Singleton in Objective-C are redundant because the Singleton pattern is never what you need. Often you do need a shared instance of a class, but enforcing that no other instance ever get used it detrimental to testing.

When you do need a shared instance of a class in your app, ensure that you do not close the door to using alternate instances. The need comes up more often than you might expect.

On, or rather in, Seattle

I’ve never been to Washington before, so I’m looking forward to Voices That Matter: iPhone Developers Conference in April. Of course, you know I like the sound of my own voice enough to be speaking: my talk this year will be about Test-Driven Development of iOS apps. I heard from attendees in Philly that they wanted my unit testing talk to be more meat and less waffle, and that’s the talk I’ll be giving this time.

I also like the sound of other people’s voices, and the schedule this time looks spot on: fellow sideburn-wielders Mike Lee and Andy Ihnatko are talking, as is Aaron Hillegass: all worth listening to. I’m a bit nervous about talking to an empty room as I’m scheduled up against Cat Shive who gives a good talk too.

If you’re interested in attending, now is the time to register. Early bird pricing ends Friday, but whether you sign up now or next week you can use promo code SEASPK2 to get an extra $100 off. Right now, that means attendance is only $395.

By the way, I’m sounding out things to do while I’m in Seattle, so any locals who have expert knowledge on where to eat and what to see please feel welcome to comment. I’m definitely up for visiting 1, Microsoft Way in nearby Redmond. If you work for Microsoft, offer to run a tour ;-).

On repeatable builds

One of the key features of software engineering, as distinct from cowboy coding or hacking, is that it should be repeatable. That doesn’t mean that you should do the same project twice in identical ways from beginning to end: that would be a waste of time (and you can bet the requirements have moved the second time around). But it does mean that two people, investigating the same project at the same point in the project, should be able to reproduce the same results.

What does that mean for builds?

If you ask me to investigate a bug a customer reported with version 1.5.3 of your app, then I should be able to build version 1.5.3 of your app even if you’re now working on version 3.0.2. The product I end up with by building version 1.5.3 should be exactly the same as the one you’d end up with, and also exactly the thing the customer was using.

In practice, this means that your build should be as simple as possible. Given access to your source repository, any developer should be able to see what they need to check out, to correspond to a particular version of your app (a released version, a development branch, whatever). They should then hit the one button that gets that product built.

If there are any dependencies in the build process, these should be handled by the build process. If a particular version of your app uses a particular version of a framework or plug-in, that should be codified in the source for that version of your app. Leave nothing to chance, external configuration or intelligent interaction. Which are basically the same thing.

Why should I care?

Well, for a start, if you don’t know that the thing you’re debugging and the thing your customer’s using are the same, then you don’t know whether you can even find the customer’s bug, let alone fix it. There are ancillary benefits too.

Get new developers up to speed quickly

One of the things that I do to differentiate from other security consultants is offer a half-day “I just want you to fix this problem” service, which is a bit like Apple DTS or MSDN Technical Incidents for security (but with a security boffin on tap, and also available on Android ;->).

Now, if you hire me for half a day, you probably want me to spend that half-day solving your problem. I certainly want to spend my time that way. Neither of us wants me to spend a couple of hours trying to work out how your product’s built. Complicated build procedures are the top reason for failed/unreproducible builds. Every time a developer has to think about how to build the product, that’s a point where he can introduce a mistake.

Anyone can turn out a new release

You’ve just found out about a crashing bug in your million-dollar app, but you’re already on that Carribean cruise you’ve bought pre-emptively with the profits. No problem, you use the satellite phone to call a junior developer, and tell him to fix the bug and submit a new build to iTunes Connect.

…can he do it? Will he get it right? If not, can your company last the two weeks until you get back from burning the profits?

You can migrate to new hardware

Maybe you finally got that new MacBook Air you’ve been lusting after, or the Mac Pro to go in your evil lair. Or your dev computer just died, and you need to get a new one up and running. And your backups failed.

If you forget how to set up your dev environment, or you get it a bit wrong, you’ll waste time and get angry. Then you’ll get it more wrong.

You can automate your builds

You’ve seen those open source projects like WebKit that do nightly builds? I do that too. In fact, I have a build triggered whenever I commit source code to version control. If the build succeeds, it runs some tests. If any of that fails, I get an e-mail.

It’s called Continuous Integration, and provides very rapid feedback on the “health” of a project. But to get it done, you need for your builds to be unattended.

OK, I’m convinced. How do I do it?

Let’s take a look at the independent parts of your typical iOS app, and see what it would take to ensure that we always get the same version of them for every build.

Source code

You already use version control, right? Right. Do you tag your releases?

A tag in git or subversion (or any other version control system that supports them, but life’s too short) is just a way to name a particular commit. In git, it really is just a name (and optionally some other info) attached to a certain commit. In subversion, it’s actually a new branch, but one that you shouldn’t commit any changes to.

So let’s say you’ve just finished version 1.0, it doesn’t contain any known bugs, and it’s time to submit it to the store. You should tag the version of the source that corresponds to what you submit as the version 1.0 candidate.

Now, if you ever need to go back to version 1.0, to address an issue reported by the store reviewers, or to investigate a bug reported by a customer, you just check out that tagged version of your source. If you need another developer to look into something related to version 1.0, then she checks out that tagged version.

Of course, checking out a particular tag only gets you the same built product if the source under version control represents a complete definition of the product. Let’s see a few cases where that might not be true.

Subprojects

Apps sometimes contain classes that have been developed as a separate project, particularly collections of helper or utility classes.

Building version 1.0 of the app requires building whatever version of the helpers was used in version 1.0. If you change the two independently, and don’t have a way to keep both in sync, then you don’t have a way to build the same product again. That sucks.

Building a particular version of your app should automatically bring along the correct version of a subproject. The easiest way to do that is with git submodules or its equivalent in your preferred SCM. If that’s not possible for whatever reason, then you can use a shell script as part of the build process. Keep the shell script in version control so you get the correct version of the script that checks out the correct version of the subproject.

Building your app should automatically build the subproject. There is nothing worse than sitting and waiting for a build, only to find random link errors at the end. You send an email to the project maintainer, then after the weekend’s over she sends a reply saying “yeah, you need to build this project first”. Gah!

In Xcode, adding a subproject is as simple as dragging the subproject’s file into the list of files in your main project. You then edit your app target, and add the target for the subproject as a dependency for the app target. Now Xcode checks when you build the app whether it needs to build the subproject first. You’re guaranteed to link against an up-to-date version, not whatever old cruft Xcode found in your search paths.

Third-party frameworks/libraries

Sometimes, your project depends on someone else’s framework or library. You have that library, but you don’t have the source to it.

Check the headers and the binary for the library into version control. Yes, it’s icky. But now, when someone else checks out a particular revision of your source, they automatically get the correct revision of the library too, and compile with the correct headers.

Set up the header/library search paths for your Xcode target such that Xcode searches the folder where the revision-controlled library too, not whatever nasty old version you, I or someone else happens to have knocking around /usr/local/. This path should be relative to $(SRCROOT), i.e. it should say how to get from where your project is to where that library is.

It should not be an absolute path. Other developers don’t have the same path structure as you. They probably have a different username, for a start, so /Users/leeg/Library/Frameworks doesn’t exist for them. They may keep your source in a disk image. If you’re using continuous integration, the source probably gets checked out into a temporary workspace that moves every time. They key thing is don’t rely on the environment being the same for every build, rely on the checked-out source being sufficient to completely describe the build.

Interface Builder plug-ins

More common in the Mac world than iOS, IB plugins provide access to custom objects in Interface Builder and allow you to add them to your XIBs, inspect their properties and so on. If Xcode can’t find an ibplugin needed to compile a XIB, then builds start failing.

Again, check the IB plugin (or the source needed to build it) into version control. If you commit the source, don’t forget to make it a dependency of your app target so it definitely gets built when people try to build your app.

You don’t need an IB Plugins installed to build an Xcode project that relies on it! In your app’s target editor, you can set the paths to the required plug-ins or the search paths for ibtool to find plug-ins when it needs them. Remember to make these paths relative.

A developer trying to edit your XIBs will need to install the plug-in, but now he knows where to find it: he takes the version that was checked out when he checked out your app.

That all sounds hard to set up.

Not really. Create a new user account on your Mac, check out your app, and build it. Fix the problems until there are none. If you can, find a Mac you’ve never coded on before and repeat the same process.

Conclusion

You can avoid wasting a lot of time, and having failed or incorrect builds, by instituting a simple, repeatable build process. There should be exactly one thing to check out of version control, and important versions of that thing should be tagged. Opening the Xcode project that was in version control and hitting ‘build’ should be all that’s needed to get your product built, no matter how complicated the project and how many libraries it depends on. Hitting that ‘build’ button on one version of the source should always build the same product, on anybody’s computer.

On squeezing out that last ounce of performance

As I get confused by a component of an application that should be network-bound actually being limited by CPU availability, I get reminded of the times in my career that I’ve dealt with application performance.

I used to work on a platform for distributing MMS and SMS messages, written using GNUstep, Linux and PostgreSQL. I had a duplicate of the production hardware stack sitting in a data centre in our office, which ran its own copy of the production software and even had a copy of the production data. I could use this to run simulations or even replays of real events, finding the locations of the slowdowns and trying various hypotheses to remove the bottlenecks.

My next job was working on antivirus. The world of antivirus evaluations is dominated by independent testers, who produce huge bake-off articles comparing the various products. For at least one decade the business of actually detecting the stuff has been routine, so awards like VB100 are meaningless. In addition to detection stats, analysts like VB and AV-comparative measure resource consumption, and readers take those measurements seriously.

That’s because they don’t want to use anti-virus, and they didn’t pay for their RAM and CPUs to waste it on software they don’t want to use. So given a bunch of apps that all do the same thing, they’ll look at which does it with less impact on everything else. This means that performance is an important requirement of new projects in AV software: on the product I worked on we had a defined set of performance tests. A new project release could not— regardless of how shiny the new features were—ship if the tests took 5% or more time or RAM than the current shipping version. On like hardware. What that really means is that due to developments in hardware, AV software was getting monotonically faster up until a few years ago.

Since then, my relationship with performance optimisation has been more sporadic. I’ve worked on contracts to speed up iOS apps, and even almost took a performance analysis and improvement job on a mobile phone operating system team. But what I usually do is make software work (and make it secure), with making it work in such a way that people can actually use it being a part of that. Here, then, are my Reflections On Making Efficient Software™.

Start at the beginning.

You may have heard the joke about a man on a driving holiday who gets lost and asks a local for directions. The local thinks for a bit, and says “well to get to where you’re going, I wouldn’t start from here if I were you”.

Performance analysis can be like that, too. If you build up all of the functionality first, and optimise it later, you will almost certainly not get a well-performing product. Furthermore, fixing it will be very expensive. You may be able to squeeze a few kilobytes out here, or get a couple of percent speed increase there, but basically the resources used by your app will not change much.

The reason is simple: most of the performance characteristics of your app are baked into the top-level architecture. You need to start thinking about how your app will perform when you start to design what the various parts do and how they fit together. If you don’t design out the architectural bottlenecks, you’ll be stuck with them no matter how good your implementation is.

I’ve been involved with projects like this. It gets to near the ship date, and the app works but is useless because it consumes all of the RAM and/or takes too long to do any work. The project manager gets a developer in (hello!) to address the performance issues. After a few weeks, the developer has managed to understand the code, do some analysis, and improved things by a couple of percent. It’s gone from “sucky” to “quite sucky”, and the ship date isn’t any further away.

An example: If you build a component that can process, say, ten objects per second, then hands them on to another component that can display results at 100 objects per second, you’re always limited to 10 Hz give or take. You might get 12, you’ll never get 100 without replacing the first component or the whole app. Both of these options are more expensive after you’ve written the component than before. Which leads me on to the next top tip…

Simulate, simulate, simulate

So how are you supposed to know how fast your putative architecture is going to run on paper? That’s easy: simulate each component. If you believe that you’ll usually get data from the network at, say, 100 objects/sec, then write a driver that sends fake objects at about 100 Hz. If you think that might spike at 10,000 objects/sec, then simulate that spike too. You’ll be able to see what it takes to develop an app that can respond to those demands.

What’s more, you’ll be able to drop your real components into the simulated environment, and see how they really handle the situations you cook up. You can even use these harnesses as an integration test framework at the intermediate level (i.e. larger than classes, smaller than the whole app).

Your simulated components should use the same interfaces to the filesystem and each other that the real code will use, and the same frameworks or libraries. But they shouldn’t do any real work. E.g if you’ve got a component that should read a JSON stream from the network, break it into objects, do around 10ms of work on each object and post a notification after each one is finished, you can write a simulation to do that using the JSON library you plan on deploying, the sleep() function and NSNotificationCenter. Then you can play around with its innards, trying out operation queues, dispatch queues, caching and other techniques to see how the system responds.

Performance isn’t all about threads

Yes, Apple has a good concurrency programming guide. Yes, dispatch queues are new and cool. But no, not everything is sped up by addition of threads. I’m not going to do the usual meaningless micrometrics of adding ten million objects to a set, because no-one ever does that in real code.

The point is that doing stuff in the background is great for exactly one thing: getting that stuff off of the UI thread. For any other putative benefits, you need to measure. Perhaps threading will speed it up. Perhaps there aren’t any good concurrent algorithms. Perhaps the scheduling overhead will get in the way.

And of course you need to measure performance in an approximation to the customer’s environment. Your 12-core Mac Pro probably runs your multithreaded code in a different way than your user’s MacBook Air (especially if the Pro has a spinning disk). And the iPad is nothing like the simulator, of course.

Speed, memory, work: choose any two

You can make it faster and use less RAM by not doing as much. You can make it faster and do the same amount of work by caching. You can use less RAM and do the same amount of work by reducing the working set. Only very broken code can gain advances in speed and memory use while not changing the outcome.

Measure early and measure often

As I said at the top, there’s no point baking the app and then sprinkling with performance fairy dust at the end. You need to know what you’re aiming for, whether it’s achievable, and how it’s going throughout development.

You must have some idea of what constitutes acceptable performance, so devise tests that discover whether the app is meeting those performance requirements. Run these tests periodically throughout development. If you find that a recent change slowed things down, or caused too much memory to be used, now is a good time to fix that. This is where that simulation comes in useful, so you can get an idea about the system’s overall performance even before you’ve written it all.