given-when-then in XCTest

I started writing a new Mac app, and I started doing it by driving the implementation through Xcode UI Automation tests. But then it turned out I was driving the test infrastructure as much as the tests, and it’s that I want to talk about.

Given, When, Then

My (complete, Xcode UI Automation) test looks like this:

func testAddingANoteResultsInANoteBeingAdded() {
    given("An empty notebook")
    when("I add a note to the notebook")
    then("There is a note in the notebook")

The test case class has an object called a World, which holds, well, the test’s world. There are two parts to this.

The World holds regular expressions associated with blocks, where each block does some part of the test if its associated regular expression matched the description of the test. As an example, my test fixture sets up this association:

try world.then(matchingExpectation: "^There is a note in the notebook$",
                work: { _, world in
                guard let notebook:LabraryNotebook
                  = world.getFromState("TheNotebook") as? LabraryNotebook else {
                    XCTFail("No notebook to test")
                XCTAssertEqual(notebook.countOfNotes(), 1,
                  "There should be one row in the notes table")

We’ll get back to how that block is implemented later. For the moment, I want to make it clear that this is a way to organise a UI test (or, indeed, any other functional test) using XCTest: it is not a new test framework. The test case class still subclasses XCTestCase, and assertions are still made with the XCTAssert* macros/functions. That’s just all wrapped up in this given/when/then structure.

Let’s look at the block’s two parameters: the first is an array of the regular expression’s capture groups so that you can find out information about the test specification, should you want.

The other argument is a reference to the World, which enables the second feature of the World: as state storage so that each part of the test can communicate with later parts. Notice that the when clause in my test says it adds a note to “the notebook”, and the then clause checks that there is a note in “the notebook”. How do they both use the same notebook object? The when clause stores it on the World using world.storeInState(), and the then clause retrieves it with world.getFromState().

Page Objects

Rather than putting XCUIElement goop directly in my test blocks, I use an abstraction called the Page Object pattern, popular among people writing browser tests in Selenium. This puts an adapter between my tests and my UI controls, so the test says (for example) app.newDocument() and the Application page object knows that that means finding the “File” menu, clicking it, then clicking the “New” menu item.

The way to create a new document in a Cocoa app has not changed since 1987 and may not change soon. But the details of my own UI surely will, and will change at a different rate than the goals of the people using it. While someone may want to add a note for the rest of time, there may not always be an “Add Note” button. So my test can continue to say:

when("I add a note to the notebook")

but the page object for a document can change from:

func addANote() {
    let app = XCUIApplication()
    let window =[documentName]
    let control = window.buttons["Add Note"]

to whatever will find and drive the interface in my redesigned application.

Would you like this?

I’m happy to package the given/when/then organisation up and release it under an open source licence so that you can use it in your own apps. As I’ve only just written the code, I’ve yet to do that, but it’s coming! I’m aware that there are multiple ways of getting/using Swift libraries, so if you’re interested please let me know whether you would expect to use an Xcode project that builds a framework, a Swift PM package, a CocoaPod or a Carthage…cart… so I can support you using the software in your way.

Subatomic Chocolate

This started out as a toot thread, but “threaded tooting is tedious for everybody involved” so here’s the single post that thread should have been.

The “Electron vs. native” debate doesn’t make much sense. I feel like I’ve been here before:

Somehow those of us who had chosen a different programming language knew that we were better at writing software; much better than those clowns who just made the most successful office suite ever, the most successful picture editing app ever, or the most successful video player ever. Because we’d taken advice on how to write software from a company that was 90 days away from bankruptcy and had proven incapable of executing on software development, we were awesome and the people who were making the shittons of money on the most popular software of all time were clueless idiots.

Some things to ponder but avoid for the moment:

  • why are those the only choices? If I write a Java SWT app with Windows native components on Windows, and Mac native components on Mac, is that native because I’m using the native widget toolkit or not, because I’m using Java? If it is not, is it “Electron”?
  • where is the boundary of native? AppKit is written in Objective-C, so am I using some unholy abomination of an RMI bridge if I write AppKit software using AppKit APIs but a different programming language, like Swift?

It seems clear that people who believe there is a correct answer to “Electron vs. native” are either native app developers or Electron app developers. We can therefore expect them to have some emotional investment (I have decided to build my career doing this, please do not tell me that I’m wrong) and to be seeking truths that support their existing positions. Indeed, if you are on one side of this debate then the other side is not even wrong because the two positions compare incompatible facts.

Pro-Electron: the tools are better/easier/more familiar/JavaScript

Most to all of these things are true, in a lot of cases. As a seasoned “native” app developer, with some tiny amount of JavaScript experience, I can build a thing very quickly in JS (with a GUI in React or React Native, I haven’t tried Electron) that still takes me a long time in a “native” toolkit, both the ones I’m comfortable with and the ones I’m unfamiliar with but download and try things out in.

Now that should be disturbing to any company who builds a “native” platform, and who thinks that developers are key to their success. If someone with nearly two decades of using your thing can be faster at using someone else’s thing within under a year of learning that thing, there is something you need to be learning very quickly about the way the other thing works and how to bring that advantage to your thing, otherwise everything will be made out of the other thing soon and you’d better hope they keep making it work on your thing.

Actually, having said that this argument is true, it’s not true at all. The tools in JS-land are execrable. Bear in mind that the JSVM (we used to call it a “browser”) is a high-performance code environment with live code loading, reflection and self-modifying capabilities; it’s disappointing that the popular developer environments are text editors with syntax highlighting and an integrated terminal window. “Live” code loading is replaced with using Watchman to wait for some files to change, then kicking off some baroque house of cards that turns those files from my house blend of JS into your house blend of JS, then reloading the whole shebang.

Actually, having said that this argument is true and false, it’s not even relevant at all. The developers are the highly-paid people whose job it is to solve the problems for everybody else, why are we making their lives easier, not everybody else’s?

Pro-“native”: the apps are more efficient/consistent

Both of these things are true, in a lot of cases. A “native” application just needs to link the system widget set (which, if your platform supports efficient memory management, is loaded anyway by some first-party tool) and run its code. It will automatically get things that look and behave like the rest of the applications on the platform.

Actually, having said that this argument is true, it’s not true at all. The “native” tools are based on a lot of low-level abstractions (like threads or operations), that are hard to use correctly; rather than rely on an existing solution (remember there’s no npm for “native”, and the supposed equivalent has nowhere near as much coverage) developers are likely to try building their own use of these primitives, with inefficiencies resulting. The “native” look and feel of the components can and will be readily customised to fit branding guidelines, and besides as the look and feel is the platform vendor’s key differentiator they’ve moved things around every release so an app that behaved “consistently” on the last version looks out of place (deliberately, so that developers are “encouraged” to adopt the new platform features) this year.

Actually, having said that this argument is true and false, it’s not even relevant at all. The computer is there as a substrate for a thing that solves somebody’s problem, so as long as the problem is solved and the solution fits on their computer, isn’t the problem solved? And as for “consistency”, the basic tenets of these desktop “native” experiences were carved out three decades ago, before almost all experience with and research into desktop computer interaction. Why aim for consistency with an approach that was decided before we knew what did or didn’t work properly?

Withholding the Four Freedoms

Having downsized my rather over-enthusiastic computer collection (thanks, eBay!), I was down to one computer. Unfortunately, as a rather long in the tooth MacBook Air, it’s no longer suited to my needs and neither is it upgradeable. I got all of the files I care about off of its disk and set out to look for a replacement, meanwhile setting the MacBook aside to one day wipe its e4fs storage and re-install Mac OS X/OS X/macOS/whatever we call it this week.

I chose the sort of spec computer I wanted, then carefully researched various vendors to see what components they used and whether there was reported driver support in the Linux kernel. Eventually, dude, I got a Dell. The Alienware 15 R3 is made of bits that are supported in Linux, with the most complex piece being that the network adapter requires binary firmware. The manufacturer includes links to the firmware blobs on their website though, so this can’t be hard…can it?

I fell straight to running the Debian Jessie installer, put the firmware blobs in place, and…the wi-fi isn’t detected. Oh well, it’s quite an old kernel, maybe I could switch to Testing? No, that doesn’t boot at all: some error about not getting valid cache information from the SSD. Neither does the Ubuntu 16.10 kernel boot, for the same reason. 16.04 boots just fine, and it detects the wireless and connects to the network…and then the installer crashes.

One thing I’m not looking forward to is the onslaught of replies to this post from people who want to help, but ultimately won’t help. “You should try Arch Linux, I expect that works,” or “maybe you’ll have better luck with Fedora.” Why do you expect that? What specific knowledge about my problem makes you think that your specific choice of distribution will work better? And why can’t I just take that knowledge and apply it to Debian, or Ubuntu? They’re just distributions, they’re all made of the same bits.

I admit, I’m frustrated. I want to be a proud advocate for Free Software and for totally free computing environments, but being unable to even run some of the flagship software makes me reticent to recommend it to others. I was having these problems back in 2001, and I’m having these problems now. And it’s not like I’m incompetent when it comes to *nix administration or to Linux driver configuration, and nor did I just buy the prettiest computer I could find and hope that it would work, I _did my research_. I spent hours making sure that there were drivers for the various components in this system, reading reviews, QA forum posts, and kernel mailing list messages. Unfortunately my willingness to screw about with configuring my computer just so that I can use it has waned over the last couple of decades, faster than the necessity to screw about with it has decreased.

For the moment, I’m running Windows 10 as a hardware abstraction layer, and have a full-screen VirtualBox VM to do my work in. It works, it makes me die a little inside but it works. But I’m worried that for all the crowing that open source is eating the world, it’s still too hard to jump in, even for a diligent and experienced user. Until we can give someone a computer that they turn on and start working in, and that runs free software, this will all remain the preserve of professionals and committed hobbyists. The Four Freedoms will effectively be restricted to those in the know and with time to dedicate to obtaining them: all users are equal, but some are more equal than others.

Apple’s Watch and Jony’s Compelling Beginning

There are a whole lot of constraints that go into designing something. Here are the few I could think of in a couple of minutes:

  • what people already understand about their interactions with things
  • what people will discover about their interactions with things
  • what people want to do
  • what people need to do
  • what people understand their wants and needs
  • what people understand about their wants and needs
  • how people want to be perceived by other people
  • which things that people need or want to do you want to help with
  • which people you want to help
  • what is happening around the people who will be doing the things
  • what you can make
  • what you can afford
  • what you’re willing to pay
  • what materials exist

Some of those seem to be more internal than others, but there’s more of a continuum between “external” and “internal” than a switch. In fact “us” and “them” are really part of the same system, so it’s probably better to divide them into constraints focussing on people and constraints focussing on industry, process and company politics.

Each of Apple’s new device categories moves the designs further from limitations of the internal constraints toward the people-centric limitations. Of course they’re not alone in choosing the problems they solve in the industry, but their path is a particular example to consider. They’re not exclusively considering the people-focussed constraints with the watch, there still are clear manufacturing/process constraints: battery life, radio efficiency are obvious examples.

There are conflicts between some of the people-focussed constraints. You might have a good idea for how a watch UI should work, but it has to be tempered by what people will expect to do which makes new user interface designs an evolutionary process. So you have to take people from what they know to what they can now do.

That’s a slow game, that Apple appear to have been playing very quickly of late.

  • 1984: WIMP GUI, but don’t worry there’s still a typewriter too.

There’s a big gap here, in which the technical constraints made the world adapt to the computer, rather than the computer adapt to the world. Compare desks from the 1980s, 1990s and 2000s, and indeed the coming and going of the diskette box and the mousemat.

  • 2007: touchscreen, but we’ve made things look like the old WIMP GUI from the 1980s a bit, and there’s still a bit of a virtual typewriter thing going on.

  • 2010: maybe this whole touchscreen thing can be used back where we were previously using the WIMP thing.

  • 2013: we can make this touchscreen thing better if we remove some bits that were left over from the WIMP thing.

  • 2014: we need to do a new thing to make the watch work, but there’s a load of stuff you’ll recognise from (i) watches, (ii) the touchscreen thing.

Now that particular path through the tangle of design constraints is far from unique. Compare the iPad to the DynaBook and you’ll find that Alan Kay solved many of the same problems, but only for people who are willing to overlook the fact that what he proposed couldn’t be built. Compare the iPhone to the pocket calculator, and you find that it was possible to portable computing many decades earlier but with reduced functionality. Apple’s products are somewhere in between these two extremes: balancing what can be done now and what could possibly be desired.

For me, the “compelling beginning” is a point along Apple’s (partly deliberate, and partly accidental) continuum, rather than a particular watershed. They’re at a point where they can introduce products that are sufficiently removed from computeriness that people are even willing to discuss them as fashion objects. Yes, it’s still evidently the same grey-and-black glass square that the last few years of devices have been. Yes, it’s still got a shrunk-down springboard list of apps like the earlier devices did.

The Apple Watch (and contemporary equivalents) are not amazing because the bear dances well, they’re amazing because the bear dances at all. The possibility of thinking about a computer as an aesthetic object, one that solves your problems and expresses your identity, rather than a box that does computer things and comes in a small range of colours, is new. The ability to consider a computer more as an object in its environment than as a collection of technical and political constraints changes how they interact with us and us with them. That is why it’s compelling.

And of course the current watch borrows cues from the phone that came before it, to increase familiarity. Future ones, and other things that come after it, will be able to jettison those affordances as expectations and comfort change. That is why it’s a beginning.

PADDs, not the iPad

Alan Kay says that Xerox PARC bought its way into the future by paying lots of money for each computer. Today, you can (almost) buy your way into the future of mobile computers by paying small amounts of money for lots of computers. This is the story of the other things that need to happen to get into the future.

I own rather a few portable computers.

Macs, iPads, Androids and more.

Such a photo certainly puts me deep into a long tail of interwebbedness. However, let me say this: within a couple of decades this photo will look reasonable, for the number of portable computers in one office room. Faster networks will enable more computers in a single location, and that will be an interesting source of future applications.

Here’s the distant future. The people in this are doing something that today’s people do not do so readily.


One of them is giving his PADD to the other. Is that because PADDs are super-expensive, so everybody has to share them? No: it’s because they’re so cheap, it’s easy to give one to someone who needs the information on it and to print/replicate/fax/magic a new one.

We can’t do that today. Partly it’s because the devices are pretty expensive. But their value isn’t just associated with their monetary worth: they’ve got so many personalisations, account settings and stored credentials on them that the idea of giving an unlocked device to someone else is unconscionable to many people. The trick is not merely to make them cheap, but to make them disposable.

Disposable pad computers also solve another problem: that of how to display multiple views simultaneously on the pad screen.


You can go from rearranging metaphorical documents on a metaphorical desktop, back to actually arranging documents on a desktop. Rather than fighting with admittedly ingenious split-screen UI, you can just put multiple screens side by side.

The cheapest tablet computer I’m aware of that’s for sale near me is around £30, but that’s still too expensive. When they’re effectively free, and when it’s as easy to give them away as it is to use them, then we’ll really be living in the future.

Just as it started, this post ends with Xerox PARC, and the inspiration for this post:

Pads are intended to be “scrap computers” (analogous to scrap paper) that can be grabbed and used anywhere; they have no individualized identity or importance.

Intuitive is the Enemy of Good

In the previous instalment, I discussed an interview in which Alan Kay maligned growth-restricted user interfaces. Here’s the quote again:

There is the desire of a consumer society to have no learning curves. This tends to result in very dumbed-down products that are easy to get started on, but are generally worthless and/or debilitating. We can contrast this with technologies that do have learning curves, but pay off well and allow users to become experts (for example, musical instruments, writing, bicycles, etc. and to a lesser extent automobiles).

This is nowhere more evident than in the world of the mobile app. Any one app comprises a very small number of very focussed, very easy to use features. This has a couple of different effects. One is that my phone as a whole is an incredibly broad, incredibly shallow experience. For example, one goal I want help with from technology is:

As an obese programmer, I want to understand how I can improve my lifestyle in order to live longer and be healthier.

Is there an app for that? No; I have six apps that kindof together provide an OK, but pretty disjointed experience that gets me some dissatisfying way toward my goal. I can tell three of these apps how much I run, but I have to remember that some subset can feed information to the others but the remainder cannot. I can tell a couple of them how much I ate, but if I do it in one of them then another won’t count it correctly. Putting enough software to fulfil my goal into one app presumably breaks the cardinal rule of making every feature available within two gestures of the app’s launch screen. Therefore every feature is instead hidden behind the externalised myriad gestures required to navigate my home screens and their folders to get to the disparate subsets of utility.

The second observable effect is that there is a lot of wasted potential in both the device, and the person operating that device. You have never met an expert iPhone user, for the simple reason that someone who’s been using an iPhone for six years is no more capable than someone who has spent a week with their new device diligently investigating. There is no continued novelty, there are no undiscovered experiences. There is no expertise. Welcome to the land of the perpetual beginner.

Thankfully, marketing provided us with a thought-terminating cliché, to help us in our discomfort with this situation. They gave us the cars and trucks analogy. Don’t worry that you can’t do everything you’d expect with this device. You shouldn’t expect to do absolutely everything with this device. Notice the sleight of brain?

Let us pause for a paragraph to notice that even if making the most simple, dumbed-down (wait, sorry, intuitive) experience were our goal, we use techniques that keep that from within our grasp. An A/B test will tell you whether this version is incrementally “better” than that version, but will not tell you whether the peak you are approaching is the tallest mountain in the range. Just as with evolution, valley crossing is hard without a monumental shake-up or an interminable period of neutral drift.

Desktop environments didn’t usually get this any better. The learning path for most WIMP interfaces can be listed thus:

  1. cannot use mouse.
  2. can use mouse, cannot remember command locations.
  3. can remember command locations.
  4. can remember keyboard shortcuts.
  5. ???
  6. programming.

A near-perfect example of this would be emacs. You start off with a straightforward modeless editor window, but you don’t know how to save, quit, load a file, or anything. So you find yourself some cheat-sheet, and pretty soon you know those things, and start to find other things like swapping buffers, opening multiple windows, and navigating around a buffer. Then you want to compose a couple of commands, and suddenly you need to learn LISP. Many people will cap out at level 4, or even somewhere between 3 and 4 (which is where I am with most IDEs unless I use them day-in, day-out for months).

The lost magic is in level 5. Tools that do a good job of enabling improvement without requiring that you adopt a skill you don’t identify with (i.e. programming, learning the innards of a computer) invite greater investment over time, rewarding you with greater results. Photoshop gets this right. Automator gets it right. AppleScript gets it wrong; that’s just programming (in fact it’s all the hard bits from Smalltalk with none of the easy or welcoming bits). Yahoo! Pipes gets it right but markets it wrong. Quartz Composer nearly gets it right. Excel is, well, a bit of a boundary case.

The really sneaky bit is that level 5 is programming, just with none of the trappings associated with the legacy way of programming that professionals do it. No code (usually), no expressing your complex graphical problem as text, no expectation that you understand git, no philosophical wrangling over whether squares are rectangles or not. It’s programming, but with a closer affinity with the problem domain than bashing out semicolons and braces. Level 5 is where we can enable people to get the most out of their computers, without making them think that they’re computering.

HATEOAS app structure explained through some flimsy analogy

You are in a tall, narrow view. A vibrant, neon sign overhead tells you that this is the entrance to “Stocks” – below it is one of those scrolling news tickers you might see on Times Square. In front of you lies a panel of buttons. Before you can do anything, an automatic droid introduces itself as “Searchfield”. It invites you to tell it the name of a stock you’d like to know the price of, and tells you that it will then GET the price.

You can see:

A button marked “Price”<br/>
A button marked “Market Cap”<br/>
A box marked “PUT trades here”<br/>

There is an exit going Back. Additionally you can go Home.


In addition to being a mildly accomplished software engineer, I’ve done some studying and armchair research in the field of ancient languages and palaeography. What happens if we smoosh those fields together?

In a very slight way, art historian and fellow Oxenafordisc Dr. Janina Ramirez did that in her series on Illuminations: the Private Lives of Medieval Kings (erm, Kings and Ælfgifu). In the series she showed off many manuscripts in the British Library collection, but when she went out in the field she took an iPad. It turns out that the BL isn’t too hot on letting you run around with their thousand-year-old kidskin.

You already know my opinion on our digital heritage. This puts it into stark relief: in one hundred years’ time, barring some epic fire in London (those never happen), the BL and its collection will still be there. Will it still be possible to even launch the iPad app she was using? I very much doubt it.

How about if we put the same effort into storing our source code as the scriptoria did into storing their indentures and gospels? Well, I sharpened a goose feather and had a go at just that (warning: very much draft document impending).

Example 2-1 from PCAS

What you see up there is the first sample code in Professional Cocoa Application Security – Listing 2-1. Ignore the fact that you don’t recognise all the letter shapes: things have changed over the centuries. There were a few contortions required to get the source code to work in manuscript form: let me show you them.

First is that in the hand in which I wrote the source, some of the characters needed for Objective-C source code don’t exist. Like ‘v’. I used the fact that u and v are actually the same letter to get around that. Punctuation was harder: I went for roughly accurate rendering, with a single misplaced comma to suggest that the scribe didn’t really understand punctuation.

When it came to comments, I decided they have the same meaning as the gloss in the Lindisfarne Gospels – rendering the difficult language required by the church^Wcompiler into plain English. I therefore roughly scratched them in smaller text with different ink, letting them flow around the code as if they’d been written later. I also put in a few Old English spellings – though again not consistently[*].

The return value posed some difficulty, because we didn’t borrow 0 from the Middle East until a few centuries after the time this script is mimicking. I realised that if a scribe were to illuminate any part of a C function, it’d probably be the return value because that’s the consistent and – from the perspective of the rest of the code – important part. Thus the 0 is highly decorated, with six legs in the fashion of a bug :-).

Bugs hark back to the days of illuminated manuscripts anyway. Any good scribe would know that a mistake in the text was the fault of Titivillus, not of the scribe. Just as those bugs aren’t my fault. Honest.

[*] Next time you want to get angry at a teenager, remember that the work “ask” was once “acsian” with the s on the end, and think about which one of you is bastardising our language.

Android: the missed opportunities

There are a few Android devices I have respect for: the Amazon Kindle Fire is one, the B&N Nook another, and the Cisco Cius is the third. To a lesser extent, the Sony tablet also fits this category.

I don’t tend to like the Galaxy Tab, the Xoom and similar tablets so much, and most of the phones are lacklustre too. So what’s the difference? Why should some of these products be good and some bad, when they’re all just Android touch screens?

Android is a technology, not a product.

A term that appears in Ars Technica’s review of Ice Cream Sandwich really brings home my argument:

Ice Cream Sandwich is the closest to a “finished” version of Android I’ve ever seen.

Reasoning by inference we can conclude that all versions of Android appear unfinished. And this is the nub of the issue. The products I’ve listed as respectable above—the Kindle Fire, Cius and so on—are not Android tablets. They are eBook readers, business communication tablets, media controllers. The ones I don’t like are Android tablets.

Using Android in a touchscreen device is an extension of using Linux in any other computing scenario: I might like the solution, but I don’t want to see your working. I own countless Linux boxes: a couple of network routers, some eInk readers, a Drobo, that sort of thing. They all have in common that they take Linux and then build something on top of it that solves a problem I have. While all of these devices are successful products that all use Linux, Linux itself as a standalone thing is not successful among the same people who buy Linux routers, Linux NAS and so on.

Indeed even Linux desktop OS distributions have failed to take over the world, but don’t they solve the problem of “I want a desktop OS”? There are two issues here: the first being that no-one has that problem. People have problems like “I want to e-mail my aunt” or “I need to write a report on deforestation in the Amazon”, and it turns out that PCs can support those people, and that a desktop OS can be part of the supporting solution. The second issue is that Linux distributions have traditionally given you all the bits of a desktop OS without helping you put them together: you wanted a sports car, well here’s enough Meccano that you can build one.

Colour in up to the lines

Let’s return to the issue of why the not-so-good Android products are not so good. They are to smartphones or tablets as Linux distros are to PCs: some assembly is required to get a functioning product. By and large, these things give you Android itself, a slathering of “differentiating” interface tweaks and access to the Android Market. Where you go from there is up to you.

Of course, fixing the user interface is a necessary (dear God is it necessary) part of producing a complete Android product, but it’s far from sufficient. Where the respectable products like the Kindle Fire work is that a problem has been identified (for example: I want to be able to read a wide range of books without having to carry a whole library of paper around) and they solve that problem. The engineers use Android to help arrive at that solution, but they colour in all the way up to the lines: they don’t just give you the line drawing and let you buy crayons from the marketplace.

So where are the opportunities?

Well, where are the problems? What is there that can’t be done, and how would you design a product to do it? If the answer involves a touchscreen, then there are a couple of ways to do it: either with an app running on top of a touchscreen platform, or with a touchscreen device. Which is appropriate to the problem you’re looking at?

It seems that in most cases, the app is the better approach. It’s cheaper to engineer and distribute, and allows customers to take advantage of their existing touchscreen products and potentially integrate your app into workflows involving their other applications.

One situation in which an app is not a sufficient solution is where you need hardware capabilities that don’t exist in off-the-shelf touchscreen kit: maybe ambient light sensors, motion sensors, thermometers, robotic arms and the like. In that case it might still be possible to adopt an existing platform to support the software side, and provide a peripheral using USB or Apple’s 30-pin connector for the custom hardware requirements. That could work, but it might not be the best experience, so perhaps a complete integrated hardware+software product is the better option.

A place where you will definitely need to consider building a complete hardware+software product is where the existing software platforms don’t provide the capabilities you need. For example, you require the ability to share data between components in a way that isn’t supported by apps on phones. Of course, the hardware could still be largely off-the-shelf running a custom software platform.

In either of these last two scenarios, building something custom on top of Android is a good solution. Giving your customers Android with the app on top is not a good solution.

These scenarios represent the missed opportunities. Start-ups could be taking Android and using it as the core of some great integrated products: not as the “app for that” but as the “Kindle for that”. A single device that solves the bejeesus out of a particular problem.

Worked example: there’s a Kindle for cars.

Modern cars suffer from a lack of convergence. There’s a whole bunch of electronics that control the car itself, or provide telemetry, and feed back information to either the dashboard or a special connector accessible to service technicians. There’s the sat nav, and then they often also have a separate entertainment system (and possibly even a dock for a separate media player). Then there’s the immobiliser, which may be remotely controllable by something like Viper smart start.

A company decides that they want to exploit this opportunity for an integrated car management system. The central part of the user experience would be a removable touch-screen display that allows drivers to plan routes and mark waypoints when they’re out of the car. They can subscribe to feeds from other organisations that tell them about interesting places, which can all be mashed up in the navigation display. For example, a driver who subscribes to English Heritage in the navigation system can ask where heritage sites within one hour’s drive are. Having seen pictures and descriptions of the sites, she could request a route from the current location to Stonehenge, where the return journey should pass by a service station (locations of which are supplied by her Welcome Break subscription).

The display also integrates with the car’s locking system, so the doors automatically lock when the display is taken outside the car. It must be connected to the car’s power/data port to enable the engine to start. While the car is in motion the display provides navigational information, and allows control of the entertainment system. Where notifications are sufficiently important, for example warnings from the car’s diagnostic system, they are announced over the display and the car’s speakers. Where they’re unimportant they’re left in the background to be reviewed once the journey is finished.

In this situation, we need both custom hardware (the interfaces to the immobiliser and the car telemetry, which for a good user experience should be a single connector that also provides power and sound output to the car speakers) and a custom software experience (the route-planning part, integrating with feeds provided by the third parties which are also available in the viewer “apps”, and the variable-severity notification system that off-the-shelf touchscreen platforms don’t provide).

Getting this product to market quickly would be benefitted by taking Android and customising it to fit the usage scenario, so that the core parts of the touchscreen software platform don’t need to be written from scratch. However providing a vanilla tablet experience with the various components implemented as apps would not make a good product.


Android can be a great part of a whole range of different touchscreen products. However it is not, in itself, a product.

Why your security UI sucks

The principle recurring problem in user experience is creating a user interface that supports the user’s mental model of how an app works, while simultaneously enabling the actions that are actually supported by the implementation’s model of the problem domain. Make the interface too much like the app internals, and the user won’t be able to make it fit their view of how things work.

A useful technique in this field is to present the user with a metaphor, to say “the thing you’re using now is a bit like this thing you’ve used before”. That worked much better in the old days when computers were new and people actually had used those things before. Now, metaphors like address books, landline handsets and envelopes are getting tired, and a generation of people have grown up without using the things these metaphors are based on.

So how do security metaphors stack up? Well, here the situation is even worse. In these situations even though people do use the real-world devices, the metaphors are strained beyond the point of usefulness.

Locks and Keys

You’d be hard pushed to find a security vendor or industry blog that doesn’t include a picture of a padlock on it somewhere. There, mine does too! A lock, using a particular mental model, is something that keeps you from using the door to a place or container (or keeps you in and other people out). The key permits you to act almost as if the lock weren’t there when you want, and almost as if the door didn’t exist (i.e. that the wall is solid) when you want.

There are few places in security UIs where locks exist (ignoring Keychain, which is already a response to a security/usability fail in itself), but plenty of keys. Keys that don’t interact with locks, but instead turn readable messages into unreadable messages. In the real world, that is (or was) done by an envelope, or by choosing carefully who you tell your message to.

Keys in software can be magically created from passwords. So hang on, does this door need a password or a key to get through?

Keys in software can validate the integrity of data, showing that it cannot have changed since the keyholder saw it. How does that map onto real world keys? Oh, and we call this a signature. You know, exactly unlike how you sign your name.

Finally, there are some types of key that you can give to anyone, called public keys. These can be used to lock things but not unlock them. WTF?



While we’re talking about keys, we should remember that they can be stored on keychains, keyrings or certificates. Wait, what?

In the real world, certificates that I own tell me that I am a motorcycle’s registered keeper, that I have completed a variety of different academic and training courses, and related information. The certificate is usually issued by the body that’s best placed to assert the truthiness of the certified statement: the DVLA, the academic institution and so on.

A digital certificate is more like a notary statement of identity than any of those certificates. Except that, far from being issued by a notary, it’s issued by anybody. Some certificates are issued by organisations who pay attention to other people’s notary statements about the holder’s identity. Sometimes, those certificates tell you (though not in plain language, of course) about how seriously they checked the identity of the holder.

How do you find out the identity of the organisation that issued the certificate? Why, with another certificate, of course! Imagine that my degree certificate was stapled to a certificate saying that the signer worked at the tutorial office, which was stapled to another saying that the tutorial office was part of the administration department, was stapled to…

So we need better metaphors?

No, we need to acknowledge that most security behaviour is internal to the software system that’s being secured, and keep it out of the user interface. A field worker in sales thinks “I’ve got an e-mail from my boss, I’ll read that”, so let her read that. Do the signature/certificate stuff internally. If that e-mail turns out not to be from her boss then rather than showing the certificate, don’t let her think that it is from the certificate.

No UI is good UI, right?

Here’s the security UI Apple added to Mac OS X in Lion.

So if Apple is no longer putting security user interface in their product, that must make it the universally blessed approach, right?

Not so fast! One limitation of many security systems is that they do rely on some user involvement. It’d be great to make it all entirely invisible, but some of that user involvement is necessary: software is used in an environment that is only partially technological, but is partially social. Requirements for things like confidentiality and integrity that are born of meatspace needs still must be supported by user involvement.

And that’s where we meet another problem. If we completely remove security UI from our products, we lose any chance to remind people that they too have skin in this game. Where data is confidential, if it doesn’t look confidential it doesn’t switch the reader into “remember to keep this secret” mode. Switching the software into that mode is almost trivial in comparison.

As an example, both Mac OS X and iOS have shipped forever without a user password and without requiring the user to set a password. Great, no security UI. But do the people using Macs and iPhones understand the difference in security posture—particularly on iOS with data protection—implied by setting a password? It doesn’t fit into the password metaphor, where a password just lets me in and doesn’t do anything about encryption.


There are two problems when it comes to security UI. Traditional UI in this arena exposes too many details of the software internals, and relies on flakey metaphors that do not translate either to the user’s mental model nor the software’s implementation model of the security feature. On the other hand, if we just remove this UI then users aren’t reminded of their part in the security of the overall, socio-technical system.

Thankfully solving these problems is not mutually exclusive. By communicating with users on their own terms at relevant times, an application’s interface can encourage users to take part in the overall security of the system without either confusing them or turning them off of engaging by using overly technical language and concepts.