Software, Science?

Is there any science in software making? Does it make sense to think of software making as scientific? Would it help if we could?

Hold on, just what is science anyway?

Good question. The medieval French philosopher-monk Buridan said that the source of all knowledge is experience, and Richard Feynman paraphrased this as “the test of all knowledge is experiment”.

If we accept that science involves some experimental test of knowledge, then some of our friends who think of themselves as scientists will find themselves excluded. Consider the astrophysicists and cosmologists. It’s all very well having a theory about how stars form, but if you’re not willing to swirl a big cloud of hydrogen, helium, lithium, carbon etc. about and watch what happens to it for a few billion years then you’re not doing the experiment. If you’re not doing the experiment then you’re not a scientist.

Our friends in what are called the biological and medical sciences also have some difficulty now. A lot of what they do is tested by experiment, but some of the experiments are not permitted on ethical grounds. If you’re not allowed to do the experiment, maybe you’re not a real scientist.

Another formulation (OK, I got this from the wikipedia entry on Science) sees science as a sort of systematic storytelling: the making of “testable explanations and predictions about the universe”.

Under this definition, there’s no problem with calling astronomy a science: you think this is how things work, then you sit, and watch, and see whether that happens.

Of course a lot of other work fits into the category now, too. There’s no problem with seeing the “social sciences” as branches of science: if you can explain how people work, and make predictions that can (in principle, even if not in practice) be tested, then you’re doing science. Psychology, sociology, economics: these are all sciences now.

Speaking of the social sciences, we must remember that science itself is a social activity, and that the way it’s performed is really defined as the explicit and implicit rules and boundaries created by all the people who are doing it. As an example, consider falsificationism. This is the idea that a good scientific hypothesis is one that can be rejected, rather than confirmed, by an appropriately-designed experiment.

Sounds pretty good, right? Here’s the interesting thing: it’s also pretty new. It was mostly popularised by Karl Popper in the 20th Century. So if falsificationism is the hallmark of good science, then Einstein didn’t do good science, nor did Marie Curie, nor Galileo, or a whole load of other people who didn’t share the philosophy. Just like Dante’s Virgil was not permitted into heaven because he’d been born before Christ and therefore could not be a Christian, so all of the good souls of classical science are not permitted to be scientists because they did not benefit from Popper’s good message.

So what is science today is not necessarily science tomorrow, and there’s a sort of self-perpetuation of a culture of science that defines what it is. And of course that culture can be critiqued. Why is peer review appropriate? Why do the benefits outweigh the politics, the gazumping, the gender bias? Why should it be that if falsification is important, negative results are less likely to be published?

Let’s talk about Physics

Around a decade ago I was studying particle physics pretty hard. Now there are plenty of interesting aspects to particle physics. Firstly that it’s a statistics-heavy discipline, and that results in statistics are defined by how happy you are with them, not by some binary right/wrong criterion.

It turns out that particle physicists are a pretty conservative bunch. They’ll only accept a particle as “discovered” if the signal indicating its existence is measured as a five-sigma confidence: i.e. if there’s under a one-on-a-million chance that the signal arose randomly in the absence of the particle’s existence. Why five sigma? Why not three (a 99.7% confidence) or six (to keep Motorola happy)? Why not repeat it three times and call it good, like we did in middle school science classes?

Also, it’s quite a specialised discipline, with a clear split between theory and practice and other divisions inside those fields. It’s been a long time since you could be a general particle physicist, and even longer since you could be simply a “physicist”. The split leads to some interesting questions about the definition of science again: if you make a prediction which you know can’t be verified during your lifetime due to the lag between theory and experimental capability, are you still doing science? Does it matter whether you are or not? Is the science in the theory (the verifiable, or falsifiable, prediction) or in the experiment? Or in both?

And how about Psychology, too

Physicists are among the most rational people I’ve worked with, but psychologists up the game by bringing their own feature to the mix: hypercriticality. And I mean that in the technical sense of criticism, not in the programmer “you’re grammar sucks” sense.

You see, psychology is hard, because people are messy. Physics is easy: the apple either fell to earth or it didn’t. Granted, quantum gets a bit weird, but it generally (probably) does its repeatable thing. We saw above that particle physics is based on statistics (as is semiconductor physics, as it happens); but you can still say that you expect some particular outcome or distribution of outcomes with some level of confidence. People aren’t so friendly. I mean, they’re friendly, but not in a scientific sense. You can do a nice repeatable psychology experiment in the lab, but only by removing so many confounding variables that it’s doubtful the results would carry over into the real world. And the experiment only really told you anything about local first year psychology undergraduates, because they’re the only people who:

  1. walked past the sign in the psychology department advertising the need for participants;
  2. need the ten dollars on offer for participation desperately enough to turn up.

In fact, you only really know about how local first year psychology undergraduates who know they’re participating in a psychology experiment behave. The ethics rules require informed consent which is a good thing because otherwise it’s hard to tell the difference between a psychology lab and a Channel 4 game show. But it means you have to say things like “hey this is totally an experiment and there’ll be counselling afterward if you’re disturbed by not really electrocuting the fake person behind the wall” which might affect how people react, except we don’t really know because we’re not allowed to do that experiment.

On the other hand, you can make observations about the real world, and draw conclusions from them, but it’s hard to know what caused what you saw because there are so many things going on. It’s pretty hard to re-run the entire of a society with just one thing changed, like “maybe if we just made Hitler an inch taller then Americans would like him, or perhaps try the exact same thing as prohibition again but in Indonesia” and other ideas that belong in Philip K. Dick novels.

So there’s this tension: repeatable results that might not apply to the real world (a lack of “ecological validity”), and real-world phenomena that might not be possible to explain (a lack of “internal validity”). And then there are all sorts of other problems too, so that psychologists know that for a study to hold water they need to surround what they say with caveats and limitations. Thus is born the “threats to validity” section on any paper, where the authors themselves describe the generality (or otherwise) of their results, knowing that such threats will be a hot topic of discussion.

But all of this—the physics, the psychology, and the other sciences—is basically a systematised story-telling exercise, in which the story is “this is why the universe is as it is” and the system is the collection of (time-and-space-dependent) rules that govern what stories may be told. It’s like religion, but with more maths (unless your religion is one of those ones that assigns numbers to each letter in a really long book then notices that your phone number appears about as many times as a Poisson distribution would suggest).

Wait, I think you were talking about software

Oh yeah, thanks. So, what science, if any, is there in making software? Does there exist a systematic approach to storytelling? First, let’s look at the kinds of stories we need to tell.

The first are the stories about the social system in which the software finds itself: the story of the users, their need (or otherwise) for a software system, their reactions (or otherwise) to the system introduced, how their interactions with each other change as a result of introducing the system, and so on. Call this requirements engineering, or human-computer interaction, or user experience; it’s one collection of stories.

You can see these kinds of stories emerging from the work of Manny Lehman. He identifies three types of software:

  • an S-system is exactly specified.
  • a P-system executes some known procedure.
  • an E-system must evolve to meet the needs of its environment.

It may seem that E-type software is the type in which our stories about society are relevant, but think again: why do we need software to match a specification, or to follow a procedure? Because automating that specification or procedure is of value to someone. Why, or to what end? Why that procedure? What is the impact of automating it? We’re back to telling stories about society. All of these software systems, according to Lehman, arise from discovery of a problem in the universe of discourse, and provide a solution that is of possible interest in the universe of discourse.

The second kind are the stories about how we worked together to build the software we thought was needed in the first stories. The practices we use to design, build and test our software are all tools for facilitating the interaction between the people who work together to make the things that come out. The things we learned about our own society, and that we hope we can repeat (or avoid) in the future, become our design, architecture, development, testing, deployment, maintenance and support practices. We even create our own tools—software for software’s sake—to automate, ease or disrupt our own interactions with each other.

You’ll mostly hear the second kind of story at most developer conferences. I believe that’s because the people who have most time and inclination to speak at most developer conferences are consultants, and they understand the second stories to a greater extent than the first because they don’t spend too long solving any particular problem. It’s also because most developer conferences are generally about making software, not about whatever problem it is that each of the attendees is trying to solve out in the world.

I’m going to borrow a convention that Rob Rix told me in an email, of labelling the first type of story as being about “external quality” and the second type about “internal quality”. I went through a few stages of acceptance of this taxonomy:

  1. Sounds like a great idea! There really are two different things we have to worry about.
  2. Hold on, this doesn’t sounds like such a good thing. Are we really dividing our work into things we do for “us” and things we do for “them”? Labelling the non-technical identity? That sounds like a recipe for outgroup homogeneity effect.
  3. No, wait, I’m thinking about it wrong. The people who make software are not the in-group. They are the mediators: it’s just the computers and the tools on one side of the boundary, and all of society on the other side. We have, in effect, the Janus Thinker: looking on the one hand toward the external stories, on the other toward the internal stories, and providing a portal allowing flow between the two.

JANUS (from Vatican collection) by Flickr user jinnrouge

So, um, science?

What we’re actually looking at is a potential social science: there are internal stories about our interactions with each other and external stories about our interactions with society and of society’s interactions with the things we create, and those stories could potentially be systematised and we’d have a social science of sorts.

Particularly, I want to make the point that we don’t have a clinical science, an analogy drawn by those who investigate evidence-based software engineering (which has included me, in my armchair way, in the past). You can usefully give half of your patients a medicine and half a placebo, then measure survival or recovery rates after that intervention. You cannot meaningfully treat a software practice, like TDD as an example, as a clinical intervention. How do you give half of your participants a placebo TDD? How much training will you give your ‘treatment’ group, and how will you organise placebo training for the ‘control’ group? [Actually I think I’ve been on some placebo training courses.]

In constructing our own scientific stories about the world of making software, we would run into the same problems that social scientists do in finding useful compromises between internal and ecological validity. For example, the oft-cited Exploratory experimental studies comparing online and offline programming performance (by Sackman et al., 1968) is frequently used to support the notion that there are “10x programmers”, that some people who write software just do it ten times faster than others.

However, this study does not have much ecological validity. It measures debugging performance, using either an offline process (submitting jobs to a batch system) or an online debugger called TSS, which probably isn’t a lot like the tools used in debugging today. The problems were well-specified, thus removing many of the real problems programmers face in designing software. Participants were expected to code a complete solution with no compiler errors, then debug it: not all programmers work like that. And where did they get their participants from? Did they have a diverse range of backgrounds, cultures, education, experience? It does not seem that any results from that study could necessarily apply to modern software development situated in a modern environment, nor could the claim of “10x programmers” necessarily generalise as we don’t know who is 10x better than whom, even at this one restricted task.

In fact, I’m also not convinced of its internal validity. There were four conditions (two programming problems and two debugging setups), each of which was assigned to six participants. Variance is so large that most of the variables are independent of each other (the independent variables are the programming problem and the debugging mode, and the dependent variables are the amount of person-time and CPU-time), unless the authors correlate them with “programming skill”. How is this skill defined? How is it measured? Why, when the individual scores are compared, is “programming skill” not again taken into consideration? What confounding variables might also affect the wide variation in scored reported? Is it possible that the fastest programmers had simply seen the problem and solved it before? We don’t know. What we do know is that the reported 28:1 ratio between best and worst performers is across both online and offline conditions (as pointed out in, e.g., The Leprechauns of Software Engineering, so that’s definitely a confounding factor. If we just looked at two programmers using the same environment, what difference would be found?

We had the problem that “programming skill” is not well-defined when examining the Sackman et al. study, and we’ll find that this problem is one we need to overcome more generally before we can make the “testable explanations and predictions” that we seek. Let’s revisit the TDD example from earlier: my hypothesis is that a team that adopts the test-driven development practice will be more productive some time later (we’ll defer a discussion of how long) than the null condition.

OK, so what do we mean by “productive”? Lines of code delivered? Probably not, their amount varies with representation. OK, number of machine instructions delivered? Not all of those would be considered useful. Amount of ‘customer value’? What does the customer consider valuable, and how do we ensure a fair measurement of that across the two conditions? Do bugs count as a reduction in value, or a need to do additional work? Or both? How long do we wait for a bug to not be found before we declare that it doesn’t exist? How is that discovery done? Does the expense related to finding bugs stay the same in both cases, or is that a confounding variable? Is the cost associated with finding bugs counted against the value delivered? And so on.

Software dogma

Because our stories are not currently very testable, many of them arise from a dogmatic belief that some tool, or process, or mode of thought, is superior to the alternatives, and that there can be no critical debate. For example, from the Clean Coder:

The bottom line is that TDD works, and everybody needs to get over it.

No room for alternatives or improvement, just get over it. If you’re having trouble defending it, apply a liberal sprinkle of argumentum ab auctoritate and tell everyone: Robert C. Martin says you should get over it!

You’ll also find any number of applications of the thought-terminating cliché, a rhetorical technique used to stop cognitive dissonance by allowing one side of the issue to go unchallenged. Some examples:

  • “I just use the right tool for the job”—OK, I’m done defending this tool in the face of evidence. It’s just clearly the correct one. You may go about your business. Move along.
  • “My approach is pragmatic”—It may look like I’m doing the opposite of what I told you earlier, but that’s because I always do the best thing to do, so I don’t need to explain the gap.
  • “I’m passionate about [X]”—yeah, I mean your argument might look valid, I suppose, if you’re the kind of person who doesn’t love this stuff as much I do. Real craftsmen would get what I’m saying.
  • and more.

The good news is that out of such religious foundations spring the shoots of scientific thought, as people seek to find a solid justification for their dogma. So just as physics has strong spiritual connections, with Steven Hawking concluding in A Brief History of Time:

However, if we discover a complete theory, it should in time be understandable by everyone, not just by a few scientists. Then we shall all, philosophers, scientists and just ordinary people, be able to take part in the discussion of the question of why it is that we and the universe exist. If we find the answer to that, it would be the ultimate triumph of human reason — for then we should know the mind of God.

and Einstein debating whether quantum physics represented a kind of deific Dungeons and Dragons:

[…] an inner voice tells me that it is not yet the real thing. The theory says a lot, but does not really bring us any closer to the secret of the “old one.” I, at any rate, am convinced that He does not throw dice.

so a (social) science of software could arise as an analogous form of experimental theology. I think the analogy could be drawn too far: the context is not similar enough to the ages of Islamic Science or of the Enlightenment to claim that similar shifts to those would occur. You already need a fairly stable base of rational science (and its application via engineering) to even have a computer at all upon which to run software, so there’s a larger base of scientific practice and philosophy to draw on.

It’s useful, though, when talking to a programmer who presents themselves as hyper-rational, to remember to dig in and to see just where the emotions, fallacious arguments and dogmatic reasoning are presenting themselves, and to wonder about what would have to change to turn any such discussion into a verifiable prediction. And, of course, to think about whether that would necessarily be a beneficial change. Just as we can critique scientific culture, so should we critique software culture.

Inside-Out Apps

This article is based on a talk I gave at mdevcon 2014. The talk also included a specific example to demonstrate the approach, but was otherwise a presentation of the following argument.

You probably read this blog because you write apps. Which is kind of cool, because I have been known to use apps. I’d be interested to hear what yours is about.

Not so fast! Let me make this clear first: I only buy page-based apps, they must use Core Data, and I automatically give one star to any app that uses storyboards.

OK, that didn’t sound like it made any sense. Nobody actually chooses which apps to buy based on what technologies or frameworks are used, they choose which apps to buy based on what problems those apps solve. On the experience they derive from using the software.

When we build our apps, the problem we’re solving and the experience we’re providing needs to be at the uppermost of our thoughts. You’re probably already used to doing this in designing an application: Apple’s Human Interface Guidelines describe the creation of an App Definition Statement to guide thinking about what goes into an app and how people will use it:

An app definition statement is a concise, concrete declaration of an app’s main purpose and its intended audience.

Create an app definition statement early in your development effort to help you turn an idea and a list of features into a coherent product that people want to own. Throughout development, use the definition statement to decide if potential features and behaviors make sense.

My suggestion is that you should use this idea to guide your app’s architecture and your class design too. Start from the problem, then work through solving that problem to building your application. I have two reasons: the first will help you, and the second will help me to help you.

The first reason is to promote a decoupling between the problem you’re trying to solve, the design you present for interacting with that solution, and the technologies you choose to implement the solution and its design. Your problem is not “I need a Master-Detail application”, which means that your solution may not be that. In fact, your problem is not that, and it may not make sense to present it that way. Or if it does now, it might not next week.

You see, designers are fickle beasts, and for all their feel-good bloviation about psychology and user experience, most are actually just operating on a combination of trend and whimsy. Last week’s refresh button is this week’s pull gesture is next week’s interaction-free event. Yesterday’s tab bar is today’s hamburger menu is tomorrow’s swipe-in drawer. Last decade’s mouse is this decade’s finger is next decade’s eye motion. Unless your problem is Corinthian leather, that’ll be gone soon. Whatever you’re doing for iOS 7 will change for iOS 8.

So it’s best to decouple your solution from your design, and the easiest way to do that is to solve the problem first and then design a presentation for it. Think about it. If you try to design a train timetable, then you’ll end up with a timetable that happens to contain train details. If you try to solve the problem “how do I know at what time to be on which platform to catch the train to London?”, then you might end up with a timetable, but you might not. And however the design of the solution changes, the problem itself will not: just as the problem of knowing where to catch a train has not changed in over a century.

The same problem that affects design-driven development also affects technology-driven development. This month you want to use Core Data. Then next month, you wish you hadn’t. The following month, you kind of want to again, then later you realise you needed a document database after all and go down that route. Solve the problem without depending on particular libraries, then changing libraries is no big deal, and neither is changing how you deal with those libraries.

It’s starting with the technology that leads to Massive View Controller. If you start by knowing that you need to glue some data to some view via a View Controller, then that’s what you end up with.

This problem is exacerbated, I believe, by a religious adherence to Model-View-Controller. My job here is not to destroy MVC, I am neither an iconoclast nor a sacrificer of sacred cattle. But when you get overly attached to MVC, then you look at every class you create and ask the question “is this a model, a view, or a controller?”. Because this question makes no sense, the answer doesn’t either: anything that isn’t evidently data or evidently graphics gets put into the amorphous “controller” collection, which eventually sucks your entire codebase into its innards like a black hole collapsing under its own weight.

Let’s stick with this “everything is MVC” difficulty for another paragraph, and possibly a bulleted list thereafter. Here are some helpful answers to the “which layer does this go in” questions:

  • does my Core Data stack belong in the model, the view, or the controller? No. Core Data is a persistence service, which your app can call on to save or retrieve data. Often the data will come from the model, but saving and retrieving that data is not itself part of your model.
  • does my networking code belong in the model, the view, or the controller? No. Networking is a communications service, which your app can call on to send or retrieve data. Often the data will come from the model, but sending and retrieving that data is not itself part of your model.
  • is Core Graphics part of the model, the view, or the controller? No. Core Graphics is a display primitive that helps objects represent themselves on the display. Often those objects will be views, but the means by which they represent themselves are part of an external service.

So building an app in a solution-first approach can help focus on what the app does, removing any unfortunate coupling between that and what the app looks like or what the app uses. That’s the bit that helps you. Now, about the other reason for doing this, the reason that makes it easier for me to help you.

When I come to look at your code, and this happens fairly often, I need to work out quickly what it does and what it should do, so that I can work out why there’s a difference between those two things and what I need to do about it. If your app is organised in such a way that I can see how each class contributes to the problem being solved, then I can readily tell where I go for everything I need. If, on the other hand, your project looks like this:

MVC organisation

Then the only thing I can tell is that your app is entirely interchangeable with every other app that claims to be nothing more than MVC. This includes every Rails app, ever. Here’s the thing. I know what MVC is, and how it works. I know what UIKit is, and why Apple thinks everything is a view controller. I get those things, your app doesn’t need to tell me those things again. It needs to reflect not the similarities, which I’ve seen every time I’ve launched Project Builder since around 2000, but the differences, which are the things that make your app special.

OK, so that’s the theory. We should start from the problem, and move to the solution, then adapt the solution onto the presentation and other technology we need to use to get a product we can sell this week. When the technologies and the presentations change, we can adapt onto those new things, to get the product we can sell next week, without having to worry about whether we broke solving the problem. But what’s the practice? How do we do that?

Start with the model.

Remember that, in Apple’s words:

model objects represent knowledge and expertise related to a specific problem domain

so solving the problem first means modelling the problem first. Now you can do this without regard to any particular libraries or technology, although it helps to pick a programming language so that you can actually write something down. In fact, you can start here:

Command line tool

A Foundation command-line tool has everything you need to solve your problem (in fact it contains a few more things than that, to make up for erstwhile deficiencies in the link editor, but we’ll ignore those things). It lets you make objects, and it lets you use all those boring things that were solved by computer scientists back in the 1760s like strings, collections and memory allocation.

So with a combination of the subset of Objective-C, the bits of Foundation that should really be in Foundation, and unit tests to drive the design of the solution, we can solve whatever problem it is that the app needs to solve. There’s just one difficulty, and that is that the problem is only solved for people who know how to send messages to objects. Now we can worry about those fast-moving targets of presentation and technology choice, knowing that the core of our app is a stable, well-tested collection of objects that solve our customers’ problem. We expose aspects of the solution by adapting them onto our choice of user interface, and similarly any other technology dependencies we need to introduce are held at arm’s length. We test that we integrate with them correctly, but not that using them ends up solving the problem.

If something must go, then we drop it, without worrying whether we’ve broken our ability to solve the problem. The libraries and frameworks are just services that we can switch between as we see fit. They help us solve our problem, which is to help everyone else to solve their problem.

And yes, when you come to build the user interface, then model-view-controller will be important. But only in adapting the solution onto the user interface, not as a strategy for stuffing an infinite number of coats onto three coat hooks.

References

None of the above is new, it’s just how Object-Oriented Programming is supposed to work. In the first part of my MVC series, I investigated Thing-Model-View-Editor and the progress from the “Thing” (a problem in the real world that must be solved) to a “Model” (a representation of that problem and its solution in the computer). That article relied on sources from Trygve Reenskaug, who described (in 1979) the process by which he moved from a Thing to a Model and then to a user interface.

In his 1992 book Object-Oriented Software Engineering: A Use-Case Driven Approach, Ivar Jacobson describes a formalised version of the same motion, based on documented use cases. Some teams replaced use cases with user stories, which look a lot like Reenskaug’s user goals:

An example: To get better control over my finances, I would need to set up a budget; to keep account
of all income and expenditure; and to keep a running comparison between budget and accounts.

Alastair Cockburn described the Ports and Adapters Architecture (earlier known as the hexagonal architecture) in which the application’s use cases (i.e. the ways in which it solves problems) are at the core, and everything else is kept at a distance through adapters which can easily be replaced.

ClassBrowser’s public face

I made a couple of things:

I should’ve done both of these things at the beginning of the project. I believe that the fact I opened the source really early, when it barely did one thing and then only on my machine™, was a good thing. It let people find out about the project, have a look, and even send changes. But not giving anywhere to discuss ClassBrowser was a mistake. It meant I answered questions in private messages that would’ve been better archived, and it probably turned some people away who couldn’t immediately see what the point was.

Hopefully it’s not too late to fix that.

ClassBrowser: warts and all

I previously gave a sneak peak of ClassBrowser, a dynamic execution environment for Objective-C. It’s not anything like ready for general use (in fact it can’t really do ObjC very well at all), but it’s at the point where you can kick the tyres and contribute pull requests. Here’s what you need to know:

Have a lot of fun!

ClassBrowser is distributed under the terms of the University of Illinois/NCSA licence (because it is based partially on code distributed with clang, which is itself under that licence).

A sneaky preview of ClassBrowser

Let me start with a few admissions. Firstly, I have been computering for a good long time now, and I still don’t really understand compilers. Secondly, work on my GNUstep Web side-project has tailed off for a while, because I decided I wanted to try something out to learn about the compiler before carrying on with that work. This post will mostly be about that something.

My final admission: if you saw my presentation on the ObjC runtime you will have seen an app called “ClassBrowser” where I showed that some of the Foundation classes are really in the CoreFoundation library. Well, there were two halves to the ClassBrowser window, and I only showed you the top half that looked like the Smalltalk class browser. I’m sorry.

So what’s the bottom half?

This is what the bottom half gives me. It lets me go from this:

ClassBrowser before

via this:

Who are you calling a doIt?

to this:

ClassBrowser after

What just happened?

You just saw some C source being compiled to LLVM bit code, which is compiled just-in-time to native code and executed, all inside that browser app.

Why?

Well why not? Less facetiously:

  • I’m a fan of Smalltalk. I want to build a thing that’s sort of a Smalltalk, except that rather than being the Smalltalk language on the Objective-C runtime (like F-Script or objective-smalltalk), it’ll be the Objective-C(++) language on the Objective-C runtime. So really, a different way of writing Objective-C.
  • I want to know more about how clang and LLVM work, and this is as good a way as any.
  • I think, when it actually gets off the ground, this will be a faster way of writing test-first Objective-C than anything Xcode can do. I like Xcode 5, I think it’s the better Xcode 4 that I always wanted, but there are gains to be had by going in a completely different direction. I just think that whoever strikes out in such a direction should not release their research project as a new version of Xcode :-).

Where can I get me a ClassBrowser?

You can’t, yet. There’s some necessary housekeeping that needs to be done before a first release, replacing some “research” hacks with good old-fashioned tested code and ensuring GNUstep-GUI compatibility. Once that’s done, a rough-and-nearly-ready first open source release will ensue.

Then there’s more work to be done before it’s anything like useful. Particularly while it’s possible to use it to run Objective-C, it’s far from pleasant. I’ve had some great advice from LLVM IRC on how to address that and will try to get it to happen soon.

Story points: because I don’t know what I’m doing

The scenario

[Int. developer’s office. Developer sits at a desk that faces the wall. Two of the monitors on Developer’s desk are on stands, if you look closely you see that the third is balanced on the box set of The Art of Computer Programming, which is still in its shrink-wrap. Developer notices you and identifies an opportunity to opine about why the world is wrong, as ever.]

Every so often, people who deal with the real world instead of the computer world ask us developers annoying questions about how our work interacts with so-called reality. You’re probably thinking the same thing I do: who cares, right? I’m right in the middle of a totally cool abstraction layer on top of the operating system’s abstraction layer that abstracts their abstraction so I can interface it to my abstraction and abstract all the abstracts, what’s that got to do with reality and customers and my employer and stuff?

Ugh, damn, turning up my headphones and staring pointedly at the screen hasn’t helped, they’re still asking this question. OK, what is it?

Apparently they want to know when some feature will be done. Look, I’m a programmer, I’m absolutely the worst person to ask about time. OK, I believe that you might want to know whether this development effort is going to deliver value to the customers any time soon, and whether we’re still going to be ahead financially when we’re done, or whether it’d be better to take on some other work. And really I’d love to answer this question, except for one thing:

I have absolutely no idea what I’m doing. Seriously, don’t you remember all the other times that I gave you estimates and they were way off? The problem isn’t some systematic error in the way I think about how long it’ll take me to do stuff, it’s that while I can build abstractions on top of other abstractions I’m not so great at going the other way. Give me a short description of a task, I’ll try and work out what’s involved but I’m likely to miss something that will become important when I go to do it. It’s these missed details that add time, and I don’t know how many of those there will be until I get started.

The proposed solution

[Developer appears to have a brainwave]

Wait, remember how my superpower is adding layers of abstraction? Well your problem of estimation looks quite a lot like a nail to me, so I’ll apply my hammer! Let’s add a layer of abstraction on top of time!

Now you wanted to know how long it’ll take to finish some feature. Well I’ll tell you, but I won’t tell you in units of hours or days, I’ll use BTUs (Bullshit Time Units) instead. So this thing I’m working on will be about five BTUs. What do you mean, that doesn’t tell you when I’ll be done? It’s simple, duh! Just wait a couple of months, and measure how many BTUs we actually managed to complete. Now you know how many BTUs per day we can do, and you know how long everything takes!

[Developer puts their headphones back in, and turns to face the monitor. The curtain closes on the scene, and the Humble(-ish) Narrator takes the stage.]

The observed problem

Did you notice that the BTU doesn’t actually solve the stated problem? If it’s possible to track BTU completion over time until we know how many BTUs get completed in an iteration, then we are making the assumption that there is a linear relationship between BTUs and units of time. Just as there are 40 (or 90, if you picked the wrong recruiter) hours to the work week, so there are N BTUs to the work week. A BTU is worth x hours, and we just need to measure for a bit until we find the value of x.

But Developer’s problem was not a failure to understand how many hours there are in an hour. Developer’s problem was a failure to know what work is outstanding. An inability to foresee what work needs to be done cannot be corrected by any change to the way in which work to be done is mapped onto time. It is, to wear out even further an already tired saw, an unknown unknown.

What to do about it

We’re kindof stuck, really. We can’t tell how long something will take until we do it, not because we’re bad at estimating how long it’ll take to do something but because we’re bad at knowing what it is we need to do.

The little bit there about “until we do it” is, I think, what we need to focus on. I can’t tell you how long something I haven’t done will take, but I can probably tell you what problems are outstanding on the thing I’m doing now. I can tell you whether it’s ready now, or whether I think it’ll be ready “soon” or “not soon”.

So here’s the opportunity: we’ll keep whatever we’ve already got ready for immediate release. We’ll share information about which of the acceptance tests are passing, and if we were to release right now you’d know what customers will get from that. Whatever the thing we’re working on now is, we’ll be in a position to decide whether to switch away if we can do some more valuable work instead.

Conflicts in my mental model of Objective-C

My worldview as it relates to the writing of software in Objective-C contains many items that are at odds with one another. I either need to resolve them or to live with the cognitive dissonance, gradually becoming more insane as the conflicting items hurl one another at my cortex.

Of the programming environments I’ve worked with, I believe that Objective-C and its frameworks are the most pleasant. On the other hand, I think that Objective-C was a hack, and that the frameworks are not without their design mistakes, regressions and inconsistencies.

I believe that Objective-C programmers are correct to side with Alan Kay in saying that the designers of C++ and Java missed out on the crucial part of object-oriented programming, which is message passing. However I also believe that ObjC missed out on a crucial part of object-oriented programming, which is the compiler as an object. Decades spent optimising the compile-link-debug-edit cycle have been spent on solving the wrong problem. On which topic, I feel conflicted by the fact that we’ve got this Smalltalk-like dynamic language support but can have our products canned for picking the same selector name as some internal secret stuff in someone else’s code.

I feel disappointed that in the last decade, we’ve just got tools that can do the same thing but in more places. On the other hand, I don’t think it’s Apple’s responsibility to break the world; their mission should be to make existing workflows faster, with new excitement being optional or third-party. It is both amazing and slightly saddening that if you defrosted a cryogenically-preserved NeXT application programmer, they would just need to learn reference counting, blocks and a little new syntax and style before they’d be up to speed with iOS apps (and maybe protocols, depending on when you threw them in the cooler).

Ah, yes, Apple. The problem with a single vendor driving the whole community around a language or other technology is that the successes or failures of the technology inevitably get caught up in the marketing messages of that vendor, and the values and attitudes ascribed to that vendor. The problem with a community-driven technology is that it can take you longer than the life of the Sun just to agree how lambdas should work. It’d be healthy for there to be other popular platforms for ObjC programming, except for the inconsistencies and conflicts that would produce. It’s great that GNUstep, Cocotron and Apportable exist and are as mature as they are, but “popular” is not quite the correct adjective for them.

Fundamentally I fear a world in which programmers think JavaScript is acceptable. Partly because JavaScript, but mostly because when a language is introduced and people avoid it for ages, then just because some CEO says all future websites must use it they start using it, that’s not healthy. Objective-C was introduced and people avoided it for ages, then just because some CEO said all future apps must use it they started using it.

I feel like I ought to do something about some of that. I haven’t, and perhaps that makes me the guy who comes up to a bunch of developers, says “I’ve got a great idea” and expects them to make it.

Representativeness in Software Engineering Research

The first paragraph describes the context of this post in relation to the blog on which it originally appeared, not blog.securemacprogramming.com.

For this post, I wanted to go a little bit meta. One focus of this blog will be on whether results from academic software engineering are applicable to the work I do as a commercial software developer, so it was hard to pass up this Microsoft Research paper on representativeness of research.

In a nutshell, the problem is this: imagine that you read some study that shows, for example, that schedule slippage on a software project is significantly lessened if developers are given two digestive biscuits and a cup of tea at 4pm on working days. The study examined the breaktime habits of developers on 500 open source projects. This sounds quite convincing. If this thing with the tea and biscuits is true across five hundred projects, it must be applicable to my project, right?

That doesn’t follow. There are many reasons why it might not follow: the study may be biased. The analysis may be wrong. The biscuit thing may be significant but not the root cause of success. The authors may have selected projects that would demonstrate the biscuit outcome. Projects that had initially signed up but got delayed might have dropped out. This paper evaluates one cause of bias: the projects used in the study aren’t representative of the project you’re doing.

It’s a fallacy to assume that just because a study has a large sample size, its results can be generalised to the population. This only applies in the case that the sample represents an even slice of the population. Imagine a world in which all software projects are either written in Java or LISP. Now it doesn’t matter whether I select 10 projects or 10,000 projects: a sample of LISP practices will not necessarily tell us anything about how to conduct a Java project.

Conversely a study that investigates both Java and LISP projects can—in this restricted universe, and with the usual “all other things being equal” caveat—tell us something generally about software projects independent of language. However, choice of language is only one dimension in which software can be measured: the size, activity, number of developers, licence and other factors can all be relevant. Therefore the possible phase space of important factors can be multidimensional.

In this paper the authors develop a framework, based on work in medicine and other fields, for measuring and maximising representativeness of a sample by appropriate selection of projects along the dimensions of the problem space. They apply the framework to recent research.

What they discovered, tabulated in Table II of the paper, is that while a very small, carefully-selected sample can be surprisingly representative (50 out of ~20k projects represented ~15% of their problem space), the ~200 projects they could find analysed in recent research only scored around 9% on their representativeness metric. However in certain dimensions the studies were highly representative, many covering 100% of the phase space in specific dimensions.

Conclusions

A fact that jumped out at me, because of the field I work in, is that there are 245 Objective-C projects in the universe studied by this paper (the projects indexed on Ohloh) and that not one of these is covered by any of the studies they analysed. That could mean that my own back yard is ripe for the picking: that there are interesting results to be determined by analysing Objective-C projects and comparing those results with received wisdom.

In the discussion section, the authors point out that just because a study is not general, does not mean it is not useful. You may not be able to generalise a result gleaned from analysing (say) Java developer tools to all software development, but if what you’re interested in is Java developer tools then that is not a problem.

What this paper gives us, then, is not necessarily a tool that commercial developers can run out and use. It gives us some important quantitative context on evaluating the research that we do read. And, should we want to analyse our own work and investigate hypotheses about factors affecting our projects, it gives us a framework to understand just how representative those analyses would be.

At the old/new interface: jQuery in WebObjects

It turns out to be really easy to incorporate jQuery into an Objective-C WebObjects app (targeting GNUstep Web). In fact, it doesn’t really touch the Objective-C source at all. I defined a WOJavascript object that loads jQuery itself from the application’s web server resources folder, so it can be reused across multiple components:

jquery_script:WOJavaScript {scriptFile="jquery-2.0.2.js"}

Then in the components where it’s used, any field that needs uniquely identifying should have a CSS identifier, which can be bound via WebObjects’s id binding. In this example, a text field for entering an email address in a form will only be enabled if the user has checked a “please contact me” checkbox.

email_field:WOTextField {value=email; id="emailField"}
contact_boolean:WOCheckBox {checked=shouldContact; id="shouldContact"}

The script itself can reside in the component’s HTML template, or in a WOJavascript that looks in the app’s resources folder or returns javascript that’s been prepared by the Objective-C code.

    <script>
function toggleEmail() {
    var emailField = $("#emailField");
    var isChecked = $("#shouldContact").prop("checked");
    emailField.prop("disabled", !isChecked);
    if (!isChecked) {
        emailField.val("");
    }
}
$(document).ready(function() {
    toggleEmail();
    $("#shouldContact").click(toggleEmail);
});
    </script>

I’m a complete newbie at jQuery, but even so that was easier than expected. I suppose the lesson to learn is that old technology isn’t necessarily incapable technology. People like replacing their web backend frameworks every year or so; whether there’s a reason (beyond caprice) warrants investigation.

The code you wrote six months ago

We have this trope in programming that you should hate the code you wrote six months ago. This is a figurative way of saying that you should be constantly learning and assimilating new ideas, so that you can look at what you were doing earlier this year and have new ways of doing it.

It would be more accurate, though less visceral, to say “you should be proud that the code you wrote six months ago was the best you could do with the knowledge you then had, and should be able to ways to improve upon it with the learning you’ve accomplished since then”. If you actually hate the code, well, that suggests that you think anyone who doesn’t have the knowledge you have now is an idiot. That kind of mentality is actually deleterious to learning, because you’re not going to listen to anyone for whom you have Set the Bozo Bit, including your younger self.

I wrote a lot about learning and teaching in APPropriate Behaviour, and thinking about that motivates me to scale this question up a bit. Never mind my code, how can we ensure that any programmer working today can look at the code I was writing six months ago and identify points for improvement? How can we ensure that I can look at the code any other programmer was working on six months ago, and identify points for improvement?

My suggestion is that programmers should know (or, given the existence of the internet, know how to use the index of) the problems that have already come before, how we solved them, and why particular solutions were taken. Reflecting back on my own career I find a lot of problems I introduced by not knowing things that had already been solved: it wasn’t until about 2008 that I really understood automated testing, a topic that was already being discussed back in 1968. Object-oriented analysis didn’t really click for me until later, even though Alan Kay and a lot of really other clever people had been working on it for decades. We’ll leave discussion of parallel programming aside for the moment.

So perhaps I’m talking about building, disseminating and updating a shared body of knowledge. The building part already been done, but I’m not sure I’ve ever met anyone who’s read the whole SWEBOK or referred to any part of it in their own writing or presentations so we’ll call the dissemination part a failure.

Actually, as I said we only really need an index, not the whole BOK itself: these do exist for various parts of the programming endeavour. Well, maybe not indices so much as catalogues; summaries of the state of the art occasionally with helpful references back to the primary material. Some of them are even considered “standards”, in that they are the go-to places for the information they catalogue:

  • If you want an algorithm, you probably want The Art of Computer Programming or Numerical Recipes. Difficulties: you probably won’t understand what’s written in there (the latter book in particular assumes a bunch of degree-level maths).
  • If you want idioms for your language, look for a catalogue called “Effective <name of your language>”. Difficulty: some people will disagree with the content here just to be contrary.
  • If you want a pattern, well! Have we got a catalogue for you! In fact, have we got more catalogues than distinct patterns! There’s the Gang of Four book, the PloP series, and more. If you want a catalogue that looks like it’s about patterns but is actually comprised of random internet commentators trying to prove they know more than Alastair Cockburn, you could try out the Portland Pattern Repository. Difficulty: you probably won’t know what you’re looking for until you’ve already read it—and a load of other stuff.

I’ve already discussed how conference talks are a double-edged sword when it comes to knowledge sharing: they reach a small fraction of the practitioners, take information from an even smaller fraction, and typically set up a subculture with its own values distinct from programming in the large. The same goes for company-internal knowledge sharing programs. I know a few companies that run such programs (we do where I work, and Etsy publish the talks from theirs). They’re great for promoting research, learning and sharing within the company, but you’re always aware that you’re not necessarily discovering things from without.

So I consider this one of the great unsolved problems in programming at the moment. In fact, let me express it as two distinct questions:

  1. How do I make sure that I am not reinventing wheels, solving problems that no longer need solving or making mistakes that have already been fixed?
  2. A new (and for sake of this discussion) inexperienced programmer joins my team. How do I help this person understand the problems that have already been solved, the mistakes that have already been made, and the wheels that have already been invented?

Solve this, and there are only two things left to do: fix concurrency, name things, and improve bounds checking.