ClassBrowser’s public face

I made a couple of things:

I should’ve done both of these things at the beginning of the project. I believe that the fact I opened the source really early, when it barely did one thing and then only on my machine™, was a good thing. It let people find out about the project, have a look, and even send changes. But not giving anywhere to discuss ClassBrowser was a mistake. It meant I answered questions in private messages that would’ve been better archived, and it probably turned some people away who couldn’t immediately see what the point was.

Hopefully it’s not too late to fix that.

ClassBrowser: warts and all

I previously gave a sneak peak of ClassBrowser, a dynamic execution environment for Objective-C. It’s not anything like ready for general use (in fact it can’t really do ObjC very well at all), but it’s at the point where you can kick the tyres and contribute pull requests. Here’s what you need to know:

Have a lot of fun!

ClassBrowser is distributed under the terms of the University of Illinois/NCSA licence (because it is based partially on code distributed with clang, which is itself under that licence).

A sneaky preview of ClassBrowser

Let me start with a few admissions. Firstly, I have been computering for a good long time now, and I still don’t really understand compilers. Secondly, work on my GNUstep Web side-project has tailed off for a while, because I decided I wanted to try something out to learn about the compiler before carrying on with that work. This post will mostly be about that something.

My final admission: if you saw my presentation on the ObjC runtime you will have seen an app called “ClassBrowser” where I showed that some of the Foundation classes are really in the CoreFoundation library. Well, there were two halves to the ClassBrowser window, and I only showed you the top half that looked like the Smalltalk class browser. I’m sorry.

So what’s the bottom half?

This is what the bottom half gives me. It lets me go from this:

ClassBrowser before

via this:

Who are you calling a doIt?

to this:

ClassBrowser after

What just happened?

You just saw some C source being compiled to LLVM bit code, which is compiled just-in-time to native code and executed, all inside that browser app.

Why?

Well why not? Less facetiously:

  • I’m a fan of Smalltalk. I want to build a thing that’s sort of a Smalltalk, except that rather than being the Smalltalk language on the Objective-C runtime (like F-Script or objective-smalltalk), it’ll be the Objective-C(++) language on the Objective-C runtime. So really, a different way of writing Objective-C.
  • I want to know more about how clang and LLVM work, and this is as good a way as any.
  • I think, when it actually gets off the ground, this will be a faster way of writing test-first Objective-C than anything Xcode can do. I like Xcode 5, I think it’s the better Xcode 4 that I always wanted, but there are gains to be had by going in a completely different direction. I just think that whoever strikes out in such a direction should not release their research project as a new version of Xcode :-).

Where can I get me a ClassBrowser?

You can’t, yet. There’s some necessary housekeeping that needs to be done before a first release, replacing some “research” hacks with good old-fashioned tested code and ensuring GNUstep-GUI compatibility. Once that’s done, a rough-and-nearly-ready first open source release will ensue.

Then there’s more work to be done before it’s anything like useful. Particularly while it’s possible to use it to run Objective-C, it’s far from pleasant. I’ve had some great advice from LLVM IRC on how to address that and will try to get it to happen soon.

Story points: because I don’t know what I’m doing

The scenario

[Int. developer’s office. Developer sits at a desk that faces the wall. Two of the monitors on Developer’s desk are on stands, if you look closely you see that the third is balanced on the box set of The Art of Computer Programming, which is still in its shrink-wrap. Developer notices you and identifies an opportunity to opine about why the world is wrong, as ever.]

Every so often, people who deal with the real world instead of the computer world ask us developers annoying questions about how our work interacts with so-called reality. You’re probably thinking the same thing I do: who cares, right? I’m right in the middle of a totally cool abstraction layer on top of the operating system’s abstraction layer that abstracts their abstraction so I can interface it to my abstraction and abstract all the abstracts, what’s that got to do with reality and customers and my employer and stuff?

Ugh, damn, turning up my headphones and staring pointedly at the screen hasn’t helped, they’re still asking this question. OK, what is it?

Apparently they want to know when some feature will be done. Look, I’m a programmer, I’m absolutely the worst person to ask about time. OK, I believe that you might want to know whether this development effort is going to deliver value to the customers any time soon, and whether we’re still going to be ahead financially when we’re done, or whether it’d be better to take on some other work. And really I’d love to answer this question, except for one thing:

I have absolutely no idea what I’m doing. Seriously, don’t you remember all the other times that I gave you estimates and they were way off? The problem isn’t some systematic error in the way I think about how long it’ll take me to do stuff, it’s that while I can build abstractions on top of other abstractions I’m not so great at going the other way. Give me a short description of a task, I’ll try and work out what’s involved but I’m likely to miss something that will become important when I go to do it. It’s these missed details that add time, and I don’t know how many of those there will be until I get started.

The proposed solution

[Developer appears to have a brainwave]

Wait, remember how my superpower is adding layers of abstraction? Well your problem of estimation looks quite a lot like a nail to me, so I’ll apply my hammer! Let’s add a layer of abstraction on top of time!

Now you wanted to know how long it’ll take to finish some feature. Well I’ll tell you, but I won’t tell you in units of hours or days, I’ll use BTUs (Bullshit Time Units) instead. So this thing I’m working on will be about five BTUs. What do you mean, that doesn’t tell you when I’ll be done? It’s simple, duh! Just wait a couple of months, and measure how many BTUs we actually managed to complete. Now you know how many BTUs per day we can do, and you know how long everything takes!

[Developer puts their headphones back in, and turns to face the monitor. The curtain closes on the scene, and the Humble(-ish) Narrator takes the stage.]

The observed problem

Did you notice that the BTU doesn’t actually solve the stated problem? If it’s possible to track BTU completion over time until we know how many BTUs get completed in an iteration, then we are making the assumption that there is a linear relationship between BTUs and units of time. Just as there are 40 (or 90, if you picked the wrong recruiter) hours to the work week, so there are N BTUs to the work week. A BTU is worth x hours, and we just need to measure for a bit until we find the value of x.

But Developer’s problem was not a failure to understand how many hours there are in an hour. Developer’s problem was a failure to know what work is outstanding. An inability to foresee what work needs to be done cannot be corrected by any change to the way in which work to be done is mapped onto time. It is, to wear out even further an already tired saw, an unknown unknown.

What to do about it

We’re kindof stuck, really. We can’t tell how long something will take until we do it, not because we’re bad at estimating how long it’ll take to do something but because we’re bad at knowing what it is we need to do.

The little bit there about “until we do it” is, I think, what we need to focus on. I can’t tell you how long something I haven’t done will take, but I can probably tell you what problems are outstanding on the thing I’m doing now. I can tell you whether it’s ready now, or whether I think it’ll be ready “soon” or “not soon”.

So here’s the opportunity: we’ll keep whatever we’ve already got ready for immediate release. We’ll share information about which of the acceptance tests are passing, and if we were to release right now you’d know what customers will get from that. Whatever the thing we’re working on now is, we’ll be in a position to decide whether to switch away if we can do some more valuable work instead.

Conflicts in my mental model of Objective-C

My worldview as it relates to the writing of software in Objective-C contains many items that are at odds with one another. I either need to resolve them or to live with the cognitive dissonance, gradually becoming more insane as the conflicting items hurl one another at my cortex.

Of the programming environments I’ve worked with, I believe that Objective-C and its frameworks are the most pleasant. On the other hand, I think that Objective-C was a hack, and that the frameworks are not without their design mistakes, regressions and inconsistencies.

I believe that Objective-C programmers are correct to side with Alan Kay in saying that the designers of C++ and Java missed out on the crucial part of object-oriented programming, which is message passing. However I also believe that ObjC missed out on a crucial part of object-oriented programming, which is the compiler as an object. Decades spent optimising the compile-link-debug-edit cycle have been spent on solving the wrong problem. On which topic, I feel conflicted by the fact that we’ve got this Smalltalk-like dynamic language support but can have our products canned for picking the same selector name as some internal secret stuff in someone else’s code.

I feel disappointed that in the last decade, we’ve just got tools that can do the same thing but in more places. On the other hand, I don’t think it’s Apple’s responsibility to break the world; their mission should be to make existing workflows faster, with new excitement being optional or third-party. It is both amazing and slightly saddening that if you defrosted a cryogenically-preserved NeXT application programmer, they would just need to learn reference counting, blocks and a little new syntax and style before they’d be up to speed with iOS apps (and maybe protocols, depending on when you threw them in the cooler).

Ah, yes, Apple. The problem with a single vendor driving the whole community around a language or other technology is that the successes or failures of the technology inevitably get caught up in the marketing messages of that vendor, and the values and attitudes ascribed to that vendor. The problem with a community-driven technology is that it can take you longer than the life of the Sun just to agree how lambdas should work. It’d be healthy for there to be other popular platforms for ObjC programming, except for the inconsistencies and conflicts that would produce. It’s great that GNUstep, Cocotron and Apportable exist and are as mature as they are, but “popular” is not quite the correct adjective for them.

Fundamentally I fear a world in which programmers think JavaScript is acceptable. Partly because JavaScript, but mostly because when a language is introduced and people avoid it for ages, then just because some CEO says all future websites must use it they start using it, that’s not healthy. Objective-C was introduced and people avoided it for ages, then just because some CEO said all future apps must use it they started using it.

I feel like I ought to do something about some of that. I haven’t, and perhaps that makes me the guy who comes up to a bunch of developers, says “I’ve got a great idea” and expects them to make it.

Representativeness in Software Engineering Research

The first paragraph describes the context of this post in relation to the blog on which it originally appeared, not blog.securemacprogramming.com.

For this post, I wanted to go a little bit meta. One focus of this blog will be on whether results from academic software engineering are applicable to the work I do as a commercial software developer, so it was hard to pass up this Microsoft Research paper on representativeness of research.

In a nutshell, the problem is this: imagine that you read some study that shows, for example, that schedule slippage on a software project is significantly lessened if developers are given two digestive biscuits and a cup of tea at 4pm on working days. The study examined the breaktime habits of developers on 500 open source projects. This sounds quite convincing. If this thing with the tea and biscuits is true across five hundred projects, it must be applicable to my project, right?

That doesn’t follow. There are many reasons why it might not follow: the study may be biased. The analysis may be wrong. The biscuit thing may be significant but not the root cause of success. The authors may have selected projects that would demonstrate the biscuit outcome. Projects that had initially signed up but got delayed might have dropped out. This paper evaluates one cause of bias: the projects used in the study aren’t representative of the project you’re doing.

It’s a fallacy to assume that just because a study has a large sample size, its results can be generalised to the population. This only applies in the case that the sample represents an even slice of the population. Imagine a world in which all software projects are either written in Java or LISP. Now it doesn’t matter whether I select 10 projects or 10,000 projects: a sample of LISP practices will not necessarily tell us anything about how to conduct a Java project.

Conversely a study that investigates both Java and LISP projects can—in this restricted universe, and with the usual “all other things being equal” caveat—tell us something generally about software projects independent of language. However, choice of language is only one dimension in which software can be measured: the size, activity, number of developers, licence and other factors can all be relevant. Therefore the possible phase space of important factors can be multidimensional.

In this paper the authors develop a framework, based on work in medicine and other fields, for measuring and maximising representativeness of a sample by appropriate selection of projects along the dimensions of the problem space. They apply the framework to recent research.

What they discovered, tabulated in Table II of the paper, is that while a very small, carefully-selected sample can be surprisingly representative (50 out of ~20k projects represented ~15% of their problem space), the ~200 projects they could find analysed in recent research only scored around 9% on their representativeness metric. However in certain dimensions the studies were highly representative, many covering 100% of the phase space in specific dimensions.

Conclusions

A fact that jumped out at me, because of the field I work in, is that there are 245 Objective-C projects in the universe studied by this paper (the projects indexed on Ohloh) and that not one of these is covered by any of the studies they analysed. That could mean that my own back yard is ripe for the picking: that there are interesting results to be determined by analysing Objective-C projects and comparing those results with received wisdom.

In the discussion section, the authors point out that just because a study is not general, does not mean it is not useful. You may not be able to generalise a result gleaned from analysing (say) Java developer tools to all software development, but if what you’re interested in is Java developer tools then that is not a problem.

What this paper gives us, then, is not necessarily a tool that commercial developers can run out and use. It gives us some important quantitative context on evaluating the research that we do read. And, should we want to analyse our own work and investigate hypotheses about factors affecting our projects, it gives us a framework to understand just how representative those analyses would be.

At the old/new interface: jQuery in WebObjects

It turns out to be really easy to incorporate jQuery into an Objective-C WebObjects app (targeting GNUstep Web). In fact, it doesn’t really touch the Objective-C source at all. I defined a WOJavascript object that loads jQuery itself from the application’s web server resources folder, so it can be reused across multiple components:

jquery_script:WOJavaScript {scriptFile="jquery-2.0.2.js"}

Then in the components where it’s used, any field that needs uniquely identifying should have a CSS identifier, which can be bound via WebObjects’s id binding. In this example, a text field for entering an email address in a form will only be enabled if the user has checked a “please contact me” checkbox.

email_field:WOTextField {value=email; id="emailField"}
contact_boolean:WOCheckBox {checked=shouldContact; id="shouldContact"}

The script itself can reside in the component’s HTML template, or in a WOJavascript that looks in the app’s resources folder or returns javascript that’s been prepared by the Objective-C code.

    <script>
function toggleEmail() {
    var emailField = $("#emailField");
    var isChecked = $("#shouldContact").prop("checked");
    emailField.prop("disabled", !isChecked);
    if (!isChecked) {
        emailField.val("");
    }
}
$(document).ready(function() {
    toggleEmail();
    $("#shouldContact").click(toggleEmail);
});
    </script>

I’m a complete newbie at jQuery, but even so that was easier than expected. I suppose the lesson to learn is that old technology isn’t necessarily incapable technology. People like replacing their web backend frameworks every year or so; whether there’s a reason (beyond caprice) warrants investigation.

The code you wrote six months ago

We have this trope in programming that you should hate the code you wrote six months ago. This is a figurative way of saying that you should be constantly learning and assimilating new ideas, so that you can look at what you were doing earlier this year and have new ways of doing it.

It would be more accurate, though less visceral, to say “you should be proud that the code you wrote six months ago was the best you could do with the knowledge you then had, and should be able to ways to improve upon it with the learning you’ve accomplished since then”. If you actually hate the code, well, that suggests that you think anyone who doesn’t have the knowledge you have now is an idiot. That kind of mentality is actually deleterious to learning, because you’re not going to listen to anyone for whom you have Set the Bozo Bit, including your younger self.

I wrote a lot about learning and teaching in APPropriate Behaviour, and thinking about that motivates me to scale this question up a bit. Never mind my code, how can we ensure that any programmer working today can look at the code I was writing six months ago and identify points for improvement? How can we ensure that I can look at the code any other programmer was working on six months ago, and identify points for improvement?

My suggestion is that programmers should know (or, given the existence of the internet, know how to use the index of) the problems that have already come before, how we solved them, and why particular solutions were taken. Reflecting back on my own career I find a lot of problems I introduced by not knowing things that had already been solved: it wasn’t until about 2008 that I really understood automated testing, a topic that was already being discussed back in 1968. Object-oriented analysis didn’t really click for me until later, even though Alan Kay and a lot of really other clever people had been working on it for decades. We’ll leave discussion of parallel programming aside for the moment.

So perhaps I’m talking about building, disseminating and updating a shared body of knowledge. The building part already been done, but I’m not sure I’ve ever met anyone who’s read the whole SWEBOK or referred to any part of it in their own writing or presentations so we’ll call the dissemination part a failure.

Actually, as I said we only really need an index, not the whole BOK itself: these do exist for various parts of the programming endeavour. Well, maybe not indices so much as catalogues; summaries of the state of the art occasionally with helpful references back to the primary material. Some of them are even considered “standards”, in that they are the go-to places for the information they catalogue:

  • If you want an algorithm, you probably want The Art of Computer Programming or Numerical Recipes. Difficulties: you probably won’t understand what’s written in there (the latter book in particular assumes a bunch of degree-level maths).
  • If you want idioms for your language, look for a catalogue called “Effective <name of your language>”. Difficulty: some people will disagree with the content here just to be contrary.
  • If you want a pattern, well! Have we got a catalogue for you! In fact, have we got more catalogues than distinct patterns! There’s the Gang of Four book, the PloP series, and more. If you want a catalogue that looks like it’s about patterns but is actually comprised of random internet commentators trying to prove they know more than Alastair Cockburn, you could try out the Portland Pattern Repository. Difficulty: you probably won’t know what you’re looking for until you’ve already read it—and a load of other stuff.

I’ve already discussed how conference talks are a double-edged sword when it comes to knowledge sharing: they reach a small fraction of the practitioners, take information from an even smaller fraction, and typically set up a subculture with its own values distinct from programming in the large. The same goes for company-internal knowledge sharing programs. I know a few companies that run such programs (we do where I work, and Etsy publish the talks from theirs). They’re great for promoting research, learning and sharing within the company, but you’re always aware that you’re not necessarily discovering things from without.

So I consider this one of the great unsolved problems in programming at the moment. In fact, let me express it as two distinct questions:

  1. How do I make sure that I am not reinventing wheels, solving problems that no longer need solving or making mistakes that have already been fixed?
  2. A new (and for sake of this discussion) inexperienced programmer joins my team. How do I help this person understand the problems that have already been solved, the mistakes that have already been made, and the wheels that have already been invented?

Solve this, and there are only two things left to do: fix concurrency, name things, and improve bounds checking.

Lighter UIViewControllers

The first issue of Objective-C periodical objc.io has just been announced:

Issue #1 is about lighter view controllers. The introduction tells you a bit more about this issue and us. First, Chris writes about lighter view controllers. Florian expands on this topic with clean table view code. Then Daniel explores view controller testing. Finally, in our first guest article, Ricki explains view controller containment.

Each of the articles is of a high quality, I thoroughly recommend that you read this. I powered through it this morning and have already put myself on the mailing list for the next issue. I think it’s worth quite a few dollars more than the $0 they’re asking.

On a more self-focussed note, I’m pleased to note that two of the articles cite my work. This isn’t out of some arrogant pleasure at seeing my name on someone else’s website. One of my personal goals has been to teach the people who teach the other people: to ensure that what I learn becomes a part of the memeplex and isn’t forgotten and reinvented a few years later. In APPropriate Behaviour a few of the chapters discuss the learning and teaching of software making. I consider it one of the great responsibilities of our discipline, to ensure the mistakes we had to make are not made by those who come after us.

The obj.io team have done a great job of researching their articles, taking the knowledge that has been found and stories that have been told and synthesising new knowledge and new stories from them. I’m both proud and humble about my small, indirect role in this effort.

APPropriate Behaviour is complete!

APPropriate Behaviour, the book on things programmers do that aren’t programming, is now complete! The final chapter – a philosophy of software making – has been added, concluding the book.

Just because it’s complete, doesn’t mean it’s finished: as my understanding of what we do develops I’ll probably want to correct things, or add new anecdotes or ideas. Readers of the book automatically get free updates whenever I create them in the future, so I hope that this is a book that grows with us.

As ever, the introduction to the book has instructions on joining the book’s Glassboard to discuss the content or omissions from the content. I look forward to reading what you have to say about the book in the Glassboard.

While the recommended purchase price of APPropriate Behaviour is $20, the minimum price now that it’s complete is just $10. Looking at the prices paid by the 107 readers who bought it while it was still being written, $10 is below the median price (so most people chose to pay more than $10) and the modal price (so the most common price chosen by readers was higher than $10).

A little about writing the book: I had created the outline of the book last Summer, while thinking about the things I believed should’ve been mentioned in Code Complete but were missing. I finally decided that it actually deserved to be written toward the end of the year, and used National Novel Writing Month as an excuse to start on the draft. A sizeable portion of the draft typescript was created in that month; enough to upload to LeanPub and start getting feedback on from early readers. I really appreciate the help and input those early readers, along with other people I’ve talked to the material about, have given both in preparing APPropriate Behaviour and in understanding my career and our industry.

Over the next few months, I tidied up that first draft, added new chapters, and extended the existing material. The end result – the 11th release including that first draft – is 141 pages of reflection over the decade in which I’ve been paid to make software: not a long time, but still nearly 15% of the sector’s total lifespan. I invite you to grab a copy from LeanPub and share in my reflections on that decade, and consider what should happen in the next.