The balloon goes up

To this day, many Smalltalk projects have a hot air balloon in their logo. These reference the cover of the issue of Byte Magazine in which Smalltalk-80 was shared with the wider programming community.

A hot air balloon bearing the word "Smalltalk" sails over a castle on a small island.

Modern Smalltalks all have a lot in common with Smalltalk-80. Why? If you compare Smalltalk-72 with Smalltalk-80 there’s a huge amount of evolution. So why does Cincom Smalltalk or Amber Smalltalk or Squeak or even Pharo still look quite a lot like Smalltalk-80?

My answer is because they are used. Actually, Alan’s answer too:

Basically what happened is this vehicle became more and more a programmer’s vehicle and less and less a children’s vehicle—the version that got put out, Smalltalk ’80, I don’t think it was ever programmed by a child. I don’t think it could have been programmed by a child because it had lost some of its amenities, even as it gained pragmatic power.

So the death of Smalltalk in a way came as soon as it got recognized by real programmers as being something useful; they made it into more of their own image, and it started losing its nice end-user features.

I think there are two different things you want from a programming language (well, programming environment, but let’s not split tree trunks). Referencing the ivory tower on the Byte cover, let’s call them “academic” and “industrial”, these two schools.

The industrial ones are out there, being used to solve problems. They need to be stable (some of these problems haven’t changed much in decades), easy to understand (the people have changed), and they don’t need to be exciting, they just need to work. Cobol and Fortran are great in this regard, as is C and to some extent C++: you take code written a bajillion years ago, build it, and it works.

The academic ones are where the new ideas get tried out. They should enable experiment and excitement first, and maybe easy to understand (but if you need to be an expert in the idea you’re trying out, that’s not so bad).

So the industrial and academic languages have conflicting goals. There’s going to be bad feedback set up if we try to achieve both goals in one place:

  • the people who have used the language as a tool to solve problems won’t appreciate it if new ideas come along that mean they have to work to get their solution building or running correctly, again.
  • the people who have used the language as a tool to explore new ideas won’t appreciate it if backwards compatibility hamstrings the ability to extend in new directions.

Unfortunately at the moment a lot of languages are used for both, which leads to them being mediocre at either. The new “we’ve done C but betterer” languages like Go, Rust etc. feature people wanting to add new features, and people wanting not to have existing stuff broken. Javascript is a mess of transpilation, shims, polyfills, and other words that mean “try to use a language, bearing in mind that nobody agrees how it’s supposed to work”.

Here are some patterns for managing the distinction that have been tried in the past:

  • metaprogramming. Lisp in particular is great at having a little language that you can use to solve your problems, and that you can also use to make new languages or make the world work differently to see how that would work. Of course, if you can change the world then you can break the world, and Lisp isn’t great at making it clear that there’s a line between following the rules and writing new ones.
  • pragmas. Haskell in particular is great at having a core language that people understand and use to write software, and a zillion flags that enable different features that one person pursued in their PhD that one time. Not all of the flag combinations may be that great, and it might be hard to know which things work well and which worked well enough to get a dissertation out of. But these are basically the “enable academic mode” settings, anyway.
  • versions. Perl and Python both ran for years in which version x was the safe, stable, industrial language, and version y (it’s not x+1: Python’s parallel versions were 2 and 3000) in which people could explore extensions, removals, or other changes in potentially breaking ways. At some point, each project got to the point where they were happy with the choices, and declared the new version “ready” and available for industrial use. This involved some translation from version x, which wasn’t necessarily straightforward (though in the case of Python was commonly overblown, so people avoided going from 2 to 3 even when it was easy). People being what they are, they put a lot of store in version numbers. So some people didn’t like that folks were recommending to use x when there was this clearly newer y available.
  • FFIs. You can call industrial C89 code (which still works after three decades) from pretty much any academic language you care to invent. If you build a JVM language, it can do what it wants, and still call Java code.

Anyway, I wonder whether that distinction between academic and industrial might be a good one to strengthen. If you make a new programming language project and try to get “users” too soon, you could lose the ability to take the language where you want it to go. And based on the experience of Smalltalk, too soon might be within the first decade.

Image

I love my Testsphere deck, from Ministry of Testing. I’ve twice seen Riskstorming in action, and the first time that I took part I bought a deck of these cards as soon as I got back to my desk.

I’m not really a tester, though I have really been a tester in the past. I still fall into the trap of thinking that I set out to make this thing do a thing, I have made it do a thing, therefore I am done. I’m painfully aware when metacognating that I am definitely not done at that point, but back “in the zone” I get carried away by success.

One of the reasons I got interested in Design by Contract was the false sense of “done” I feel when TDDing. I thought of a test that this thing works. I made it pass the test. Therefore this thing works? Well, no: how can I keep the same workflow, and speed of progress but improve the confidence in the statement?

The Testsphere cards are like a collection of mnemonics for testers, and for people who otherwise find themselves wondering whether this software really works. Sometimes I cut the deck, look at the card I’ve found, and think about what it means for my software. It might make me think about new ways to test the code. It might make me think about criticising the design. It might make me question the whole approach I’m taking. This is all good: I need these cues.

I just cut the deck and found the “Image” card, which is in the Heuristics section of the deck. It says that it’s a consistency heuristic:

Is your product true to the image and reputation you or your app’s company wishes to project?

That’s really interesting. How would I test for that? OK, I need to know what success is, which means I need to know “the image and reputation [we wish] to project”. That sounds very much like a marketing thing. Back when I ran the mobile track at QCon London, Jaimee Newberry gave a great talk about finding the voice for your product. She suggested identifying a celebrity whose personality embodies the values you want to project, then thinking about your interactions with your customers as if that personality were speaking to them.

It also sounds like there’s a significant user or customer experience part to this definition. Maybe marketing can tell me what voice, tone, or image we want to suggest to our customers, but what does it mean to say that a touchscreen interface works like Lady Gaga? Is that swipe gesture the correct amount of quirky, unexpected, subversive, yet still accessible? Do the features we have built shout “Poker Face”?

We’ll be looking at user interface design, too. Graphic design. Sound design. Copyediting. The frequency of posts on the email list, and the level of engagement needed. Pricing, too: it’s no good the brochure projecting Fortnum & Mason if the menu says Five Guys.

This doesn’t seem like something I’m going to get from red to green in a few minutes in Emacs. And it’s one of a hundred cards.

Why 80?

80 characters per line is a standard worth sticking to, even today. OK, why?

Well, back up. Let’s examine the axioms. Is 80 characters per line a standard? Not really, it’s a convention. IBM cards (which weren’t just made by IBM or read by IBM machines) were certainly 80 characters wide, as were DEC video terminals, which Macs etc. emulate. Actually, that’s not even true. The DEC VT-05 could display 72 characters per line, their later VT-50 and successor models introduced 80 characters. The VT-100 could display 132 characters per line, the same quantity as a line printer (including the ones made by IBM). Other video terminals had 40 or 64 character lines. Teletypewriters typically had shorter lines, like 70 characters.

Typewriters were typically limited to \((\mathrm{width\ of\ page} – 2 \times \mathrm{margin\ width}) \times \mathrm{character\ density}\) characters per line. With wide margins and narrow US paper, you might get 50 characters: with narrow margins and wide A4 paper, maybe 100.

IBM were not the only people to make cards, punches, and readers. Other manufacturers did, with other numbers of characters per card. IBM themselves made 40, 45 and 96 column cards. Remington Rand made cards with 45 or 90 columns.

So, axiom one modified, “80 characters per line is a particular convention out of many worth sticking to, even today.” Is it worth sticking to?

Hints are that it isn’t. The effects of line length on reading online news explored screen-reading with different line lengths: 35, 55, 75 and 95 cpl. They found, from the abstract:

Results showed that passages formatted with 95 cpl resulted in faster reading speed. No effects of line length were found for comprehension or satisfaction, however, users indicated a strong preference for either the short or long line lengths.

However that isn’t a clear slam dunk. Quoting their reference to prior work:

Research investigating line length for online text has been inconclusive. Several studies found that longer line lengths (80 – 100 cpl) were read faster than short line lengths (Duchnicky and Kolers, 1983; Dyson and Kipping, 1998). Contrary to these findings, other research suggests the use of shorter line lengths. Dyson and Haselgrove (2001) found that 55 characters per line were read faster than either 100 cpl or 25 cpl conditions. Similarly, a line length of 45-60 characters was recommended by Grabinger and Osman-Jouchoux (1996) based on user preferences. Bernard, Fernandez, Hull, and Chaparro (2003) found that adults preferred medium line length (76 cpl) and children preferred shorter line lengths (45 cpl) when compared to 132 characters per line.

So, long lines are read faster than short lines, except when they aren’t. They also found that most people preferred the longest or shortest lines the most, but also that everybody preferred the shortest or longest lines the least.

But is 95cpl a magic number? What about 105cpl, or 115cpl? What about 273cpl, which is what I get if I leave my Terminal font settings alone and maximise the window in my larger monitor? Does it even make sense for programmers who don’t have to line up the comment markers in Fortran-77 code to be using monospaced fonts, or would we be better off with proportional fonts?

And that article was about online news articles, a particular and terse form of prose, being read by Americans. Does it generalise to code? How about the observation that children and adults prefer different lengths, what causes that change? Does this apply to people from other countries? Well, who knows?

Buse and Weimer found that “average line length” was “strongly negatively correlated” with perceived readability. So maybe we should be aiming for one-character lines! Or we can offset the occasional 1,000 character line by having lots and lots of one-character lines:

}
}
}
}
}
}

It sounds like there’s information missing from their analysis. What was the actual shape of the data? What were the maximum and minimum line lengths considered, what distribution of line lengths was there?

We’re in a good place to rewrite the title from the beginning of the post: 80 characters per line is a particular convention out of many that we know literally nothing about the benefit or cost of, even today. Maybe our developer environments need a bit of that UX thing we keep imposing on everybody else.

Ultimate Programmer Super Stack Reloaded

Remember remember the cough 6th of November, when APPropriate Behaviour joined a wealth of other learning material for software engineers in a super-discounted bundle called the Ultimate Programmer Super Stack?

It’s happening again! This is a five-day flash sale, with all same material on levelling up as a programmer, running a startup, and learning new technologies like Aurelia, Node, Python and more. The link at the top of this paragraph goes to the sales page, and you’ve got until Monday, when it’s gone for good.

The Fragile Manifesto

A lot of what I’ve been reading and thinking about of late is about the agile backlash. More speed, lower velocity reflects on IT teams pursuing “deliver more/newer IT” at the cost of “help the company achieve its mission”. Grooming the Backfog is about one dysfunction that arises as a result: (mis)managing a never-ending road of small changes rather than looking at the big picture and finding a path toward the destination. Our products are not our products attempts to address this problem by recasting teams not as makers of product, but as solvers of problems.

Here’s the latest: UK wasting £37 billion a year on failed agile IT projects. Some people will say that this is a result of not Agiling enough: if you were all Lean and MVP and whatever you’d not get to waste all of that money. I don’t necessarily agree with that: I think there’s actually things to learn by, y’know, reading the article.

The truth is that, despite the hype, Agile development doesn’t always work in practice.

True enough, but not a helpful statement, because “Agile” now means a lot of different things to different people. If we take it to mean the values, principles and practices written by the people who came up with the term, then I can readily believe that it wouldn’t work in practice for people whose context is different from those who came up with the ideas in 2001. Which may well be everyone.

I’m also very confident that it doesn’t mean that. I met a team recently who said they did “Agile”, and discussed their standups and two-week iterations. They also described how they were considering whether to go from an annual to biannual release.

Almost three quarters (73%) of CIOs think Agile IT has now become an industry in its own right while half (50%) say they now think of Agile as “an IT fad”.

The Agile-Industrial Complex is well-documented. You know what isn’t well-documented? Your software.

The report revealed 44% of Agile IT projects that fail, do so because of a failure to produce enough (or any) documentation.

The survey found that 34% of failed Agile projects failed because of a lack of upfront and ongoing planning. Planning is a casualty of today’s interpretation of the Agile Manifesto[…]

68% of CIOs agree that agile teams require more Architects. From defining strategy, to championing technical requirements (such as performance and security) to ensuring development teams stick to the rules of the game, the role of the Architect is sorely missed in the agile space. It must be reintroduced.

A bit near the top of the front page of the manifesto for agile software development is a sentence fragment that says:

Working software over comprehensive documentation

Before we discuss that fragment, I’d just like to quote the end of the sentence. It’s a long way further down the page, so it’s possible that some readers have missed it.

That is, while there is value in the items on the right, we value the items on the left more.

Refactor -> Inline Reference:

That is, while there is value in comprehensive documentation, we value working software more.

Refactor -> Extract Statement:

There is value in comprehensive documentation.

Now I want to apply the same set of transforms to another of the sentence fragments:

There is value in following a plan.

Nobody ever said don’t have a plan. You should have a plan. You should be willing to amend the plan. I was recently asked what I’d do if I found that my understanding of the “requirements” of a system differ from the customer’s understanding. It depends a lot on context but if there truly is a “the customer” and they want something that I’m not expecting to offer them, it’s time for me to either throw away my version or find a different customer.

Similarly, nobody said don’t have comprehensive documentation. I have been on a very “by-the-book” Agile team, where a developer team lead gave feedback that they couldn’t work out where a change would go to enable a particular feature. That’s architecture! What they wanted was an architectural plan of the system. Except that they couldn’t explicitly want that, because software architecture is so, ugh, 1990s and Rational Rose. Wanting an architecture diagram is like wanting to use CORBA, urrr.

Once you get past that bizarre emotional response, give me a call.

Grooming the Backfog

This is “Pub Walks in Warwickshire”. NEW EDITION, it tells me! This particular EDITION was actually NEW back in 2008. It’s no longer in print.

Pub Walks in Warwickshire

Each chapter is a separate short walk, starting and finishing at a pub with a map and instructions to find your way around the walk. Some of the instructions are broken: a farmer has put a barbed wire fence across a field, or a gate has been replaced or removed. You find when you get there that it’s impossible to follow the instructions, and you have to invent a new route to get back on track. You did bring a different map, didn’t you? If not, you’ll be relying on good old-fashioned trial and error.

Other problems are more catastrophic. The Crown at Napton-on-the-hill seems to have closed in about 2013, so an attempt to do a circular walk ending with a pint there is going to run into significant difficulties, and come to an unsatisfactory conclusion. The world has moved on, and those directions are no longer relevant. You might want to start/end at the Folly, but you’ll have to make up a route that joins to the bits described here.

This morning, a friend told me of a team that he’d heard of who were pulling 25 people in to a three-hour backlog grooming session. That sounds like they’re going to write the NEW EDITION of “Pub Walks in Warwickshire” for their software, and that by the time they come around to walking the route they’ll find some of the paths are fenced over and the pubs closed.

Decomposing the Analogy

A lengthy, detailed backlog is not any different from having a complete project plan in advance of starting work, and comes with the same problems. Just like the pub walks book, you may find that some details need to change when you get to tackling them, therefore there was no value in spending the time constructing all of those details in the first place. These sorts of changes happen when assumptions about the organisation or architecture of the system are invalidated. Yes, you want this feature, but you can no longer put it in the Accounts module because you found that customers think about that when they’re sorting their bills, not their accounts. Or you need to put more effort into handling input from an external data source, because the way it really works isn’t quite the same as the documentation.

Or you find that a part of the landscape is no longer present and there’s no value in being over there. This happens when the introduction of your system, or a competitors’, means that people no longer worry about the problem they had back at the start. Or when changes in what people are trying to do mean they no longer want or need to solve that problem at all.

A book of maps and directions is a snapshot in time of ways to navigate the landscape. If it takes long enough to follow all of the directions, you will find that the details on the ground no longer match the approximation provided by the book.

A backlog of product features and stories is a snapshot in time of ways to develop the product. If it takes long enough to implement all of the features, you will find that the details in the environment no longer match the approximation provided by the backlog.

A Feeling of Confidence

We need to accept that people are probably producing this hefty backlog because they feel good about doing it, and replace it with something else to feel good about. Otherwise, we’re just making people feel bad about what they’re doing, or making them feel bad by no longer doing it.

What people seem to get from detailed plans is confidence. If what they’re confident in is “the process as documented says I need a backlog, and I feel confident that I have done that” then there’s not much we can do other than try to change the process documentation. But reality probably isn’t that facile. The confidence comes from knowing where they’re trying to go, and having a plan to get there.

We can substitute that confidence with frequent feedback: confidence that the direction they’re going in now is the best one given current knowledge, and that it’s really easy to get updates and course corrections. Replace the confidence of a detailed map with the confidence of live navigation.

On the Backfog

A software team should still have an idea of where it’s going. It helps to situate today’s development in the context of where we think (but do not know) we will be soon, to organise the system into a logical architecture, to see which bits of flexibility Ya [Probably] Ain’t Gonna Need and which bits Ya [Probably] Are. It also helps to have the discussion with people who might buy our stuff, because we can say “we think we’re going to do these things in the coming months” and they can say “I will give you a wheelbarrow full of money if you do this one first” or “actually I don’t need that thing so I hope it doesn’t get in my way”.

But we don’t need to know the detailed steps and directions to get there, because building those details now will be wasted effort if things change by the time we are ready to tackle all of the pieces. Those discussions we’re having with the people who might buy our stuff? They might, and indeed probably should, change that high-level direction.

Think of it like trying to navigate an unknown landscape in fog. You know that where you’re trying to get to is over there somewhere, but you can’t clearly see the whole path from here. You probably wouldn’t just take a compass bearing and head toward the destination. You’d look at what you can see around, and what paths there are. You’d check a map, sure, but you’d probably compare it with what you can see. You’d phone ahead to the destination, and check that they expect to be open when you expect to get there. You’d find out if there are any fruitful places to stop along the way.

So yes, share the high-level direction, it’s helpful. But share the uncertainty too. The thing we’re doing next should definitely be known, the thing we’re doing later should definitely be guesswork. Get confidence not from colouring in the plan all the way up to the edges, but by knowing how ready and able you are to update the plan.

On the continuous history of approximation

The Difference Engine – the Charles Babbage machine, not the steampunk novel – is a device for finding successive solutions to polynomial equations by adding up the differences introduced by each term between the successive input values.

This sounds like a fairly niche market, but in fact it’s quite useful because there are a whole lot of other functions that can be approximated by polynomial equations. The approach, which is based in calculus, generates a Taylor series (or a MacLaurin series, if the approximation is for input values near zero).

Now, it happens that this collection of other functions includes logarithms:

\(ln(1+x) \approx x – x^2/2 + x^3/3 – x^4/4 + \ldots\)

and exponents:

\(e^x \approx 1 + x + x^2/2! + x^3/3! + x^4/4! + \ldots\)

and so, given a difference engine, you can make tables of logarithms and exponents.

In fact, your computer is probably using exactly this approach to calculate those functions. Here’s how glibc calculates ln(x) for x roughly equal to 1:

  r = x - 1.0;
  r2 = r * r;
  r3 = r * r2;
  y = r3 * (B[1] + r * B[2] + r2 * B[3]
    + r3 * (B[4] + r * B[5] + r2 * B[6]
        + r3 * (B[7] + r * B[8] + r2 * B[9] + r3 * B[10])));
  // some more twiddling that add terms in r and r*r, then return y

In other words, it works out r so that it is calculating ln(1+r), instead of ln(x). Then it adds together r + a*r^2 + b*r^3 + c*r^4 + d*r^5 + ... + k*r^12…it does the Taylor series for ln(1+r)!

Now given these approximations, we can combine numbers into probabilities (using the sigmoid function, which is in terms of e^x) and find the errors on those probabilities (using the cross entropy, which is in terms of ln(x). We can build a learning neural network!

And, more than a century after it was designed, our technique could still do it using the Difference Engine.

Java By Contract: a Worked Example

Java by Contract is an implementation of Design by Contract, as promoted by Bertrand Meyer and the Eiffel Software company, for the Java programming language. The contract is specified using standard Java methods and annotations, making it a more reliable tool than earlier work which used javadoc comments and rewrote the Java source code to include the relevant tests.

Which is all well and good, but how do you use it? Here’s an example.

The problem

There is a whole class of algorithms to approximately find roots to a function, using an iterative technique. Given, in Java syntax, the abstract type MathFunction that implements a function over the double type:

interface MathFunction {
    double f(double x);
}

Define the abstract interface that exposes such an iterative solution, including the details of its contract.

The solution

The interface is designed using the Command-Query Separation Principle. Given access to the function f(), the interface has a command findRoot(seed1, seed2) which locates the root between those two values, and a query root() which returns that root. Additionally, a boolean query exhaustedIterations() reports whether the solution converged.

Both of the queries have the precondition that the command must previously have successfully run; i.e. you cannot ask what the answer was without requesting that the answer be discovered.

The contract on the command is more interesting. The precondition is that for the two seed values seed1 and seed2, one of them must correspond to a point f(x) > 0 and the other to a point f(x) < 0 (it does not matter which). This guarantees an odd, and therefore non-zero, number of roots[*] to f(x) between the two, and the method will iterate toward one of them. If the precondition does not hold, then an even number of roots (including possibly zero) lies between the seed values, so it cannot be guaranteed that a solution exists to find.

In return for satisfying the precondition, the command guarantees that it either finds a root or exhausts its iteration allowance looking. Another way of putting that: if the method exits early, it is because it has already found a convergent solution.

Many, but not all, of these contract details can be provided as default method implementations in the interface. The remainder must be supplied by the implementing class.

/**
 * Given a mathematical function f over the doubles, and two bounds for a root to that function,
 * find the root using an (unspecified) iterative approach.
 * A root is an input value x such that f(x)=0.
 */
public interface RootFinder {
    /**
     * @return The function that this object is finding a root for.
     */
    MathFunction f();
    /**
     * A root to the function f() is thought to lie between seed1 and seed2. Find it.
     * @param seed1 One boundary for the root to f().
     * @param seed2 Another boundary for the root to f().
     */
    @Precondition(name = "seedGuessesStraddleRoot")
    @Postcondition(name = "earlyExitImpliesConvergence")
    void findRoot(double seed1, double seed2);
    /**
     * @return The root to the function f() that was discovered.
     */
    @Precondition(name = "guessWasCalculated")
    Double root();
    /**
     * @return Whether the iterative solution used the maximum number of iterations.
     */
    @Precondition(name = "guessWasCalculated")
    boolean exhaustedIterations();

    default Boolean guessWasCalculated() {
        return this.root() != null;
    }
    default Boolean seedGuessesStraddleRoot(Double seed1, Double seed2) {
        double r1 = f().f(seed1);
        double r2 = f().f(seed2);
        return ((r1 > 0 && r2 < 0) || (r1 < 0 && r2 > 0));
    }
    Boolean earlyExitImpliesConvergence(Double seed1, Double seed2, Void result);
}

Example usage

There are swaths of algorithms to implement this interface. See, for example, the book Numerical Recipes. Given a particular implementation, we can look for roots of a simple function, for example f(x) = x^2 - 2:

    RootFinder squareRootOfTwo = SecantRootFinder.finderForFunction((double x) -> x*x - 2);
    squareRootOfTwo.findRoot(1.0, 2.0);
    System.out.println(String.format("Root: %f", squareRootOfTwo.root()));
    System.out.println(String.format("The solution did%s converge before hitting the iteration limit",
            squareRootOfTwo.exhaustedIterations()?"n't":""));

This suggests that a root exists at x~=1.414214, and that it converged on the solution before running out of goes. Let’s see if there’s another root between 2 and 3:

Exception in thread "main" online.labrary.javaByContract.ContractViolationException:
  online.labrary.javaByContract.Precondition seedGuessesStraddleRoot had unexpected value false on object
  online.labrary.jbcTests.TestGeneratorTests$SecantRootFinder@29774679
    at javaByContract/online.labrary.javaByContract.ContractEnforcer.invoke(ContractEnforcer.java:92)
    at jdk.proxy1/com.sun.proxy.jdk.proxy1.$Proxy4.findRoot(Unknown Source)
    at javaByContract/online.labrary.rootFinder.RootFinder.main(RootFinder.java:10)

Whoops! I’m holding it wrong: the function doesn’t change sign between x=2 and x=3. I shouldn’t expect the tool to work, and indeed it’s been designed to communicate that expectation by failing a precondition.

[*] Nitpick: roots _or singularities_.

Updates to JavaByContract

Some improvements to JavaByContract, the design-by-contract tool for Java:

  • Preconditions, Postconditions and Invariants now appear in the Javadoc for types that use JavaByContract. While this is only a small source change, it’s a huge usability improvement, as programmers using your types can now read the contracts for those types in their documentation.
  • There is Javadoc for the JavaByContract package.
  • The error message on contract violation distinguishes between precondition, postcondition and invariant violation.

I’m speaking generally about moving beyond TDD, using JavaByContract as a specific example, at Coventry Tech Meetup next week. See you there!