Free Software and LLM Contribution Policies

Multiple free software (or open source) projects have policies that forbid, or in some cases allow with extra scrutiny and scepticism, contributions that are supported by AI-augmented tools. I believe that this is a poor decision for many reasons, which fall under these categories:

  1. The Four Freedoms
  2. Free Software and Copyright
  3. Freedom to Fork
  4. Historical Discontinuities
  5. Unintended Consequences
  6. Miscategorized Assumptions

I will present my argument on each point, then conclude by saying the policy I believe that these projects would be better served with. This is just my suggestion, of course, I’m not in a leadership position on any of the projects and I’ve only contributed to them in minor ways.

1. The Four Freedoms.

Central to the philosophy of free software – and transitively to the open source philosophy – are the four freedoms. The GNU project website spells them out in full, but I like the pithy summary from FSF Europe: ‘use, study, share, improve’.

Given these freedoms as axiomatic, it seems perverse to introduce a policy that restrict someone’s freedom regarding the way they use their computer to work with the software, at the point of contributing to the software.

Imagine a contributor policy that says ‘you can’t submit patches to this project that you edit with vim’, or ‘we reject submissions if we find that you used Windows to test them’. These seem absurd, but they’re consistent with what’s happening with LLMs: the project team doesn’t like the tool you used to prepare the software change, so it rejects the change regardless of the consequences of doing so.

2. Copyleft

In Free as in Freedom (2.0), Richard Stallman observes that “use of copyright was not necessarily unethical. What was bad about software copyright was the way it was typically used, and designed to be used: to deny the user essential freedoms.”

In “What is copyleft?”, he writes, ‘proprietary software developers use copyright to take away the user’s freedom; we use copyright to guarantee their freedom. That’s why we reverse the name, changing “copyright” to “copyleft”.’

One of the concerns people have with LLM-authored contributions – a subset of the types of contribution these policies ban – is that the copyright status is unclear in many places, with one early indicator being that LLM-authored contributions might not be copyrightable.

If this is the case, then nobody can remove the freedom of people who use that contribution. If that isn’t the case, and the work is the creation of the person who used the AI tool, then they can use a freedom-preserving license.

If, instead, we enter a new era of copyright … well, anything could happen, but the way to have a say is to build competence, authenticity, and respect in society by engaging with the problem, not by withdrawing from it.

3. Freedom to Fork

The freedom to distribute your modifications and to distribute copies of the software explicitly doesn’t require people to ‘upstream’ their modifications; that is, to contribute them back to the place where they originally got the software. In fact, licenses with clauses that mandate upstreaming are non-free, for example, the earliest versions of the APSL.

Someone who modifies your software using an LLM, then has their upstream patch rejected, is free to distribute it anyway, creating a fork in your project. They might choose to track and apply changes in your project – not too much work, after all they can use an LLM to do it – so that your version of the project becomes the one with the recognizable name, but a subset of the features. At one extreme, this means fracturing the project’s community, along tool-use lines. At another extreme, it means the original project becomes irrelevant and the replacement takes over, as happened to GCC and EGCS.

4. Historical Discontinuity

Free software always has coexisted with and even used non-free software. GNU Emacs was only one of about 30 emacs implementations. GNU itself uses Unix as its design document, and the original GNU components ran on proprietary UNIX distributions, because there was no fully-free environment available – so people used proprietary development tools, libraries, shells, and kernels. Even today, many free software components are portable to proprietary environments like Windows or macOS, and you can use proprietary tools like Microsoft’s compiler or NotePad++ to work on them.

Anti-LLM policy muddles the software freedom message by making the community values more about position on LLMs than about software freedom. This risks making it easier to dismiss genuine concerns about software freedom, because the people involved are seen as opportunists riding a temporary wave of situational sentiment, rather than supporters of a strong principled position that they defend in all circumstances.

Bradley M. Kühn of the Software Freedom Conservancy wrote of the Challenges in Maintaining a Big Tent for Software Freedom – the LLM moment is one of those situations where we should keep the big tent open.

5. Unintended Consequences

It’s already the situation that a well-resourced proprietary software vendor who disagrees with the license of a free software component can staff up a team to reimplement a proprietary version. If the no-LLM policymakers get their way, and all free software is either LLM-free or fractured into irrelevance, then it becomes supremely inexpensive to spin up proprietary versions of free software components – and ridiculously expensive to maintain free software versions of proprietary components. Software freedom would lose the significant (but already precarious) foothold it gained in computing over the last few decades.

As the LLMs tool evolve and improve, the gap would become wider. Free Software risks becoming a historical reenactment activity, in which people type in code the old-fashioned way, and upon sharing it immediately gets cloned by a hundred LLM agents.

I’m not saying that’s a necessary conclusion, and it’s certainly an undesirable one, but I do see it as a real risk.

6. Mischaracterized Assumptions

Reading Stallman’s position on LLMs, one sees that he’s mostly concerned about the non-free, cloud-hosted partner models that send all of the user’s data to the model provider. That’s a genuine and valid concern, one that’s consistent with his long-standing views on hosted software and software freedom. But it’s an incomplete picture.

At the opposite end of the spectrum is Apertus, a model for LLMs which that applies an open training process to open data to produce an open-weights model that you can host in a free software harness, and use from a free software UI.

A ‘no-LLM’ policy that forbids Apertus shoots software freedom in the feet – and prevents software freedom advocates from evangelising the benefits we’d see if more LLMs were like Apertus.

Free Software projects used to advocate for software freedom, while using proprietary compilers to build their free software until GCC was along and could support their needs. We can do the same with other tools, including LLMs.

7. A Way Forward

LLM-augmented coding tools empower people without traditional programming backgrounds to modify software to suit their needs, and to share their modified version.

Maintainers of popular projects are rightly concerned that rather than ‘fostering collaboration and improvement’, this can lead to hard to maintain projects that buckle under the weight of low quality, poorly thought out contributions that take time to interact with but don’t add value to the project.

This situation gets to the core of a hypocrisy in the ‘Cathedral and the Bazaar’ model of free software communities – the true bazaar model is difficult to navigate, so instead the free software world organizes itself into various unorthodox cathedrals, with their hierarchies and bylaws. As the bazaar increases in size, the choices available get harder to navigate, and the people who put themselves in the position of mediators, the clergy, get more and more work. Improving the access to tools that enable software freedom has the perverse effect of making maintainers want to keep people away from contributing.

The quality / anti-slop concern is easy to address by having quality criteria on patch submissions, with automated checks. Don’t tell people they can’t submit patches if they use particular tools; tell them their patches are only considered for acceptance when they meet the quality criteria. In addition to cleaning out the frustration matrix of confusing tool use for quality (the submissions that are low quality & produced without LLM, and the submissions that are high quality & produced with LLM); this approach allows anyone who wants to contribute – using whatever tools – understand and adopt the quality rules of the upstream project; ‘fostering collaboration and improvement’ as stated in the Four Freedoms.

The non-free concern is addressed by advocating for software freedom in LLMs – the same way we’ve been advocating for software freedom in web browsers, office suites, and other applications for decades.

The copyright concern is addressed by representing our position on software freedom strongly, consistently, and authoritatively, so that we earn the right and respect to influence the people who make those decisions. If we do not, then only the people who run the LLM companies – along with traditional anti-freedom advocates like record and motion picture industry associations – will be in the room, and we will not.

It might be that we need to identify new freedom and new principles to uphold in the LLM age – Matthew Skala has written his 11 freedoms for free AI, for example. What we definitely don’t need to do is to abandon our existing principles in favor of opportunistic positions in the debates of the day. That is a recipe for being sidelined in all debates, and for watching software freedom become irrelevant.

Posted in AI, FLOSS, freesoftware, GNU | 2 Comments

Episode 59: The NATO Software Engineering conferences, part 3

We’re closing in on the end of the 1968 conference report, in this section discussion software service, maintenance, and other “special topics” including educating software engineers, and whether it’s reasonable to pay for software at all. Along the way, we discover that there’s no silver bullet 18 years before Fred Brooks told us; decide whether 1968 software needed more blockchain; and find the horrific truth behind beta testing.

The episode is supported by members of the Chiron Codex Patreon (use this gift link for your first month free), so please do join the community or hit the Ko-Fi button to make a one-off donation.

Links

Transcript

Hello, and welcome to episode 59 of the Structure and Interpretation of Computer Programmers podcast. I’m Graham Lee, and this episode is the third part of a mini-series discussing the 1968 and 1969 NATO Conferences on Software Engineering. It’s sponsored by the members of my Patreon, which can include you.

We’re up to sections 6 and 7 of the report, which cover software service, that is, the business of satisfying customers by delivering software, particularly maintenance, and special topics, which don’t fit into the subjects of the three
workgroups. Service was one of the workgroups, with the others being design and production, the topics of the previous episode of this podcast.

We join the service section with a subsection that has the provocative title, The Virtue of Realistic Goals. Essentially, nobody at the conference blames the programmers for failed projects. Either the customers or the users had unrealistic expectations, or the manufacturers made unrealistic claims about their system’s capabilities.

Klaus Samuelsson is particularly harsh in blaming the users, saying it’s their fault for accepting a system before they’ve satisfied themselves of its correctness. Brian Randall, who edited the conference report, agrees, saying, “the users are as much to blame for premature acceptance of systems as the manufacturers for premature release”. But what choice do customers have?

Indeed, even in this day and age, we only have a partial solution to this problem. There’s free software, where you can inspect the code, or get someone else to do it, and know it’s correct, then choose whether you pay the creators, and of course many people don’t. Or there’s shareware, where you can use the software and check it roughly works the way you want, or at least the trial features that you have access to do, and then pay to unlock the rest, which you haven’t been able to try.

Every other distribution mechanism or purchase model for software is either buy now or regret later, or buy now and hope that every back-end update keeps the bits you need working the way that you need them to work.

This, by the way, also came up in the report, where d’Agapeyeff says it’s generally a problem that we haven’t worked out how to make sure that any release of software is a strict subset of later releases. In other words, that newer versions can do more than previous versions, and do all of the existing things in the same way that the previous versions did.

I wrote a blog post back in 2018, which is linked in the show notes, about the way in which semantic version encodes the intention to introduce breaking changes between versions. I proposed meaningful versioning in which the smallest increment number, the z in x.y.z, applies to additional features, the middle increment, the y, applies to behaviour-preserving refactorings, and the largest increment, the x, to bug fixes. There’s no room for backwards incompatible changes in this scheme, unlike in semantic versioning. If you want to do that, you should release a different product.

This idea is related to the discussion of the open-closed principle that we had in episode 58. The whole software system should, according to d’Agapeyeff, be open to extension, you add capabilities in subsequent issues, the word he uses for releases of the software, and it should be closed to modification. The bit you’ve already released ought to work well enough
that you don’t need to change it. I should insert a placeholder call-out here to a subsequent episode of this podcast that I haven’t planned or recorded yet on Bertrand Meyer’s book, Object-Oriented Software Construction, because the idea of open and closed is all over this 1968 conference.

Next up are discussions on the state of the initial release of software and the frequency and nature of subsequent releases. The general sentiment is that the initial release should work well, even if it doesn’t have all of the planned capabilities. Growing outward from a high-quality core, preferably by adapting a modular design, is better than releasing a poor-quality version of everything.

However, François Genuys points out that people need to access pre-release versions of the software for training. On the one hand, I thought that this could refer to the interim systems that the conference discussed in the design and production sections, which we talked about in episode 58 of the podcast, where some of the components are real and the others are simulations. In this way, customers or support staff train with working systems, just not completely working. In the same way that the subsequent initial release is a high-quality subset of the total system behavior, so the pre-release training versions would be high-quality subsets of the initial system behavior.

Then I remembered that Genuys was at IBM, and that the idea of alpha and beta tests supposedly come from IBM, so I wondered about the timing of that. Wikipedia suggests that the terminology came from the 1950s, but it offers two citations, neither of which actually back up that claim. Jeff Atwood, at his coding horror blog, corroborates the IBM origin of these terms, but he does so by citing the Wikipedia article. Everybody else either cites the Wikipedia article, or plagiarizes it, or plagiarizes Jeff’s post, or plagiarizes both of them. So I think we’re stuck here. Unless anyone who’s listening has a contemporary source, in which case please do send it in and let me know in the comments, we don’t actually know where the terms alpha and beta testing come from.

Not that it’s important. The idea that you have pre-release alpha or beta tests doesn’t mean that you let people use the versions that fail those tests, nor necessarily that your testing criteria at those times are any less stringent than those for the initial release or for subsequent system releases. Speaking of the subsequent system releases, there’s a bit of tension over how frequently these should appear. Ashiroplar sums up the tension well. More releases means more churn, but it also means getting corrections into the hands of customers sooner.

Generally, people at the conference are in favour of fast corrections and infrequent major upheaval updates. This puts me in mind of my experience managing Debian systems, where it’s easy to accept the in-release updates, that is apt-get update and apt-get upgrade, without worrying that anything will break. And infrequently, you have to cross your fingers and do a dist-upgrade.

Or even the times when I’ve managed Solaris systems, you take the stream of minor updates forever and never reinstall the major version of the operating environment. It also calls to mind Microsoft’s Patch Tuesday approach, releasing interim updates at predictable times, so that administrators can prepare to deal with their installation.

This section on release frequency ends with an extract from Control Data Corporation’s H.R. Gillette on defining metrics for release quality. Here’s the quote.

“Below, I have written a copy of one of the paragraphs which has been put into a product objectives document. We struggled a great deal to define measurable objectives in the document, and this is an example. The numbers used do have relevance historically, and that is all that need be said about them. Finally, our objectives may not have been high enough in this particular area. We tried to push our luck while at the same time being realistic.

The total number of unique bugs reported for all releases in one year on ECS SCOPE will not be greater than the number given by the following formula. Number of bugs is less than or equal to 500 minus 45 divided by in brackets I plus 10 close brackets, where I is the number of installations using ECS SCOPE. 85% of the reported PSRs will be corrected within 30 days, and 50% of these will be corrected within 15 days. All PSRs will be corrected within 60 days.”

In that quote, a PSR is a bug report. What this is trying to get at is laudable. There won’t be many bugs, even if we have lots of different customers, and we’ll fix the bugs quickly. Unfortunately, measuring bugs reported is one of those meaningless metrics that’s easy to parody, and indeed Scott Adams covered this one in a Dilbert comic back in the 1990s, when the software team write themselves a minivan.

You game this metric by choosing quiescent customers who don’t report bugs, or by including obvious defects like typos that are quick to fix, so that you achieve your turnaround time goals, and customers don’t have time to fill out their PSR forms with more meaningful bug reports.

More recently, software engineers have finally read a 2001 article from Dr. Dobbs’ journal, and are shifting tests left, incorporating test design and implementation throughout the development process, and particularly from the beginning. This brings me on to the money quote from this section of the conference report, from Alick Glennie, who’s the inventor of AutoCode, an early family of programming languages and compilers, which may even have included the world’s first compiler, though that is a contested claim.

The quote is, “Software manufacturers should desist from using customers as their means of testing systems.”

Okay, me speaking again now. It’s better to see customer bug reports as a feedback mechanism than as a goal. You’ll always get them, as long as you have customers, and you have a reporting channel. But you don’t need to optimise for or control them. If you get a lot of bug reports for some module, that might indicate quality issues, or that it’s really popular. If you get not many bug reports for some module, that might indicate high quality, or that nobody uses it. And perhaps the reason that nobody uses it is because it’s too buggy.

As with the rest of this section of the report, the important thing to worry about is how you get working software delivered to your customers. Back in 2001, some people even said that this should be our highest priority.

I originally skim read the next sections, which were on replication and distribution of software, because they’re problems that don’t exist anymore. You put your software on the internet and distribute it for somewhere between zero and near zero cost, or you don’t distribute it at all and let people access it on your computer via their browsers. Gone are the days of the hologrammatic Windows XP CD, with its licence key that’s long enough to identify each atom in the universe uniquely.

But even back in 1968, duplicating software was so much cheaper than duplicating hardware, that Brian Randell recognised that economics is a reason that software quality was given less consideration by manufacturers than hardware quality. It’s that much more expensive to fix a hardware problem in the field.

And I have a vague recollection that at some point, Sun Microsystems swapped the SCSI IDs for recognising which drive is addressed by which number zero and three between machine architectures. And that might have been between the Motorola 68000
and the SPARC. And while it was possible for customers to switch some jumpers around to get back to the original behaviour, they did offer to send field engineers to customers to make the changes for them. But I can’t find documentary evidence for them doing so, so maybe it was just a scary story told at sysadmin camp when I was a young initiate.

A particular problem that distributing software on physical media had was ensuring correctness. If a tape even had one bit flipped or removed, it was useless and potentially dangerous if the software still ran, even though it was incorrect. In principle, most software that’s distributed electronically now has all sorts of digital signatures and checksums. And in practice, that’s wisely hidden from the customer. So you kind of have to trust the vendor that everything checks out.

Moving on to maintenance. And again, the blame for getting it wrong falls squarely with the customer. Each maintenance depends upon the proper recording of programming errors by the user and upon the quality of such records, says Mr. H. Köhler of AEG Telefunken. We know this not to be reliable, and so now we use telemetry and automated diagnostics to get the information we need in the form we need. It turns out that each maintenance depends upon the proper recording of programming errors by the programmer, and upon the quality of such records, but also upon the programmers choosing to act upon the records.

And also to a large extent, it depends upon the error reports actually going to the right place. In the 1960s, this would have been a simple problem. The customer blamed IBM for any bug. IBM blamed the customer’s local applications or modifications. Eventually, one or other of them either fixed a problem or worked around it. IBM didn’t unbundle their software from their hardware until 1969. We’ll see this discussion play out in real time later in the episode.

And there were only a handful of ISVs in the 1960s, mostly concentrating on filling in gaps in hardware vendor offerings. For example, Ken Kolence, one of the attendees at the conference, founded a company called Boole & Babbage, which created
profiling software.

Once you got more integrated software from more vendors on a computer, it became harder to decide who to blame for a problem. Is it Valve’s fault that Steam crashes on Windows or Microsoft’s? Or is it the fault of some third-party vendor because the customer installed a haxie that loads code into the app? In the 2000s, I worked for an antivirus software company, and we all got all of the bug reports from all of the software. Either the customer blamed the antivirus company because they didn’t like our software, or the software vendor blamed the antivirus company because they saw our kernel extension was loaded and used that as an excuse not to investigate their own customers’ problems.

I therefore spent a lot of the time as I was on support demonstrating that other people’s software crashed without the antivirus software installed, in the same way that it crashed with the antivirus software installed, so that I could send the crash report back to the company whose software crashed. In one instance, we had a report that Excel on Mac OS X crashed with our antivirus installed. Eventually, I, along with two people from Apple, a file systems engineer and a technical support professional, showed that Excel crashed on Mac OS X without antivirus installed because a rarely used code path in the spreadsheet application caused it to try to use an unimplemented feature in the HFS+ file system. Now, is it Microsoft’s problem that Excel does something that doesn’t work, or is it Apple’s problem that they exposed an unimplemented API? Thankfully, answering that question became somebody else’s problem about 20 years ago.

Just before we move on to the section on special topics, there are a couple of interesting talking points in part of the report that’s on acceptance testing. One is a suggestion from James Babcock, who ran a time-sharing services company, that we need software meters analogous to the present hardware meters so that our rental costs can be adjusted to allow for time lost through software errors as well as hardware errors. I would certainly welcome the rebates I’d get if some of the cloud computing services I use multiplied their subscription fee by their uptime ratio.

I’m going to mention Brad Cox’s 1996 book, Super Distribution, here as well. He invented a pay per use model for software pricing, and it’s a transitive pricing model. I pay for the application software I use as I use it. The application vendors pay for the library calls their software makes whenever it makes them, and so on. This model required special hardware to do the digital rights management in the 1990s, but I think now it could actually be a reasonable application
of a smart contracts blockchain like Ethereum.

In the Super Distribution model, I would automatically get money off during an outage, because I wouldn’t be able to use the software at all, so I wouldn’t be able to get charged for anything.

The other point on acceptance testing is a difference of opinion between Edsger Dijkstra, “testing is a very inefficient way of convincing oneself of the correctness of a program”, and Mr A. I. Llewellyn from the British Government’s Ministry of Technology: “Testing is one of the foundations of all scientific enterprise. In fact, it would be good to have independent tests of system function and performance published.”

This is the advert break. It starts now.

This episode is brought to you by me, Graham Lee. But really, by you.
Chiron Codex is a community of people who are learning how to become better software engineers
by adopting AI augmentation in a thoughtful way. We aren’t outsourcing our understanding to coding
assistants like Claude or Codex, but becoming software engineering centaurs by using AI tools
to improve our knowledge and the quality of our work. Join the community over on Patreon to find out
about interaction patterns that improve your work with AI coding tools. Running LLMs for software
development locally, discussions of recent research in the field, and more. If you’re a software
engineer who’s interested in the promise of AI tools, but sceptical about handing your skills over
to the computer, this is the community for you. Go to patreon.com slash Chiron Codex, that’s C-H-I-R-O-N-C-O-D-E-X,
now for more information and to join. Use the gift link in the show notes to get your first
month of insider access completely free. Alternatively, you can show your appreciation
by donating at Ko-fi, that’s ko-fi.com slash Chiron Codex, K-O-F-I dot com. Direct support
by my audience is the only revenue I get for my work as a software engineer and communicator,
so your support really means a lot to me and makes it possible for me to produce this podcast.
Thank you so much.

That was the advert break. It’s over now.

OK, we’re on to the section on special topics, which opens with software, the state of the art. We’ve actually already encountered most of the discussion points here, particularly the idea that most of software works very well, that people are doing what they need to at a much lower cost than ever before, and it’s only the edges of the field’s capability, both in terms of scale and novelty, for example, time sharing, where the problems arise. These were all in the executive overview at the start of the report that we discussed in episode 57.

This idea that people are doing what they need to do at a lower cost than ever before, though, is hard to square with Robert McClure’s assertion that “it seems almost automatic that software is never produced on time, never meets specification, and always exceeds its estimated cost.” He describes the causes as coming from “the refusal of industry to re-engineer last year’s model, from the inability of industry to allow personnel to accumulate applicable experience, and from emotional management.”

And that certainly all aligns with my own experience, having seen the phrase “rewritten from the ground up” used as if it’s a good thing, having experienced layoffs and limited career development prospects that limit retention, and management fads that come and go like high street clothing collections. However, this position is out of alignment with the rest of the conference report.

I think we see here the division that Thomas Haigh identified between the industry software engineers who think things are going well and would like them to go a bit better, and the academics who think that all of industry software is on a hiding to nothing until it adopts current academic practices.

Whatever the magnitude of the problem, somewhere between 1 and 100% of software projects running into difficulties, possible solutions were discussed. Ascher Opler suggested two approaches, either stealth mode, where the manufacturer doesn’t say anything about the capabilities of the system until they finish developing it, or loose promises mode, where they say what they’re doing but give a really long lead time and be honest about the uncertainty involved.

The subsequent third way that modern software engineers use is the lean startup approach, where the manufacturer says what it’s doing and then gets early feedback before it even starts building anything. It avoids both the risks of stealth mode, which are building something that nobody wants, and loose promises mode, where the risk is getting resumpt by someone who implements your plans faster than you do.

Going back to the theme of things that were subsequently rediscovered by somebody else but that already existed at the time of the NATO conference, Doug Ross is the only person in the conference report who actually refers to the contemporary state of affairs as a crisis. He warns against people who promise a breakthrough, a mere 18 years before Fred Brooks agreed that
there is no silver bullet in software engineering.

A second special topic in section 7 of the report is education, and Alan Perlis sets out criteria to define a curriculum in software engineering education. It’s useful to note his point that this is distinct from computer science education, as, according to him, “most of the computer science programs are producing faculty for other computer science departments”. This is actually a deliberate choice that the curriculum committee at the ACM made, choosing to focus on computer science as an academic and mathematical pursuit, rather than on software as a practical industry. But Perlis is damning in his assessment. “You have to look hard in a computer science department to find anything that is dedicated to utility as a goal”. Ouch.

Dijkstra produces the money quote for section 7 in this discussion on education. “You are right in saying a lot of systems really work, that is our glimmer of hope. But there is a profound difference between observing that apparently some people are able to do something, and being able to teach that ability”.

We could imagine this as being his way of digging in when the crisis narrative got debunked. Yes, everybody can make working software, but maybe they’re doing it wrong anyway because they don’t do it the way that I like.

One of the questions that managed to keep op-ed writers employed for decades after this conference was the extent to which software engineering and computer engineering share commonalities with, well, with engineering. This is a topic that I read deeply for my PhD thesis background, so I could go into way too much depth here. But suffice it to say that people are still discussing whether software engineers should be licensed engineers. And in fact, there are some places where they do need to be, and so in those places, people who write software just don’t use the word engineer.

One argument that was made in 2002, and seems particularly weak, is that engineering licensing would cover non-software disciplines, and it would be unfair to stop someone practicing software just because they don’t understand fluid dynamics.

The final topic we’re going to consider in this episode is the question of software pricing, i.e. whether software should be unbundled from hardware and sold as a separate product. We all know how this played out. Software was unbundled from hardware, became a huge economic engine in its own right, and even ended up eating its own tail when cloud computing changed the economic calculus so that hardware needs are factored into the software costs.

It seems like most of the attendees at the conference were in favour of software pricing, but the section in the report is presented neutrally with equal weight given to both sides. Tellingly, this is also the only section in the report that uses the Chatham House rule, where no quotes are attributed to named speakers. So, while we do know that most people at the conference were in favour of separate software pricing, we don’t know who or how many people were making the argument against.

If I had to guess, I would say that IBM representatives were against unbundling software, and everybody else was in favour of it, and that IBM lost the argument very shortly after. This was partly the work of a company called ADR, who have the distinction of being the first company ever to file a software patent. ADR brought an anti-monopoly case against IBM, saying that providing their software for free was stifling the market. This is, of course, an argument that came up again in the 1990s, with Microsoft bundling their browser and media player with Windows, and again very recently with the European Union’s Digital Markets Act and its definition of some services provided by large, typically American companies as gatekeepers. But let me know what you think. And also, your perspectives as people who rely on software being a commercial commodity, what do you think of the way that software is priced? You can email me at grahamlee at acm.org, or you can comment on this post, the post for this episode, over at sicpers.info slash podcast. That’s s-i-c-p-e-r-s dot info slash podcast.

The next episode will conclude the reading of the 1968 conference report by covering the keynote address and the working papers that are included in the report. Only a fraction of the submitted papers actually appear in the report. There’s no full proceedings, so a lot of the information that went into the conference is sadly lost forever, unless some attendee happened to file away their copies of the papers that they received. Until the next time, take care, and I’ll talk to you soon.

Leave a comment

Episode 58: The NATO Software Engineering conferences, part 2

This episode digs into the problems of software design and software production as perceived in the 1968 conference, most urgently: just what are software design and software production? The episode is supported by members of the Chiron Codex Patreon (use this gift link for your first month free), so please do join the community or hit the Ko-Fi button to make a one-off donation.

Links

Transcript

Hello and welcome to episode 58 of the Structure and Interpretation of Computer Programmers podcast. I’m Graham Lee and this episode is the second part of a mini-series discussing the 1968 and 1969 NATO conferences. It’s sponsored by the members of my Patreon, which could include you.

In part one I reviewed the context of the first NATO software engineering conference in Garmisch which is in Bavaria in Germany, and approached the end of section three of the conference report with no clear idea—because the people in the room hadn’t agreed on one—which activities comprise design of software and which comprise the production of software.

Well, as this episode focuses on sections four and five which are about design and production it’s time for me to confidently tell you that I still don’t know what those terms mean. There’s a large extract in section 3.2 from a Mr. J. Harr of Bell Labs, which is the place in which a year later Unix would be invented. In his paper “the design and production of real-time software for Electronic Switching Systems, for which application a year later Unix would be invented.

In this paper the design process covers everything from specifying the overall hardware software system through division of the software into precisely defined blocks with defined interfaces and data structures, the compilation, simulation and testing of those blocks, integration into a software product and final load testing.

Of interest is that about 13% of the effort on the ESS project is on assemblers, compilers and translation. What they call translation is now what we would call a compiler for a high-level language. The 1968 then-compiler being a combination of a translator from a lower level language like Fortran to the machine language maybe with some helpful macros and also link editing or patching abilities that allow different program blocks to reference each other. So here was a project noteworthy for inclusion in the report (but hopefully that noteworthiness came from the fact that it was an ordinary project) where a significant number of staff and amount of effort were focused on creating the tools that create the product.

So because every activity in that report is a design activity I’ve tried to skip through to one of the working papers in section 9, the Classification of Subject Matter from the software product working group or production working group, because that paper categorizes production activities. These include training, indoctrination in conventions, determining and imposing productivity metrics as improvement, acquiring support staff and facilities, setting a budget, hiring staff, negotiating with customers, and design activities like specification, designing software units, creating test plans.

An amusing point from the Classification of Subject Matter is the inclusion of the entry “control of innovation and reinvention” in the list which, on the one hand, makes me think of the choose boring technology article, but on the other leads me to picture a manager who’s incensed that their staff has blasphemed by using their noodles in making the software.

If you were to press me to produce definitions of software design and software production (which implicitly you are by listening to a podcast in which I claim to discuss those topics) software design as a phrase used in the 1968 conference is the activity of understanding the system requirements and producing a collection of computer instructions that satisfy those requirements. Software production is doing that in a way that the customer wants to pay for, that you can afford, that the customer wants to use the output of, and is capable of using the output of, and preferably that the customer is happy with.

Okay, so pretending now that we know what software design is let’s look at the section of report on software design. It’s here that we find what I currently believe to be the earliest reference to the architect of real world buildings Christopher Alexander, he of the pattern language fame, in the software field, as Peter Naur describes software designers analogous to civil engineering or architecture in large heterogeneous environments.

Alexander d’Agapeyeff, who we met last time wishing we could do more to teach the design and testing of testable software, argued for designing a machine that was capable of running a high-level intermediate code translatable from high-level languages which is something that we might now recognise as Pascal p-code, the Lisp machine, JVM bytecode and so on.

A historian would probably find fault with my applying such modern ideas to these statements, the retroactive claim that conference attendees were prescient in defining the future with intermediate languages, OOP (as we shall see shortly), and TDD (as I expect to encounter multiple times in this series), and then the implication that the rest of the industry was too ignorant or too stubborn to notice for a number of decades. Certainly, certainly the NATO conferences have achieved a near mythical status now that they probably just didn’t get during the 1970s, and the reports are both incomplete and focused on points the editors considered interesting, whether because they were representative or provocative, but without telling us which. So the observations might not have landed with contemporary readers, and in fact people who were in the room at the conference may not have noticed some of these statements at all until they were typed up into the report.

Nonetheless, hardware that runs an intermediate language is a natural extension of the contemporary goals of closing the gap between large system design and implementation, so I feel like I’m on fairly stable ground making the association here. Similarly, two quotes seem to presage Bertrand Mayer’s open-closed principle with some precision. Letellier says a software package must be thought of as open-ended, and Gillette says generality is essential to satisfy the requirement for extensibility, and that the key to production success of any module construct is the rigid specification of the interfaces. In other words, you’re not allowed to modify the interfaces, they’re closed, but you do need to design the modules to be extensible, they’re open.

Anyway, back to the spooky foreshadowing of bytecode, d’Agapeyeff gives four reasons for intermediate languages to be executed on the computer. They are to increase the runtime checks a computer can make, thereby increasing program safety, provide more development facilities, increase portability, and to allow all communication with the programmer to be in source language. This last point is now also achieved for compiled languages using debugging data, supported by formats like STABS, COFF, and DWARF, which were all invented and introduced in the 1980s.

As projects, and I’m using my scare quote fingers here, as projects “scale”, which didn’t just mean the size of the software went up, it also referred to the expectations of the customers, or the situations in which they used the software, which might grow beyond those foreseen by the designers. As projects scale, application software might grow beyond the expressiveness of the design language used to describe it. Kolence blamed this on a lack of universal notation for software, which would do for programming what George Boole’s notation of logic does for electronic hardware design.

He suggested that Ken Iverson’s notation is the solution. Ken Iverson’s notation is the APL programming language, which actually grew out of a specification language used for a formal specification of parts of IBM’s System 360, among other things. APL was very popular through the 70s and 80s, and still has a hardcore following, but it never displaced the Algol-derived languages as a universal lingua franca for expressing computation. And it’s the context of a specification language in which to view the suggestion here. Not necessarily as an implementation language, not everyone who used APL even had an interpreter that would run on their computer, but as a specification language, in a continuity that includes Z, TLA, and the Unified Modelling Language, as other examples.

Dijkstra submitted a paper that proposes a hierarchical, or at least a layered design approach, which led to the discussion over the extent to which a specification should be complete. Willem van der Poel says that a complete specification is a working solution, i.e. if you describe your problem in enough detail you end up solving the problem. But Dijkstra says that an incomplete specification allows for useful flexibility. There’s an analogy here with the concept of undefined behaviour in a C programming language, which allows for a portable specification of the language that behaves in whatever way is most efficient on the host hardware. And, despite what detractors claim, a C compiler has never led to demons flying out of a programmer’s nostrils.

But, what does completeness mean? If a design can be complete, we need a definition of a complete design or an incomplete design. And, the idea of a logical closure was suggested by analogy to group theory, where a group is complete if it has a certain collection of operations. So, for example, a system that lets you write files and doesn’t have a facility to read them is clearly incomplete, because you have an operation that doesn’t have a corresponding logical extension operation.
But then, what about one that can read and write files but can’t delete them? Is that complete? So, while the idea of closure was introduced, it wasn’t very deeply pursued, or at least not in the conference report.

A particular problem with software designs is the issue of detecting and handling errors. Indeed, there is sentiment in the report that if you aren’t considering resilience and fault tolerance in your design, then you aren’t actually doing design. What makes errors difficult to design is that they tend to cut across all of your nice layers and modules, so that a failure of the storage drum to be ready means that you can’t complete a tax calculation.

To consider a more modern example, think about the Java null pointer exception. Java doesn’t even have pointers, and yet here we are, dealing with an error that is caused by one.

So, a lot of discussion took place on the directionality of design, whether that be top-down, meaning to start with the interface and requirements and work towards implementation on the computer, or bottom-up, meaning to start with reusable modules and combine them until you satisfy the requirements, or whether to do something else, because both top-down and bottom-up design have risks.

Your top-down design might paint you into a corner where you need to implement a module that you can’t actually build. Your bottom-up design might create a lot of reusable modules that don’t actually have any use at all in your system.

Naur describes a concept called design trees, where you build dependency graphs of the decisions that influence other decisions, so that you know which problems you need to solve first. Ed David, another employee at Bell Labs, suggested a skeletal coding approach, in which you actually build the whole system first, admittedly using stubs, simulations, and other doubles for modules that aren’t yet complete. Then you explore the aptness of that skeleton to your needs, tweak it, and progressively fill in the details.

Going back to our retroactive futurology, this is a process that eventually became popular as Boehm’s spiral model, which we mentioned in the previous episode, and the Rapid Application Development movement of the 1980s and 1990s. This iterative approach also addresses one of the big design drawbacks discussed in the report, which is that users and customers can’t clearly express what they want, but they can tell you when you’ve got it wrong.

Speaking of communication, Conway’s Law, which was brand new, having been published in April 68, makes several special guest appearances, as does the obvious corollary. If your organisation is going to make software that models this org chart, set up your org chart so that it models the software that you want to build.

This is the advert break. It starts now.

This episode is brought to you by me, Graham Lee.
But really, by you.

Chiron Codex is a community of people who are learning how to become better software engineers by adopting AI augmentation in a thoughtful way. We aren’t outsourcing our understanding to coding assistants like Claude or Codex, but becoming software engineering centaurs by using AI tools to improve our knowledge and the quality of our work.

Join the community over on Patreon to find out about interaction patterns that improve your work with AI coding tools, running LLMs for software development locally, discussions of recent research in the field, and more.

If you’re a software engineer who’s interested in the promise of AI tools, but sceptical about handing your skills over to the computer, this is the community for you. Go to patreon.com slash chironcodex, that’s C-H-I-R-O-N-C-O-D-E-X, now for more information and to join. Use the gift link in the show notes to get your first month of insider access completely free.

Alternatively, you can show your appreciation by donating at Ko-fi, that’s ko-fi.com slash chironcodex, K-O-F-I dot com. Direct support by my audience is the only revenue I get for my work as a software engineer and communicator, so your support really means a lot to me and makes it possible for me to produce this podcast.
Thank you so much.

That was the advert break. It’s over now.

From design then to production, and the big problem facing 1968 software people was being able to deliver large systems, both in terms of the amount of software and the amount of novelty introduced. The need to always chase the latest advances made, or should I say still makes, every project into part research, part development, and part implementation, even though it’s costed, presented to the customer, and charged for as a pure implementation project.

A Fortran compiler team will, by the time it writes its third Fortran compiler, be pretty good at writing Fortran compilers and at estimating how long it takes and how many resources they need to write a Fortran compiler. But most teams aren’t doing the same thing three times, they’re doing whatever it is for the first time, or for their first time anyway.

Your second Fortran compiler isn’t a Fortran compiler. It’s a Fortran compiler that works at an online terminal on a time-sharing computer, or in the cloud, or with blockchain, or AI assistance, or whatever’s new this week in the Datamation magazine.

The problem of scaling software production is so acute that there’s an argument over whether to just use a small team of people who know each other well for all software projects, or whether that limit would actually be the end of the software game altogether. Given the current Bot Farm amplified memes about a real-world Butlerian jihad, the event in Frank Herbert’s
Dune chronology where humankind turned against artificial intelligence, it’s kind of fun to imagine an alternate reality where the greatest computer scientists and electronic engineers in the world came together in 1968 and went, “no, this doesn’t actually work. Let’s just shut it all down.”

Digression. I said greatest in the world there, even though this podcast episode is about a NATO conference. Much as I’m not convinced the Western hegemony is the best way to organise society that one could invent, the truth is that communist bloc computing was on the back foot in 1968.

Under Stalin, cybernetics have been declared unsocialist as a tool for managerial control of the workers, so research into computing wasn’t easy to undertake, promote or secure resources for. This changed after Khrushchev’s Thaw, but it wasn’t until
the beginning of the 1960s that the Soviet government started sponsoring computing factories.

Competing interests and misaligned incentives meant that the dream of a centralised computer-controlled economy, a dream that Salvador Allende rediscovered for Chile in the 1970s, never came to fruition. At the beginning of the 1970s, Soviet computing policy turned to duplicating successful Western designs to the extent that the most popular microcomputer in the Eastern Bloc was actually a PDP-11 compatible.

The USSR undoubtedly had some very capable computing experts. Think of Ekaterina Shkabara or Lev Dashevskii, Viktor Glushkov
or Sergei Lebedev. And 1968 saw the release of the BESM-6, a machine with comparable capabilities to common American hardware. But the fact is that the Soviet Union was late to seeing value in computers and was relegated to copying Western innovations in both software (Algol, Fortran and Pascal were all popular compilers on BESM series computers) and in hardware. They typically designed integrated circuits by duplicating old designs from Texas Instruments.

Anyway, back to scaling software projects. And the conference perceived one of the biggest problems to be estimation in terms of both time and costs. If you could tell someone what they’d spend and how long they’d wait to get a working system and actually be correct about it, then you’d immediately make your endeavour more professional-seeming. Getting faster or cheaper at doing it or getting better at doing it could take a backseat to being reliable about doing the things that you claim you’re capable of doing.

The problem was nobody knew what they should be measuring which meant that they all ended up measuring the one thing that was actually countable: the number of instruction words produced. Everyone agreed that this was wrong but everyone agreed that there was no other game in town.

A couple of speakers suggested what would eventually become a decade later the “function point”: a measure of the amount of software requirements that you delivered. This was even presented in the context of measuring burndown in terms of test coverage. The amount of system you have done is the amount of software that actually does what was requested. This still suffers from the problem that we described in the previous episode where the requirements describe what the customers thought they wanted not what they actually need.

Harr listed 10 reasons that projects fail. Eight of these are the inability to estimate. They’re just the inability to estimate different things. One is a change management issue, which is not keeping the project documentation in sync with the reality, and the tenth is the one that Fred Brooks would seven years later name the mythical man month problem: trying to bring a project under control by throwing more people at it. The “human wave” approach was broadly derided at the conference even among those who didn’t think it realistic to keep software teams small.

Notice that modern software methodologies “solve” (again I’ve used my scare quote fingers) they “solve” these problems by backing away from them. We advocate for two pizza teams so that we don’t need to deal with solving communications problems. We heap scorn on people who try to solve those problems with ideas like SAFe or scrum of scrums. We advocate for short iterations so that we don’t have to do any estimation, beyond answering the question “do you think this will be ready within the next fortnight?”

We advocate for cross-functional on-site teams so that we can let gossip take the place of formal communication, or we drown remote workers under slack messages, emails, and wiki updates. In this sense, modern software engineering is more of a coping strategy than an answer to the challenges identified in 1968.

A section on performance monitoring in software production is mostly about testing—both of performance and of logical correctness. The section mentions automated suites of tests at both the unit and integration level, written in the same
language as the implementation, and checked in to the same configuration management system.

This brings me to what I consider to be the money quote from the report for this episode, and it’s from Alan Perlis:

A software system can be best designed if the testing is interlaced with the designing instead of being used after the design.

It turns out that there have been people advocating for test driven development in software longer than there have been people walking on the moon.

I want to end by coming back to this question of what constitutes design and what production, because there’s a section in the part of the report on production called Concepts which is about software paradigms.

Doug Ross advocates for “plexes”. Those are modules that combine data structure and algorithm very much like objects, in fact he references Simula as a good system for modelling these plexes. And Perlis observes that all of those abstractions exist in the Lisp programming language and they each have the name “function”.

It might seem like creating objects or functions is a design issue, but it influences so much of how you talk about and make software that it’s correct to consider it a management thing, a budgetary thing, and generally a production issue.

I’d love to hear your thoughts on this episode, or your reflections on the NATO conference report. You can comment on the blog post for this episode or you can email me at grahamlee at acm.org. Next time we’ll take a look at software support and see what the luminaries of 1968 made of helping their customers use their software.

Thank you very much for listening and we’ll talk later.

Leave a comment

Episode 57: The NATO Software Engineering conferences, part 1

This episode contextualises the 1968 NATO Science Committee conference on Software Engineering, and explains what we learn through the executive summary, preface, and first three sections of the conference report. Upcoming episodes will cover the rest of the 1968 conference, the change in attitude shortly thereafter, and the entirely different report from the 1969 conference.

The episode is supported by members of the Chiron Codex Patreon(use this gift link for your first month free), so please do join the community or hit the Ko-Fi button to make a one-off donation.

Links

Transcript

Welcome to episode 57 of the Structure and Interpretation of Computer
Programmers podcast. I’m Graham Lee, and this episode is the first
part of a mini-series discussing the 1968 and 1969 NATO conferences on
software engineering. It’s sponsored by the members of my Patreon,
which can include you.

Software engineering is commonly thought to have had its genesis at
the NATO Science Committee Conference on Software Engineering, held in
Garmisch, Germany, in the week of October 7th to 11th, 1968.

Certainly, the phrase software engineering was coined for the
title of that conference, and NATO didn’t already do software
engineering. The conference was initiated by a Science Committee
working group on computer science.

Computer science itself was a new idea, having been named by an
independent consultant, Louis Fein, in 1959. The ACM first put
together a preliminary CS curriculum in 1962 to 1965, and eventually
ratified it as Curriculum 68, the same year as the first of the NATO
conferences on software engineering.

This conference report has mostly gone down in history as a broadly
cited starting point for the so-called software crisis. But what does
it actually say? Before answering that, we need to contextualise the
conference, beginning, I suppose, by addressing the elephant in the
room. Why NATO? The answer is simply that NATO represented the largest
customer and a good chunk of the supply chain of computers and their
applications at the time. Electronic computers had been invented a
little more than 20 years earlier, and had found their first
applications in the military. In Britain, the Colossus system provided
brute force cryptanalysis to the government code and cipher school,
and in the United States, the ENIAC was funded to calculate artillery
tables, and applied by John von Neumann to thermonuclear reaction
calculations used to design the hydrogen bomb.

By the end of the 1950s, the United States’ semi-automated ground
environment defence system employed 800 to 900 programmers, more than
half of the total workforce in the country. The project would grow to
about 2,000 programmers over its lifetime. Many of the ideas of
division of labour between hardware and software people, and between
different software people, came from military projects. Computers are
one of the ideal examples of military technology that becomes dual use
through serendipity. NATO had the most to gain if people found better
ways to make software more efficiently and quickly.

Software engineering, the phrase, was according to the conference
report’s preface, and to the reminiscences of one of its editors,
Brian Randell, a name chosen provocatively to suggest that software
needed to be constructed with the same rigour as found in established
engineering disciplines. This conference brought together people from
academia and industry, about half were academics, nearly half from
computing companies or consultancies, and a few government employees
from computing using departments, and people from North America and
Europe, but mostly Europe. I count 37 European attendees or observers,
and 24 from the United States and Canada.

And the conference was organised into three work groups, software
design, software production, and software service, in which they would
discuss this notion of software engineering. Now software engineering
as a field almost implies the absence of hardware, at least the
absence of hardware is an important constraint on the design of
software. This move, certainly a political one in a field of
professional boundaries, in which programmers and analysts try to
assert their importance in the computing world as peers or even
superiors to the electrical and electronics engineers by describing
their own work as an independent engineering discipline in its own
right.

This move mirrors the slightly earlier development of academic
computer science by minimising the contribution of the computer. The
argument goes that as hardware gets more capable and flexible, the
specific limitations of any one device become unimportant, and
software designers can concentrate wholly on the problem domain. At
the outset of the integrated circuit era, this might have seemed a
reasonable bet, but in practice, there are a few domains where it’s
true even now.

Bob Barton made the opposite argument. He said, In design, we should
start by designing hardware and software together. This will require a
kind of general purpose person, a computer engineer. It’s unclear to
what extent the Software Engineering Conference, at which Barton made
that comment, actually served to widen the professional gap between
hardware and software, or whether the existing Taylorist fad for
subdividing knowledge work in the mid-20th century had already made
that split absolute. What we do know is that other than some
hobbyists and brief flurries at the beginning of the microcomputing
and Internet of Things eras, computer engineers haven’t existed, and
most organisations have separated their hardware and their software
divisions, assuming they even designed both at all.

From the very start of the report, the highlights section, that serves
as an executive summary, Randell’s recollection, and the conference as
presented in the report that he edited, and presumably the executive
summary that as editor he would have co-written, diverge immensely.

Randell recalls the conference as being the place where the software
crisis was named and acknowledged, and a field of software engineering
bent towards its resolution. In fact, it seems that the word crisis
hardly appears in the report at all, that conference attendee Edsger
Dijkstra popularised the software crisis myth in the 1970s, and that
the editors of the report were aware that it, quote, did not attempt
to provide a balanced review of the total state of software, and tends
to under-stress the achievements of the field, end quote.

Indeed, in another direct quote in the report from John Buxton, we
find that 99% of computers work tolerably satisfactorily, and Ken
Kolence says, “there are many areas where there is no such thing as a
crisis”, although the wording here implies that the idea of a crisis
was being discussed at the conference, at least.

So what are the problems that the conference addressed?
Interestingly, the highlights describe the problem crucial to the use
of computers as being the “so-called software or programs developed to
control their action”.

I wonder what this means. I initially interpreted it as suggesting
that the idea of software as a distinct entity was not yet settled.
Perhaps some people thought of a computer as a general-purpose device
that you add software to for a particular application, while others
thought of a computer as a component of a system that needs to be
programmed to fulfil its role in that system. Subsequently, I changed
my mind, and I think the editors might just mean to say that software
is a technical term that the broader reaches of their audience won’t
know the meaning of in 1968. But I’m interested to hear how you
interpret the idea of so-called software.

The specific problems they describe as being relevant to their broader
audience, that’s academics, policy makers, civil servants, people who
market computers, beyond the realm of people who directly work on
software engineering. And these are direct quotes from the highlights
section of the report.

Firstly, the problems of achieving sufficient reliability in the data
systems which are becoming increasingly integrated into the central
activities of modern society. I interpret this problem as one of the
earliest examples of the idea that software is eating the world.
Second, the difficulties of meeting schedules and specifications on
large software projects. Third, the education of software or data
systems engineers. And lastly, the highly controversial question of
whether software should be priced separately from hardware.

This is the advert break. It starts now.

This episode is brought to you by me, Graham Lee. But really, by you.
Chiron Codex is a community of people who are learning how to become
better software engineers by adopting AI augmentation in a thoughtful
way. We aren’t outsourcing our understanding to coding assistants
like Claude or Codex, but becoming software engineering centaurs by
using AI tools to improve our knowledge and the quality of our work.
Join the community over on Patreon to find out about interaction
patterns that improve your work with AI coding tools, running LLMs for
software development locally, discussions of recent research in the
field, and more.

If you’re a software engineer who’s interested in the promise of AI
tools, but sceptical about handing your skills over to the computer,
this is the community for you. Go to https://patreon.com/chironcodex,
that’s C-H-I-R-O-N-C-O-D-E-X, now for more information and to join.
Use the gift link in the show notes to get your first month of insider
access completely free. Alternatively, you can show your appreciation
by donating at ko-fi, that’s https://ko-fi.com/chironcodex, K-O-F-I
dot com.

Direct support by my audience is the only revenue I get for my work as
a software engineer and communicator, so your support really means a
lot to me, and makes it possible for me to produce this podcast.
Thank you so much.

That was the advert break. It’s over now.

Remember that in 1968, a lot of software programs were batch jobs that
ran on a whole machine, with no timesharing. There were already a
total of two computers at MIT that ran the CTSS timesharing system.
Development of Multics, the predecessor of Unix, was underway, and
Dijkstra’s team had been working on the THE multiprocessing system for
a while.

But, for the most part, while a computer ran your program, it did
nothing else. That also meant that it wasn’t running your compiler or
your assembler. Programmers had to wait in line for computer time,
just like everybody else. So, programs were written by hand, often
with flowcharts as design aids, and a lot of debugging incurred in
vivo, with programmers emulating the computer state in their head, and
checking that algorithms yielded the expected results. As we’ll see
in later parts of the conference, automated testing did exist, both at
the unit and system level.

Computer hardware had already adopted transistors, and even some early
integrated circuits. But, in 1968, there wasn’t the aggressive
upgrade cycle that we see today, and it’s likely that almost every
computer that had ever been built by the time of the conference was
either still in use, or had had its parts cannibalised for another
computer that was still in use. This includes computers based on
thermionic valves, and including those valve-based computers that use
non-binary storage, including valves that store octal and decimal
digits.

Many early computers were one-offs, designed to support the
applications they were commissioned for, but there were some standard
designs, and even one example of a family of compatible computers that
could all, well, almost all, run the same software, while offering
different specifications or capabilities. This was the IBM System 360.

Its operating system, OS 360, was released in 1965, and it
required 44 kilobytes of memory, when the System 360 family offered
between 8 kilobytes and 4 megabytes. The conference report makes a
note of this as a massive, expensive, staff-heavy project, as
expensive to IBM as a project to develop the System 360 hardware that
it ran on. But the world would have to wait until 1975 for Fred
Brooks’ detailed post-mortem in the Mythical Man Month.

To give some idea of the scale of software production at the time of
the conference, co-chair Dr. H.J. Helms estimates that there were
10,000 installed computers in Europe, a number that grew by 25% to 50%
per year, with more than a quarter of a million analysts and
programmers affected by the quality of software that manufacturers
distributed for those computers.

Alexander d’Agapeyeff reports that a decade earlier, in 1958, a
European general-purpose computer manufacturer often had less than 50
software programmers. Now, 1968, they probably number 1,000 to 2,000
people. What would be needed in 1978? he asked.

Well, fast-forwarding further than that, there are now big tech
companies with tens of thousands of software programmers who don’t
manufacture any computers at all.

As noted in the highlights, it’s large systems, where ambition
outstrips capability, in which the attendees saw a problem. With two
attendees, Asher Oppler and Stanley Gill, the latter being one of the
co-inventors of the subroutine, questioning whether customers should
even be allowed to request computer systems whose complexity outstrips
the capabilities of software creators.

As the complexity of system grows, the number of errors introduced
grows even faster. Doug McElroy and Collins both noted that the
process by which software is created uses backward techniques and has
a deservedly poor reputation. But why?

The report proposes two underlying causes in the section on Software
Engineering and Society, which was written for a more general
policy-making audience than the technical sections later. The first
cause is, according to Cambridge University’s Sandy Fraser, that
software production isn’t a linear path in which every activity takes
a step towards working software, and that managers don’t know what to
measure or how to measure it.

This is still a problem in 2026, as we saw with managers leaping on
the tokens-consumed metric without connecting that to working software
produced by their organisations.

The second cause, expressed by Robert Graham of MIT’s Project Mac,
which spawned the MIT AI lab, is that projects go on for years using
their initial poor understanding of the system, then deliver something
that doesn’t work as needed. Then they have to go back and start
again.

So even in 1968, it was seen that software construction needed more
feedback than projects were accepting from customers. And indeed,
that’s a core topic in Section 3 of the report, a discussion on the
nature of software engineering.

Two papers, one by a Mr. J. Nash of IBM UK and the other by
Dr. F. Selig of oil company Mobile, give schematic outlines of the
software engineering process, moving linearly from analysis to design
to implementation to deployment to maintenance. Both show activities
occurring in parallel, unlike the phased approach that became popular
among people who misread the Royce paper, with Nash’s diagram in
particular showing that technical support, documentation, test
development and control and administration, i.e. project management,
occur throughout the project lifetime.

Multiple attendees noted the lack of feedback in both diagrams and the
necessity to get feedback throughout the project. Bernard Galler,
then president of the ACM, recounted stories of projects delivering
poor quality results because of the lack of user feedback into the
designs and asked the question, why do these things happen? Why
indeed?

Selig himself points to feedback within the project with external
requirements informing software design and internal design constraints
informing the requirements. Sandy Fraser’s own description of the
progress of a software activity presages iterative and incremental
approaches like Barry Boehm’s 1988 Spiral model, in which, to quote
Sandy Fraser, each stage produced a usable product and the period
between the end of one stage and the start of the next provided the
operational experience upon which the next design was based.

With the benefit of hindsight, this sounds a lot like proceeding in
short iterations with time for retrospection in between them. In
practice, without access to the whole paper—the conference report is
comprised of working papers that were discussed in the conference but
never published as a proceedings as such—without access to the whole
paper, we don’t know if these iterations were weeks or months long or
who found the products to be usable. It could be that the output of
an early iteration was a system requirements specification that was
usable by a software designer, for example.

d’Agapeyeff described an inverted pyramid model in which a large
number of application programs depend on a smaller number of service
routines that sit on an even smaller base of control programs
buttressed by compilers and assemblers. Due to the lack of feedback
between applications programmers and hardware vendors who wrote the
control programs and the service routines, there was a necessary
middleware layer that adapted the service routines onto the
application’s needs but which couldn’t do anything to address
performance issues.

He described programming as still too much of an artistic endeavour
and suggested that more teaching was needed in structuring programs,
designing and testing modules and simulating runtime conditions. In
other words, in designing testable software and in testing it.

Assuming you listened from the start of the podcast and didn’t just
skip to here on the basis that I tend to take a long time getting
warmed up to a topic, you will remember that there were three
workgroups at the conference, design, production and service. At the
actual conference, attendees disagreed that design and production of
software were distinct activities.

Report editor Peter Naur says that the distinction is arbitrary and
only exists to support the division of labour in software projects.
Dijkstra says that we can’t separate the two if we are going to do a
decent job. And a consultant by the name of Kinslow says that design
is necessarily iterative. He describes the failure on large projects
as rushing to get the specification done, so skipping bits which you
expect to be able to fill in later, but which are then incorrectly
coded by 200 people. And then it’s too late to correct the damage
that’s been done to the project.

I’ve seen that failure mode on software projects in my career, which
started in 2004, but probably, at least hopefully, much less
frequently than the people in 1968, saw it.

The money quote from the first part of the 1968 report is, to my mind,
this from Doug Ross of MIT, who went on to invent the structured
analysis and design technique.

“The most deadly thing in software is the concept, which almost
universally seems to be followed, that you are going to specify what
you are going to do and then do it. And that is where most of our
troubles come from. The projects that are called successful have met
their specifications, but those specifications were based upon the
designer’s ignorance before they started the job.”

Think about this quote the next time you read a LinkedIn post on the
benefits of spec-driven development.

In this episode, we’ve covered the first 33 pages of a 226-page
report, one of two reports from the NATO conferences on software
engineering, and found that even then, software design was understood
to need iterative feedback from users, integrators, and producers, and
that everybody involved in the project had to share their knowledge
and build the software based on the latest knowledge integrated from
everybody, not on the designer’s initial feels.

Good news about the rest of this series is that the next page, page
34, is blank. But next time, we’ll start to look at the output of
some of the working groups and dig into the state of the design and
production of software in 1968.

Until then, remember that you can contact me with your feedback on
this episode. You can go to the page on the Structure and
Interpretation of Computer Programmers podcast where the post for this
episode is hosted. That’s at https://sicpers.info/podcast.

You can email me grahamlee at acm.org or you can join the Patreon to
support my work and join in the chat there. That’s at
https://patreon.com/chironcodex. I’ll talk to you again soon.

Leave a comment

I keep bouncing off the Scheme language

I have a huge appreciation for the Scheme programming language. I just seem to be unable to get it to stick in my head. This seems like a huge revelation for someone who named their blog after the Scheme textbook, but there it is. This post is the public admission I need to make, to keep me accountable for trying again. And again.

One problem is that I’m an inconsistent LISPer. The first software I ever got paid for was an Emacs major mode for the GLE plotting language, which didn’t do much beyond syntax highlighting. But I didn’t really get deeply into Emacs customization or automation, so I still have to look at the manual or my outdated copy of Writing GNU Emacs Extensions whenever I want to do anything.

I’m OK at reading Scheme. During my investigations of AI coding assistants for the project that became Chiron Codex, I created a Smalltalk-like live environment with a module browser for the Racket dialect. Obviously an LLM generated the code, but I felt comfortable following along and understood what it was doing, reading and Trusting the Tests. And when I look at Scheme that other people have written, I think I get what’s going on.

My difficulty is with thinking the way that lets me write Scheme. I have the ALGOL neurotype. When I think about a programming problem, I think in terms of the sequence of instructions I need the computer to do, and the memory locations that can hold the information the computer needs to track. After decades of working with OOP, I can quickly identify smaller computers that run smaller programs to make it easier, but only because I’ve got experience using the Simula-derived, neurologically ALGOL-based OOP strands like Java and Smalltalk-80.

This is, unfortunately, a failure that breeds failure. I’ve started two web app projects recently, including SE100, the reading list for the SICPers podcast. In each case, I’ve thought about using GNU Artanis but ultimately fallen back into my ALGOL mindset (the SE100 catalog uses the Go programming language, for example).

I think Scheme makes for some powerful software that’s pleasant to read: when I use Linux, I use GNU Guix and GNU Shepherd. I want to contribute to that ecosystem, I just have to get over the hump that I know the other, more complex way better, and be willing to play junior developer with some unfamiliar tools. This is my admission. Check back in a while to hold me accountable to this.

Posted in GNU, tool-support | 4 Comments

Episode 56: A Plea for Lean Software

The topic for this episode is Niklaus Wirth’s A Plea for Lean Software.

The episode is sponsored by…your generous support. Head over to https://www.patreon.com/chironcodex/redeem/A31E3 to get a free month of Insider access to my Patreon, with my gratitude!

Links:

Episode Transcript

(Music plays)

Welcome to the Structure and Interpretation of Computer Programmers podcast, episode 56. This episode is about “A Plea for Lean Software,” an article that Niklaus Wirth wrote in the IEEE’s Computer magazine in 1995.

This episode is brought to you by you, the community who support my work through the Chiron Codex Patreon and your gifts to the Chiron Codex Ko-fi account.

Let’s talk a little bit about Niklaus Wirth’s career. He’s perhaps most famous for creating the Pascal programming language back in 1970. Pascal is designed to support structured approaches to programming, with procedures that operate on record types and dynamic types including lists. Pascal’s type system includes features that have recently come back into vogue, including strong types where a pointer to any type is incompatible with a pointer to another type.

I mostly encountered Pascal in my Amiga days, as a language for learning about computers. It was used a lot in teaching contexts, including my university computing labs, though that was on NEXTSTEP and long after I had learned how to write Pascal software, and for writing simple applications. Pascal’s record type made it easy to create random access binary files in the days when you didn’t have a separate database management system like PostgreSQL, or even an integrated one like SQLite, and the Berkeley DB wasn’t yet widely used.

Pascal was adopted by Apple and augmented with object-oriented features, first as Clascal and then Object Pascal, and became the basis for the MacApp framework for building classic Mac OS software. Object Pascal itself was integrated into Turbo Pascal and Delphi, and was at the core of the rapid application development movement in the 1990s.

Wirth hadn’t stayed still though, and had added modules to Pascal to create Modula, then co-routines and other features to create Modula-2. In the 1990s, he co-created Oberon, both a programming language that supports data type inheritance—in which the data types themselves collaborate to define how or if they share implementations in their shared interfaces—and an operating system written almost entirely in the Oberon programming language. In a nod to the power of self-describing systems like Smalltalk, Wirth had spent two sabbaticals at Xerox PARC. Before his 80th birthday in 2013, Wirth updated Oberon to run on a CPU instruction set architecture that he had designed himself.

This brings us back to the article, because Wirth came up with three principles for software design based on his experiences with Jürg Gutknecht building Oberon. “A Plea for Lean Software” introduces these principles and derives nine lessons from them. Before I introduce those, the summary of the article that the magazine sub-editor put in a sidebar is a good description of what Wirth is looking for in so-called lean software:

“Software’s girth has surpassed its functionality, largely because hardware advances make this possible. The way to streamline software lies in disciplined methodologies and a return to the essentials.”

Wirth was frustrated that a text editor written in the 1970s used about 8 kilobytes of storage, and a text editor written in the 1990s used 800 kilobytes, but only has the same capabilities. He described a law of software that we now call Wirth’s Law, but that he attributes in the article to Martin Reiser:

“Software is getting slower more rapidly than hardware becomes faster.”

Now I have two asides here. The first is that Sophie Wilson did a great talk where she talks about the amount that hardware has actually become faster, which has been decelerating for a long while now; there’s a link in the show notes. My second aside is that this situation, where Reiser’s Law has become Wirth’s Law, is one of the reasons I enjoy going back to these important texts in these podcasts and on my blog. Working out who said the thing and what they actually said usually means that we end up with a clearer idea of the thought they were trying to convey than the telephone game where people redefine ideas in software to support their current way of working or denigrate somebody else. In this case, Wirth is quoting a colleague he directly worked with, so it isn’t too bad, but eventually ideas seem to get homeopathically diluted into nothingness through the retelling.

Okay, so no more beating about the bush. We promised three principles of software creation, and here they are:

  1. First, concentrate on the essentials. Oberon is a text user interface because the creators considered graphics and icons to be superfluous to the goal of contributing power and flexibility. The deeper message is to identify the things that people need to do and commit only to delivering those things. If a small percentage of your users want to do some other things, make the system flexible and extensible so they can get that, but don’t make everybody manage those features mentally or physically.
  2. Second, use a type-safe object-oriented language. The benefits of type safety were a smaller team size, fewer problems generated in work or rework, and as a result, faster development and rework.
  3. Third, flexible extensibility. Design a system so that new features can be added by creating modules that combine operations supplied in existing modules, or by adding new data types that are compatible with existing operations that work on existing data types.

The article introduces nine lessons that Wirth and Gutknecht learned from their work on Oberon, which they contrasted with the way mainstream software development carried on, so let’s take a look at those lessons next.

(Music plays)


This is the advert break. It starts now.

This episode is brought to you by me, Graham Lee, but really by youChiron Codex is a community of people who are learning how to become better software engineers by adopting AI augmentation in a thoughtful way. We aren’t outsourcing our understanding to coding assistants like Claude or Codex, but becoming software engineering centaurs by using AI tools to improve our knowledge and the quality of our work.

Join the community over on Patreon to find out about interaction patterns that improve your work with AI coding tools, running LLMs for software development locally, discussions of recent research in the field, and more. If you’re a software engineer who’s interested in the promise of AI tools but skeptical about handing your skills over to the computer, this is the community for you. Go to patreon.com/chironcodex—that’s C-H-I-R-O-N-C-O-D-E-X—now for more information and to join.

Use the gift link in the show notes to get your first month of insider access completely free. Alternatively, you can show your appreciation by donating at Ko-fi, that’s ko-fi.com/chironcodex, K-O-hyphen-F-I dot com. Direct support by my audience is the only revenue I get for my work as a software engineer and communicator, so your support really means a lot to me and makes it possible for me to produce this podcast. Thank you so much.

That was the advert break. It’s over now.

(Music plays)


Wirth’s first lesson is:

“The exclusive use of a strongly typed language was the most influential factor in designing this complex system in such a short time.”

This is one lesson that I know hasn’t been universally learned. I’ve been on both sides of the argument, too. As somebody who really enjoys working with Smalltalk and Objective-C, I don’t feel less productive working with dynamic types. I also point out that Smalltalk has a single type, “object,” therefore all of its expressions type-check very nicely. Because I enjoy it, I might even be more productive in the sense that I’m willing to work more and procrastinate less because the work’s enjoyable.

However, I’ve also worked on JavaScript software where I’ve seen the problems caused by objects that have incompatible shapes being discovered at runtime. Advocating for incrementally adopting a type-checking mechanism to catch exactly those errors—I suggested either TypeScript or Flow—I experienced a revolution from the developers, with one of them saying that if they ever had to understand what covariance and contravariance are again, they would leave the company. I might suggest that they do need to understand those things, even if they choose to use tools that don’t surface them. I can believe that strongly typed languages give the benefits Wirth claims, while also believing that nobody believes that they do, including, hypocritically speaking, myself. Given that Python and JavaScript are still two of the more popular programming languages in the world—and if you’re being uncharitable you might want to include C—there’s a lot of convincing to be done and a lot of inertia and potentially legacy code to account for.

Lesson two:

“The most difficult design task is to find the most appropriate decomposition of the whole into a module hierarchy, minimizing function and code duplications.”

I think that this is an evergreen statement about software design. Refactoring, which—going back to the whole type safety debate—comes from the Smalltalk world, gives us a way to deal with this a piece at a time by applying incremental design repair. I don’t think that’s what Wirth had in mind, and indeed he has a section in the article explaining that developers never have enough time to do the efficient designs because we’re always pressured to add more features, further bloating already inefficient software. So maybe while I think of refactoring as a way to tidy as we go, he might have thought of it as a way to kick the can down the road.

That’s an important piece of context about this article in itself. Oberon wasn’t developed by a stealth startup running along on a small bit of seed funding. It was the work of a tenured professor and an assistant professor at ETH Zurich. I don’t know what the funding landscape was like for Swift’s research institutes in the mid-to-late 1980s, but I do expect them to be different to those in corporate software development, based on my experience in early 2020s academia.

Lesson three:

“Oberon’s type extension construct was essential for designing an extensible system wherein new modules added functionality and new object classes integrated compatibly with the existing classes or data types.”

In other words, having come up with a minimally duplicative set of expressive primitives, the way to keep the rest of the system efficient is to be able to design the correct compositions of those primitives into richer applications. The details of the paragraph on this lesson say “without access to the source code,” so this lesson is really about correctly designing interfaces so that people can see the expected way to use them and then use them in the expected way to achieve their own goals.

Lesson four:

“In an extensible system, the key issue is to identify those primitives that offer the most flexibility for extensions while avoiding a proliferation of primitives.”

It’s entirely possible that I’ve misunderstood lesson three, because I thought that covered the sentence I just read out. Maybe I just inferred the existence of correctly designed primitives—the stuff of lesson four—in the need to have a good mechanism for safely composing them—the stuff of lesson three.

Lesson five:

“The belief that complex systems require armies of designers and programmers is wrong. A system that is not understood in its entirety, or at least to a significant degree of detail, by a single individual should probably not be built.”

This strikes me as one of those arrogant-sounding European computer science academic quotes to which Alan Kay would respond by citing his probably apocryphal paper entitled “On the Fact That Most Software is Written on One Side of the Atlantic.” First, it’s clear that one feature of the project Wirth and Gutknecht created that makes it easier for one or two people to understand it is that it’s intentionally restricted in capabilities. It’s a system that you can build applications in, like Smalltalk, not a system you can apply to things. Okay, we can still accept the point of this article that much software complexity is incidental and down to the way that the software teams are run, rather than being essential parts of the software, and we can still maybe aspire to this goal. But we can also move the goalposts to fit or not fit this lesson at will, which means finding difficulty in applying it. Does one person understand the Minix operating system or Plan 9? Definitely. In fact, someone recently published a complete guide to Plan 9, and Andy Tanenbaum has always done the same for Minix. How about a GNU/Linux operating system of your choosing? Probably not. How about the bits of the GNU/Linux operating system that implement the same behavior as Minix? Possibly, but is that a useful boundary?

If I’m going to come up with so many questions, I think I need to provide some kind of answer. And here I’ll appeal to Alan Kay’s idea of recursive modularity and Conway’s Law, connecting software architecture and organization structure, letting me say that the only reasonable request Wirth can be making is that the scope and design of a software system undertaken by any particular team should be comprehensible by a small number of people, and that taking the other lessons into account, it should have an extensible design with well-considered primitives and a flexible interface that allows a small number of other people to incorporate it into their design. Otherwise, this question degenerates to working out how many angels can dance on the head of a pin, or perhaps closer to our topic, how many pins you can harmlessly stab an angel with.

Lesson six:

“Communication problems grow as the size of the design team grows.”

This is the topic of The Mythical Man-Month and the insight directly behind Brooks’s Law. The number of communication routes grows combinatorially with the number of communicators, so adding people to a project increases the amount of communication on the project and slows down progress. Given that Wirth was writing in 1995, 20 years after Brooks published The Mythical Man-Month, and didn’t cite that book in this article, I think it both charitable and likely to assume that Wirth hadn’t read the earlier work and had independently observed this communication issue.

Lesson seven:

“Reducing complexity and size must be the goal in every step, in system specification, design, and detailed programming.”

I suspect that this is where Wirth found the most deviation between reality and expectation, because most people have remaining employed as the goal and translate that into a locally relevant outcome that their business, university, research facility, or charity’s management needs. I can see three ways to make reduced size and complexity into the more important goal over deliverability of profitable features, research outputs, or other organizational value.

The first is to remove the organization and make pursuit of minimalist software a hobby. Undoubtedly, many people in the free software world are doing this, and undoubtedly for the most part, that software isn’t going mainstream. When I think of the software people use to record and publish podcasts, for example, I don’t necessarily instantly think, “Oh yes, Audacity and WordPress, famously lean software.”

The second is to change the programmer-sponsor relationship from employee-employer to agent-client, become a licensed profession, make lean software a moral imperative, and threaten people with loss of license—i.e., with an inability to do the work—if they write software in another way. That’s such a revolutionary approach with so many downsides that I honestly can’t see it working.

The third approach is to rebalance the short-term and long-term needs of an organization and to create simple small programs because it’s in the best interests of the sponsors to have simple small programs. Small, to remain faster and more efficient than competing software; simple, to make future flexibility easier. This is the kind of principle upon which we can probably all agree, and yet in practice will rarely see enacted. For example, going back to that 8-kilobyte text editor that became an 800-kilobyte text editor: the built-in text editor on my computer now uses 48.9 megabytes for a single small file open, and honestly that’s more a result of creative accounting in the resource monitoring tool than an accurate reflection of the situation.

Lesson eight:

“Organizing a team into managers, designers, programmers, analysts, and users is detrimental. All should participate with differing degrees of emphasis in all aspects of development.”

This is a lesson that gets heard and ignored over and over again. One of the principles of agile software development is that the best architectures, requirements, and designs emerge from self-organizing teams, again observing that the external organization of people into teams is detrimental. Mob programming, which would be described as all the brilliant people working on the same thing at the same time in the same space and at the same computer, doesn’t even give managers the opportunity to corral people into specific roles.

But we need to understand a dichotomy here. Lesson six tells us to minimize communication lines, and the goal of organizing people into roles is to design interfaces and activities that constrain data flow to tame the communication problem. If you saw a software system designed along mob programming lines—all the brilliant functions working on the same thing at the same time in the same space and on the same computer—you’d think the designer had taken leave of their senses, and definitely Wirth would not believe his earlier lessons had been applied. However, if we accept teams of humans as software is different, so of course you structure them differently, then we need to give up on Conway’s Law, that the software architecture necessarily reflects the org chart. I so far haven’t participated in mob programming; indeed, it’s been six or more years since I’ve been in the same room as any of my colleagues, let alone all of them. People I’ve spoken to who try it are generally very much in favor of the practice, but I don’t have the experience needed to give a judgment. It does address lesson eight by removing the pigeonholes of different roles on the team.

Lesson nine:

“Programs should be written and polished until they acquire publication quality.”

Wirth goes on to say that this is infinitely more demanding than writing a program that runs, and that it “contradicts certain vested interests in the commercial world.” It makes it sound like he’s promoting a populist conspiracy where there are hidden influences in commercial software that stop people from being able to write publication-quality software. When the reality is that the commercial world doesn’t have infinite resources to throw at an infinitely demanding problem that doesn’t make them any money.

Unfortunately, this is the missing keystone that causes Wirth’s whole argument to collapse. He spends a page in the section on causes for fat software explaining why commercial software is bloated. Each customer uses a different subset of the features, so more features lead to more customers. Complexity gets mistaken for power, and time pressure pushes people to be first to market rather than to do their best engineering. He then gives an example of creating software without those pressures—the Oberon system which has no customers, a focus on simplicity, and no market to reach—and asks why everybody else can’t just do it like that. The missing piece is the part where either his suggestions are shown to remove the pressures that cause fat software, or at least to mitigate their symptoms. Either lean software makes it easier to add features, demonstrate power, or help people get to market quickly, or it completely upends market dynamics so that those factors aren’t important anymore. If neither of those is true, then this whole piece is nothing more than a call to adopt a principle because some Swiss academic would prefer it if you could all please do that, thank you very much.

Software is a very broad and inclusive discipline with low entry requirements. Indeed, I and many people in my generation had no relevant qualifications other than an interest in making computers do things and a stubborn determination to type in programs that were shared in magazines. As such, trying to instill a universal discipline is doomed to failure, especially one that works against getting software out to market. I’ve made the case before in the De Programmatica Ipsum issue on quality that software is a lemon market. Nobody knows what quality your software is before they try it, so nobody has any way to evaluate software by quality, so nobody believes software has any quality, which pushes prices down, which pushes affordable costs down, which pushes quality down, which pushes belief about quality down, and so on. That’s why the pressures Wirth discussed in 1995 still apply today.

This situation hasn’t stopped people from hoping that everybody would just adopt one simple trick that merely requires continuous discipline and perhaps a degree in mathematics. From Dijkstra to the pre-Cambrian monad tutorial explosion of the 2010s to the Rust evangelism strike force. Very infrequently, a particular practice becomes a meme and gets sort of somewhat maybe adopted broadly in a way that the originators wish it hadn’t been. Find me an agilist who’s happy with the way software companies do “agile”—and yes, I used scare quotes there—or the way everybody does TDD—scare quotes again—or doesn’t mind that Smalltalk isn’t more widely adopted. These people go on the circuit giving conference talks saying, “No, what we meant was,” and they do this to the kinds of conference that will book them, which are the kinds that have an audience who already believe the thesis of the talk, so not the people who allegedly need to hear their advice. They post snarky articles on LinkedIn about how nobody else in the software industry truly gets it and you should hire them if you want to be one of the few teams who do software properly. I know this because I was that person, particularly during my burnout phase a little over a decade ago; in some ways, I still am.

Thank you for listening. Please find one other person who would enjoy this podcast, tell them how much you liked it, and share a link to the podcast with them. If you’re able, support the podcast and my other work sharing software engineering information and insight on Patreon or Ko-fi. You can follow me on BlueSky at iamleeg.bluesky.social—that’s I-A-M-L-E-E-G dot bluesky dot social—or email me: grahamlee@acm.org. Thank you again, and I will talk to you soon.

(Music plays)

Leave a comment

On industrial relations

Today, 2026 May 3, marks the centenary of the onset of a general strike in the UK. In response to a dispute over the organisation and pay in British coal mines following the end of a fixed-term government subsidy, the Trades Union Congress called out many affiliated unions in sympathy. Between 1.7 million and 3 million workers came out, with some unions striking only after the general strike officially ended on the 12th.

Reactions to the strike shaped relationships between government and workers, unions and the Labour Party, the secret service and the unions, the secret service and the Labour Party, unions and the TUC, the government and the BBC, the BBC and the newspapers, and other organisations in ways that still impact British society today.

This is not that story. This is the story of the truism at the core of the general strike and the government’s response: in a dispute between you and your employer, you can’t rely on the government for help. Only your trade union can support you in that dispute, by providing resources, training, expertise, and solidarity, by intervening in your case and by representing you and others like you collectively to try to create a situation where your working conditions don’t lead to a grievance.

This isn’t to say that industrial disputes and politics are completely unrelated, nor that trades unions don’t get involved in political issues. This happens, and it happens in matters that are important to their members. It can change outcomes in important ways; if you didn’t work over the weekend, or on May 1st (or instead, if you’re in the UK, aren’t working tomorrow on Monday May 4th), you can thank the unions for that. However, having been a union branch committee member and a caseworker, I know that political representation doesn’t change employer attitudes or policies as rapidly as case support and direct involvement.

Additionally, only a trade union can represent the positions of you and me, and of people like you and me, in the workplace, by being democratically constituted bodies where everyone can submit and vote on motions to define their position. Employer-sponsored feedback groups or consultative committees are opportunities for employers to hear that everything is fine from people they trust to say everything is fine. Maybe they might stock more biscuits in the break room, if you work on site.

I’ve made the point before on this blog that technology doesn’t steal jobs; employers do. People worry about the effects of AI on employment for various reasons: the Taylorist argument that it increases productivity and reduces labour needs; the Captain Swing view that it replaces creativity and individual expression; the Weberian view that it centralises control and power.

All of these things can occur simultaneously, and other effects too. The point is that everybody is facing this. If your concern is inadequate training, or redundancy, or unfavourable contracts, the solution is to work together with others to address that. Industrial action, in the form of strikes or what UK law calls “action short of a strike”, is only the big tool at the bottom of the drawer that we workers use when negotiation and conciliation fail. But the negotiations, along with the other actions, succeed more when those doing the bargaining represent all of us.

Posted in architecture of sorts, Business, economics, government, philosophy after a fashion, Policy, Responsibility | Leave a comment

Episode 55: Relaunch and Death March

In which I first apologise for the four-year gap between episodes, and then explain what I’m doing now and why that means I can start podcasting again. Other than creating valuable internet content I don’t have any work, so you can support this podcast by joining my Patreon.

With that out of the way, the topic for today’s episode is the book Death March, by Ed Yourdon. I look at what a death march project is, why they still occur in 2026, and Yourdon’s recommendations for coping with them.

Transcript

Hello, welcome to episode 55 of the Structure and Interpretation of Computer Programmers podcast. Yes, I am restarting this podcast. It’s been nearly four years since the last episode, but now I have more time available as I’ve voluntarily left paid work to focus on helping software engineers improve their craft. And this podcast becomes part of that assistance. I don’t make any money other than what you, my audience, give me to support this shift in my lifestyle, and the vehicle you use to provide that support is over on my Patreon.

This podcast will remain on this feed, and there’s other stuff that I share first, or even exclusively, over on the Patreon. Let me give you a quick pitch for that. My message to software engineers is: your job is safe. If you’re worried about whether AI means that there’ll be less need for software engineers in the next few years and that you need to retrain—don’t be. Once the field shakes out the adoption problems, identifies the tools that will work, and adapts its ways of working, this will unlock the huge latent demand for software that we’re still not meeting. There will be more people in software, not fewer.

So yes, there will still be need for software engineers, and yes, you will need to retrain because the role will, of course, change. And that’s what Chiron Codex is for. I want to help you understand that you can use AI coding assistants and not only remain in control of the software you create, but create better software by augmenting your skills and capabilities using those of the AI.

In the short term, I’m sharing techniques for interacting with chat-based coding assistants like Gemini CLI, Claude Code, or ChatGPT Codex that help you get better results or refine your ideas in ways that weren’t available before. These techniques come loaded with examples and a companion agent skills repository makes them ready to use. I’ll build out training, more agent skills and sub-agent prompts, and new tools to help you become a software engineering centaur instead of outsourcing your understanding to the computer.

Now that doesn’t mean that the SICP podcast is becoming AI-focused, and indeed it isn’t, for this simple reason: while the tools we have to apply software engineering knowledge might be changing, the actual knowledge areas—the need to understand systems and requirements, architecture, design and trade-offs, verification, validation, performance and more—all of that remains the same. So what I’ll do in this podcast is survey what we already know about software engineering through the lens of particular works—works from practitioners, consultants, researchers, and from adjacent fields—with a focus on the classics that people come back to decade after decade. I’ll look at what this literature teaches us about software and how we incorporate that knowledge into our work, with or without AI support.

I hope you join me on this journey. Remember you can support this podcast over on the Patreon, and that’s the only thing that contributes to my mortgage so that I can make these episodes. But the best way to help out is to tell one or more of your friends and colleagues about the podcast and recommend that they give it a listen. I’m always open to conversations and feedback. You can comment on the post for this podcast or send me an email at grahamlee@acm.org—that’s G-R-A-H-A-M-L-E-E at A-C-M dot O-R-G.

With that pitch and that explanation for the radio silence over the last few years out of the way, let’s get into the topic for episode 55, which is Ed Yourdon’s book, Death March, all about people and project management.

Ed Yourdon was a software consultant and a prolific author during the end of the 20th and the beginning of the 21st centuries. In fact, he was one of the people who was most vocal in reporting the issues of Y2K and of warning people of the risks associated with not updating software, which led to his reputation taking a bit of a hit when the year 2000 came and went without any big catastrophes. But of course, the reason that things went so smoothly is that there was a massive, massive effort to update all of this software, and that Ed Yourdon’s warnings were one of the reasons that people took this idea so seriously. This is, unfortunately, a recurring problem in software: that if you fix a problem before it becomes a disaster, people assume that you haven’t done anything.

However, the first edition of the Death March book was in 1995 and the second was in 2004, so he was able to keep some form of professional name and to carry on publishing beyond Y2K. So the first question that we have to ask about a Death March project is how we define a project to be in a Death March. And Yourdon’s definition is that any of the project parameters in this project exceed the norm by at least 50%. So for example, the schedule is less than half of that arrived at by rational estimates; the headcount is less than half the usual number for such a project; or the budget or associated resources for the project have been cut in half.

Now it may seem that in our modern era of agile projects and sprints, this is a bit of an outdated idea, so why should I pick this book and this topic of Death March projects? Unfortunately, it’s because I’ve seen a lot of Death March projects in recent years, including on projects that are notionally run according to agile principles, because the fundamental drivers of a Death March are not technological—they are political in nature. One company that I saw still had Death March projects because while they had switched to monthly sprints, they still had a project scope defined by annual conference attendance and the ability to release a new version of their product at the conference every year. Which meant that they had a feature list that they promised at one conference and aimed to deliver at the next conference without taking a reasonable approach to estimation, and so without guaranteeing that the project was rationally able to fit within that 12-month gap.

Other projects I’ve seen have been hobbled by technical debt practices, and so the ability to deliver over time gets reduced as the complexity of working in the code gets greater. With the result that what would previously have taken one iteration—sprint, whatever you want to call it—to deliver, starts to take longer, and as soon as it takes two sprints, you have doubled the rational estimate for delivering the feature. If you try to do it in one sprint, you’re on a Death March project. And so unfortunately, I do still see Death March projects even where each of the death marches is allegedly a two-week sprint or a one-month iteration.

So Ed Yourdon draws, as any valid consultant does in order to earn their money, a quadrant diagram to categorize the four types of sprint. If you don’t see a quadrant diagram in a book by a consultant, then perhaps the editor decided they needed to save a page, but it was definitely there in the draft. And his quadrants are on the one axis whether the chance of success is lower or higher—he doesn’t say low or high because by definition a Death March project has a low chance of success. One of the project parameters is wrong by a factor of 50% at least.

And on the other axis—and this might be surprising—but he has whether there is a low or a high level of happiness. This goes back to the idea that a Death March project is political in nature; people are participating in it for various reasons, not least of which being the perception of not having an alternative. If the market—as it was in 2003 just after the dot-com crash and 9/11—is in a poor state, as it was also immediately after the global financial crisis and immediately after COVID-19—in fact, I would go as far as to say that we are still in the post-COVID-19 slump—then many of the employees, whether they are managers, project managers, programmers, testers, operations staff, whoever they are, may feel like they have no choice but to continue in their current job.

Of course, upturns in a market can lead to Death March projects as well, because you might plan and scope out a project with a team of people and then the higher-performance people on the team will go and get higher-paid jobs somewhere else, and you’re left with your existing schedule, your existing commitments, and fewer staff. So we see Death Marches in times of boom and times of bust.

But other reasons for people to participate include heroism: if you’re on one of those high-happiness, relatively high chance of success projects, that’s a kind of Mission Impossible—there may be great rewards, or at least great recognition, for completing the project no matter how unlikely that seems. People may be naive and not realize that the project they’re signing up to is a Death March. Or there might be career progression or resume-padding opportunities. Maybe this project is a chance to implement AI-augmented blockchain in the company and it’s the only such project that’s ever going to be initiated, and so you participate whatever the likelihood of success just so that you can have those technologies on your CV.

Now with the core properties of the Death March project typically being political rather than technical, we might find that market constraints or miscommunication with customers lead to aggressive deadlines or misunderstood requirements. And a lot of Yourdon’s recommendations for dealing with Death March projects are political in nature. They largely involve the project manager trying to save the project both from its own staff, who may be willing to try many things or give up on certain pieces of work—I remember back in my first programming job I was on a project that became a Death March effectively because they asked the wrong people to estimate it.

Now on the one hand, this project was run as a typical waterfall project, and this would have been in approximately 2007. So we already knew all of the problems with waterfall, but the engineering management at this company were insistent on the kind of phased approach to running the project with managerial review at phase-exit gates. And that meant that the first thing that the team had to do was estimate how long it would take them to complete the project. Well, the team being me, who was in my first programming job; another programmer who was in their first job full stop out of university; and a third programmer who was an experienced member of the company, having worked there for five years, but who had never worked on the technology that this project was integrating with. So we were really the wrong people to estimate this project.

We were very naive, very optimistic, and came up with a completely unworkable project schedule. That project had many of the features that Yourdon describes in a Death March, including people suggesting that we give up on basic accessibility or usability requirements, or even on quality assurance tasks so that we finished something at some time rather than delivering a good product at the time that it was ready. And indeed a lot of Yourdon’s recommendations are either trying to save the project from its own team members who engage in this kind of behavior, or trying to rescue the project from the company management who are going to take a bit more of a keen interest because this project seems to be going off the rails, particularly if the project is going to be one of the high-visibility projects for the company or creates a new product that their customers are relying on.

So some of these recommendations do kind of seem a bit dated now, like they’re situated in the context of what you could get away with in employer-employee relations in 2003, but on the other hand, I’ve seen some of these relatively recently before. Overtime, better office conditions, evening pizza orders—those are all things that I see Silicon Valley companies doing, and even preemptively doing it to get project members into the Death March mindset. Now I used to work at one of the large Silicon Valley companies that’s most famous for its social networking product, and there the office had three free cooked meals a day in the refectory, drinks and snacks available for free anytime of day, and a workplace social program where you got financial support for a social event if a group of employees met at the office and left for this event in the evening—I think the particular criterion was after 7:00 PM.

That obviously encourages people to still be in the office after 7:00 PM, as does the free cooked dinner. And so therefore you build a Death March mentality where people give you overtime for free, and then you don’t need to be rational in your estimation processes because you can always assume that free overtime is available. Another of his suggestions from the kind of office arrangement perspective is to take the team out somewhere else and have them work in like a skunkworks facility, like a warehouse across town from the office. This is again related to the idea that you want to kind of take them away from the regular management oversight so they can actually focus on getting the work done, but also kind of embed them in this high-urgency environment where everyone understands what the mission is and how important it is to get it done. And the idea of war-rooming is still prevalent amongst some of these larger software companies.

But some of his other recommendations represent a partial acceptance of what would have been the radical but broadening-in-adoption new idea at the time of Agile software development. Don’t forget that even though the Manifesto was published in 2001, the people who were talking about it had been talking about it for a good few years beforehand, and that these were software methodologists who were talking with regular software companies all the time. So Yourdon would have been very aware of their work and of their recommendations and of the likelihood of success or otherwise of following these recommendations.

So there is actually a section in his chapter on triage about adopting XP. And the reason for that is that his triage chapter is about saying, well let’s accept that this project isn’t going to go successfully if we do it the way that we’re going to do it. So let’s ask the customer what the most important things are and give them those first, and work with the customer frequently to reprioritize the requirements, to get their feedback and to update what we’re doing based on what they need. Customer collaboration over contract negotiation and valuing the continuous delivery of working software to the customer.

Focus the process on meaningful contributions. Indeed, later chapters discuss continuous delivery even—there’s a section on having a daily build. Now the Death March project I talked about before earlier in my career could not have had a daily build because the build was very strongly handheld. We used Perforce as our version control system, and the build definition was a change set that pointed at a file that listed components and the change set of those components to check out. And because there was an air-gapped network, someone would take that build specification, check out the requested source.

Because we were building for a Unix product and the build team was using Windows, frequently we would get problems where binary files had Windows line endings where a new line character in the binary file had been replaced by a carriage return and a new line when it was checked out and so the build would fail. So we had build failures of at least 50%. But nonetheless, having checked out the source, they would then burn it to a CD and take that over to the build network, run the CD through an antivirus program on the air-gapped build room, and then copy that source onto the build computer to run the build. So a daily build would have literally taken an employee to run.

These days you can build your software multiple times a day, and so we’re used to continuous delivery where even the daily build might even go into production, or we have feature flags so the code changes are getting integrated every day and that is going into the build every day, even if the new features aren’t necessarily available for the customer to use. But daily builds in 2003—Microsoft were doing this for Windows. I don’t know how prevalent it was, but this was continuous delivery—it’s just to make sure that the software you’re working on is available for the customer so that they can adapt to it as quickly as possible, so that the rest of the project team can integrate and adapt to the software that you’ve built as soon as they can.

He also talked about the risks of assuming that new processes or new tools could save the day though, and so contextually we understand why his recommendations for things like XP were guarded. He’s talking about the idea that there are people who just believe that some new process or new tool is going to be a silver bullet and that if only you would adopt that, you will absolutely turn your fortunes around. The risk with that, particularly on a Death March project—and he explains this in the book—is that everybody has to learn and master this new tool or this new process to be effective, and in the short term, that slows the project down just as adding new people would.

According to Fred Brooks, there’s a load of communication that has to be done, a load of learning, a load of practice. And so let’s imagine that you’re working on a Death March project in Ruby on Rails and someone says, well if only we used Elixir and Phoenix, we’d get this project done much quicker. Is that much quicker after you have learned how to be productive with Elixir and Phoenix, or is that much quicker assuming that you already know how to use them? Or is that just wishful thinking and the person wants to put those technologies on their resume?

Now an interesting recommendation at the end of the book that I don’t think I’ve ever seen put into practice is what he calls “wargaming”, which is the idea of preparing people for projects that go wrong or that require adaptation or that become death marches by letting them participate in simulated projects and simulating particular events—for example, half of the staff leaving or the customer deciding that they need the software much earlier than you had previously assumed or the estimates being incredibly wrong.

I don’t know that I have ever seen a software company simulate a project at all, or even insert into a real project a simulated catastrophe or failure for resilience testing. I’ve seen certainly technological simulations like fake data center outages or Red Teaming and what are basically simulated cyber attacks, but I don’t know that I’ve ever seen a software company simulate a software project or inject a simulated event into a real software project just to see whether people are ready and whether the organization is ready to adapt to it. That is an interesting idea that still belongs in the future despite this second edition of this book being written in 2004.

So sadly, I think that death marches are still relevant and that Ed Yourdon’s book still has something to teach us, particularly on the kind of agile projects that are called “Dark Scrum” by Ron Jeffries and that are all too common. I’d love to hear what you think; you can comment on the post for this podcast, you can send me an email (grahamlee@acm.org), or if you join the Patreon, you can join the community and get involved in the chat over there. So thank you for listening; I don’t entirely know when the next episode is going to come out, but I’m going to aim for a monthly cadence, so I hope to talk to you all again very soon.

1 Comment

Art or tool?

The Internet spaces I tend to inhabit have more polarisation than at many other recent times, and little explication of the worldviews that lead to different premises for discussion, that in turn lead to the polarisation and disagreement. Taking a step back to analyse the discussions, I think we see a debate that’s been raging for longer than I’ve been alive and that has no chance of reconciliation.

Is the program code that someone creates an artistic expression, or a tool that gets the job done? The useful answer is “both”, the pragmatic answer is “it depends on the context”, but the belief is often one or the other, or a large amount of one and a small amount of the other, and from there stem the arguments.

Code as art

When someone creates a program, they combine their technical skill with their humanitarian understanding and their aesthetic sensibilities to make something that has meaning to society and affects people in some way. They craft a design that expresses their current understanding of a situation, including their understanding of how that situation might evolve into future situations. Software serves two purposes: the use to which it’s put, and a demonstration of the skill of its creator.

Code as tool

When someone creates a program, they combine their technical skill with their humanitarian understanding and their aesthetic sensibilities to make something that has value to society and that people can apply in some way. They craft a design that solves a problem as they currently understand it, including their understanding of how that problem might evolve into future problems. Software serves two purposes: the use to which it’s put, and the adaptability toward future applications.

The half-century of discord

If you start from either of those places, people who start from the other place look like they don’t understand what software truly is.

To the code-artist, the act of programming is a creative effort that’s deeply personal and extractive, as there’s a part of themselves that goes into every interface, every abstraction, every carefully-considered parameter. “Technical debt” is a swear word because it means deliberately making unaesthetic choices. “Legacy code” is a swear word because some other, inferior artist created that, and the code-artist can do a better job.

Efficiency tools are swear words because they remove the creativity and expressivity from the craft, automating choices that by rights should be made by ingenious humans or—and this may be worse—allowing mass-production of art by duplicating a single work into multiple contexts, when the correct way is to hand-craft the bespoke design that’s most appropriate for each context. Of course, which specific tools are verboten depends on which tools are new at the time of the debate. Douglas Adams had it mostly right in The Salmon of Doubt:

  1. Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.
  2. Anything that’s invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.
  3. Anything invented after you’re thirty-five is against the natural order of things.

So code-artists of a certain age in the 1960s might have thought that compilers and linkers are preternatural, when a true artist hand-selects the accumulators to use for each variable to make their code more efficient, and assembles functions into libraries to optimally load them in when they’re needed. People of a certain age in the 1980s might have thought that copying BASIC listings from Sinclair User is preternatural, because that’s just uncreative plagiarism and the copyist can never truly understand what’s going on. Take this latter argument out of its time and apply it again to getting code from comp.lang.c on Usenet, from answers on Stack Overflow, or from generative AI—and have about as much success with it as earlier arguments against the printing press or portrait photography.

To the code-toolsmith, the act of programming is a production process that’s performed to achieve some aim, so the principle is to move from “working towards the aim” to “having achieved the aim” as quickly as possible so that you can achieve some other aim. “Technical debt” is an acceptable decision that optimises for being done. “Legacy code” is delightful because it’s already achieving its aims.

Efficiency tools are wonderful because they remove uncertainty, decision-making, and individual effort from the task, by enabling mass adoption of known solutions to general problems. Why choose which accumulator to use for each variable when you can automate that, and think about the problem you’re solving? Why commission an oil painter, when you can press a button and have a visual record of the person stood in front of you? Why write potentially incorrect code when you can copy it out of Sinclair User or from an answer in Stack Overflow?

Code as both

[In drafting this post I adopted the portmanteau “tort” here—part tool, part art—which also works in suggesting that code can be a harm a person inflicts on another person.]

In any given situation, we see that code has both artistic and pragmatic qualities. Even in the extreme case of “program as art”, such as a demo scene demo, the code needs to work in that it needs to perform the functions that support drawing the demo’s graphics and playing its music correctly. Going to the further artistic extreme of example code in a tutorial or article—where the code is an aesthetic component in a creative work that has the sole goal of communicating a message from its creator to its viewers—it still needs to work in that the viewer needs to understand the message conveyed in the code and how to apply that meaning to their own situations: they don’t merely appreciate the code, they learn from it. Computers and Typesetting, Volume B by Donald Knuth isn’t just a book people can read, it’s a working digital typesetter, and it would fail as a book if it didn’t work.

In the other extreme, the archetypal “program as tool”, such as a line-of-business application written by an employee programmer, the code needs to convey in that it needs to demonstrate what the programmer’s understanding of the line of business is, and how they reified that understanding in software, so that they and others can come back to it and modify it when they discover that the understanding was wrong, or that the line of business has changed. They don’t merely use the code, they appreciate it. TeX by Donald Knuth isn’t just a working digital typesetter, it’s a book people can read, and it would fail as a digital typesetter if people couldn’t read it.

A synthetic understanding

We therefore need to create a common paradigm for understanding software quality that includes both the artistic and the pragmatic; both the external qualities of what it does, and the internal qualities of how it does it. When we don’t have that, we have people talking past each other when it comes to making the software: new tools are either diabolical interference in the creative art, or the best thing ever. But more than that: when we don’t have a synthetic basis for understanding software, we can’t work together to achieve software with either quality attribute. We split into “the business” who just want the problems solved and don’t see the value in the expressive nature of software, and “the technical people” who understand the craft of making and don’t see the benefit in doing a lesser job, faster. In theory, this is the point of the “engineering” idea in “software engineering”; to understand the science and art of software and apply both to improve systems.

This isn’t a new idea. Just as the arguments over copying a BASIC listing from a magazine have been raging for decades, so the “intersection of technology and the liberal arts” has been understood and re-understood, told and re-told, for just as long. It’s no coincidence that Computers and Typesetting, Volume B and TeX are actually the same work. I tell this story again today because it’s relevant today, to avoid creating two different camps of software creators who don’t understand each other.

Posted in software-engineering | Leave a comment

On working machines

In part one, on thinking machines, I explored two facets of the philosophy of artificial intelligence: “intelligence”, and consciousness. That left an important topic to consider for this post: the impact of artificial intelligence on work.

No technology has ever “stolen a job”. Not once. Technology automates and enables tasks. Some of these tasks were never part of “the market”, and other tasks were. If your job is defined by performing the same task over and over, be it knocking the base of a saggar, driving a vehicle, or typing JavaScript into somebody else’s computer, and that task can be automated, then there’s a chance that your employers won’t need you to do that task any more. But whether they keep you, redeploy you, retrain you to do something else, or let you go, is their choice: it’s the employers that stole your job.

Let’s imagine a hypothetical scenario where a company has ten JavaScript shovelers, and each outputs an average of one bushel of JS per day. Now some technological intervention—could be AI, sure, but it could be a syntax-highlighting text editor, TypeScript, or some other tool—makes each JS shoveler ten times more efficient (aside: it doesn’t). The employer’s choices (note: not the technology’s choices, not the inventor’s choices; the employer’s choices) might be represented in a diagram like this:

The "expanding brain" meme, where the four options are:
Fire nine employees.
Redeploy nine employees.
Keep all ten and get 10x work.
Hire even more employees.

That last option is courtesy of Jevons’ paradox, which says that when a resource becomes more efficient to use, demand goes up. If a new technology makes knowledge-deployment more efficient, then demand for knowledge work increases, it doesn’t decrease. The employers who don’t increase their knowledge-working capacity when knowledge work becomes more efficient are, to paraphrase William Stanley Jevons, idiots.

The “AI is stealing our jobs” meme comes from a lack of understanding that software engineers are workers, not employers, and that the economic principles of employment and work apply to them the same as to other workers. Bringing in another paradox of economics, Robert Solow noted that “you can see the computer age everywhere but in the productivity statistics”. It took a long time for computers to start automating knowledge work: first record tabulation, then payroll and inventory management, then the typing pool and typesetting, then so on and so on through technical drafting and taxi dispatching.

Through the slow burn of the computer age, software engineers got comfortable with being the people who automate other people’s work. Throughout that period, demand for (the task of) computer programming rose. Now, two (mostly unrelated) things have happened: the first is that a new technology has promised to automate computer programming, placing us at the start of the next Solow age; and headcount among people who repetitively do computer-programming tasks has been decreasing.

That means that the computer people are on the receiving end of capitalism for the first time since the dot-com crash, and they don’t like it. We automate other people’s work, it’s unfair to automate our work! This is another view through the same economic lens that gives us enshittification: wait, we worked hard to turn this manual task into an automated platform, you owners can’t seriously expect to capture additional value from this platform?! We’re supposed to continue to benefit from the lower costs we enabled for you!?

Those of us who do computering for a salary, wage, or day rate have always been on the receiving end of the exploitative nature of the wage relationship, unfortunately the relatively high salaries and enjoyable tasks stopped many of us from engaging with that seriously. We’re now in a position of huge uncertainty for many employees in the field, and the short-term solution to that is the same solution it’s always been, that’s demonstrated to work in many European economies: collective bargaining on behalf of the sector.

But becoming conscious to the benefits of increasing bargaining power through group organisation is insufficient to end the fundamentally exploitative relationship, and to stop the next round of automation, layoffs, and changes to employment conditions. So is any idea that employment will automatically disappear completely, or ebb away, in some Keynesian decline to a 15-hour working week. As we automate some tasks, we introduce new tasks, and new jobs that exploit people to get those tasks done; whether or not you think of them as bullshit jobs.

Posted in AI, economics | Leave a comment