On the value of old principles

People using AI coding assistants typically wrestle with three problems (assuming they know what they’re trying to get the model to do, and that that’s the correct thing to try to get it to do):

  • Prompt. How to word the instructions in a way that yields the desired outcome, especially considering the butterfly effect that small changes in wording can lead to large changes in result.
  • Context. The models deal in a tokenised representation of information, and have capacity to deal with a finite list of tokens.
  • Attention. The more things a model is instructed to attend to, the less important is each thing’s contribution to the generated output stream. This tends to follow a U-shaped distribution, with the beginning and end of the input stream being more important than the middle.

(It’s important to bear in mind during this discussion that all of the above, and most of the below, is a huge mess of analogies, mostly introduced by AI researchers to make their research sound like intelligence, and tools vendors to make their models sound like they do things. A prompt isn’t really “instructions”, models don’t really “pay attention” to anything, and you can’t get a model to “do” anything other than generate tokens.)

Considering particularly the context and attention problems, a large part of the challenge people face is dividing large amounts of information available about their problem into small amounts that are relevant to the immediate task, such that the model generates a useful response that neither fails because relevant information was left out, nor fails because too much irrelevant information was left in.

Well, it turns out human software developers suffer from three analogous problems too: failing to interpret guidance correctly; not being able to keep lots of details in working memory at once; and not applying all of the different rules that are relevant at one time. As such, software design is full of principles that are designed to limit the spread of information, and that provide value whether applied for the benefit of a human developer or a code-generation model.

Almost the entire point of the main software-design paradigms is information hiding or encapsulation. If you’re working on one module, or one object, or one function, you should only need to know the internal details of that module, object or function. You should only need to know the external interface of collaborating modules, objects, or functions.

Consider the Law of Demeter, which says approximately “don’t talk to your collaborators’ collaborators”. That means your context never needs to grow past immediate collaborators.

Consider the Interface Segregation Principle, which says approximately “ask not what a type can do, ask what a type can do for you”. That means you never need to attend to all the unrelated facilities a type offers.

Consider the Open-Closed Principle, which says approximately “when it’s done, it’s done”. That means you never need concern yourself with whether you need to change that other type.

Consider the Pipes and Adapters architecture, which says approximately “you’re either looking at a domain object or a technology integration, never both”. That means you either need to know how your implementation technology works or you need to know how your business problem works, but you don’t need details of both at the same time.

All of these principles help limit context and attention, which is beneficial when code-generating models have limited context and attention. Following the principles means that however large your system gets, it never gets “too big for the model” because the model doesn’t need to read the whole project.

Even were the models to scale to the point where a whole, ridiculously large software project fits in context, and even were they to pay attention to every scrap of information in that context, these principles would still help. Because they also help limit the context and attention that us humans need to spend, meaning we can still understand what’s going on.

And for the foreseeable, we still need to understand what’s going on.

About Graham

I make it faster and easier for you to create high-quality code.
This entry was posted in AI, design, OOP. Bookmark the permalink.