Yesterday, we observed that the goal of considering the
go to statement harmful was so that a programmer could write a correct program and have done with it. We noticed that this is never how computering works: many programs are not even instantaneously correct because they represent an understanding of a domain captured at an earlier time, before the context was altered by both external changes and the introduction of the software itself.
Today, let’s look at the benefits of removing the
go to statement. Dijkstra again:
My second remark is that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.
This makes sense! Our source code is a static model of our software system, which is itself (we hope) a model of a problem that somebody has along with tools to help with solving that problem. But our software system is a dynamic actor that absorbs, transforms, and emits data, reacting to and generating events in communication with other human and non-human actors. We need to ensure that the dynamic behaviour is evident in the static model, so that we can “reason about” the development of the system. How does Dijkstra’s removal of
go to help us achieve that?
Let us now consider how we can characterize the progress of a process. (You may think about this question in a very concrete manner: suppose that a process, considered as a time succession of actions, is stopped after an arbitrary action, what data do we have to fix in order that we can redo the process until the very same point?) If the program text is a pure concatenation of, say, assignment statements (for the purpose of this discussion regarded as the descriptions of single actions) it is sufficient to point in the program text to a point between two successive action descriptions. (In the absence of
go tostatements I can permit myself the syntactic ambiguity in the last three words of the previous sentence: if we parse them as “successive (action descriptions)” we mean successive in text space; if we parse as “(successive action) descriptions” we mean successive in time.) Let us call such a pointer to a suitable place in the text a “textual index.”
Why do we need such independent coordinates? The reason is – and this seems to be inherent to sequential processes – that we can interpret the value of a variable only with respect to the progress of the process. If we wish to count the number, n say, of people in an initially empty room, we can achieve this by increasing n by one whenever we see someone entering the room. In the in-between moment that we have observed someone entering the room but have not yet performed the subsequent increase of n, its value equals the number of people in the room minus one!
So the value of
go to-less programming is that I can start at the entry point, and track every change in the system between there and the point of interest. But I only need to do that once (well, once for each different condition, loop, procedure code path, etc.) and then I can write down “at this index, these things have happened”. Conversely, I can start with the known state at a known point, and run the program counter backwards, undoing the changes I observe (obviously encountering problems if any of those are assignments). With
go to statements present, I cannot know the history of how the program counter came to be here, so I can’t make confident statements about the dynamic evolution of the system.
This isn’t the only way to ensure that we understand a software system’s dynamic behaviour, which is lucky because it’s not a particularly good one. In today’s parlance, it “doesn’t scale”. Imagine being given the bug report “I clicked in the timeline on a YouTube video during an advert and the comments disappeared”, and trying to build a view of the stateful evolution of the entire YouTube system (or even just the browser, if you like, and if it turns out you don’t need the rest of the information) between
main() and the location of the program counter where the bug emerged. Even if we pretend that a browser isn’t multithreaded, you would not have a good time.
Another approach is to encapsulate parts of the program, so that the amount we need to comprehend in one go is smaller. When you do that, you don’t need to worry about where the global program counter is or how it got there. Donald Knuth demonstrated this in Structured Programming with
go to Statements, and went on to say that removing all instances of
go to is solving the wrong problem:
One thing we haven’t spelled out clearly, however, is what makes some go to’s bad and others acceptable. The reason is that we’ve really been directing our attention to the wrong issue, to the objective question of go to elimination instead of the important subjective question of program structure.
In the words of John Brown [here, Knuth cites an unpublished note], “The act of focusing our mightiest intellectual resources on the elusive goal of go to-less programs has helped us get our minds off all those really tough and possibly unresolvable problems and issues with which today’s professional programmer would otherwise have to grapple.”
Much has been written on structured programming, procedural programming, object-oriented programming, and functional programming, which all have the same goal: separate a program into “a thing which uses this little bit of software, according to its contract” and “the thing that you would like to believe implements this contract”. OOP and FP additionally make explicit the isolation of state changes, so that you don’t need to know the whole value of the computer’s memory to assert conformance to the contract. Instead, you just need to know the value of the little bit of memory in the fake standalone computer that runs this one object, or this one function (or indeed model the behaviour of the object or function without reference to computer details like memory).
Use or otherwise of
go to statements in a thoughtfully-designed (I admit that statement opens a can of worms) is orthogonal to understanding the behaviour of the program. Let me type part of an
Array class based on a linked list directly into my blog editor:
Public Class Array Of ElementType Private entries As LinkedList(Of ElementType) Public Function Count() As Integer Dim list As LinkedList(Of ElementType) = entries Count = 0 nonempty: If list.IsEmpty() Then GoTo empty Count = Count + 1 list = list.Next() GoTo nonempty empty: Exit Function End Function Public Function At(index As Integer) As ElementType Dim cursor As Integer = index Dim list As LinkedList(Of ElementType) = entries next: If list.IsEmpty() Then Err.Raise("Index Out of Bounds Error") If cursor = 0 Then Return list.Element() list = list.Next() cursor = cursor - 1 GoTo next End Function End Class
While this code sample uses
go to statements, I would suggest it’s possible to explore the assertion “objects of class
Array satisfy the contract for an array” without too much difficulty. As such, the behaviour of the program anywhere that uses arrays is simplified by the assumption “
Array behaves according to the contract”, and the behaviour anywhere else is simplified by ignoring the
Array code entirely.
Whatever harm the
go to statement caused, it was not as much harm as trying to define a “correct” program by understanding all of the ways in which the program counter arrived at the current instruction.