Concurrent objects and SCOOP

Representing concurrency in an object-oriented system has been a long-standing problem. Encapsulating the concurrency primitives via objects and methods is easy enough, but doesn’t get us anywhere. We still end up composing our programs out of threads and mutexes and semaphores, which is still hard.

Prior Art

It’s worth skimming the things that I’ve written about here: I’ve put quite a lot of time into this concurrency problem and have written different models as my understanding changed.

Clearly, this is a broad problem. It’s one I’ve spent a lot of time on, and have a few satisfactory answers to if no definitive answer. I’m not the only one. Over at codeotaku they’ve concluded that Fibers are the right solution, though apparently on the basis of performance.

HPC programs are often based on concurrent execution through message passing, though common patterns keep it to a minimum: the batch processor starts all of the processes, each process finds its node number, node 0 divvies up the work to all of the nodes (a message send), then they each run through their part of the work on their own. Eventually they get their answer and send a message back to node 0, and when it has gathered all of the results everything is done. So really, the HPC people solve this problem by avoiding it.

You’re braining it wrong, Graham

Many of these designs try to solve for concurrency in quite a general way. Serial execution is a special case, where you only have one object or you don’t submit commands to the bus. The problem with this design approach, as described by Bertrand Meyer in his webinar on concurrent Object-Oriented Programming, is that serial execution is the only version we really understand. So designing for the general, and hard-to-understand, case means that generally we won’t understand what’s going on.

The reason he says this is so is that we’re better at understanding static relationships between things than the dynamic evolution of a system. As soon as you have mutexes and condition locks and so on, you are forced to understand the dynamic behaviour of the system (is this lock available? Is this condition met?). Worse: you have to understand it holistically (can anything that’s going on at the moment have changed this value?).

Enter SCOOP

Meyer’s proposal is that as serial programs are much easier to understand (solved, one might say, if one has read Dijkstra’s A Discipline of Programming) we should make our model as close to serial programming as possible. Anything that adds concurrency should be unsurprising, and not violate any expectations we had if we tried to understand our program as a sequential process.

He introduced SCOOP (Simple Concurrent Object-Oriented Programming) developed by the Concurrency Made Easy group at ETH Zürich and part of Eiffel. Some of the design decisions he presented:

  • a processor is an abstraction representing sequential execution
  • there is a many-to-one mapping of objects to processors (this means that an object’s execution is always serial, and that all objects are effectively mutexes)
  • where an object messages another on a different processor, commands will be asynchronous (but executed in order) and queries will be synchronous
  • processors are created dynamically and opportunistically (i.e. whenever you create an object in a “separate” and as-yet unpopulated domain)

An implementation of this concurrency model in Objective-C is really easy. A proxy object representing the domain separation intercepts messages, determines whether they are commands or queries and arranges for them to be run on the processor. It inspects the objects returned from methods, introducing proxies to tie them to the relevant processor. In this implementation a “processor” is a serial operation queue, but it could equivalently be a dedicated thread, a thread pulled from a pool, a dedicated CPU, or anything else that can run one thing at a time.

This implementation does not yield all of the stated benefits of SCOOP. Two in particular:

  1. The interaction of SCOOP with the Eiffel type system is such that while a local (to this processor) object can be referred to through a “separate” variable (i.e. one that potentially could be on a different processor), it is an error to try to use a “separate” object directly as if it were local. I do not see a way, in either Swift’s type system or ObjC’s, to maintain that property. It looks like this proposal, were it to cover generic or associated types, would address that deficiency.
  2. SCOOP turns Eiffel’s correctness preconditions into wait conditions. A serial program will fail if it tries to send a message without satisfying preconditions. When the message is sent to a “separate” object, this instead turns into a requirement to wait for the precondition to be true before execution.

Conclusions

Meyer is right: concurrent programming is difficult, because we are bad at considering all of the different combinations of states that a concurrent system can be in. A concurrent design can best be understood if it is constrained to be mostly like a serial one, and not require lots of scary non-local comprehension to understand the program’s behaviour. SCOOP is a really nice tool for realising such designs.

This is something I can help your team with! As you can see, I’ve spent actual years understanding and thinking about software concurrency, and while I’m not arrogant enough to claim I have solved it I can certainly provide a fresh perspective to your team’s architects and developers. Book an office hours appointment and let’s take a (free!) hour to look at your concurrency problems.

About Graham

I make it faster and easier for you to create high-quality code.
This entry was posted in code-level, OOP, performance and tagged . Bookmark the permalink.

One Response to Concurrent objects and SCOOP

  1. Pingback: On or Between | Structure and Interpretation of Computer Programmers

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.