This post is a write-up of a talk I gave at Alt Tech Talks: London on the Objective-C runtime. Seriously though, you should’ve been there.
The Objective-C runtime?
That’s the name of the library of C functions that implement the nuts and bolts of Objective-C. Objects could just be represented as C structures, and methods could just be implemented as C functions. In fact they sort of are, but with some extra capabilities. These structures and functions are wrapped in this collection of runtime functions that allows Objective-C programs to create, inspect and modify classes, objects and methods on the fly.
It’s the Objective-C runtime library works out what methods get executed, too. The
[object doSomething] syntax does not directly resolve a method and call it. Instead, a message is sent to the object (which gets called the receiver in this context). The runtime library gives objects the opportunity to look at the message and decide how to respond to it. Alan Kay repeatedly said that message-passing is the important part of Smalltalk (from which Objective-C derives), not objects:
I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to focus on the lesser idea.
The big idea is “messaging” – that is what the kernal[sic] of Smalltalk/Squeak is all about (and it’s something that was never quite completed in our Xerox PARC phase). The Japanese have a small word – ma – for “that which is in between” – perhaps the nearest English equivalent is “interstitial”. The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be.
Indeed in one article describing the Smalltalk virtual machine the programming technique is called the message-passing or messaging paradigm. “Object-Oriented” is used to describe the memory management system.
Throughout this talk and post I’m talking about the ObjC runtime, but there are many. They all support an object’s introspective and message-receiving capabilities, but they have different features and work in different ways (for example, Apple’s runtime sends messages in one step, but the GNU runtime looks up messages and then invokes the discovered function in two steps). All of the discussion below relates to Apple’s most modern runtime library (the one that is delivered as part of OS X since 10.5 and iOS).
In the talk, I decided to examine a few specific areas of the runtime library’s behaviour. I looked for things that I wanted to understand better, and came up with questions I wanted to answer as part of the talk.
Dynamic class creation
Can I implement Key-Value Observing?
While I was preparing the talk, a post called KVO considered harmful started to get a lot of coverage. That post raises a lot of valid criticisms of the Key-Value Observing API, but rather than throw away the Observer pattern I wanted to explore a new implementation.
The observed (pardon the pun) behaviour of KVO is to privately subclass the observed object’s class, so that it can customise the object’s behaviour to call the KVO callback. That’s done through a function called
objc_duplicateClass, unfortunately the documentation tells us that we should not call this function ourselves.
It’s still possible to implement an Observer pattern that uses the same secret-subclass behaviour, by allocating and registering a “class pair”. What’s a class pair? Well each class in Objective-C is really two classes: the class object defines the instance methods, and the “metaclass” defines the class methods. So each class is really a singleton instance of its metaclass.
The ObserverPattern implementation shows how this works. When you add an observer to an object, the receiver first works out whether it’s an instance of the observable class. If it needs to create that class, it does so: adding our own implementations of
-dealloc to clean up after ourselves, and
-class so that, like KVO observable objects, the generated class name doesn’t appear when you ask an observed object its type.
Having created the class, the code goes on to add a setter for the conventional Key-Value Coding selector name for the property: this setter grabs the old and new values of the property and invokes the callback which was supplied as a block object. Because we can, the block is dispatched asynchronously.
Notice that the
-addObserverForKey:withBlock: method uses
object_setClass() to replace the receiver’s class with the newly-constructed class. The main effect of this is to change the way messages are resolved onto methods, but you need to be careful that the original and replaced class have the same instance variable layout too. Instance variables are looked up via the runtime too, and changing the class could alter where the runtime thinks the bytes are for any given variable.
We have a little extra hurdle to overcome in storing the collection of observer tokens, because there’s nowhere to put them. Adding an instance variable to the
ObserverPattern[…] class would not work, as instances of that class are never actually allocated. The objects involved have the instance variables of their initial class, which won’t include space for the observers.
The Objective-C runtime provides for this situation by giving us associated objects. Any object can have what is, conceptually, a dictionary of other objects maintained by the runtime. Observed objects can store and retrieve their observer tokens via associated references, and no extra instance variables are needed.
A little problem in the ObserverPattern implementation will become clear if you run it enough times. The observation callbacks are sent asynchronously, and can be delivered out of sequence. That means the observer can’t actually tell what the final state of the observed key is, because the “new value” received in the callback might have already been replaced. I left this fun issue in to demonstrate that KVO’s synchronous implementation is a feature, not a bug.
What are those extra bytes for?
When you create an Objective-C object, the runtime lets you allocate some extra storage at the end of the space reserved for its instance variables. What’s the point of that? All you can do is get a pointer to the start of the space (using
object_getIndexedIvars)…hmm, indexed ivars. Well, I suppose an array is a pretty obvious use of indexed ivars…
Let’s build NSArray! There are two things to see in
SimpleArray: the most obvious is the use of the class cluster pattern. The reason is that the object returned from
+alloc—where we’d normally allocate space for the object—cannot know how big it’s going to be. We need to use the arguments to
-initWithObjects:count: to know how many objects there are in the array. So
+alloc returns a placeholder, which is then able to allocate and return the real array object.
One obvious question to ask is why we’d do this at all. Why not just use
calloc() to grab an appropriately-sized buffer in which to store the object pointers? The answer is to do with a low-level performance concern called locality of reference. We know from the design of the array class that pretty much every time the array pointer is used, the buffer pointer will be used too. Putting them next to each other in RAM means we don’t have to look off at some dereferenced pointer just to find another pointer.
Just how does message forwarding work?
One of the powerful features of Objective-C is that an object doesn’t have to implement a method when it’s compiled to be able to respond to messages with that selector name. It can lazily resolve the methods, or it can forward them to another object, or it can raise an error, or it can do something else. But something about this feature was bugging me: message forwarding (which happens in the runtime) calls
-forwardInvocation:, passing it an
NSInvocation object. But
NSInvocation is defined in Foundation: does the runtime library need to “know” about Foundation to work?
I tracked down what was going on and found that no, it does not need to know about Foundation. The runtime lets applications define the forwarding function, that gets called when
objc_msgSend() can’t find the implementation for a selector. On startup, CoreFoundation[+] injects the forwarding function that does
-forwardInvocation:. So presumably my application can do its own thing, right?
Let’s build Ruby! OK, not all of Ruby. But Ruby has a
#method_missing function that gets called when an object receives a message it doesn’t understand, which is much more similar to Smalltalk’s approach than to Objective-C’s. Using
objc_setForwardHandler, it’s possible to implement
methodMissing: in our Objective-C classes.
The Objective-C runtime is a powerful way to add a lot of dynamic behaviour to an application for very little work. Some developers don’t use it much beyond swizzling methods for debugging, but it has facilities that make it a powerful tool for real application code too.
[+]CoreFoundation and Foundation are really siblings, and they each expose pieces of the other’s implementation, but one has a C interface and the other an Objective-C interface. Various Objective-C classes are actually part of CoreFoundation, including
NSInvocation and the related
NSObject is not in either of these libraries: it’s now defined in the runtime itself, so that the runtime’s memory management functions know about
-release and so on[++]. On the other hand, most of the *behaviour* of
NSObject is implemented by categories higher up. And, of course, this is all implementation detail and the locations of these classes could be (and are) moved between versions of the frameworks.
[++]Other languages like Smalltalk and Ruby have a simple base class that does nothing except know how to be an object, called
BaseObject. You could imagine the runtime supplying—and being coupled to—
ProtoObject, and (Core)Foundation supplying
NSProxy as subclasses of