Skip to content

Class clusters, placeholder objects, value-oriented programming, and all that good stuff.

Have you ever seen this exception in your crash log?

2012-05-29 17:55:37.240 Untitled 2[5084:707] *** Terminating app due to uncaught exception ‘NSInvalidArgumentException’, reason: ‘*** -length only defined for abstract class. Define -[NSPlaceholderString length]!’

What’s that NSPlaceholderString class?

Leaving aside NSMutableString for a moment[*], there’s no way for a developer who’s got an instance of an NSString to modify that string. In this model a string instance represents the value of that string: the word “hello” is always going to be “hello”. You can build a sentence that includes the word “hello” in a sequence – e.g. “hello, world”. You can build a different sentence, e.g. “goodbye, world”. You haven’t changed the value of the word “hello” to “goodbye”, you’ve changed the value of the sentence to include a word with a different value.

OK, let’s take that to an extreme. If any string that a developer gets back from NSString‘s API is immutable, then that should include the string she gets back from +allocWithZone:, right? So any extra data passed in an -initWith… method can’t be used to change the string object we just allocated.

That’s OK, because -init… methods are allowed to return a different object, preserving this “don’t change the value” principle. Imagine the C string initialiser for NSString looking like this (I doubt it does – I think it internally converts the string to UTF-16 – but it’ll do as an example):

-(id)initWithCString: (char *)cString encoding: (NSStringEncoding)encoding
{
  NSCString *otherString = [NSCString allocWithZone: [self zone]];
  [self release];
  otherString->length = strlen(cString);
  otherString->bytes = malloc(otherString->length);
  strlcpy(otherString->bytes, cString, otherString->length);
  otherString->storedEncoding = encoding;
  return otherString;
}

This doesn’t violate the no-modification contract, because it only changes an object that’s being built and that the end developer hasn’t seen yet. Once the developer gets to look at this string – when it’s returned from the initialiser – it’ll be immutable.

So this means that the string which was returned from +allocWithZone: represents a particular value of a string: the string that has yet to be assigned a value. Indeed, it’s a placeholder string. But any string that has yet to be assigned a value can be represented by the same placeholder, because they all mean the same thing. That means we can save some memory by creating a Flyweight instance of the placeholder. Even if multiple call sites in multiple threads all get the same instance of our placeholder string, there’s no danger of them tripping over each other because they’ll all then get different strings as they tell the placeholder what values they need to represent.

In fact, if code in two different threads need to represent the same value, it’s safe to give them both references to the same object. Neither can change that object and spoil things for the other.

This pattern of keeping objects immutable in the eyes of client code, providing transformations that result in new objects rather than modifying existing objects, makes a raft of thread safety problems disappear and reduces the complexity of class APIs. I’ll be using it more often in my object models.

[*]To be honest, I’d like to leave it aside forever. It satisfies the Law of Demeter, but there’s a whole class of concurrency problems that only exist because “a mutable string isa string”.