I’ve been reading the Smalltalk-80 blue book (pdf) recently, and started to wonder what a Smalltalk style object browser for Objective-C would look like. Not just from the perspective of presenting the information that makes up Objective-C classes in novel ways (though this is something I’ve discussed with Saul Mora at great length in the past). What would an object browser in which the compiler is an object, so you can define and manipulate classes in real time, look like?
Well, the first thing you’d need to do is to turn the compiler into an object. I decided to see whether I could see what the compiler sees, using the clang compiler front-end library.
Wait, clang library? Clang’s a command-line tool, isn’t it? Well yes, but it and the entire of LLVM are implemented as a collection of reusable C++ classes. Clang then has a stable C interface wrapping the C++, and this is what I used to produce this browser app. This isn’t the browser I intend to write, this is the one I threw away to learn about the technology.

Clang is a stream parser, and there are two ways to deal with source files just like any other stream: event-driven[*], in which you let the parser go and get callbacks from it when it sees interesting things, or document-based[*] where you let the parser build up a document object model (a tree, in this case) which you then visit the nodes of to learn about the data.
[*] Computer scientists probably call these things something else.
Being perverse, I’m going to use the event-driven parser to build a parallel data model in Objective-C. First, I need to adapt the clang library to Objective-C, so that the compiler is an Objective-C object. Here’s my parser interface:
#import <Foundation/Foundation.h>
@protocol FZAClassParserDelegate;
@interface FZAClassParser : NSObject
@property (weak, nonatomic) id <FZAClassParserDelegate>delegate;
- (id)initWithSourceFile: (NSString *)implementation;
- (void)parse;
@end
The -parse method is the one that’s interesting (I presume…) so we’ll dive into that. It actually farms the real work out to an operation queue:
#import <clang-c/Index.h>
//...
- (void)parse {
__weak id parser = self;
[queue addOperationWithBlock: ^{ [parser realParse]; }];
}
- (void)realParse {
#pragma warning Pass errors back to the app
@autoreleasepool {
CXIndex index = clang_createIndex(1, 1);
if (!index) {
NSLog(@"fail: couldn't create index");
return;
}
CXTranslationUnit translationUnit = clang_parseTranslationUnit(index, [sourceFile fileSystemRepresentation], NULL, 0, NULL, 0, CXTranslationUnit_None);
if (!translationUnit) {
NSLog(@"fail: couldn't parse translation unit);
return;
}
CXIndexAction action = clang_IndexAction_create(index);
That’s the setup code, which gets clang ready to start reading through the file. Which is done in this function:
int indexResult = clang_indexTranslationUnit(action,
(__bridge CXClientData)self,
&indexerCallbacks,
sizeof(indexerCallbacks),
CXIndexOpt_SuppressWarnings,
translationUnit);
This is the important part. Being a C callback API, clang takes a context pointer which is the second argument: in this case, the parser object. It also takes a collection of callback pointers, which I’ll show next after just showing that the objects created in this method need cleaning up.
clang_IndexAction_dispose(action);
clang_disposeTranslationUnit(translationUnit);
clang_disposeIndex(index);
(void) indexResult;
}
}
There’s a structure called IndexCallbacks defined in Index.h, this class’s structure contains functions that call through to methods on the parser’s delegate:
int abortQuery(CXClientData client_data, void *reserved);
void diagnostic(CXClientData client_data,
CXDiagnosticSet diagnostic_set, void *reserved);
CXIdxClientFile enteredMainFile(CXClientData client_data,
CXFile mainFile, void *reserved);
CXIdxClientFile ppIncludedFile(CXClientData client_data,
const CXIdxIncludedFileInfo *included_file);
CXIdxClientASTFile importedASTFile(CXClientData client_data,
const CXIdxImportedASTFileInfo *imported_ast);
CXIdxClientContainer startedTranslationUnit(CXClientData client_data,
void *reserved);
void indexDeclaration(CXClientData client_data,
const CXIdxDeclInfo *declaration);
void indexEntityReference(CXClientData client_data,
const CXIdxEntityRefInfo *entity_reference);
static IndexerCallbacks indexerCallbacks = {
.abortQuery = abortQuery,
.diagnostic = diagnostic,
.enteredMainFile = enteredMainFile,
.ppIncludedFile = ppIncludedFile,
.importedASTFile = importedASTFile,
.startedTranslationUnit = startedTranslationUnit,
.indexDeclaration = indexDeclaration,
.indexEntityReference = indexEntityReference
};
int abortQuery(CXClientData client_data, void *reserved) {
@autoreleasepool {
FZAClassParser *parser = (__bridge FZAClassParser *)client_data;
if ([parser.delegate respondsToSelector: @selector(classParserShouldAbort:)]) {
return [parser.delegate classParserShouldAbort: parser];
}
return 0;
}
}
// …
Internally clang creates its own threads, so the callback functions wrap delegate messages in @autoreleasepool so that the delegate doesn’t have to worry about this.
The delegate still needs to understand clang data structures of course, this is where the real work is done. Here’s the delegate that’s used to build the data model used in the browser app:
#import <Foundation/Foundation.h>
#import "FZAClassParserDelegate.h"
@class FZAClassGroup;
@interface FZAModelBuildingParserDelegate : NSObject <FZAClassParserDelegate>
- (id)initWithClassGroup: (FZAClassGroup *)classGroup;
@end
The FZAClassGroup class is just somewhere to put all the data collected by parsing the file: in a real IDE, this might represent a project, a translation unit, a framework or something else. Anyway, it has a collection of classes. The parser adds classes to that collection, and methods and properties to those classes:
@implementation FZAModelBuildingParserDelegate {
FZAClassGroup *group;
FZAClassDefinition *currentClass;
}
- (id)initWithClassGroup:(FZAClassGroup *)classGroup {
if ((self = [super init])) {
group = classGroup;
}
return self;
}
- (void)classParser:(FZAClassParser *)parser foundDeclaration:(CXIdxDeclInfo const *)declaration {
const char * const name = declaration->entityInfo->name;
if (name == NULL) return; //not much we could do anyway.
NSString *declarationName = [NSString stringWithUTF8String: name];
We’ve now got a named declaration, but a declaration of what?
switch (declaration->entityInfo->kind) {
case CXIdxEntity_ObjCProtocol:
{
currentClass = nil;
break;
}
case CXIdxEntity_ObjCCategory:
{
const CXIdxObjCCategoryDeclInfo *categoryInfo =
clang_index_getObjCCategoryDeclInfo(declaration);
NSString *className = [NSString stringWithUTF8String: categoryInfo->objcClass->name];
FZAClassDefinition *classDefinition =[group classNamed: className];
if (!classDefinition) {
classDefinition = [[FZAClassDefinition alloc] init];
classDefinition.name = className;
[group insertObject: classDefinition inClassesAtIndex: [group countOfClasses]];
}
currentClass = classDefinition;
break;
}
case CXIdxEntity_ObjCClass:
{
FZAClassDefinition *classDefinition =[group classNamed: declarationName];
if (!classDefinition) {
classDefinition = [[FZAClassDefinition alloc] init];
classDefinition.name = declarationName;
[group insertObject: classDefinition inClassesAtIndex: [group countOfClasses]];
}
currentClass = classDefinition;
break;
}
I’m ignoring protocols, but recognising that methods declared in a protocol shouldn’t go onto any particular class. Similarly, I’m adding methods found in categories to the class on which that category is defined: real Smalltalk browsers keep the categories, but for this prototype I decided to skip them. I’m using the fact that this is a prototype to justify having left the duplicate code in place, above :-S.
So now we know what class we’re looking at, we can start looking for methods or properties defined on that class:
case CXIdxEntity_ObjCClassMethod:
case CXIdxEntity_ObjCInstanceMethod:
{
FZAMethodDefinition *method = [[FZAMethodDefinition alloc] init];
method.selector = declarationName;
if (declaration->entityInfo->kind == CXIdxEntity_ObjCClassMethod)
method.type = FZAMethodClass;
else
method.type = FZAMethodInstance;
[currentClass insertObject: method inMethodsAtIndex: [currentClass countOfMethods]];
break;
}
case CXIdxEntity_ObjCProperty:
{
FZAPropertyDefinition *property = [[FZAPropertyDefinition alloc] init];
property.title = declarationName;
[currentClass insertObject: property inPropertiesAtIndex: [currentClass countOfProperties]];
break;
}
default:
break;
}
}
And that’s “it”. The result of collecting all of these callbacks is a tree:
ClassGroup -> Class -> [Method, Property]
I define a tree-ish interface for all of these classes, by adding categories that define the same methods:
@interface FZAMethodDefinition (TreeSupport)
- (NSInteger)countOfChildren;
- (NSString *)name;
- (id)childAtIndex: (NSInteger)index;
- (BOOL)isExpandable;
@end
@implementation FZAMethodDefinition (TreeSupport)
- (NSInteger)countOfChildren {
return 0;
}
- (BOOL)isExpandable {
return NO;
}
- (id)childAtIndex:(NSInteger)index {
return nil;
}
- (NSString *)name {
switch (self.type) {
case FZAMethodClass:
return [@"+" stringByAppendingString: self.selector];
break;
case FZAMethodInstance:
return [@"-" stringByAppendingString: self.selector];
break;
default:
return [@"?" stringByAppendingString: self.selector];
break;
}
}
@end
And, well, that’s it. libClang could be the kernel of a thousand visualizers, browsers and editors for C-derived languages, the start of one is outlined above.