Archive for the 'Programming' Category

A Critical Weakness


Last week, I touched on a tricky implementation of an eventListener. One object could register to receive events by another by implementing the appropriate protocol and then adding itself to an eventListeners array.

The code assumes you are utilizing ARC, the automatic reference counting system implemented in the newer objective-c releases. This frees us from the majority of memory management, and banishes the days of retain/release/autorelease. Unfortunately, it also introduces a tricky memory problem here.


Because our view controller stores a reference to the CharacterData object, the CharacterData object won’t be freed as long as the view controller exists. But since the view controller has been added to the eventListener array on the CharacterData object, the reverse is also true – the view controller won’t be freed as long as the CharacterData object exists! This is called a circular reference, and prevents either of those objects from every being freed! Obviously if we have a lot of these objects and listeners, this can add up to a real problem.

We can create a simple situation that demonstrates this:

// Dancer.h
  1. // A Dancer
  2. @interface Dancer : NSObject
  4. // Strong reference to our dance partner.
  5. @property Dancer* dancePartner;
  7. // We log on dealloc.
  8. (void) dealloc;
  10. // A static method for creating a dance pair.
  11. + (void) createDancePair;
  13. @end
// Dancer.m
  1. @implementation Dancer
  3. // We log on dealloc.
  4. (void) dealloc
  5. {
  6.     NSLog(@"Dancer %p was deallocated", self);
  7. }
  9. + (void) createDancePair
  10. {
  11.     Dancer* man = [[Dancer alloc] init];
  12.     Dancer* woman = [[Dancer alloc] init];
  13.     man.dancePartner = woman;
  14.     woman.dancePartner = man;
  16.     // Leaving scope should dealloc both dancers, but…
  17. }
  19. @end

The standard fix to this is to make a weak reference. A weak reference is a reference that is valid as long as the target object exists, but doesn’t count towards keeping the target object alive. That is to say, as long as something else refers to the object, it will exist, but as soon as there are no other references that aren’t weak (aka strong references), the object will be freed.

There are two syntaxes, depending on where you declare it:

  1. @interface WeakReferenceExample : NSObject
  2. {
  3.     __weak NSObject* weakVariable;
  4. }
  6. @property (weak) NSObject* weakProperty;
  7. @end

And as an added boon, the weak references will be set to nil when the target object is freed. So we can fix our very contrived example with a simple ‘weak’ keyword like so:

  1. // A Dancer
  2. @interface Dancer : NSObject
  4. // Strong reference to our dance partner.
  5. @property (weak) Dancer* dancePartner;
  7. // We log on dealloc.
  8. (void) dealloc;
  10. // A static method for creating a dance pair.
  11. + (void) createDancePair;
  13. @end

Now, when the dancers go out of scope at the end of the createDancePair method, the dealloc method is actually invoked on both, proving that they’ve been correctly destroyed.

Unfortunately, it isn’t immediately obvious how to apply this to our original problem. The issue is with the NSMutableArray, which always takes a strong reference. Making our reference to the array weak doesn’t work, because the array will simply be immediately freed. The solution here requires a wrapper class.

By creating a class whose sole purpose is to store a weak reference to another object, we can circumvent the NSArray problem.

// WeakWrapper.h
  1. @interface WeakWrapper : NSObject
  3. // The weak reference to the actual object
  4. @property (weak,atomic,readonly) id object;
  6. // Basic initializer with the actual object
  7. (id) initWithObject:(id)object;
  9. @end
// WeakWrapper.m
  1. @implementation WeakWrapper
  3. // The weak reference to the actual object
  4. @synthesize object;
  6. // Basic initializer with the actual object
  7. (id) initWithObject:(id)pobject
  8. {
  9.     if((self = [super init]))
  10.     {
  11.         // Store the weak reference.  This will become nil automatically when this object is disposed of.
  12.         object = pobject;
  13.     }
  14.     return self;
  15. }
  17. @end

This does mean we have to be aware of the fact that we’re using this wrapper class. It doesn’t really make sense to expose the raw array and hope that no one does it incorrectly, so we will hide the internals and provide accessor functions.

// CharacterData.m
  1. // NEW: Add a new listener into the list
  2. (void) addEventListener:(NSObject*)listener
  3. {
  4.     // Wrap the incoming object in a weak reference wrapper
  5.     WeakWrapper* wrapper = [[WeakWrapper alloc] initWithObject:listener];
  6.     [eventListeners addObject:wrapper];
  7. }
  9. // NEW: Remove the specified listener from the list
  10. (void) removeEventListener:(NSObject*)listener
  11. {
  12.     NSArray* localEventListeners = [eventListeners copy];
  13.     // emit the event to any interested listeners
  14.     for(WeakWrapper* wrapper in localEventListeners)
  15.     {
  16.         // Try to get out the real object into a temporary strong reference
  17.         NSObject* delegate = wrapper.object;
  19.         // Remove this if it's nil or our object
  20.         if(delegate == nil || delegate == listener)
  21.         {
  22.             [eventListeners removeObject:wrapper];
  23.         }
  24.     }
  25. }
  27. (void) emitPropertyChangedEvent:(NSString*)propertyName
  28. {
  29.     NSArray* localEventListeners = [eventListeners copy];
  30.     // emit the event to any interested listeners
  31.     for(WeakWrapper* wrapper in localEventListeners)
  32.     {
  33.         // Try to get out the real object into a temporary strong reference
  34.         NSObject* delegate = wrapper.object;
  36.         // If this is nil, then the object has been deallocated so we can remove the wrapper
  37.         if(delegate == nil)
  38.         {
  39.             [eventListeners removeObject:wrapper];
  40.         }
  41.         // Otherwise send it the event if it responds
  42.         else if([delegate respondsToSelector:@selector(characterData:propertyChanged:)])
  43.         {
  44.             [delegate characterData:self propertyChanged:propertyName];
  45.         }
  46.     }
  47. }

We clean up any dead references whenever we iterate through the event listeners.

Now, objects can listen for events from other objects, but not be kept alive by them. We can implement our original design with impunity.

Example XCode iOS project: WeakReferenceProject

An Aspect of Aspect-Oriented Programming


Today I’d like to explain a solution I’ve cobbled together for what tends to be a rather common design obstacle. Imagine, if you will, that we’re creating a simple game and we’re trying to display the main character’s stats on screen in some fashion. The exact UI isn’t important, so lets just start with the data object that represents the main character.

// CharacterData.h
  1. @interface CharacterData : NSObject
  3. @property NSInteger combatSkill;
  4. @property NSInteger endurance;
  5. @property NSInteger baseArmor;
  6. @property NSInteger equipmentArmor;
  7. @property (readonly) NSInteger totalArmor;
  9. @end

We have a basic character with four stored stats, and one derived stat (totalArmor just returns base + equipment). We can change the values by just assigning to them easily enough.
Now lets say we have a view hooked up to display the character’s data to the user. This view needs to be updated any time the data changes. We can use the Observer Pattern for this. Wikipedia has the following to say about it:

The observer pattern is a software design pattern in which an object, called the subject, maintains a list of its dependents, called observers, and notifies them automatically of any state changes, usually by calling one of their methods.

Unfortunately, this can be a bit cumbersome in objective-c. We would have to write a custom setter for every property that emitted the appropriate event. Perhaps there’s a way we can simply… intercept all property sets?

You’ve probably guessed that I’m about to propose a solution. Before I do, let me introduce NSProxy with a quote from the apple documentation:

NSProxy is an abstract superclass defining an API for objects that act as stand-ins for other objects or for objects that don’t exist yet. Typically, a message to a proxy is forwarded to the real object or causes the proxy to load (or transform itself into) the real object.

In a nutshell, this will allow us to create an object to stand in for our CharacterData object and inspect incoming messages before passing them on to the original CharacterData object. If we see a ‘property set’ message come in, then we can emit an event. Let’s give it a shot.

// PropertyEventProxy.h
  1. @interface PropertyEventProxy : NSProxy
  2. {
  3.   id _proxiedObject;
  4.   SEL _eventSelector;
  5.   NSDictionary* _setterMap;
  6. }
  8. (id) initWithProxiedObject:(id)proxyObject eventSelector:(SEL)eventSelector;
  9. @end
// PropertyEventProxy.m
  1. @implementation
  2. (id) initWithProxiedObject:(id)proxyObject eventSelector:(SEL)eventSelector
  3. {
  4.   // NSProxy does NOT inherit from NSObject, so there's no super implementation to invoke
  5.   _proxiedObject = proxyObject;
  6.   _eventSelector = eventSelector;
  8.   NSMutableDictionary* registeredProperties = [NSMutableDictionary dictionary];
  10.   // Loop through and find all assignable properties
  11.   unsigned int propertyCount = 0;
  12.   objc_property_t* propertyList = class_copyPropertyList([_proxiedObject class], &propertyCount);
  13.   for(int i = 0; i < propertyCount; ++i)
  14.   {
  15.     // turn this into a property selector
  16.     NSString* propertyName = [NSString stringWithUTF8String:property_getName(propertyList[i])];
  17.     NSString *propertySetterName = [NSString stringWithFormat:@"set%@:",
  18.     [propertyName stringByReplacingCharactersInRange:NSMakeRange(0,1) withString:[[propertyName substringToIndex:1] capitalizedString]]];
  20.     [registeredProperties setObject:propertyName forKey:propertySetterName];
  21.   }
  22.   _setterMap = [registeredProperties copy];
  23.   return self;
  24. }
  26. @end

This gives us the basic object that tracks what it is proxying, and takes an ‘event emitter’ selector it will invoke on the proxied object when there is a ‘property set’. When this proxy object is initialized, we query the proxied object to find all of its properties. From the property names we build up the signature to expect for the set method, and store it by mapping it to the actual property name for lookup later.

The magic comes in overloading the forwardingTargetForSelector: method. Again from apple’s documentation:

If an object implements (or inherits) this method, and returns a non-nil (and non-self) result, that returned object is used as the new receiver object and the message dispatch resumes to that new object.

So, in essence, our proxy object gets to peek at every message that is going to the data object, before asking the runtime to pass it on to the original object. In this case, we check to see if the message matches the signature of a know property setter. If it does, we invoke the ‘event’ selector on the original object and pass it the name of the property that changed.

// PropertyEventProxy.m
  1. // Other code from above
  2. (id)forwardingTargetForSelector:(SEL)aSelector
  3. {
  4.   NSString* selectorName = [NSString stringWithUTF8String:sel_getName(aSelector)];
  5.   NSString* propertyNameMatch = [_setterMap objectForKey:selectorName];
  6.   if(propertyNameMatch != nil)
  7.   {
  8.     // This matches a property setter we're looking for, so trigger the event!
  9.     [_proxiedObject performSelector:_eventSelector withObject:propertyNameMatch];
  10.   }
  11.   // Pass back the proxied object so the method will get called on the actual object next
  12.   return _proxiedObject;
  13. }

Now we just need to fix up our original data object a little to emit the event, and actually use the proxy.

// CharacterData.h
  2. // A Protocol that other objects can implement to register for CharacterData property change events
  3. @protocol CharacterDataEventListener
  4. // Invoked whenever an assignment is made to a property
  5. (void) characterData:(CharacterData*)data propertyChanged:(NSString*)propertyName;
  6. @end
  8. @interface CharacterData : NSObject
  10. // Existing properties here
  11. @property NSInteger combatSkill;
  12. @property NSInteger endurance;
  13. @property NSInteger baseArmor;
  14. @property NSInteger equipmentArmor;
  15. @property (readonly) NSInteger totalArmor;
  17. // NEW: A list of entities listening to a property change event
  18. @property (readonly) NSMutableArray* eventListeners;
  20. // NEW: Send the event to any listeners
  21. (void) emitPropertyChangedEvent:(NSString*)propertyName;
  23. @end
// CharacterData.m
  1. @implementation CharacterData
  3. @synthesize eventListeners, combatSkill, endurance, baseArmor, equipmentArmor;
  4. @dynamic totalArmor;
  6. (id) init
  7. {
  8.   if((self = [super init]))
  9.   {
  10.     eventListeners = [NSMutableArray array];
  11.     // Anything custom to be done here    
  12.   }
  13.   // Now, very sneakily – return the proxy object instead of self!
  14.   return [[PropertyEventProxy alloc] initWithProxiedObject:self eventSelector:@selector(emitPropertyChangedEvent:)];
  15. }
  17. (void) emitPropertyChangedEvent:(NSString*)propertyName
  18. {
  19.   // emit the event to any interested listeners
  20.   for(NSObject* delegate in self.eventListeners)
  21.   {
  22.     if([delegate respondsToSelector:@selector(characterData:propertyChanged:)])
  23.     {
  24.       [delegate characterData:self propertyChanged:propertyName];
  25.     }
  26.   }
  27. }
  29. // Implementation for the derived totalArmor stat.
  30. (NSInteger) totalArmor
  31. {
  32.   return self.baseArmor + self.equipmentArmor;
  33. }
  35. @end

You can see here that we’ve created an array property (eventListeners) that objects can assign themselves to in order to receive events. The emitPropertyChangedEvent: method then allows us to sent that event to any registered delegate.

There’s also a bit of craziness in the init method. Instead of returning ‘self’ as all other good initializers do, we actually construct the proxy and return it instead! This hides the fact that there’s a proxy object wrapper from the rest of the application entirely.

This proxy object can be used to wrap almost any object, whether your own custom objects or built-in classes. Classes that implement the protocol will need to be careful about the – (id) awakeAfterUsingCoder: method. We’ll cover dealing with that edge case in a future post.

If you’d like to play around with an example, I’ve create a simple XCode iOS project: ProxyObjectTest

iOS Development Optimizations Part 5 – Referencing Entities


For the next couple articles I’m going to get a little bit more technical, and start delving down into the murky depths of our game engine.

If you’ve done any development for OS X or iOS, you know that apple’s preferred language is Objective-C. †Their top-level frameworks and APIs all use Objective-C to interface and, while many parts are†achievable†from simple C, it’s not what I would call “a fun time”.

Fortunately, Objective-C itself has become a lot nicer about interacting with other languages over the years. †At this point, you can actually intermix C++ and Objective-C in an inspirationally named language called Objective-C++. †We use C++ internally for the majority of our projects, so for our game engine we use a very thin Objective-C wrapper for the out-of-game UI (the menu system, basically) and then jump right into C++.

We use C++ internally in the engine for a few reasons:

  • Familiarity – most of our other projects are written in C++, and it’s the language we’re most familiar with as a team.
  • Type-safety – this is somewhat of a personal preference, but C++ allows me to detect far more errors at compile time than Objective-C does.
  • Portability – an engine written in C++ using OpenGL can actually run on most OSes, including Windows, OS X, and Linux. †Sadly, from my investigation it looks like it’s still not quite workable for Android. †And Windows Phone (which I actually love, btw, but that’s another post) disallows ‘native’ code entirely.

There are a number of popular techniques for referencing entities in a game engine. †There are two big issues to tackle in any solution – object lifetime, and serialization. †Object lifetime is a†particularly†tricky issue – if you try to access an object that has already been destroyed, you’ll at best cause a crash, and at worst corrupt your data. †Serialization is the process of ‘saving the game’, such that any reference to an entity (for example, what enemy a spell is targeting) can be restored when loading from disk.

The first solution we considered was having a master list of entities at the engine level, and handing out IDs unique to each entity. †If you wanted to store a reference, you’d actually store the ID instead, and whenever you needed to access the entity you’d †ask the engine to find it for you. †Usually the engine will store this list as a dictionary of some kind – a hash table, or binary tree. †If the entity you’re asking for has already been destroyed — perhaps the enemy was killed before the spell finished executing, for example — then the engine will return a†sentinel†value (NULL in our case) which you’ll need to remember to check for.

The downside is that any access entails a lookup cost, and that you need to remember to check for dead entities. †The big advantage is that this method is very easily serialized – just write the ID number out to disk along side the entity.

The second solution, and the one we ended up implementing, was smart pointers that reference count objects. †One huge point in favor of this solution is that a library we were already using, boost, provides all of the infrastructure in the form of shared_ptr and weak_ptr.

Reference counting, in a nutshell, wraps an object with a container that counts how many references exist to the object. †Every time a reference is made, the count goes up. †When the reference goes away, the count goes down. †If the count reaches zero — no one at all references the object — we can destroy the object safely. †A special kind of smart pointer called a weak pointer maintains a reference without changing the count. †Weak pointers allow us to say ‘hey, I want to use this object as long as it is alive, but go ahead and destroy it if no one else wants it’. †An example is an entity’s ‘target’ – you want to know who your target is, but allow it to die if someone else kills it.

There’s a different kind of overhead to this method. †Rather than a lookup at access time, there’s a reference count increase whenever a new reference is made to that entity, and decreased when that reference goes away. †Initially, it seemed that this performance cost was far too significant to ever consider this method; fortunately, we discovered a few ways to reduce the cost until it was†negligible.

  • The iPhone threading performance means single-threaded access is better, so we removed all atomic actions and thread-safety from boost. †This cut down on 99% of the cost.
  • We pass constant references around, rather than copying the reference every time.
  • For very tight loops where we know the lifetime behavior, we can cheat and get the raw pointer out of the smart pointer and use that. †This is dangerous, and counter to the entire point of the solution, but we ended up using it in the inner rendering loop.

Strong pointers guarantee that the reference exists and is still alive, but weak pointers may point to dead entities. †However, in order to use a weak pointer you must convert it to a strong pointer – which is a huge reminder that hey, you should probably check to make sure the entity is actually alive before using it!

Smart pointers don’t help with serialization at all, sadly. †Our fix was to add IDs to objects as they are written to disk, and convert all references to these IDs. †Then we can load objects along with their IDs later, and restore the smart pointers then.

iOS Development Optimizations Part 4 – Pathfinding


Tower Defense and Hero Defense are actually quite different in terms of pathfinding. For Hero Defense, we had to statically create a nav-mesh at design time, and all of our entities – player and foe alike – navigated it using A* at both the high level and the local level. Since we had so many separate entities moving to so many different targets, the mesh needed to be able to reach the entire map, and the entities would have to calculate their exact path on the fly.

For Tower Defense though, the possible paths are constantly – and drastically changed. Players create mazes using their buildings, cutting off certain openings and creating intricate side lanes. The nav mesh was useless, and A* was pretty much in its worst case scenario. The one advantage we did have was that every moving entity on the screen was heading towards the same destination – the resource the player is trying to protect!

Very helpful, thanks...

The best solution we found is also the simplest. We flood-fill out from the resource building to every square on the map it can reach. The maps are reasonably sized – roughly 20×40 squares – so flood filling the entire thing takes a fraction of a frame. If we had to do this for every enemy, or for many resource buildings, it would add up and cause a significant penalty. However, one flood fill will find the optimal path for any entity on the screen, no matter where it is! We also only need to update this when a tower is built. Ok, that’s a bit of a lie, because there’s one painful situation exposed here.

All paths lead to the roman looking building

Players can create mazes as long as they want in our game – but at all times, the enemies must be able to reach their target. We simply don’t let them place towers where they would permanently block all paths. This means that whenever a player drags a tower out, we need to recalculate the paths for every square they drag the tower over, so we can give feedback when the path is blocked.

Now that's just not fair...

Still, since each flood fill takes less than a frame, and checking for a blocking build is easy (does each entity have a path to the goal or not?), this painful situation is… tolerable. I had intended to implement a predictive solver for the background thread, which would monitor the user’s drag motion and attempt to pre-solve for squares it might end up in. Fortunately, play testing revealed this wasn’t necessary, as the delay was unnoticeable. Concrete data to the rescue again!

iOS Development Optimizations Part 3 – Resource Management


Even having done everything we could to minimize the number of textures we needed to load, there was still room for improvement. Sprite sheets were tightly packed, the entire level was rendered into its own texture set, and we’d even run all of the textures through Apple’s PVR compression to cut the file size down to 25%. †Still, we wanted to be able to run through 20 different kinds of enemies over the course of a level. †We obviously can’t load all of the frames of animation for 20 enemies, for all of their possible actions – running, attacking, idling, and dying are the minimums set, with more for characters with emotes. †And so we enter the realm of Dynamic Resource Management.

My first crack at this was a basic lazy implementation – lazy in the technical sense, not the social one. †How do we get a graphics engine to be lazy? Simply don’t load any texture until it is about to be rendered! † The moment that a texture is requested, make sure that there is enough memory for it. †If not, kick out any texture that isn’t currently in use to make room. †This means that we could have 20 enemies in one level – just, not all on the screen at the same time. †Bring one enemy in, and if there isn’t room, evict the resources of an enemy not in use.

These are MY resources now!

This was a fantastic improvement–with one issue. †By waiting until the moment the frame needed to be rendered in order to load it, I introduced a pause the moment any frame was shown for the first time, as the texture was read from disk. †Being completely lazy about it didn’t quite solve the issue, so I implemented a two-phase mostly-lazy pattern using a custom smart pointer. †Warning – the following is fairly technical stuff, but I’m not going to go in depth, so be strong!

A smart pointer is basically an object that acts like a reference to another object , but does something smart about the reference at the same time. †We typically use them for reference counting things (create a smart pointer to an object, and the reference count goes up; destroy the smart pointer and the reference count goes down. †Now we know how many references to an object there are!). †In this case though, the smart pointer did some extra work. †Instead of immediately creating a reference to the intended object, it sent a request to a background thread warning it that a resource had been requested. †Then, when the application goes to actually use the referenced object, the smart pointer will block until the resource has actually been loaded.

An example:

// Highly fake class to demonstrate the principle
  1. class ImageSmartPointer
  2. {
  3.     public:
  4.         ImageSmartPointer(string resourceName)
  5.         {
  6.              // Let the image loader know we'll want this image soon
  7.              BackgroundImageLoader>RequestLoad(resourceName);
  8.              m_ResourceName = resourceName;
  9.              m_pImage = NULL;
  10.         }
  12.         Image* operator>()
  13.         {
  14.              // if we haven't already gotten the image, get it from the loader
  15.              if(m_pImage == NULL)
  16.              {
  17.                  // This may block for a load.  It will increase the reference count on the image as well.
  18.                  m_pImage = BackgroundImageLoader>GetLoadedImage(m_ResourceName);
  19.              }
  20.              return m_pImage;
  21.         }
  23.         ~ImageSmartPointer()
  24.         {
  25.              // if we've retrieved the image already, let the loader know we're done with it
  26.              if(m_pImage != NULL)
  27.                   BackgroundImageLoader>ReleaseImage(m_ResourceName);
  28.         }
  29.     private:
  30.         string m_ResourceName;
  31.         Image* m_pImage;
  32. };

Although threading doesn’t provide any performance enhancements on the current iOS devices (unlike PCs with multiple cores, for example), it still provides one great benefit – anything done on a background thread doesn’t block the foreground (renderer) thread. †This means that when a character is loaded, it can grab a reference to each of its frames, which are then loaded by the background thread without stalling the frame rate. †By the time the renderer needs to actually render the character’s frame, 95% of the time it’s already been loaded. †They other 5% of the time there’s a slight delay, but now that it’s only one frame (not every single one) it’s basically unnoticeable. So, with a little application of a lot of laziness, we now have a late-binding pre-loading resource tracking system.

iOS Development Optimizations Part 2 – Map Design


The key to having great content is having great tools.

Well, that, talent, and dedication.

We do have a great map editor that we use for our level design, though. †Not only does a custom-built tool make designing our levels much easier, it allowed me another opportunity to apply some optimizations.

A Scene in our Editor

For Townrs Defender, we developed the designer on the principle of re-use – if we could repeat the same texture multiple times across the map, we’d get a ton of savings in texture memory. †So, the background was composed of tiles that could be laid out like dominos. †Likewise, trees were many many instances of the same few tree sprites.

We had fallen into a trap, a trap everyone falls into at some point: Premature Optimization. †Strictly speaking, I suppose, this wasn’t premature–just mis-targeted. †You see, this time around I took some measurements and tested some theories. †It turned out that the scene overdraw–rendering a tree, then another one in front of it in the same place, for example–was the biggest performance killer for us.

Scene image on left, overdraw visualization on right

Because all of our sprites are alpha-blended, we necessarily have to render them back-to-front in sorted order. †This is a worst case scenario then, because for any sample pixel, we may:

  • Render the background tile (the ground)
  • Render a background widget (a flower, for example)
  • Render a tree
  • Render another tree that is in front of the first
  • Render a character
  • Render any spell effects (flames, arrows, etc)

This was terrible! †We needed to review our thought process.

  • The idea of tileable terrain grew out of our original design for terrain that could go on forever. †You could walk from one map to the next, for example. †That wasn’t even remotely true anymore.
  • Instancing the trees was supposed to save on texture memory. †Well, it did – very minutely. †Using a variety of widgets ate into those savings, and even worse, it cramped our design decisions: using a new widget would eat up valuable texture space, so we’d almost always choose to re-use an already placed widget than use a new one. †This made some of Townrs Defender look repetitive and homogenous.

Knowing now that this often hurt, rather than helped, we looked at it from another perspective. †What if, no matter how many widgets or background tiles we used, the texture space remained constant? †What if we could reduce the overdraw down to four or even three passes? How could we possibly achieve that?

In the end, it was simple. †Each level is rendered out as its own texture. †True, that means each level has its own dedicated texture, but only one is loaded at a time. † Disk space is cheap compared to resident memory. †Now, we couldn’t actually render out everything into one texture–characters have to be able to walk behind trees, after all. †So after saving out the ‘background’ of the map, we now go through and render out horizontal stripes of widgets, and save them one big texture atlas. †There is some fidelity lost here–all widgets act as if they were in the center of the stripe, so some gradations of depth are lost. †But, for the purposes of a small screen on an iPhone, it worked fine.

Background Texture for Level 1

A Foreground Widget Atlas

Our render pass was now more likely:

  • Render the background texture
  • Render the foreground pseudo-widget
  • Render a character

An added benefit was that there were now far far less texture changes, because 90% of our objects on the screen (somewhere in the hundreds) were now 15 objects. †Our frame rate actually more than doubled. †It also helped for memory budgeting–rather than an unknown number of textures, each level had a set cost that we could account for. And, much to the relief of our map designers, they were now free to use as many and as varied widgets in any level as their heart could desire.

And players benefit in-game too: where Townrs Defender could only put 15 enemies on screen at a time, Spires can render closer to 50!

Gauntlet mode pits you against a massive wave of enemies.

Since I originally wrote this article, Spires has actually been published on iTunes – so if you want to check out the engine performance yourself, go pick it up!

iOS Development Optimizations Part 1 – Sprite Engine


In the time between Townrs Defender and Spires, I had the opportunity to rework the sprite engine to optimize performance in many ways. †You may remember that Townrs Defender had a number of, in particular, memory issues – on older devices it would frequently have to disable music to guarantee it would run. †So, memory usage was something I wanted to focus on for our version 2 of the engine.

One very common optimization, used in both 3d and 2d engines, is called atlasing. †Every texture change the render makes has overhead, so atlasing takes a series of textures and combines them into one big texture, on the assumption that if there are less textures, you’ll need to change them less frequently.

Sprite Sheet for an American Hero

Sprite Sheet for an American Hero

This is a great optimization, and very easy to parse – take the frame number you want, and treat it like an index into the image as a 2D array. †Creating it is as simple as downloading ImageMagick and running the ‘montage’ command from the commandline.

But †all that white space is a bummer, huh? †File compression will collapse a lot of it, but we have to expand the whole file and keep it in memory eventually. †We can try to crop the spite more tightly, but we’d always be limited by the largest sprite – an outstretched limb could wreck everything! †You can see it in this image alone – the sprites at the top where our mystery hero is running upward are much thinner then the ones at the bottom, where he runs sideways.

Well, I’m a programmer, so my fix is to write a new tool!

All we have to do is crop each individual sprite as tightly as we can, but when we do so, we have to remember how much we cropped from each side.

Our Hero Cropped Tightly

This allows us to push in from each side of the image without worrying about the center point of the image – by recording the edge offsets, we can re-align the sprite in-game with its original center point.

If we don’t record this information, the final sprite will bob around crazily!

Top-Left Aligned Sprite

Correctly Centered Sprite

Once we’ve cropped each sprite as tightly as we can, it’s time to pack them all together. †2D packing is a topic that has been extensively researched and implemented. †From my quick research and prototyping is seems that – at least for our rather simple ‘aligned rectangles packed within a square’ situation – a simple greedy algorithm gets us about 90% optimization. †Advanced algorithms can certainly boost that higher, but tend to take exponentially longer to run.

The result is quite nice. †Rather than using up a portion of a 1024×1024 sheet though, it makes more sense to split it into two 512×512 sheets:

Packed Sprite Sheet 1

Packed Sprite Sheet 2

1024 x 1024 sheet = 1048576 pixels

512x512x2 = 524288†pixels

That’s a 50% savings off the bat – and as you can see, we even have some leftover space to stuff more sprites into!

Tools of the Trade

Hero Character in Blender

Hero Character in Blender

Once weíd figured out our implementation details, we needed to pick the tools to get us there. We quickly determined that Blender was well suited to our needs.† Blender is an open source modeling program, whose abilities span modeling to video editing to being an entire game engine.† It has most of the features of 3dsmax or Maya, all of the useful ones, and many more to boot.† Of course, the fact that itís free helps a great deal.† But honestly, out of all of the open source alternatives Iíve tried over the years (Gimp, Open-office, Thunderbird) Blender is the model of ìgetting it rightî (Alongside, perhaps, Firefox).

Particle Effect in Blender

Particle Effect in Blender

Our modeling and rigging is done in Blender, and textures are created in Photoshop.† From there, we render out a series of individual frames in the standard 8 directions to a PNG format.† We then use ImageMagick, a command-line tool that comprises a lot of the functionality of Gimp or Photoshop.† ImageMagick gives us the ability to run batch manipulations on the images, and compose them together in interesting ways.

From the resulting PNG format, we compress to PVR, using appleís texturetool program.† PVR is an interesting format; in a nutshell, itís a highly regular compressed image format that can use either 4 bits or 2 bits to represent pixels.† The huge advantage to this format is that the iPhone supports it natively, and can render directly from the compressed format.† We found that PNGs didnít work because a) decompressing the PNG format takes up a ton of CPU time and b) when uncompressed, a 1024×1024 image will be stored at its native size of 4MB.† When you only have 20MB or so to work with, thatís pretty limiting.† For comparison, a 4bpp PVR of the same image is only 512kb ñ yes, thatís 1/8th the size.† However, for small images the compression artifacts become very noticeable, so for those we actually double the resolution of the initial images.† For those keeping track, the compressed double-resolution image is still only ? the size of the original resolution uncompressed image.

BlenderNation also has a quick writeup on us.

Investigation and iPhone Limitations


Weíve had a lot of requests for insight into our toolset and development process here at Prophetic Sky, so I thought Iíd write a few entries to summarize some of what we do here. Weíre typically a Windows/Linux workshop, and so we had to make a few adjustments to work on an OSX platform for iPhone development. Fortunately, OSX overlaps a great deal with Linux, and we tend to use a lot of great open source tools.

There arenít a lot of real statistics for the performance of the iPhone out there, but you can get a rough feel for it by just looking at the existing games out there. Our tests let us to conclude that 3000 triangles/second was a good maximum (weíve read 6000, but couldnít reproduce that to be comfortable with it). Even assuming our physics, game logic, AI and special effects didnít eat into that time at all, if we wanted to fit 20 characters on the screen at once, each character had to be 150 triangles or less. Thatís not terrible; many games run with those limitations. But once you realize you still have to fit terrain, trees, towers, walls and spell effects in there, it starts to become very cramped.

Nova Spell

Better detail, but less flexibility

So we turned to sprite assets. There are many advantages to sprites ñ namely that 3000 triangles equates to 1500 sprites on the screen at a time. Also, we can render the characters in much higher detail initially, from source models in the thousands of triangles. The feedback that weíve gotten on the game leads us to believe that this was the right choice; universally we hear that the game looks beautiful.

There were drawbacks, as well. While we arenít limited by the rendering power of the iPhone, we are limited by the available memory, and sprites take up a lot of space. In addition, it means our art is limited to running at a fixed frame rate, and in fixed directions ñ the characters can only run in one of the 8 standard directions, for example. Also, we cannot change the perspective ñ moving the camera will just show that the sprites are flat.