1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.


Saturday, August 18, 2012

Gameplay Architecture Part 1: Component Entity Models in C++

Why Component Entity Models?
In the interest of re-usability, separation of concerns, and data driven needs, developers are starting to move away from maintenance prone class hierarchies and towards Component Entity Models.  Conceptually, such models are simple to understand.  Functionality is split into components and game entities are defined by a composition of these components.  So at its core, entities themselves are just a container of their components or an identifier that allows its components to relate and communicate with eachother.

There are a number of different ways to go about implementing such a model, and a number of things about C++ can make certain aspects of the implementation rather tricky.  Be forewarned, that the implementation I'm presenting does use templates and does stick rather closely to the model used in Unity.  While a bit tricky, the implementation here is not overly complicated, and has considerable benefits once it is set up.  Users familiar with the script interface in Unity can speak of how intuitive it is to use.

Establishing Relationships Between Components
In more radical models, the idea of an entity is removed completely in favor of pure aggregation.  Although not without its benefits, in my opinion this makes things unnecessarily complicated.  In addition, forcing a component to belong to one and only Entity solves a lot of problems.  So in the interest of keeping things simple that is what I will be presenting here.  A one to many relationship between entities exist, and each component keeps a pointer to the entity.  This makes sending messages and modifying components significantly easier for a new component.  For example, a camera tracking behavior will know to search for and operate on camera components attached to the same entity.  Likewise, a health component will know which components to send a message on something like a TAKE_DAMAGE event.

Accessing Components
At its core an Entity is a container of components.  We don't want any game logic to seep into the entities themselves, but instead want funcionality to stay decoupled and seperated into individual components.  Therefore, we need some way to access components.  The implementation I am showing here presents a few including:

- By Type (using templates for type safety)
- By A Unique Instance Id (given by a static instance counter)
- By Name (using a hashed string id for faster comparisons, but still provides a system where designers can name parts of their level)

Why Names/Name Id's?
It is very handy for designers to identify things in the world by name, save such things to a file, and have them work the same in each build.  Hashed string id's minimize the cost of comparisons, and can still be accessed in debug builds that make use of a giant string table.  This is a godsend for debugging compared to some random enum type id that could easily be reordered when merged!

Why Instance Id's?
A GUID can be quite handy if you need a unique handle and your scriping language doesn't have pointers.  In certain cases, name ids may not cut it since designers will keep giving objects the same name.

Providing Your Own RTTI
Identifying objects by their runtime type is a bit trickier.  We could punt and just use dynamic_cast while iterating and rely on the non NULL result to indicate a valid cast.  This will work and is more reliable than a typeid that won't respect class hierarchies.  However, enabling RTTI is generally considered undesirable because it must be implemented everywhere in your program.

Observe, given a type id we have rolled ourselves and the use of a virtual function like the one below we can determine if an object is derived from a given type:
bool virtual isDerivedFromType(u32 typeId) {
    return (TYPE_ID == typeId) ? true : PARENT::isDerivedFromType(typeId); 

This will continue until it finds a matching type id or reaches the root class in the hierarchy.  In the root of the hierarchy our logic would have to be simplified to compare its object id and not go to any parent class (since it doesn't exist).

bool virtual isDerivedFromType(u32 typeId) {
    return (TYPE_ID == typId); // no parent  

Hashed String Id's > Error Prone Enums
But how do we provide a unique type id?  We could roll an enum of all possible types, but enums do not scale well at all.  A better implementation, that would put less of a burden on its users, would use the hashed string id of the class name.  Since we need a unique isDerivedFrom() function anyways, this is a perfectly justifiable use of the preprocessor.  A good hash function like MD5 or CRC32 should let you use this simple pattern without any conflicts.

// goes in the public declaration of every component

static const U32 COMPONENT_TYPE_ID; \
static const char * COMPONENT_TYPE_NAME; \
virtual bool isDerivedFromComponent(U32 componentTypeId) const { \
    return (COMPONENT_TYPE_ID == componentTypeId) ? true : PARENT_COMPONENT_CLASS::isDerivedFromComponent(componentTypeId); \

// Goes in the cpp of every component
So at this point, a minimalist implementation could have its logic work like so when searching for the first component of a given type:
template <typename COMPONENT_TYPE> COMPONENT_TYPE * getComponent() {
    foreach (component in componentList) {
        if (component->isDerivedFrom(COMPONENT_TYPE::GetComponentTypeId())) {
            return static_cast<COMPONENT_TYPE *>(component);

Storing Components
Data Structures
Part of the entity interface
There are a number of ways we could potentially store components.  The first solution you might go to would be a dynamic array (e.g. std::vector) of component pointers.  After some thought, I have created a very specialized structure that uses handles and takes advantage of  template specialization.  To keep things simple I will instead present a system that just uses an intrusive linked list.  Since components should only belong to one list anyways this isn't a terrible choice, but in general if you want something that is concurrent and cache friendly you will probably want to use something else.

class Component {
    // the RTTI type id and name are stored statically
    Entity *m_entity;         // component container entity
    Component *m_next;        // intrusive next pointer
    Component *m_prev;        // intrusive prev pointer
    U32 m_nameId;             // hashed name id
    U32 m_instanceId;         // unique instance id
    ComponentEntityFlags m_componentEntityFlags;  // flags like enabled/disabled, etc etc

class Entity {
    Transform m_transform;     // head of the component linked list
    Component *m_tail;         // tail of intrusive component linked list
    U32 m_layers;              // bitwise layers
    U32 m_nameId;              // hashed name id
    U32 m_instanceId;          // unique object instance id
    ComponentEntityFlags m_componentEntityFlags; // entity flags
Therefore, we can get a component by type using the following code:
template <typename COMPONENT>
const COMPONENT * jlEntity::getComponent() const {
    const COMPONENT *ptr = NULL;
    for (const COMPONENT * cur = getComponents(); cur != NULL; cur = cur->getNext()) {
        if (cur->isDerivedFromComponent(COMPONENT::GetComponentTypeId())) {
            ptr = static_cast<const COMPONENT *>(cur);
    return ptr;

Getting multiple components by type also becomes trivial:

template <typename COMPONENT>
U32 jlEntity::getComponents(const COMPONENT *componentList[], U32 maxToAdd) const {
    U32 n = 0;
    for (const jlComponent *cur = getComponents(); cur && n < maxToAdd; cur = cur->getNext()) {
       if ( cur->isDerivedFromComponent(COMPONENT::GetComponentTypeId())) {
           componentList[n++] = static_cast<const COMPONENT *>(cur);
    return n;

It could also be expanded using the bitwise filtering I posted about previously here.  That way you can search for all enabled or disabled components of a given type.

Scenegraph Integration
Operations can be expanded to trees of entities/components
Parent-Child Relationships
One of the great things about Unity is that its parent child relationships carry over to the object model itself.  This lets designers and programmers define objects in ways that are very beneficial and still have ways for high level code to intuitively communicate with related data.  For example, if the wheel of a car needs to inherit rotation and translation, why not take this relationship a bit further?  If we have links to the scene nodes, and they have links back to their entities, we can perform operations on components and entities while traversing down the tree.  This not only means we can search for components in a subtree, but also send messages down them (or up them).  Therefore, if the car gets an important event, it can quickly and easily send it down to the wheels

A Consistent Message Passing System
If functionality is split into components, than a virtual function for sending and receiving messages is easy to implement.  If we have scene graph integration we can not only send messages to every component attached the given entity, but to every component in the entity subtree.  This can be rather slow, but extremely useful when you need it.  It is also very flexible for expansion (e.g. a component isn't doing anything with a message now, but could later without changing any other code),

As I will explore in another post, the nitty gritty details of implementing such a system is actually rather involved, especially if you want to queue messages with different data efficiently.  Hopefully this will get posted within a few weeks.

Performance Considerations/Concerns
Speeding Up Component Retrieval
At first glance the above might look a bit scary.  Sure it uses little memory, but a linear search that isn't cache coherent is scary.  Even with a dynamic array, we are still ultimately dealing with pointers and indirection anyways.  Also, since components do many different things, they will be allocated in many different ways.  This is one of the downsides of component entity models in general, but all is not lost.  Here are some of your options:

Exploiting Coupling
If every entity must have a transform this can serve as the head in your linked list.  If the Transform is really just an id into a procedural interface, you may be able to get away with storing the component itself inside the entity.  

Cache Frequently Used Components
In Unity, frequently used components are stored as members with separate references.  When combined with template specialization this can make operations on a type extremely fast.  For example, if we store a pointer to the transform our getComponent overload could resemble the following:

template <>
FORCE_INLINE const Transform * Entity::getComponent<Transform>() const {
    return m_transform;

However, this comes at the price of extra space in every entity.  In addition, C++ templates do not respect class hierarchies, so if I make a subclass of Transform, I need to make specializations for every operation on that type (addComponent, removeComponent, etc etc).  This is a considerable pain and maintenance overhead.

Minimize Components To A Light Interface
Try to avoid providing Update, Render, and a bunch of other virtual functions for everything unless you really hate I-CACHE efficiency.  Keep things light, make components nothing more than a way to access existing information if that is all you need.  Avoid storing the actual data for components inside the interface if it presents too much of an overhead.  If you have an existing scene graph that uses some crazy struct of arrays scheme, make the Transform interface store nothing more than a handle into those arrays.  Components do not have to present crippling changes to every piece of code.  You can still keep your existing procedural implementations if you want.  Components need only be an interface.

Allocate Entities Into A Pool
Since entities should not have subclasses and all share the same size, they are a perfect candidate for pool allocation.  We cannot make the same assumptions with components, since they will obviously have subclasses of varying sizes.  If the range isn't significant, you can allocate blocks capable of storing the largest component, or use separate pools for components of different sizes.