It's common in games to use bitfields to store numerous flags efficiently. Not only are booleans incapable of holding more than one flag, but they do so while taking up at least as much space as an unsigned char and sometimes as much space as a word (which has implications for space and alignment). It is much more efficient to store them into one variable and do simple bitwise operations to check if a flag is set. Just as a simple test case, lets say we have some flags for some generic entity in our game:
typedef unsigned int EntityFlags; const EntityFlags EF_ENABLED = (1 << 0); const EntityFlags EF_IS_DEAD = (1 << 1); const EntityFlags EF_IS_PLAYER = (1 << 2); const EntityFlags EF_IS_IN_VEHICLE = (1 << 3); const EntityFlags EF_IS_STUNNED = (1 << 4);So now we have a number of flags we can efficiently pack into one variable and pass to functions as one variable. This is how OpenGL handles almost all of its options, which is a good thing, particularly for compatibility reasons. For example, instead of some function with an inflexible signature:
int processState(bool isDead, bool isPlayer, bool isInVehicle); // called later int ret = processState(true, true, true);We instead have something more flexible to change, and with clearer calling code (in my opinion):
int processState(EntityFlags entityFlags); // called later int ret = processState(EF_IS_DEAD | EF_IS_PLAYER | EF_IS_IN_VEHICLE);Moving on from our simple review, lets take a look at a generic way to expand upon this knowledge. Lets say we have an input list and we need to filter it into another list given its flags. So our function looks something like this, with Entity being some type that at least has something resembling a getFlags() function that returns its EntityFlags from above.
int filterList(Entity * RESTRICT output[], const Entity * RESTRICT input[], int N, EntityFlags searchFlags) { int outIdx = 0; for (int inputIdx = 0; inputIdx < N; ++inputIdx) { if (filter(input[inputIdx]->getFlags(), searchFlags)) { output[outIdx++] = input[inputIdx]; } } return outIdx; }
So what goes into out filter function that differentiates filterList() from your standard list copy? If we are doing an inclusive search, whereby any flags shared in common we just need to AND the bits together and see if the result is non-zero:
int filterAnyBitsAreSet(EntityFlags flagsA, EntityFlags flagsB) { return (flagsA & flagsB); // search filters and object share at least one flag in common }
The algorithm for checking if all the bits in flagsB are set in flagsA is also very simple, whereby we take an additional step and check for equality:
int filterAllBitsAreSet(EntityFlags flagsA, EntityFlags flagsB) { return ((flagsA & flagsB) == flagsB); // all flags in flagsB must be set in flagsA }All of this is pretty trivial. However, wouldn't it be nice if we could pass an additional parameter to filterList which let us perform inclusive and exclusive searches. E.G. at one point I want all entities in the list that are both a player and in a vehicle. And at another time I want all the entities that are either stunned and/or dead. Our filter function could take an additional flag which determines which processing to use:
int filter(EntityFlags flagsA, EntityFlags flagsB, bool searchInclusive) { if (searchInclusive) { return (flagsA & flagsB); } else { // exclusive all bits must be set return ((flagsA & flagsB) == flagsB); } }
This will work. The searchInclusive flag could percolate up to our filterList() function and we could optionally search all flags in either manner. However, we are left with these annoying branches, which add up when processing a large list. How can we make this branchless?
Lets take a look at our previous conditions a little more closely. In both instances, some bits need to be shared in common. As such, (flagsA & flagsB) will have to be in our test in some way. Next, we are going to reverse the logic a bit of our all bits are set conditions. The XOR operation returns the difference in bits. If all bits in flagsB must be set in flagsA, the XOR difference should be 0000 for all of the relevant bits (since they should be the same). Therefore, if all bits must be set the following condition must be met ((flagsA ^ flagsB) & flagsB == 0). Use a pencil and paper if it helps you reason it out. So now we have two valid conditions, but they won't work together in one bitwise test. Notice the second half of our condition must equal zero to be valid. If we use another mask to zero out the expression in the case of an inclusive test we will be golden. Using one of the two constants below as our mask, we can perform the test without branches like so:
const int ANY_ARE_SET = 0; const int ALL_ARE_SET = 0xFFFFFFFF; int filter(EntityFlags flagsA, EntityFlags flagsB, int mask) { return ((flagsA & flagsB) && (((flagsA ^ flagsB) & (flagsB & mask)) == 0)); } // Or alternatively like... int filter2(EntityFlags flagsA, EntityFlags flagsB, int mask) { return ((flagsA & flagsB) && (((flagsA & flagsB) & (flagsB & mask)) == (flagsB & mask))); }
Now our list copy can filter results without forcing the calling code to perform a second pass and without excessive branches. And this simple concept is easily applicable to anything using bitfields. Now, as explained above I can filter lists of objects with bitfields quite easily and efficiently using the same function.
Entity *downedEntities[MAX_ENTITY_LIST_SIZE]; int sz = filterList(downedEntities, g_EntityList, g_numEntities, EF_IS_DEAD | EF_IS_STUNNED, ANY_IS_SET); for (int i = 0; i < sz; ++i) { // processing for enemies that are stunned and/or dead } Entity *playersInVehicles[MAX_ENTITY_LIST_SIZE]; int sz = filterList(playersInVehicles, g_EntityList, g_numEntities, EF_IS_PLAYER | EF_IS_IN_VEHICLE, ALL_ARE_SET); for (int i = 0; i < sz; ++i) { // processing for entities which are players and currently in vehicles }
Simple but powerful.
No comments:
Post a Comment