Thoughts on ECS
I mentioned in my year summary that I have a lot to say about ECS, and got several requests to write more about it, so I’ll do an attempt to write up my thoughts here.
First and foremost, I have very limited experience with ECS. I have implemented some prototypes using it, but my experience comes primarily from reading about it online and talking to people who have actually used it, so take all of this with a grain of salt. I’d be very interested to hear other peoples thoughts on it and how it aligns with my own conclusions.
The Entity Component System-model, or ECS for short, has been around for a few decades and it comes in many different flavors. In most implementations, an entity is just a unique identifier (a number). One or more components can be attached to each entity, but usually only one component of the same type per entity. A component is typically just data (a struct). Systems implement all logic and operate on one or several component types. Systems have an update function that gets called periodically, filters out relevant components to operate on and performs some operation. Systems are generally not aware of each other, though there are probably a few exceptions in most real worls scenarios.
It is often claimed that ECS would lead to increased performance because of data locality and cache coherency, more on that later on. Others use ECS primarily for code structure and separation of responsibilities, which could lead to improved reusability and code quality.
The problem at hand
ECS is often seen as the universal solution to deep, complex class hierarchies using inheritance. The argument usually goes something like this: “What if a monster in my game has the ability to walk, but also the ability to fly? Which base class should it inherit from? With ECS an entity can be given different capabilities arbitrarily by using composition instead of inheritance.”
It’s absolutely true that deep class hierarchies often run into these sort of issues and should be avoided (especially multiple ineheritance, which I think everyone now finally agree was a bad idea to begin with). However, ECS being the only alternative is simply not true. I have not seen anyone implement behaviors with deep class hierarchies for decades, at least not in the games industry. There are many other ways to do it. One pretty common alternative is to use a much more shallow class hierarchy with one common base class and all the different game objects immediately below that, occasionally using a shared base class where it makes sense, or using composition where appropriate. Others (myself included) primarily use scripts and tags to implement and composite behaviors.
ECS performance
ECS is often seen as a technique that is inherently associated with good performance. I don’t think this is necessarily the case. ECS is usually implemented with components being packed nicely together in memory for good data-locality when looping over them. In reality though, any meaningful system will operate on multiple components, now scattered in memory instead of being close together. Components are only packed tighly in memory if you read from one component type. As soon as you read from multiple, this is no longer the case. A movement system might read colliders from the collision component, velocity from the physics component and modify the position from the transform component. That’s three potential cache misses instead of one.
Some ECS systems has a solution to this called archetypes, which is a rather complex system to store all components that belong to the same entity close in memory. Hang on a second, isn’t that exactly what the good old struct already does?
For archetypes to work with dynamic composition, data has to move around when new components are added or removed. This, of course, implies a performance hit, but it also means that any pointers to component data will be invalid after adding or removing components. The latter is particularly problematic and can lead to complicated bugs in systems altering components while iterating.
ECS encourages frequent iteration over combinations of components. In some game genres, like simulation games for instance, this is probably a common pattern, but in most games I have worked on, looping over lots of objects is generally something that can and should be avoided altogether. Instead of looping over all tripwires in a level to see if the player crosses one, it’s more efficient to use a spatial query around the player, collecting all tripwires that are close and check only those.
On top of this, ECS requires lots and lots of lookups for entity-component mappings. It’s usually a hashmap or some form of indirection table, but in either case it certainly doesn’t come for free. However, in defense of ECS I must add that if it’s used as an alternative to implement behaviors in scripts, it’s likely a big performance win regardless of implementation.
Cold data
Having the ability to associate cold data (data that is rarely used) to any object is a great idea. Memory locality is a real issue. Keeping your hot data small is a big performance win and ECS makes this very easy. Just attach any cold data as extra components to your objects and they will never be touched unless queried. Unfortunately this clashes completely with the idea of archetypes, where components are layed out linearly in memory, just as if they were a struct. The ideal scenario would be to use archetype storage for hot components and sparse storage for cold components, but that would be a very complex ECS implementation.
Data-driven design
In my opinion, this is the best (and possibly even the only) case for ECS, where it really has a clear advantage. Say the intent is to composite different behaviors and let the designer mix and match at runtime in an editor. This maps very well to ECS, since the dynamic nature of the system usually does not require entity-component relationships to be hard-coded. A lot of boilerplate components would likely be needed for any meaningful game object, and crucial systems might expect certain combinations of components to do any meaningful work, so a true mix-and-match-anything-you-want might not be as attainable as you might first think.
Debugging
One area of ECS that is rarely discussed is how complicated they are to debug when something goes sour. Visual debuggers are excellent at displaying values in structs and following pointers. With ECS you have integer identifiers that are associated with component data through indirection. If using archetypes, the component data is dynamically composed into a memory blob with no debug info whatsoever. While it may be possible to achieve walkable data structures using natvis magic, it would certainly be a very complex and fragile setup.
Pros and cons
To summarize here are the pros and cons of ECS, in my opinion:
Pros
-
Entities can be dynamically composed at runtime in a data-driven way, for example via an editor. It also means that you can alter capabilities of existing entities at runtime by adding or removing components.
-
It provides a framework for cold or optional data. It’s a great way to avoid expanding your base classes with data that only a fraction of all entities use or data that is rarely used. This can reduce the size of your objects and improve cache locality. Unfortunately this is rather complicated to combine with the archetype model.
-
It provides a mental model that forces you to structure your code in a data-oriented way (this can be done also without ECS, but may require some diciplin)
Cons
-
Added complexity. You need to understand the details and limitations of your ECS architecture in order to implement features and understand the code.
-
The dynamic nature of ECS requires a lot of lookups between entities and their associated components. Depending on the method and implementation this can lead to performance problems that are difficult to detect, since they are scattered all over the code.
-
Hard to debug
Neutral
-
Performance can be better or worse depending on the implementation and the alternative considered.
-
The code structure can be better or worse depending on the implementation.
Sensible ECS
I’d argue that a large portion of what ECS brings to be table can be implemented in a much simpler way, that is equally data-oriented, but does not require the complex, dynamic database that ECS really is. If your game does not require dynamic composition at run-time, you can simply create your entity types statically using composition in the following manner:
struct Monster
{
Transform transform;
Velocity velocity;
Collider collider;
PathFinding pathFinding;
};
There is no need for inheritance here, just include whatever components that make sense for what you are implementing. Note that all data for a monster is layed out linearly in memory instead of being scattered.
Then instead of a function updating monsters:
void updateMonster(Monster* monster);
You can create reusable, system-like update functions for relevant combinations of components:
void updateMovement(Transform* t, Velocity* v, Collider* c);
void updatePathFinding(Transform* t, Collider* c, PathFinding* p);
Note that these functions operate only on components, without knowing which entity they belong to or how they are stored. Hence they can be reused for any entity type that use the same components without using inheritance.
This simple approach requires more explicit function calling. Nothing is automatic, so if you want to update movement, you have to manually do that for all the relevant entities:
//Update movement for monsters
for(int i=0; i<monsterCount; i++)
updateMovement(&monster[i].transform, &monster[i].velocity, &monster[i].collider);
//Update movement for peasants
for(int i=0; i<peasantCount; i++)
updateMovement(&peasant[i].transform, &peasant[i].velocity, &peasant[i].collider);
This can get very explicit if you have a lot of entity types, but in my experience explicit is a good thing. The code is easy to read and understand, it literaly does exactly what it says from top to bottom, instead of implicitly calling things automatically. It interacts perfectly with any debugger, and most importantly you are in control over the exact control flow of the update loop and the order in which everything is called. What if you have an entity type with the same components, but that doesn’t need a movement update? No problem, you can easily do that be simply not calling updateMovement for such entities.
What about cold/optional data that ECS does so well? There is nothing stopping us from adding a similar system alongside the one described above. We can make all entities inherit from a common base with a unique identifier and create the same indirection tables as you normally would in ECS, but only use it for cold data, instead of using it for everything. Then we have hot data aggregated tightly in memory, while cold data is floating around somewhere else and only accessed when needed.
The main limitation is of course that entity types are locked down at compile-time and cannot be changed dynamically. Wether this is problematic or not depends on the use case. If you are making an actual game, not a generic engine, this simple approach will probably take you all the way to the finish line. If the goal is to create a data-driven, generic engine where designers can dream up composite behaviors in an editor, a full ECS implementation is probably the better choice, but be aware of the complexity it brings.