The process of optimizing something from 10 mins to 1 second is quite different exercise than optimising something from 10 ms to 1 ms. I'm stumble upon this recently in my effort to make the temporary obstacle fast and scalable.
Once upon a time I used to work in a company where AI navigation data generation took easily 15-30 mins per level. I thought that it was waste of designers time, so I wanted to fix it. I ended up with a solution which can handle changes incrementally, so that instead of wasting 15 mins of designers time to tweak a simple scenario would become 150 ms instead.
Problems with 2.5D Heighfields
There were a lot of interesting lesson learned from building Recast. One of them was how to store 2.5D heighfields. I ended up using two different data structures, one which is fast to modify, and another which is has fast neighbour lookup.
Both of the heighfield representations has one annoying thing when trying to reach sub 10 ms performance. They need some extra data to describe the structure of the data. This additional data complicates random access and takes up additional memory. Something like 2D grid would be quite ideal data structure to work with.
Another challenge for sub 10 ms operation is amount of data. In Recast and Detour, the tiles span from the bottom of the world to the top of the world. Many games are usually quite flat–1 to 4 floors on top of each other–but it is still huge variable cost when you process different tiles. Firstly, updates unpredictable amount of time, and secondly the process uses unpredictable amount of memory. Ideally we should know the upper bound of the memory usage before hand. Even if it means a bit bloated memory usage for the general case.
Finally, the column data structure partitions the work unequally. Adding temporary obstacle at the top of a building requires to update the navmesh of the whole building, there might have been many floors which does not require any updates at all.
Side Tracks
I blogged earlier about an idea to create small navigation grids and connect them loosely like waypoints. I did some prototypes and I have done a lot of thinking on how to make it work, but so far it has been failure. But the idea spawned some other ideas.
My plan was to use simple 2D heighfields per grid to describe the world, and to use compressed cache of these tiles to describe the whole world. I liked the loose arrangement too, but I could never quite figure out how to make path following work very well in that context.
Anyways, at the same time I was working on temporary obstacles for Recast & Detour too. And I implemented a tile cache, which basically stores that layered heightfield in compressed form and uses it to kickstart a navmesh generation process. In case you just wanted to mark some areas impassable, that method speeds up the tile generation process my almost a magnitude.
I got it to a point where things are fast, but because of the layer structure, the speed is unpredictable. Adding ideas from one project to another I got a simple stupid idea: what if I partition the layered heighfield to stack of 2D heighfield layers?
2D Data for the Win
I did try that idea long time ago, but it did not work, it was slow and took too much memory. But few things has changed since. Firstly, now the process can use tiles, which means that the 2D heightfields are something like 64x64, not 1056x1522 per layer.
Secondly, the monotone region partitioning makes finding non-overlapping regions really fast. It also makes sure that the transitions between two neighbour layers will be axis aligned. Previously I had a lot of problems trying to match non-axis aligned edges. Currently Detour uses axis-aligned edges to create connections between tiles.
What this means in practice is that I have now a method that transforms a layered 2.5D heighfield into a separate layers of 2D heightfields.
The process starts by finding monotone regions in the 2.5D heighfield. During the process I also calculate which regions are on top of each other, as well as which regions are next to each other. In the next phase, I start from the first region, and flood fill to neighbours, and if they do not overlap the correctly flooded region, they will be assimilar to it. This process is repeated to all regions until they are all marked with unique label. It is a bit like depth peeling, but it considers connectivity.
Since 2D data does not need any additional structure to describe it, it is much simpler to compress it too. For example a simple lossy spline fitting could be use to store the height, and simple RLE should handle are types. Add some cheap LZ compression on top of that, and it should be a lot of win.
Going Forward
My plan is to change the rest of the navmesh generation process so that it can operate on 2D layers. It allows me to take certain shortcuts and make things faster. I hope to make the whole process from 2D heightfield to navmesh to only use stack allocator to make it more usable at runtim. In addition to that I will allow Detour to handle layers too. The current process of building a whole column worth of navmesh will remain intact.
The end result I hope to reach is fast preprocess step which partitions the data to a tile cache, were each tile contains 2D data only. This should allow faster generation, more local changes and better compression of the tiles.
All in all that should allow really fast and predictable process of temporary obstacles.