Sunday, January 30, 2011

Counting the Sheep



I had little time today, so I implemented off-mesh link handling for DetourCrowd. Yet another layer of state handling. The logic to detect and change state is not too complicated. I was mainly trying to see how to fit the link animation processing in the system. Not too much on how looks for now.

It looks like I might be able to make the system so that all custom handlers for steering and off-mesh connection handling will be just one (virtual) function call which handles all the agents. Something like:

void updateSteering(dtCrowdAgent** agents, const int nagents, const float dt);

DetourCrowd will just make sure the call happens in right order and the user can then do the update in anyway he feels necessary. Which brings me to a question to the readers of this blog:
How do you mix and match locomotion and other navigation animations?
For example, an agent is running and then jumps over a fence, and then keeps on running. I'm especially interested in what kind of logic might be used to trigger the change from running to jumping, if there is some kind of transition phase, etc. I know from the past that that is one nasty topic. I could really use some feedback on this. Good implementations, failed attempts, something that barely works, I want to hear it all :)

Another hard topic is how to manage potential queueing to an off-mesh link that becomes choke point. Or how to evenly distribute agents to multiple jump-links. Should the links be locked so that only one NPC can use it at time, or should there be invisible colliders at each end of the jump-link so other would avoid the location even more.

Some of those problems were touched by this paper: Situation Agents: Agent-based Externalized Steering Logic. I hope to have some time to try some of the ideas out.

Saturday, January 29, 2011

DetourCrowd

I finally managed to clean up the CrowdManager code and put it in proper place, yay!

The code is now located in DetourCrowd folder. My idea is to keep Detour a clean low level library and build DetourCrowd on top of that. While is was organizing things, I also moved the velocity planning code to DetourCrowd folder.

The crowd manager now allocates its' own dtNavMeshQuery, so there is no need to pass one to all the update functions anymore.

I also implemented timesliced pathfinder. The slicing is controlled using a constant called MAX_ITERS_PER_UPDATE. You can probably crank it up to 500 or so on real games. Try different values and see how it affects the performance graph (i.e. should not peek too much). I have currently set it very low for fuzz testing.

There are still some things missing:
  • Off-mesh link handling
  • Better handling of navmesh changes (i.e. how paths respond to temporary obstacles)
  • Off-mesh links with custom callbacks
  • Better handling of magic constants
  • Serialization
  • Custom movement styles
I will work on the items roughly in that order. I have a couple of other big things queued up for this spring, so no guarantees when things will be implemented.

Any feedback on the change or the crowd manager in general are welcome!

(Sorry again Windows peeps, I will try to find a PC to fix the VC project.)

Monday, January 24, 2011

2D Grids for the Win!

The process of optimizing something from 10 mins to 1 second is quite different exercise than optimising something from 10 ms to 1 ms. I'm stumble upon this recently in my effort to make the temporary obstacle fast and scalable.

Once upon a time I used to work in a company where AI navigation data generation took easily 15-30 mins per level. I thought that it was waste of designers time, so I wanted to fix it. I ended up with a solution which can handle changes incrementally, so that instead of wasting 15 mins of designers time to tweak a simple scenario would become 150 ms instead.

Problems with 2.5D Heighfields

There were a lot of interesting lesson learned from building Recast. One of them was how to store 2.5D heighfields. I ended up using two different data structures, one which is fast to modify, and another which is has fast neighbour lookup.

Both of the heighfield representations has one annoying thing when trying to reach sub 10 ms performance. They need some extra data to describe the structure of the data. This additional data complicates random access and takes up additional memory. Something like 2D grid would be quite ideal data structure to work with.

Another challenge for sub 10 ms operation is amount of data. In Recast and Detour, the tiles span from the bottom of the world to the top of the world. Many games are usually quite flat–1 to 4 floors on top of each other–but it is still huge variable cost when you process different tiles. Firstly, updates unpredictable amount of time, and secondly the process uses unpredictable amount of memory. Ideally we should know the upper bound of the memory usage before hand. Even if it means a bit bloated memory usage for the general case.

Finally, the column data structure partitions the work unequally. Adding temporary obstacle at the top of a building requires to update the navmesh of the whole building, there might have been many floors which does not require any updates at all.

Side Tracks

I blogged earlier about an idea to create small navigation grids and connect them loosely like waypoints. I did some prototypes and I have done a lot of thinking on how to make it work, but so far it has been failure. But the idea spawned some other ideas.

My plan was to use simple 2D heighfields per grid to describe the world, and to use compressed cache of these tiles to describe the whole world. I liked the loose arrangement too, but I could never quite figure out how to make path following work very well in that context.

Anyways, at the same time I was working on temporary obstacles for Recast & Detour too. And I implemented a tile cache, which basically stores that layered heightfield in compressed form and uses it to kickstart a navmesh generation process. In case you just wanted to mark some areas impassable, that method speeds up the tile generation process my almost a magnitude.

I got it to a point where things are fast, but because of the layer structure, the speed is unpredictable. Adding ideas from one project to another I got a simple stupid idea: what if I partition the layered heighfield to stack of 2D heighfield layers?

2D Data for the Win

I did try that idea long time ago, but it did not work, it was slow and took too much memory. But few things has changed since. Firstly, now the process can use tiles, which means that the 2D heightfields are something like 64x64, not 1056x1522 per layer.

Secondly, the monotone region partitioning makes finding non-overlapping regions really fast. It also makes sure that the transitions between two neighbour layers will be axis aligned. Previously I had a lot of problems trying to match non-axis aligned edges. Currently Detour uses axis-aligned edges to create connections between tiles.




What this means in practice is that I have now a method that transforms a layered 2.5D heighfield into a separate layers of 2D heightfields.

The process starts by finding monotone regions in the 2.5D heighfield. During the process I also calculate which regions are on top of each other, as well as which regions are next to each other. In the next phase, I start from the first region, and flood fill to neighbours, and if they do not overlap the correctly flooded region, they will be assimilar to it. This process is repeated to all regions until they are all marked with unique label. It is a bit like depth peeling, but it considers connectivity.

Since 2D data does not need any additional structure to describe it, it is much simpler to compress it too. For example a simple lossy spline fitting could be use to store the height, and simple RLE should handle are types. Add some cheap LZ compression on top of that, and it should be a lot of win.

Going Forward

My plan is to change the rest of the navmesh generation process so that it can operate on 2D layers. It allows me to take certain shortcuts and make things faster. I hope to make the whole process from 2D heightfield to navmesh to only use stack allocator to make it more usable at runtim. In addition to that I will allow Detour to handle layers too. The current process of building a whole column worth of navmesh will remain intact.

The end result I hope to reach is fast preprocess step which partitions the data to a tile cache, were each tile contains 2D data only. This should allow faster generation, more local changes and better compression of the tiles.

All in all that should allow really fast and predictable process of temporary obstacles.

Explainability

I have been doing a lot of UI design recently at work. We just got first bunch of feedback from the users and as you can imagine they had a lot of trouble trying to navigate through the app. This made me thnk about.

I had a recent discussion with a good friend of mine and he told me that most of his UX work nowadays is about copy. At that point in time my biggest concern was to make our UI slick and lean while he was pondering how to communicate something through words. That slick and lean was the version that went to user testing and did not receive too much praise.

As I worked further on my layout I noticed interesting thing. We did not have explainable names for all the stuff in UI. The concepts were there, but they were not exposed. Sometimes I had omitted text to make the layout look better, or crammed things together to make it more compact.

We had put a lot of effort to make the app discoverable. That is, there is one layer in the UI design which allows you to explore and test all the features of the app one at a time. There is one layer, which allows really quick access to all these features via shortcuts and modifier keys. It looks pleasing to the eye too. To make the entry to the app even smoother and the structure more understandable, we added another layer of design: explainability.

Everything we have put in the UI has name and relation. Pretty much every feature and its' relation to others can be explained with just one paragraph of text (as a matter of fact, that is often my litmus test). So the UI extends outside the app too. Explainability implies literate, but not litter. Tufte's rule of adding as much information as possible with as little ink as possible still applies.

Sounds stupidly obvious, but it took me good 15 years to understand it!

Wednesday, January 19, 2011

My Recent Failures in Polygon Clipping

I have been prototyping some prerequisites for cookie cutter temporary obstacles recently. And the short story is that it won't happen. The longer story is, I had a couple of different ideas how I would implement the clipping.


First Attempt


I always wanted to try out the dynamic navmesh subdivide algorithm from AGPW4. The same algorithm is also used in FEAR SDK. The idea is that to subtract polygon A from polygon B, you split polygon A with each edge of poly B. You know, like BSP.

The good thing is that the algorithm is really simple to implement and the resulting polygons are also convex. The bad thing is that it creates a lot of extra vertices and polygons. After few overlapping intersections you have left nothing but magic fairy dust. And not the good kind.

I did a couple of test to remove the extra vertices. Some test were successful but in overall the whole effort just turned into a giant epsilon fest. That is not the kind of code I would like to support and maintain.

So the lesson was that if you choose to use the BSP-like algorithm to cut your polygons, make sure your code can handle T-junctions. That way you can always track which edges to connect without any additional epsilon magic.


Second Attempt


I think I might have gotten the previous attempt to work to some degree with a lot of sweat. But the fact that it would create huge amounts of polygons really turned me away from it. I like how grid based methods make it easy to handle obstacles. So my second attempt was to use that.

The idea was that I would create a grid which covers a whole polygon plus some padding and rasterize the obstacles into that grid. Then I would use monotone partitioning to create polygons from that raster data. Finally I would clip those polygons against the edges of the original polygon.

While this was kinda silly idea, I think it was worth pursuing. I gave me a lot of ideas, which I will discuss later. Anyways. This idea suffers from the extra vertex mania.


Third Attempt

Despite a lot of failing, I was not done yet! As my last resort I tried to use a proper intersection and tessellation library. While I was able to solve the case for one polygon with multiple obstacles, the performance was not too tempting. While the GLU tesselator is robust and scales pretty well, things are different when you want to do things that are below 1ms.

That would have still left me with all kinds of epsilon tests to handle the (necessary) new vertices added at the polygon edges. Vertex welding is not particularly complicated thing to do, but it can and will create degenerate and flipped geometry in odd cases. I don't like odd cases.


While my attempts to create cookie cutter obstacles has failed, the process has sprout some great ideas to pursue. The one big lesson I learned during this research so far has been that it is really interesting to write code which has known upper bound, let it be memory usage or processing requirements. And I think that is the key to make something dynamic happen quickly–hard, known, bite sized limits in computation and memory.

Tuesday, January 18, 2011

Time-sliced Temporary Obstacle Handling



While I was testing the temp obstacle handling, I noticed that some locations would take quite a bit more time to process than others. For example in the level that is shown in the video above, the rebuild times would range from 1 ms to 15 ms depending on how many tiles the obstacle would affect and how many layers there were in that particular location. The number of obstacle affect the build time too. The added performance cost is pretty linear to the number of obstacles (and their size). In my tests each obstacle takes about 0.05-0.1ms to process.

To fight the variance I implemented a time-sliced version of the generation process. In the above video, the temporary obstacles are updated incrementally so that on each update the process consumes 1 ms. The code will find its way to the SVN later this week.

Friday, January 14, 2011

First Iteration of Temporary Obstacles Checked In

The very first iteration of temporary obstacle handling using compressed lean heighfields just got checked in (R258). Unfortunately, it is not yet "just plug it in" solution, but merely some simple modifications and a sample implementation how it can be implemented using the toolkit.

I ended up calling the minimal representation to create a compact heighfield "lean heighfield". I'm not sure if it is particularly good name, but at least it is more distinct than "compressible heighfield".

If you were previously regenerating tiles using Recast at runtime, this modification allows you to save and compress the voxelized results and speed up the tile rebuilding almost by an order of magnitude (it usually varies between 5-10 times faster).

I will keep on prototyping some more to see if I can support modifying the Detour meshes directly.

In the process I also changed the navmesh builder, so that the detail mesh is now optional. Detail mesh is important if you have large tiles or one big navmesh, or if you require some approximation of the height at runtime. When using small tiles–which is good practice when you are likely to change them alot at runtime–the detail mesh is not always necessary. Not requiring to the detail mesh speeds up the generation process and requires less memory.

[EDIT] As usual, the Visual Studio project is lagging behind. I will try to allocate a slot on some win machine to put it up to date.

Tuesday, January 11, 2011

Temporary Obstacle Progress


I managed to make some progress with the temporary obstacle handling. Most of the required pieces are now in place, next up is bunch of massaging to make it great.

So the stuff I have working now is a preprocess step which rasterizes the tiles, creates a minimal representation of the heightfield and then "zips" it. These data chunks are stored in tile cache. Upon request tile cache will create a piece of navmesh for you which includes all the temporary obstacles too.

I use tile size of 48x48 in the video. It takes about 2 ms to update one tile. So in worst case adding one obstacle takes 8 ms. Interestingly the conversion from the minimal representation to compact heightfield takes about 25% of the time, and generating detail mesh takes another 25% of the time.

For those in dire need for the extra nanoseconds, I think I'll provide dummy detail mesh generation process which just triangulates the polygons instead of trying to add more detail. Or maybe even support that at runtime like in the old days.

Apart from the few offenders, each Recast call takes very little time, so it is possible to timeslice the generation.

Sunday, January 9, 2011

Compressed Heighfields

One of the items on my TODO list for 2011 is support for temporary obstacles. I posted earlier about the possibility to compress the heighfield and use part of the Recast process to regenerate tiles much faster.

I have been prototyping heighfield data compression recently. The above picture shows the test scene I have been using. It has both simple plane like surfaces as well as building with many levels.

The compression process starts by stripping down and packing the heighfield data. I store elevation, free space and area type for each walkable span. In addition to that the ground grid stores number of layers per column. See my previous post on the topic for a bit more details.

That is the minimum amount of data to reproduce a compact heighfield for a particular tile. The minimal heighfield data for the test scene takes 713kB. This is more than the 512kB I reported earlier. The reason is that each tile also contains border, which is required in order to make the tiles match ar borders.

The compressed data for the whole scene takes 82kB. This is the amount of data that needs to kept in memory to regenerate any tile in the scene. It is quite a bit more than the 32kB I reported earlier. The difference between the tests is that the old data was compressed as one huge chunk using zip and my recent prototype compresses each tile individually and uses libLZF which does not compress so well, but it is a lot faster.

For a comparison, the actual navmesh for the test scene takes 98kB. I will provide some timings as I get further with the implementation.

Changes in the Build Process

Currently the build process for a tiled navmesh goes like this:

for each tile
- rasterize triangles
- find walkable cells
- mark areas
- build regions
- build contours
- build poly mesh
- build detail mesh
- store navmesh

With compressed heighfield the process changes to follows:

Preprocess
for each tile
- rasterize static triangles
- find walkable cells
- mark areas
- store compressed heighfield

At Run Time
for a changed tile
- uncompressed heighfield
- mark obstacles
- build regions
- build contours
- build poly mesh
- build detail mesh
- store navmesh

Triangle voxelization is usually dominating facter when generating a tile. For example it takes 71.5% of the time to generate the highlighted tile in the test scene.

The perfectionist in me still would like to find a way to use just the Detour data to adjust the navmesh for temporary obstacles, but I think this test has been worth the effort so far.

Thursday, January 6, 2011

Loosely Ordered Mega Navigation Grids

I got this stupid idea while reading AiGameDev article on fixed point math used in PathEngine. A light lit up above my head when I read about the section related to streaming. PathEngine does not try to match the navigation representation at neighbor tile borders, but instead the navigation tiles overlap and the agent switches to another mesh when it is over it. At first it feels odd, but it is kinda awesome too.

That is sort of my idea. What about a navigation representation which is based on overlapping 2D representation of the space? Sort of like HPA*, but you don't have square tiles laid out on a grid, but the tiles could be place anywhere in 3D and would have height data too? That way you could have full 3D navigation–overhangs and all that jazz–but the dynamic awesomeness of a 2D grid!

So I thought, I'd test it out. In the above screenshot there is the very first step. I initially had the idea to use cubemap, but the irregular sample distribution on ground made a lot of things really complicated.

What you see on the screenshot is 32x32 voxelized section of the scene. The blue area represents 2D heighfield placed at the location of the red cross. Notice how there is black patch under the bridge, since the bridge is visited first.

My idea is to use a graph at the higher level to find a path through the tiles. And to use my earlier attempt at local grid planner to find the path between waypoints. The overal design fits well into the Navigation Loop thinking too.

I see all kinds of interesting opportunities for this method, such as using different accuracies for different waypoint patches, dynamically creating new patches as the scene changes, etc.

The main idea behind the design of this idea was how to handle changes in the navigation data at runtime, without sacrificing any of the properties of navmeshes.

The amount of data is of course one concern. Heightfield and some flags probably take around 1kB per patch, but the great thing is that you most likely need to store very few patches, just the ones around the agents and use cache to manage them, sort of mega-texture for navigation grids.