Sunday, September 19, 2010

Experimenting with ORCA


Ever since RVO2 library came out, I've been itching to dust off my old ORCA test and try to get it to work. I really like the simplicity of the method.

I don't have the required intuition to debug linear programming, so I opted to calculate the union of the constraints as polygon clipping instead. I start with a rectangle, which roughly describes all the admissible velocities, and chip away each of the constraints using poly-plane clipping algorithm in 2D. In above picture the result polygon is drawn in white.

As suggested in the RVO2 forums, in case the resulting polygon would be empty (no admissible velocities), I remove the constraints one at a time, starting from the furthest agent. You can see this effect in the above picture, the green constraints would make the polygon null, so it is discarded. This is simpler initial alternative to 3D linear programming.

While implementing the method I ran into quite a few problems. These can be due to my lack of understanding the method, or because my actual implementation differs somewhat from the RVO2 code.

Firstly, I'm visual person so I always need under-the-hood view before I can understand some method. There was quite a big leap from the agent position and velocity to the actual constraint plane. So I first spent some time trying to see where the planes come from.

Secondly, I have hard time understanding all the intrinsic details of optimal velocity, and I'm even more confused when I compare the paper vs. the library code.

When reading the paper I get the feeling that the planes should be recalculated if you choose another optimal velocity but the library does not do that. The lib calculates the planes once, as if optimal velocity would be current velocity and then uses linear programming to try to find solution to different optimal velocities.

I get super horrible feedback effect if I set optimal velocity to the current velocity, so I used desired velocity instead and things kinda work. Kinda.

Thirdly, I cannot see how the method is able to choose good constraint planes by selecting a single choice for each object. I hope I'm missing something big here.

For example, let's take a look at the picture at the top of this post. That green plane messes things up big time. Had the method chosen the other side of the obstacle, the case would have had good solution.

Even in slightly crowded scenes that kind of problems occur a lot. Many cases I traced through, calculating the constraint using optimal velocity of zero would have solved the problem. I could not figure out good rule when to choose which one of them. I tried some obvious ideas but there always was another case which failed after the change.

I would love to love the ORCA method, but I have had a lot of trouble getting it to work.

I have a hunch that the structure of the combined velocity obstacle plays big role in temporally coherent velocity selection. Fiorini & Shiller described the structure very well in their early papers, but I have not seem much research on that field ever since.

Tuesday, September 14, 2010

On Navigation Robustness


One important point of the Navigation Loop method is that the loop is robust. From one point of view, this is realized by making each successive step in the loop to produce more accurate solution. On the other hand, the robustness comes from the fact that system stays in control.

Steering velocity is passed for the local avoidance to not to produce velocity which will collide with others, after the final movement a collision detection is applied so that small errors can be corrected and finally the new location is ensured to be inside the navmesh so that the next iteration can work with solid data again.

Traditionally path following is implemented so that you first find spline (linear or curved) through the world and try to follow it. Two horrible things will happen with this method.

Don't Walk That Tight Rope

Following a spline iteratively will always result somewhat low pass filtered results. You will cut curners. There are several sources in the net to help you with that task, but regardless what you do, you will never be able to exactly follow that spline (unless you are just interpolating over it), which means that you will collide with something or get into areas where you cannot move away. This effect is further amplified if you use physics to move the NPC, since the navigation data and collision data never match exactly.

So the first source of problems comes with trying to follow the path and trying to detect when you are too far away from the path that you will need to replan the whole path. This method is usually really quick to get up and running, but the end result is either 90% success rate (we need at least 99.9%!) or horrible bowl of special case code spaghetti to capture fix all the known problem cases (most of which you will learn 1 week after you have shipped).

The lesson is that you should never ever ever try to follow a path as means of navigating through the world. Instead you should have more loose structure of the path (the path polygons) and find the next corner to steer to each update and adjust the path corridor as you move to next polygon. This way you never get out of the path and you always have correct direction to steer to regardless even if you veer of the ideal path.

Always On the Mesh

The second source of problems is that the NPC may get into location where it cannot reach the navmesh anymore. You should always constrain your NPCs to be inside navigation mesh when they are walking. The navigation mesh represents areas where the NPC can walk and stand with moderate effort. The input parameters for the mesh generation are adjusted so that this holds true [1].

You should have special case code to handle locations outside the navmesh like off-mesh connections (jump over things) or animations and actions around MGs or other entities. These actions should always enter/exit on navmesh.

When you keep the NPC inside the navmesh, you can always have good and valid data to use for pathfinding. Of course some sloppiness needs to be tolerated because of floating point accuracy.

If you don't keep the NPC inside the navmesh then you have similar problem as with trying to follow a spline. The two representations will get out of sync and you will spend a lot of time writing code to cope with it.

Physics... (sigh)

Robust navigation is not just "it kinda pulls this dude past this obstacle", it must pass the nastiest of the tests when the player sheds havoc on your NPCs.

The physics and navigation representation never match 100%. For this reason something that looks completely valid from navigation point if view may result dead lock collision with physics system.

So a lot of the problems comes from the fact that physics is always in control. If you want to have robust navigation it must be in control and instead you should treat physics system as a sensor for the NPC.

The physics should be used to plant the NPC firmly on ground, or you should use it to detect when someone is throwing a barrel at the NPC and blend to ragdoll, etc. There is nothing realistic or believable about things bouncing off an invisible capsule!

When you move your NPC, the navmesh boundary is the final hard constraint and is always satisfied. Before that you can have simple collision handling between the agents for cases where local avoidance fails (and it will, if not badly, you will get at least accuracy errors).


To sum it up, don't try to follow a tight rope, you will fail. Always keep the NPC on navmesh, or you will fail. Physics is good tool, but bad master.

I know removing the physics roundtrip will be hard to swallow, but give it a try! :)
_

[1] If the navmesh generator does not produce mesh for all the locations you wish to be navigable, then you need to change the params or add extra annotation to handle these locations. For example the NPC might be able to walk down certain steep slopes, but these may be rejected from the navmesh. It is not trivial to detect these areas as you would need to extract the direction the agent could move. Annotation and special navigation code is easier solution for this case.

Sunday, August 29, 2010

My Summer of Code 2010

This where all the magic happened!

You may have noticed the increased traffic on the commits and the blog. But now Mikko's Summer of Code has come to an end. I used a few weeks of downtime between projects to flush my TODO queue. I'm starting a new project this week and I will be back to my 1 day a week schedule with RC&DT.

Here's some thoughts on what are likely to happen in near future.

Multi-Agent Navigation

I got the multi-agent navigation to much further state than I expected. The code is currently huge pile of spaghetti. The next step is put it all together. I have been trying to think about the API and it seems to be quite a big task. I think it is good that my progress is slowing down so that I can let things simmer in a bit.
If you have some thoughts how you would like to integrate the multi-agent navigation to your project, please let me know!
Would you like to use it as a component? Do you want a manager you can tick or do you prefer to do that yourself? Do you use physics for collision detection? Where in your agent update cycle you are planning to put the path following code? Any input is welcome. If you don't want to share the secrets here you can always send me email too.

Everyone is integrating path following slightly differently. I hope to support as many people as possible. On one extreme there are the folks who just would like to get plug'n'play, and on the other side are the folks who would like to get the max performance which means that I need to split the API so that you can tear everything apart as necessary.

My current idea is to roughly separate the code into roughly four pieces:
  • mover
  • steering
  • obstacle avoidance
  • collision detection
The mover takes care of handling path corridor and movement along navmesh surface. In theory you could use it to create NPCs or even player movement. It will also keep track if new path should be queried.

Steering is game and behavior specific so I think I will add some helper functions to the mover class which allows you to build your own. I will provide some examples how to do it too.

Obstacle avoidance will be separate class which can be used even without any of the mover stuff. Basically you feed in obstacles and the agent state (position, radius, velocity, desired velocity) and it will spit out new velocity which should result in less collisions than the desired one. The first implementation will be the sampled VO, I'm looking into adding ORCA stuff too later.

Collision detection will be responsible of finding the neighbors to avoid and to collide with. This is potentially something that your engine does already. I might include this only in the example code.

Many of the tasks required for multi-agent navigation are embarrassingly parallel (see all the for loops in the current CrowdManager::update()). My idea is to tailor the APIs so that when possible you just pass in the data that is absolutely necessary. For example the obstacle avoidance code will not figure out the neighbors itself, but you need to feed them.

I will create an example which can be used for learning and also something that can be integrated as is, for those who just like to get quick and dirty integration.

Detour Error Handling

There are some cases where Detour should be more verbose on error cases. My work on the multi-agent navigation has revealed some short comings in the error handling code too and the sliced pathfinder work surfaced some ideas too.

I'm planning to improve the Detour API at some point so that it will report errors better. The change will be most likely that every method will return a status instead. This allows me to report invalid input data and more elaborate reason why path finding may have failed or report that path finding succeed, but actually returned partial path. There are some discrepancies on the raycast() function return values too.

Temporary Obstacles

I got quite exited about the heightfield compression findings and got some really good feedback on it too. It is definitely something I wish to prototype more in detail.

This also ties into better Detour error handling. I will need to finally figure out how to handle interrupted and invalid paths. Detour will not replan automatically, but it will let you know what something went wrong or that something will go wrong in the future so that you can schedule new path query yourself.

Friday, August 27, 2010

RVO 2 Library

RVO2 Library has just been released. It uses optimal reciprocal collision avoidance (ORCA).
I have had tried to implement ORCA a couple of times before but did not succeed. It'll be interesting to compare my earlier attemps with the sources and give it another go. I did not quite get what they are doing in linearProgram3() with projLines stuff, though.

There is one comment in the examples, which I found particularly charming:
Perturb a little to avoid deadlocks due to perfect symmetry.

Wednesday, August 25, 2010

Handling Temporary Obstacles


I have been using some idle cycles every now and then to try to think about a nice method to update the navmesh without requiring to completely rasterize and rebuild a tile from scratch. This would be useful for obstacles, which may break and may be pushed around and usually only contribute blocking areas into the navmesh.

Cookie Cutter Obstacles

One method to do that is to punch holes in the navmesh using CSG. Some path finding middleware support that and some games shipped using it. So it definitely works, but I don't like the method.

Firstly, you have to work around the floating point accuracy, which is paramount pain. Secondly, it usually is too accurate, which means that you get some small polygons all over the place. Thirdly, it does not scale well when you have multiple overlapping obstacles. So it is an option, but very very low on my list of things to try.

Local Grid

Another option would be to use local grid and rasterize all the temporary obstacles into it and then use another simple pathfinder to find around the way.

The good thing about this method is that overlapping obstacles is not a performance issue anymore. Because grids often have lower resolution than actual world geometry, they tend to simplify the obstacle area too, so it solves the small triangle problem too.

The down side is that you have second representation of your world. The problem it creates is that what to do when the path is blocked? There is no way to reliably pass that data back to the navmesh and replan the path, as it would require to change the layout of the mesh. Simply blocking certain polygons temporarily does not fix it.

Recreate Piece of a Navmesh

Currently the way I prefer to handle dynamic obstacles is to recreate the tiles where an obstacle has changed. Tiling helps to limit the amount of work that needs to be done. While it works well in practice it is overkill for many situations, including the temporary obstacle case.

That has lead me to thinking that how much of the work that goes into rebuilding a tile could be cached?

In practice the answer is that at minimum you need to have the data that is stored in Recast compact heightfield to rebuild a tile and have the benefits of the grid. At the compact heightfield level each obstacle could be rasterized in the heightfield as specific area type and then the rest of the Recast pipeline would be run to obtain the final mesh for a tile.

There are some shortcuts that can be taken to make the rest of the Recast process quite a bit faster than usually. One optimization is to use monotone region building function. Another one is to use more loose parameters for detail mesh calculation. For small tiles, this can halve the build time which is spent after rasterization.

In practice it means, that if we could start from compact heighfield, we could rebuild a tile on 1 ms instead of 10 ms. And by using the simpler methods the work can by split into similar sized chunks so it is possible to handle the building over several frames.

But compact heightfield can be a lot of data which makes the idea impractical. Only if we could compress it...

Compressing 2.5D Heightfield Data

By inspecting the data stored in the compact heightfield, we can see that the data per span does not vary very much between neighbours. Also it happens to be stored in a way that helps usual compression algorithms to pack it well too.


The 2.5D heighfields in Recast are stored as columns stacked on top of each other. If there are multiple voxels on top of each other, they are stored as single span. This is often called run length encoding (RLE).

The spans in compact heightfield are stored in one flat array. In addition to that there is a grid of cells at XZ-plane, which stores index and span count for that column. So when you need to find all the spans at certain location, you first lookup the index to the large span array from the cell and loop through all the spans that belong to the column.

The layout is also awesome to store links to neighbours, you only need to store a small sub-index, which added to the index found from the neighbour cell to finally find the actualy span.

Because of the memory layout, all the parameters of spans can be easily stored in one contiguous array, which is good for compression since neighbour parameters are often the same. In practice only the span y and area type are needed to rebuild compact heightfield.

For the base grid, it is only necessary to store the span count per grid cell. The index can be recalculated by accumulating the counts.


The above picture shows 2.5D heightfield of size 208 x 503. The resulting compact heightfield data that is required by Recast to build the navmesh is 1662 kB.

When the data is stripped down to the minimum as explained above, the amount of data is 508 kB. Since the data is so similar it is only 29 kB after being compressed with zip! The height data should compress even better if delta encoding is added.

Conclusion

Using compressed compact heightfield could potentially be interesting solution to handle temporary obstacles. For small tiles, the rasterization can often take up to 90% of the total time required to build one tile worth of navmesh. The method would require extra data to be stored per tile, but the data should be manageable since it can be compressed by about 1:50.

Tuesday, August 24, 2010

Recast API Breakage Redux

I did some changes to the previous Recast API adjustments. The changes are not that big this time.

The first change you notice is that rcBuildContext is now called just rcContext. The second change you may notice that the time query functions are gone, and how there is startTimer()and stopTimer() instead. The third change is that you can disable the virtual function calls using a boolean flag if you want to.

These changes should make the rcContex simpler to implement and maintain.

Available in R206, the VC project is broken until I get back to my win machine again.

Monday, August 23, 2010

Path Following Performance pt. 2


Got the performance to quite constant sub 1 ms with adaptive sampling even in heavily crowded situations. The RVO samples (red) still dominate the performance. I think I'll stop here for a moment, remove a couple of quadratic loops, start figuring out how to package it better and open the flood gates for a couple of trillion bug reports.

That yellow spike is new path requests for all agents in a single frame. I'm quite happy about the Detour performance here. It stays about the same on larger meshes too. I'm particularly happy that I could keep the raycast there to optimize paths a little each update.