Digesting Duck

Friday, November 26, 2010

Detour Return Codes Take 2

I just update the Detour status labels based on recent feedback. The code is in in R255. The return value is now bitfield instead of a single enum. This to pass both error messages as well as few bits of information about the quality of the result.

The high level statuses are DT_FAILURE, DT_SUCCESS, and DT_IN_PROGRESS. The rest of the bits are used to describe mode detail about the status. To test if a status is one of the above, you should use the helper functions dtStatusSucceed(), dtStatusFailed() or dtStatusInProgress().

Example:

dtStatus status = m_navQuery->init(m_navMesh, 2048);
if (dtStatusFailed(status))
 return false;

In many cases of success and failure, you can get one or more bits of extra information why things failed or if the quality of result is degraded. The detail flags are:

DT_WRONG_MAGIC: the input data you passed had wrong magic number. In practice this means that you are trying deed bad data to Detour. Failure.
DT_WRONG_VERSION: the input data is in different version than Detour expects. Failure.
DT_OUT_OF_MEMORY: the operation could not allocate enough memory. Currently this applies to init methods. Failure.
DT_INVALID_PARAM: one or more of the input parameters had invalid values. Usually this means that you passed a poly ref which is not valid. Failure.
DT_BUFFER_TOO_SMALL: the buffer to store the result was too small. Success.
DT_OUT_OF_NODES: query ran our of nodes during search. If this happens often, you should increase maxNodes. Success.
DT_PARTIAL_RESULT: query did not reach the end location, returning best guess. Success.

In case a function call fails, it usually returns only one flag. If a function call succeeds, it may return multiple flags indicating the result quality. The flag DT_BUFFER_TOO_SMALL is also use store/restore tile state and in that case it is returned to describe why the function call failed.

Some more examples:

dtStatus status = m_navQuery->init(m_navMesh, 2048);
if (dtStatusFailed(status))
{
 if (dtStatusDetail(status, DT_OUT_OF_MEMORY))
  printf("Out of memory initializing navmesh query.\n");
 return false;
}

dtStatus status = m_navQuery->findPath(sref, eref, spos, epos, filter, polys, &npolys, MAX_POLYS);
if (dtStatusSucceed(status))
{
 if (dtStatusDetail(status, DT_BUFFER_TOO_SMALL))
  // The path result is longer than MAX_POLYS.
 if (dtStatusDetail(status, DT_PARTIAL_RESULT))
  // Could not reach end location, the path leads to a location near the end point.
}

Friday, November 12, 2010

Sundries - Visual Scripting

I found this old sketch of a visual language for AI behaviors while browsing through an old harddrive. I think it is from 2006 or 2007 or something. Few notes:

Concurrency: tasks (blue stuff) can be run at parallel and a certain point of a task (percentage of completion, distance to end of path, etc) can trigger another parallel task.
Variables: the usual flow diagrams break down when you have too many things to connect, solution? Use local variables!
Ordered: the connections have order. This was important lesson learned from countless hours of debugging Crysis flowgraphs.

The arrows at the top of the tasks are execution points. When triggered, the task starting at that location will start executing.

There was also a syntax to define how the next tasks would have been triggered. In the above examples the next task is always triggered when the previous on is finished. I had an idea to also have some kind of notation to require all tasks to be finished before the next one was carried on.

The orange boxes are variables. Grey boxes are functions. Some functions take variables as input (could be stacked ala werkkzeug).

I think that "patrol point is not valid" should have a trigger line coming to it from the same bundle where the patrol point selection and restart is triggered (that eval would be 2 and restart would be 3). The grey text in the patrol box should actually read "Patrol(StartPoint)".

The overall design was much inspired by werkkzeug and PD.

Sunday, November 7, 2010

Path Corridor Topology Optimization in Action

I just commited a change to the Crowd Manager which implements path corridor topology optimization (SVN R251).

In the above video, only the very first path is calculated using regular A* for all agents, after that the path end is adjusted by stretching the path corridor as I have explained earlier.

When the agents start to move, the path corridor is optimized as I explained in my previous blog post.

In the example the path topology optimization is done for only agent per update at most twice per second. Only 32 nodes are visited during the update. Search for optimizePathTopology in CrowdManager.cpp to see how it is implemented.

Feel free to try it our yourself!

Thursday, November 4, 2010

Path Corridor Optimizations

Path corridor consists of a current start location, end location and list of polygons connecting the two. This is the structure which used at runtime to navigate from the current location to the end location. We can use A* or his bearded friends to find this path corridor in the first place.

When the agent wants to move, we first calculate the furthest visible corner along the corridor and move towards that point. As the agent moves we adjust the path corridor. If the agent moves from the current first polygon to the next on the corridor, the current first polygon is dropped and the corridor's length shrinks.

The same idea can be used to adjust the corridor even if the agent does not move towards the goal (i.e. is being pushed around by other agents). The corridor adjusts based on how the agent moves. Even despite the detours, we can always quickly query the next visible corner using the funnel algorithm and try to move towards it. Eventually we will reach the end location.

This is flexible and fast way to move the agent towards the end location. But there is room for improvement. There are several cases where the shape of the corridor may lead to visually poor results. For example, if the agent is following a target or it needs avoid other agents too much, the path can get tangled up. Often there would be a trivially shorter path to the goal only if we could adjust just few polygons in the corridor.

This kind of problems can be solved using two kinds of optimization methods: visibility optimizations and topology optimizations.

Visibility Optimizations

The idea of visibility optimizations is to try to alter the beginning of the path corridor so that we could take more direct route towards the next corner. This problem usually only occurs if there are vertices in the navmesh which do not belong to a border. In Recast & Detour this happens if you use tiled navmeshes.

If the extra corners are not removed, sometimes it looks like the agent is trying to avoid invisible obstacles or sometimes they are just small odd kinks in the road.

One way to optimize the beginning of the path is to do a walkability raycast from the current location to the corner which comes after the next corner. That is, try to shortcut the next corner.

Detour walkability raycast returns the list of polygons it visited during the trace and that data can be used to patch the beginning of the path. Note that the raycast is not used to optimize the string pulled path, but the path corridor. This is really important difference.

The visibility optimizations should be done as often as possible, potentially every update. If the optimization is not done often enough, you will get visible jerks in the agent motion. To keep the computation in bounds, you should limit the ray distance (something like 20-30 m is good range).

The walkability raycast works ok, but It would be interesting to try other methods too which might do better optimizations. One potential idea would be to treat the edges between the navmesh polygons as portals and use the well known portal visibility algorithm in 2D to try to optimize the path.

Topology Optimizations

If the agent follows a target, or if it is moving in a large crowd the topology of the path corridor can get tangled up, like in the above picture. The visibility optimization can only handle problems that are visible from the current location but there can be an obvious shortcut which needs a bit more reasoning.

One way to fix topology problems is replan the whole path, but this is very wasteful and the time it takes to complete an A* search is really unpredictable. At worst case when it cannot find the goal it will visit all your navigation polygons!

The topology problems that a human inspector would notice are usually quite local. Only if we could replan the first 20m of the path. Well, yes we can!

One way to efficiently implement path corridor topology optimization is to search towards the end locations using A*, but instead of doing the full search we limit the search to something like 30-50 nodes (step 1 in above picture). This makes sure that we can guarantee that the search returns in finite time.

When the search has been done, we start traversing the existing path corridor from the end (step 2 in above picture). As we move towards the start location, we check at every polygon if it was visited during the search. If that polygon was visited, we trace the graph back to the start and substitute the beginning of the corridor with the trace result.

Conclusion

There are two kinds of optimizations which can be used to optimize the path corridor. They make the visual appearance of the navigation more pleasing. Firstly, we can make sure that we try to aim as far as we can see by doing visibility optimization, and secondly we can periodically optimize the topology of the path so that the agent tries to move towards the goal without taking too obvious detour.

The intuition is that the visual optimization tries to optimize the very near term reactive plan, whilst the topology optimization tries to optimize the medium term plan. The reactive plan should be updated very often, but the mid term plan can be updated less frequently, say few times a second.

Friday, October 29, 2010

Detour API Change

Heads up!

I changed the Detour API so that it always returns a status (success, failure and few others) instead of number of polygons on path, etc. When you sync to a version > R250, I suggest double checking the calls to Detour.

Many of the changes also changed the function signatures so you will get compiler errors and know where to changes stuff. Some functions such as init() can now have multiple failure cases si simply checking true/false will not result on correct behavior.

The biggest change was for the functions which previously returned number of results. Now the result count is passed as a parameter and the return value tells you if the operation succeed or failed. For example pathFind() changed from:

m_npolys = m_navQuery->findPath(m_startRef, m_endRef, m_spos, m_epos, &m_filter, m_polys, MAX_POLYS);

to:

m_navQuery->findPath(m_startRef, m_endRef, m_spos, m_epos, &m_filter, m_polys, &m_npolys, MAX_POLYS);

To ease the migration, I made the change so that result count is always initialized to 0 in the beginning of the call, so it is safe to use that as an indicator of success too.

Currently majority of the failure cases are reported when one of the input dtPolyRefs was null. There is only one succeed case, so it is good value to check for success.

If you have some ideas for more verbose error reporting or find bugs, let me know!

Following Moving Target

One interesting sub-problem of navigation is following a moving target. This could be a group of ninjas following your warrior, a horde of blood thirsty zombies trying to eat you, limping band of pirates on their patrol route.

So how do you go about implementing a moving movement target? Fire off new path request every frame (too expensive! decision aliasing!)... every few frames (still too expensive! still aliasing!)... when the target moves (it moves all the time, bad optimization!)?

The right answer is that you adjust your path corridor.

The path corridor consists of start point and end point and the list of polygons from the start to the end. When we move our character, we adjust the beginning of the path corridor so that the previous definition holds after the character has moved. We can do the same for the end of the path too!

Take a look at the case pictured at the top of the post. The looting party of three follows the target. The dark polygons represent the path polygons of each agents path corridor. When the target is moved along the navmesh as indicated by the arrow, following will happen.

When the target point is moved, we keep track of the polygons we moved through. Then we "fuse" those polygons to the end of the current path corridor. The fusing process tries to find the shortest common path amongst the two. Sounds complicated, but it is really simple algorithm really.

And that's it! Now we have a new path corridor which describes the navigation from the agents current position to the position it tries to follow.

Detour Implementation

The code is in SVN R249. If you use the Move Target tool with shift+click it will use the target adjusting instead of pathfinding. Turn on Show Path from the debug draw options to see the effect.

The CrowdManager example supports both adjusting the current path corridor of an agent as well as keeping track of the changes that might happen whilst a movement request is still being processed.

Future Improvements

This method of following comes with it own set of cons too.

The method does not try to find the shortest path, it just tries to patch the current path with minimal effort. Depending on your application this may be good or bad.

If you have quit small tile size, this may lead to odd behavior in open fields. Even if there is direct line of movement to the goal, the path will take detour based on how the target moved. The open field case eventually would optimize itself. Since the path optimization distance is limited, the agent will move to "old" direction for some time before the optimizer kicks in.

Conclusion

Patching the path corridor is really efficient way to implement path following. It has some limitations, but they are quite negligible when the follow distance is quite small, like when following player quite closely or when following a formation point.

My recent work on path following has surfaced and interesting problem to solve, that is the path corridor optimization. When the world is constantly in motion–and in games it always is–the quality of the initial path is less and less important. I hope I will have more time in near future to figure out how to optimize the path corridor in certain uses cases.

If the above method sounds crappy and limited, it still does solve the case for moving target for a whole range of the use cases. And did I mention it is really fast?

Tuesday, October 19, 2010

RVO Sample Pattern

I spent quite a lot of time last Friday on trying to fix some cases I had with ORCA and to implement line obstacles. I kept on running into cases which simply cannot be solved using it.

For example I could not solve the case in above picture. Regardless of my effort, I ended up having cases where the segments would block the movement completely. The ORCA planes would be aligned according to the dark edges and they would limit a convex region which is used to clamp the velocity. I think some of the problems might be due to the fact that the navmesh is offset by the agent radius.

I'm anxiously waiting for the next paper which improves ORCA. Until then, I'm sticking with the sampled approach. Which leads me to another thing I tried last week.

Improving Sampling

At the time when I was working on my first iteration of the adaptive sampling for RVOs Phil Carlisle's student Scott Bevin mailed me that he was using similar method in his thesis too.

One main difference between out methods were that I was using a regular grid as sampling pattern and he was using 8 samples on a circle aligned at the movement direction and subdivide towards the best sample. At the time I was too busy with my Paris presentation so I did not have any time to investigate it further.

I tried his method but it did not quite work in crowded situations. There simply was not enough initial samples to get good answers.

But the idea of aligning the sample grid helps a lot! I initially thought that it would cause a lot of problems because it might produce more feedback into the velocity control loop. I think I was wrong. The desired velocity usually is quite stable, so when the pattern is aligned towards it, I did not see any additional feedback effects.

As you might spot from the above picture, my implementation used two rings of 7 samples plus one sample at zero velocity. The inner circle's rotation is offset slightly to get better coverage. Since the samples are aligned towards the desired velocity there is always a sample exactly at the desired velocity too. This makes the overal control very smooth, the jaggies from the non-aligned method are gone.

Conclusion

In practice the new sampling pattern improved the movement through small corridors a lot. The hesitating movement is almost completely gone now. The agents do slow down quite a bit if they approach a choke point from an angle. From their point of view they are almost approaching a wall. Adding that extra sample at the zero velocity improved stopping a lot.

All in all, I'm pretty happy about the results now. Not perfect, but tolerable. I can now finally turn my attention into fixing the remainder code structure issues with the crowd code.