Digesting Duck: August 2010

Sunday, August 29, 2010

My Summer of Code 2010

This where all the magic happened!

You may have noticed the increased traffic on the commits and the blog. But now Mikko's Summer of Code has come to an end. I used a few weeks of downtime between projects to flush my TODO queue. I'm starting a new project this week and I will be back to my 1 day a week schedule with RC&DT.

Here's some thoughts on what are likely to happen in near future.

Multi-Agent Navigation

I got the multi-agent navigation to much further state than I expected. The code is currently huge pile of spaghetti. The next step is put it all together. I have been trying to think about the API and it seems to be quite a big task. I think it is good that my progress is slowing down so that I can let things simmer in a bit.

If you have some thoughts how you would like to integrate the multi-agent navigation to your project, please let me know!

Would you like to use it as a component? Do you want a manager you can tick or do you prefer to do that yourself? Do you use physics for collision detection? Where in your agent update cycle you are planning to put the path following code? Any input is welcome. If you don't want to share the secrets here you can always send me email too.

Everyone is integrating path following slightly differently. I hope to support as many people as possible. On one extreme there are the folks who just would like to get plug'n'play, and on the other side are the folks who would like to get the max performance which means that I need to split the API so that you can tear everything apart as necessary.

My current idea is to roughly separate the code into roughly four pieces:

mover
steering
obstacle avoidance
collision detection

The mover takes care of handling path corridor and movement along navmesh surface. In theory you could use it to create NPCs or even player movement. It will also keep track if new path should be queried.

Steering is game and behavior specific so I think I will add some helper functions to the mover class which allows you to build your own. I will provide some examples how to do it too.

Obstacle avoidance will be separate class which can be used even without any of the mover stuff. Basically you feed in obstacles and the agent state (position, radius, velocity, desired velocity) and it will spit out new velocity which should result in less collisions than the desired one. The first implementation will be the sampled VO, I'm looking into adding ORCA stuff too later.

Collision detection will be responsible of finding the neighbors to avoid and to collide with. This is potentially something that your engine does already. I might include this only in the example code.

Many of the tasks required for multi-agent navigation are embarrassingly parallel (see all the for loops in the current CrowdManager::update()). My idea is to tailor the APIs so that when possible you just pass in the data that is absolutely necessary. For example the obstacle avoidance code will not figure out the neighbors itself, but you need to feed them.

I will create an example which can be used for learning and also something that can be integrated as is, for those who just like to get quick and dirty integration.

Detour Error Handling

There are some cases where Detour should be more verbose on error cases. My work on the multi-agent navigation has revealed some short comings in the error handling code too and the sliced pathfinder work surfaced some ideas too.

I'm planning to improve the Detour API at some point so that it will report errors better. The change will be most likely that every method will return a status instead. This allows me to report invalid input data and more elaborate reason why path finding may have failed or report that path finding succeed, but actually returned partial path. There are some discrepancies on the raycast() function return values too.

Temporary Obstacles

I got quite exited about the heightfield compression findings and got some really good feedback on it too. It is definitely something I wish to prototype more in detail.

This also ties into better Detour error handling. I will need to finally figure out how to handle interrupted and invalid paths. Detour will not replan automatically, but it will let you know what something went wrong or that something will go wrong in the future so that you can schedule new path query yourself.

Friday, August 27, 2010

RVO 2 Library

RVO2 Library has just been released. It uses optimal reciprocal collision avoidance (ORCA).

http://gamma.cs.unc.edu/RVO2/

I have had tried to implement ORCA a couple of times before but did not succeed. It'll be interesting to compare my earlier attemps with the sources and give it another go. I did not quite get what they are doing in linearProgram3() with projLines stuff, though.

There is one comment in the examples, which I found particularly charming:

Perturb a little to avoid deadlocks due to perfect symmetry.

Wednesday, August 25, 2010

Handling Temporary Obstacles

I have been using some idle cycles every now and then to try to think about a nice method to update the navmesh without requiring to completely rasterize and rebuild a tile from scratch. This would be useful for obstacles, which may break and may be pushed around and usually only contribute blocking areas into the navmesh.

Cookie Cutter Obstacles

One method to do that is to punch holes in the navmesh using CSG. Some path finding middleware support that and some games shipped using it. So it definitely works, but I don't like the method.

Firstly, you have to work around the floating point accuracy, which is paramount pain. Secondly, it usually is too accurate, which means that you get some small polygons all over the place. Thirdly, it does not scale well when you have multiple overlapping obstacles. So it is an option, but very very low on my list of things to try.

Local Grid

Another option would be to use local grid and rasterize all the temporary obstacles into it and then use another simple pathfinder to find around the way.

The good thing about this method is that overlapping obstacles is not a performance issue anymore. Because grids often have lower resolution than actual world geometry, they tend to simplify the obstacle area too, so it solves the small triangle problem too.

The down side is that you have second representation of your world. The problem it creates is that what to do when the path is blocked? There is no way to reliably pass that data back to the navmesh and replan the path, as it would require to change the layout of the mesh. Simply blocking certain polygons temporarily does not fix it.

Recreate Piece of a Navmesh

Currently the way I prefer to handle dynamic obstacles is to recreate the tiles where an obstacle has changed. Tiling helps to limit the amount of work that needs to be done. While it works well in practice it is overkill for many situations, including the temporary obstacle case.

That has lead me to thinking that how much of the work that goes into rebuilding a tile could be cached?

In practice the answer is that at minimum you need to have the data that is stored in Recast compact heightfield to rebuild a tile and have the benefits of the grid. At the compact heightfield level each obstacle could be rasterized in the heightfield as specific area type and then the rest of the Recast pipeline would be run to obtain the final mesh for a tile.

There are some shortcuts that can be taken to make the rest of the Recast process quite a bit faster than usually. One optimization is to use monotone region building function. Another one is to use more loose parameters for detail mesh calculation. For small tiles, this can halve the build time which is spent after rasterization.

In practice it means, that if we could start from compact heighfield, we could rebuild a tile on 1 ms instead of 10 ms. And by using the simpler methods the work can by split into similar sized chunks so it is possible to handle the building over several frames.

But compact heightfield can be a lot of data which makes the idea impractical. Only if we could compress it...

Compressing 2.5D Heightfield Data

By inspecting the data stored in the compact heightfield, we can see that the data per span does not vary very much between neighbours. Also it happens to be stored in a way that helps usual compression algorithms to pack it well too.

The 2.5D heighfields in Recast are stored as columns stacked on top of each other. If there are multiple voxels on top of each other, they are stored as single span. This is often called run length encoding (RLE).

The spans in compact heightfield are stored in one flat array. In addition to that there is a grid of cells at XZ-plane, which stores index and span count for that column. So when you need to find all the spans at certain location, you first lookup the index to the large span array from the cell and loop through all the spans that belong to the column.

The layout is also awesome to store links to neighbours, you only need to store a small sub-index, which added to the index found from the neighbour cell to finally find the actualy span.

Because of the memory layout, all the parameters of spans can be easily stored in one contiguous array, which is good for compression since neighbour parameters are often the same. In practice only the span y and area type are needed to rebuild compact heightfield.

For the base grid, it is only necessary to store the span count per grid cell. The index can be recalculated by accumulating the counts.

The above picture shows 2.5D heightfield of size 208 x 503. The resulting compact heightfield data that is required by Recast to build the navmesh is 1662 kB.

When the data is stripped down to the minimum as explained above, the amount of data is 508 kB. Since the data is so similar it is only 29 kB after being compressed with zip! The height data should compress even better if delta encoding is added.

Conclusion

Using compressed compact heightfield could potentially be interesting solution to handle temporary obstacles. For small tiles, the rasterization can often take up to 90% of the total time required to build one tile worth of navmesh. The method would require extra data to be stored per tile, but the data should be manageable since it can be compressed by about 1:50.

Tuesday, August 24, 2010

Recast API Breakage Redux

I did some changes to the previous Recast API adjustments. The changes are not that big this time.

The first change you notice is that rcBuildContext is now called just rcContext. The second change you may notice that the time query functions are gone, and how there is startTimer()and stopTimer() instead. The third change is that you can disable the virtual function calls using a boolean flag if you want to.

These changes should make the rcContex simpler to implement and maintain.

Available in R206, the VC project is broken until I get back to my win machine again.

Monday, August 23, 2010

Path Following Performance pt. 2

Got the performance to quite constant sub 1 ms with adaptive sampling even in heavily crowded situations. The RVO samples (red) still dominate the performance. I think I'll stop here for a moment, remove a couple of quadratic loops, start figuring out how to package it better and open the flood gates for a couple of trillion bug reports.

That yellow spike is new path requests for all agents in a single frame. I'm quite happy about the Detour performance here. It stays about the same on larger meshes too. I'm particularly happy that I could keep the raycast there to optimize paths a little each update.

Friday, August 20, 2010

Path Following Performance

I started working a bit on the path following again. Naming things better and moving stuff around. I also added a simple profiler to the stuff too as seen on above picture. The case shows 25 agents moving in a quite simple level.

The bad news is that the performance is bad! The good news is that it's ok.

There are two times tracked in the above picture. The red-ish line shows how much time is spent in the brute force RVO sampling, and the lime-ish line shows the time taken by the total navigation loop calculation.

My goal is to get this case to around 0.5 ms, and I think it is doable.

Visibility Optimized Graph Experiment

I blogged earlier about A* node placement. I had the change to try out the visibility optimized node placement today. First impression, not impressed.

It generates quite aggressively goal directed paths, much like if you over estimate your heuristic (or was it vice verse). The final segment to the goal is always perfect, though.

In case someone wants to give it a try, here's the code I used:

static int isectSegSeg(const float* ap, const float* aq,
                      const float* bp, const float* bq,
                      float& s, float& t)
{
   float u[3], v[3], w[3];
   dtVsub(u,aq,ap);
   dtVsub(v,bq,bp);
   dtVsub(w,ap,bp);
   float d = dtVperp2D(u,v);
   if (fabsf(d) < 1e-6f) return 0;
   d = 1.0f/d;
   s = dtVperp2D(v,w) * d;
//    if (s < 0 || s > 1) return 0;
   t = dtVperp2D(u,w) * d;
   if (t < 0 || t > 1) return 0;
   return 1;
}

...

if (neighbourNode->flags == 0)
{
   float sa[3], sb[3];
   getPortalPoints(bestRef, bestPoly, bestTile,
                   neighbourRef, neighbourPoly, neighbourTile,
                   sa, sb);
   float s,t = 0.5f;
   if (!isectSegSeg(bestNode->pos,endPos, sa,sb, s, t))
       dtDistancePtSegSqr2D(endPos, sa,sb, t);
   t = dtClamp(t, 0.1f, 0.9f);
   dtVlerp(neighbourNode->pos, sa,sb, t);
}

Custom Detour Query Filters

I just committed the change for custom query filters and slight optimisation for the A* (SVN R200).

Custom Query Filters

I kept the earlier default behavior of the filter, so the API and functionality change is really small. Instead of calling dtNavMeshQuery.setAreaCost() you now call dtQueryFilter.setAreaCost().

There was quite some discussion if the passFilter() and getCost() should be virtual function calls or not. For people working on PowerPC processors it can be quite a big deal as both the functions are called per node during A*. On PC, the perfomance loss of using virtual function is not super bad (about 5-10%) compared to the flexibility it gives.

There is a define in the dtNavMeshQuery.h called DT_VIRTUAL_QUERYFILTER. If the define is set, the above mentioned functions are declared virtual and you can override them. In other case, the default implementations are inlined which should lead maximum performance. This was the best compromise I could think of.

Your custom query filter should be as fast as possible, as it can easily dominate the A* performance. If you want to query an agent flag, cache it locally inside the custom filter before you call pathFind(). If you need to do physics queries, try to cache them too before hand too. Use as little extra data as possible.

A* Optimization

While I was testing the impact of the virtual calls I noticed that the edge midpoint calculations were taking quite substantial time, so I tried caching the results. I earlier thought that having smaller node size and recalculating the edges on the fly would be more efficient than storing the node positions. It doubles the node struct size and all.

It turns out I was wrong (may vary per platform). Anyways, the node position is now stored for each node. This also opens up possibilities to try different node calculation methods. Caching the node positions resulted about 15% speed improvement.

This is the piece of code that calculates the node position:

            // If the node is visited the first time, calculate node position.
           if (neighbourNode->flags == 0)
           {
               getEdgeMidPoint(bestRef, bestPoly, bestTile,
                               neighbourRef, neighbourPoly, neighbourTile,
                               neighbourNode->pos);
           }

Be aware not to create neighbour nodes on top of each other (that is, zero length connections). For example if you want to place that new node so that it is the nearest point on the edge from parent node, then you should clamp the edge 't' to lie between for example 0.001 and 0.99, or something similar. You can use getPortalPoints() to query the segment which connects best and neighbour.

I also added a debug view mode which displays the nodes and link to their parents. This should help to see how well the node placement might work.

Thursday, August 19, 2010

Navmesh Node Locations

I have previously calculated the A* node locations on the fly to save memory. That is when I expand to neighbour polygons, I calculate the edge mid location on the fly. While profiling the filter change proposal, I also tested caching the A* nodes locations. This gave me the benefit to also being able to draw them.

There was one particular location in one of my test levels which always bothered me. I had plotted it out few times in Photoshop, but now I can finally see what was going on.

The above picture also illustrates the reason why navmeshes sometimes produce unoptimal paths. The A* is actually ran on that "skeleton". The problems pretty much always occur where small and large polygon are next to each other.

To answer the question in the picture. That detour is selected because the route along the skeleton is shorter than crossing through that large polygon. When you see the skeleton it is easy to understand, but when it is not visible, it looks plain odd.

Proposal for Custom Cost and Polygon Filter for Detour

I have been recently discussing a potential solution to issues 47 and 103 with Neil Armstrong. That is, how to allow the user to pass custom cost values for A*. In some cases the different travel costs could be based on NPC type, or maybe even some local context like trying to avoid the polygon where the target is.

I think this kind of functionality is really important to achieve correct gameplay, and usually there is no one solution that fits all.

I'm proposing a change to the dtQueryFilter which would allow custom logic on polygon filtering as well as custom cost for searches. Here's the interface and the current implementation as an example:

struct dtQueryFilter
{
   dtQueryFilter() :
       includeFlags(0xffff),
       excludeFlags(0) {}
  
   virtual bool passFilter(const dtPolyRef ref,
                           const dtMeshTile* tile,
                           const dtPoly* poly) const
   {
       return (poly->flags & includeFlags) != 0 &&
              (poly->flags & excludeFlags) == 0;
   }
  
   virtual float getCost(const float* pa, const float* pb,
                         const dtPolyRef prevRef,
                         const dtMeshTile* prevTile,
                         const dtPoly* prevPoly,
                         const dtPolyRef curRef,
                         const dtMeshTile* curTile,
                         const dtPoly* curPoly) const
   {
       return dtVdist(pa, pb) * areaCost[curPoly->area];
   }
  
   float areaCost[DT_MAX_AREAS];

   unsigned short includeFlags;
   unsigned short excludeFlags;
};

The input to the cost function is a line segment from pa to pb which travels along curPoly and the previous and next polygons.

I'm not big fan of interfaces and I was worried how much overhead it would add. In my test case the path queries vary from 0.030 ms to 0.135 ms (there are some small fluctuations from run to run). When the passFilter() is virtual function instead of inlined function it adds 0.010 ms to 0.020 ms per path query. I think the interface is worth it.

So this post is request for comments. Unless anyone disagrees or proposes better solution, I will check in the changes.

[EDIT] I did few more tests and the above did not actually include the cost as virtual call, only the polygon filter as virtual function call. In total the virtual function call version is something around 5-10% slower.

Recast API Breakage

I just committed a change (SVN R197 R199) which adds rcBuildContext, and breaks that API. Sorry for the trouble! I hope you welcome the new changes.

The newly added rcBuildContext abstracts the timer, logging and build time reporting. I'm planning to add more error and warning handling support via that interface in the future, for example drawing and high-lighting bad geometry.

Majority of the Recast functions now expect a valid pointer to rcBuildContext as first parameter. Allocators and a couple of really simple utility functions are an excetion. If you don't care about build timing or logging, you can just pass a pointer to an instance of the default implementation of rcBuildContext. Like this:

rcBuildContext dummy;
...
rcErodeWalkableArea(&dummy, m_cfg.walkableRadius, *m_chf);

There is an example implementation in the SampleInterfaces.h/cpp. I also moved the other example implementations of interfaces there (FileIO, DebugDrawGL).

The files RecastTimer.h/cpp and RecastLog.h/cpp were removed in the process. I added RecastAssert.h which is used in some places to check that validity of the passed parameter (more validation later).

I wish a smooth migration! Let me know if there is something I overlooked.

Off to update the Win32 project. [EDIT] updated.

Tuesday, August 17, 2010

Detour Sliced Path Find

Well, while I was in teh mood, I though I'd give it a go.

I just checked in (SVN R196) some extra methods for dtNavMeshQuery which implements sliced path finding. It works as follows:


// Request
m_pathFindState = m_navQuery->initSlicedFindPath(m_startRef, m_endRef, m_spos, m_epos, &m_filter);

...

// On update
if (m_pathFindState == DT_QUERY_RUNNING)
    m_pathFindState = m_navQuery->updateSlicedFindPath(maxIter);

if (m_pathFindState == DT_QUERY_READY)
    m_npolys = m_navQuery->finalizeSlicedFindPath(m_polys, MAX_POLYS);

There is new tool called "Pathfind Sliced" which allows you to see it in action. The tool advances the search one node at a time so you can see how the closed list is update in realtime.

The actual path find is a tiny little slower than the findPath() because it does more validation. It should catch the case that some of the data is being removed while the query is in flight, and if the update encounters invalid data it will return DT_QUERY_FAILED.

If the removed data was already in the closed list of the search, the path result will contain invalid data. The rest of the API can handle that, but I recommend restarting all currently running path requests after you have added or removed a tile.

Detour Navmesh Data and Queries Separated

I just checked in code (SVN R194) which separates the navmesh data from the queries. In the process I also cleaned up how the data is accessed via dtNavMesh.

API Changes

In a nutshell this means that instead of using dtNavMesh, you use dtNavMeshQuery, except when you initialize the data of course. That is pretty much the only change if you are using Detour in single threaded system.

All the query methods have been moved to dtNavMeshQuery. The dtNavMesh class now only has methods to add, remove or change the data.

I removed bunch of getters. Now the right way to access the data pointed by dtPolyRef is:

bool getTileAndPolyByRef(const dtPolyRef ref, const dtMeshTile** tile, const dtPoly** poly) const;

It takes polygon ref as input, and stores pointer to tile and polygon as output. The function returns false if the poly ref is not valid.

Parallel Stuff

The new change allows you to better control the memory access of Detour for multi-threaded systems. The navigation mesh data stored in dtNavMesh is considered read-only by the dtNavMeshQuery. This also allows multiple dtNavMeshQueries to work on the same data. If you organize your data modifications and queries right, there is no need for locks.

The basic outline of multi-threaded use of Detour is as outline by Bjoern in [1]:

Update the navigation mesh (set flags, add or remove tiles)
Collect new pathfinding requests or pathfinding-change requests. Requests are deferred and not processed when posting them.
Process all pathfinding requests.
Make pathfinding results available.
Clear out all closed pathfinding requests and prepare for the next main loop cycle.
Back to stage 1.

I recommend reading Bjoern's post if you are interested in parallelizing your path finder. Also, any feedback is welcome. I will not be providing task scheduler, though, no point asking for it :)

So this is the first step towards more parallel Detour. The big task yet to tackle is to make all the queries to finish in more predictable time. The findPath() is really unpredictable in its' current form.

There are two options to improve that: 1) slice it, 2) add hierarchy. The final choice might be both.

Slicing means that only a set number of nodes are visited by the A* every update. It is simple to implement, but since the data can change between the updates, there needs to be much more bookkeeping in the process. Also, slicing means that the dtNavMeshQuery will store an intermediate state. That is, it would not be possible to run other query methods between the path find iterations, as they might mess up the open and closed lists.

I think in the long run the hierarchy is the way to go, but it needs new data structures and all sorts of things to be prototyped before it is ready to go in.

As an intermediate step, I'm probably going to implement iterative findPath() at some point.

[1] Parallel Detour navigation mesh

Some Recast & Detour Projects

The image is composed of a screen shot from the game Kingdoms of Amalur: Reckonin, GSoC10 projects for CrystalSpace and Blender.

Recast & Detour has seen quite nice adaptation during the past year. During the Paris Game AI Conference I run out of fingers to count the AAA projects currently in production which are using some part of Recast and Detour. In addition to that looks like there are countless indie, open source or hobby projects which I'm totally unaware of too.

Some projects have ported the code to different platforms, some have made quite heavy modifications to add features needed specifically for their game, and some are just using some parts of the code. It's awesome that Recast and Detour are being put in use! This also means that quite a few bugs have been reported and fixed too.

Here's some recent projects I'd like to highlight (if you like to share yours, drop me a line or comment below):

Kingdoms of Amalur: Reckoning
I'm glad that I'm allowed to share that up coming game Kingdoms of Amalur: Reckoning by Big Huge Games is using Recast and Detour.

Google Summer of Code 2010
Thanks to Nick and Leonardo, Blender Game Engine and CrystalSpace both leveled up on pathfinding this summer.

I'm thankful to all the people who have sent bug reports, fixes, improvements, suggestions or challenged me. It has made the toolkit so much better!

Monday, August 16, 2010

Heads Up, API Changes Coming Soon

I try to get two API breaking changes done this week.

1) Recast Context: This change will add additional parameter to most of the recast calls. I have yet to decide if the parameter will be the first or last. The idea behind this change is to remove platform dependancy of error/warning logging, timer and profiling code from Recast. See:

2) dtNavMesh & dtNavMeshQuery: This change will allow multiple simultaneous navmesh queries to happen at the same time. Basically it will strip dtNavMesh to just the data management, and add new class which has all the query functions. This makes Detour more concurrency friendly. See:

I will make separate posts describing the changes once I get things checked in.

Thursday, August 12, 2010

Some More Movement Progress

Since last time, I implemented smooth steering, drunken movement style and sampled velocity obstacles.

The features are coming together, but the code is still huge pile of mess. I checked in first version (R193) so that the change list does not become too big.

When using tiled navmeshes and many agents, I got quite a bit of problems with agents trying to move backwards to catch up with their path. This happens mostly around the vertices which are at tile corners.

For the time being, I added that raycast trick which tries to optimize the path corridor by checking if there is line of sight from the current location to the corner after the next corner. If so, then the beginning of the path is patched to use the results from the raycast.

The performance is still realtime (it chokes when you have large group next to each other), but I bet it is actually really horrible. Once I get the structure of the code better, I'll start looking at the optimizations.

Tuesday, August 10, 2010

Oh, it moves!

The first test of a couple of agents moving around in a test level, yay!

I have implemented the navigation-loop stuff, simple steering, simple dynamic constraint, and simple collision handling. I need to rethink the way stuff is organized so that it is easier to grab the necessary code to your own projects.

It's always so amazing when it works the first time, even if it just barely does.

Monday, August 9, 2010

Detour Agent Movement Progress

I'm moving forward with the agent movement code for Detour. Today I added a couple of functions to Detour help me realize the goal. Check NavMeshTesterTool.cpp how to use the new functionality.

dtNavMesh.moveAlongSurface() - This function tries to move an agent from start position to target position along the navigation mesh and constraints the position to be inside the navigation mesh at all times. The function is Detour implementation of the algorithm I explained in a previous post. In the process I removed the old moveAlongPathCorridor(), which supported only subset of the functionality anyways.

dtNavMesh.findLocalNeighbourhood() - This function finds non-overlapping (in 2D) neighbourhood around a specified point. It is useful for local navigation. I have explained the algorithm earlier too.

dtNavMesh.getPolyWallSegments() - This functions returns wall segments of a specified navmesh polygon. In conjunction with findLocalNeighbourhood() it can be used to generate 2D collision segments for local obstacle avoidance.

The image at the top of the post shows local neighbourhood segments queried using the new functions.

All the above functions assume that they operate on really small set of polygons. To achieve a bit better cache coherence, I added another tiny node pool, which currently only contains 64 nodes. The aim was to improve cache coherence. The node pool uses hash table to lookup nodes based on ID. When dealing with small number of nodes this actually slows things down because of the random access through the hash table.

Additionally, instead of using priority queue, the above functions use fixed size small stack and breath first search. The random access of the navmesh polygons is quite cache pooper already, but the above gives around 30-50% speed up compared to using the existing p-queue and larger node pool.

I'm planning to spend some more time this week in the agent movement stuff. The above functions are bound to change as I try to find my way through the implementation details.