The way the Crowd Manager has evolved in the SVN is pretty much the way I usually massage my code. I start with something that just works. In my case it was the ugly but working prototype I made for my Navigation Loop presentation. First I ported that code almost one-to-one to use Detour. After that I have tried to refactor the hell out of it until I have seen the code from arranged from many points of view.
The point is to find the best structure of the code based on the constraints you have. Fully explore the depth based on each constraint. My major constraints have been speed, concurrency and reusability.
Speed was my first goal, which I pretty much reached. I have recently worked more on the concurrency and a lot more on trying to find good structure how to organize the code so that it is reusable.
I don't have much experience writing concurrent  code, so my approach has been to make all the hard work–like A*–asynchronous (PathQueue) and try to arrange the other code so that the work is done embarrassingly parallel (to make the transition from concurrent to parallel easier), to use read-only data, and try to reduce the sync points to as few as possible. Very much like the Nulstein stuff.
One of the most complicated things about writing toolkit or library code is how to organize the public API. I don't like huge monolithic systems which want to control all the updates themselves. They are simple to get something rolling quickly, but once you need to shuffle your own code around a bit, you start to get pretty horrible problems in update orders and what not. I'm big fan of Casey what it comes to API design, but I don't think I master that yet.
One of the hardest parts of arranging the code has been how to facilitate custom implementations of parallelizing the code, steering, collision avoidance and dynamic constraints. They are right there in the middle.
As with the rest of the RC&DT, the Crowd Manager will be again piece of usable example glue code, which uses the "real" public API. Ideally it should show you how the relation ship between different parts of the navigation loop, where to put the sync points, and which steps rely on previous ones. Also the sample will include some example implementation of steering and dynamic constraints.
The public API will be some classes which implement path corridor management, local avoidance and such. The sample code should help you to get things up and running quickly, but I try to make the sample code simple, light and rough enough that you will not have problem later to tear it into parts and rewrite to fit into your code base.
So there is still a lot of work to do, and I will probably postpone some of the features  to get a working piece of code out in reasonable time frame.
ETA? Before Christmas (fingers crossed).
 Concurrent = code "in-flight" at the same time, Parallel = code executed in parallel
 Like using off-mesh connections and more robust handling of mesh changes. Not complicated code, just annoyingly puzzley to figure out correct state handling, and how to juggle that state around.