There are about a dozen software packages on the market to help traffic engineers determine what intersection improvements should be made. They all talk about calibrating the models to existing conditions. But there's a real problem with needing to calibrate these software models.
How do you calibrate to something that doesn't exist? How do you take an existing all way stop sign control intersection and calibrate to increased traffic forecasts with traffic signal or roundabout control at an intersection?
The increased traffic alone my change driver behavior at the intersection. But then layering on different lane configurations and traffic control make the existing condition irrelevant.
Bryant Ficek at TKDA and I have been talking about this for quite a while, so we decided to do some research. We couldn't come up with any research on the topic (if you have it – please share, and yes we realize the whole Highway Capacity Manual is based on mountains of research).
So we decided to collect turning movement counts, delay and queuing data at a two lane roundabout in Washington County. Then build models of a "generic" multi-lane roundabout with the software's defaults to see how different software packages performed.
Yes, yes – we understand the software companies are going to scream at us – you need to calibrate! But our response is – you can't calibrate to a future alternative that doesn't exist right now. We depend on you to build software models with parameters that will get us close to what that future condition would be. Or at least tell us what data we should collect locally to determine our region's defaults.
Bryant built models with Rodel, HCM 2010, Simtraffic, Synchro, and Vissim. It's not that easy to come up with what defaults to use in each software package. Here's Bryant's write-up of the inputs for each model:
Download Roundabout Software Evaluation. And here's a spreadsheet that compares all of the results against the actual p.m. peak hour data measured –
Download Comparison Matrix. NOTE: Internet Exporer seems to have a tough time downloading these files – Firefox and Chrome work. You can email me if you can't get them downloaded – firstname.lastname@example.org.
The punchline is that none of them nailed it. They were all reasonable on the delay results, but the queuing results were pretty far off.
This is just the start of this study. Bryant and I want to collect data at a bunch of intersections with different types of traffic control and then test out all of the models. It would be great if we got other folks to pitch in with results from around the country or even the world.
But we didn't want to get to far on this if there's a fatal flaw in our reasoning or methodology. So this post is the start of a conversation. What do you think?
Im guessing you and Bryant didn’t have access to aaSIDRA to include in the comparison? I used this extensively during my time down in STL (2004-2008) for roundabout analysis. The results from it seemed reasonable; much more so than RODEL at least!
I think its understood in the modeling community that you calibrate extensively for existing geometric and volume conditions so you can evaluate impacts forecasted traffic will have with the same geometrics (a forecasted “no build” alternative). Once you get away from the existing geometrics, then you must apply the existing conditions calibration techniques as best you can for forecasted conditions. If the traffic control type changes completely, then you must calibrate based on local knowledge and experience of similar intersections in the area.
Traffic analysis softwares are just tools in the traffic engineers toolbox. Engineers must decide which tool is best for each situation and the proper way to apply it. This includes changing default parameters to best represent local conditions!
We don’t have Sidra, but it looks like we’ll be provided a temporary license so we can add that to the mix.
I agree with you on using calibrating to get a feel for local conditions. But then the question is how good of a feel does the modeler have for local conditions. It’s a very hard thing to quantify or justify. Going by “gut” doesn’t seem to be very defensible.
So starting out, we’re hoping to hone in on the software tool that gets closest to the mark out of the box. There will always be room for practitioners to make improvements and that is somewhat of an art form.
We are also trying to illuminate some of the areas where we could tweak inputs across the board to match up with our local conditions.
How closely did the field counts match the default PHF of 0.92 that was used in the analysis?
The PHF from the p.m. peak hour was 0.91.
Just something that caught my eye… Did you remember to convert the Max Queue output from Rodel from vehicles to ft.? If so, I didn’t know that RODEL would give Max queues shorter than the 95th percentile queues. I’ve seen that in SimTraffic, but didn’t now RODEL would do it as well.
We did do the conversion in RODEL.
These are notoriously difficult to replicate in a model. One learner driver, one stalled vehicle, rain, fog, snow, etc., can affect observed queues but are almost certainly never included in a model.
I’d be very interested in the first instance to understand the variation in observed data. Say, count the same intersection for 100 days at different times of the year. I would expect a relatively simple correlation between demand and queue length, but I’d also expect it to show significant variation.
I would expect a model to be able to replicate the overall correlation. For the deterministic models, they won’t have any of the variation. But, even the stochastic models are unlikely to replicate the vast range of ‘events’ which will contribute to the variation in observed data.
Excellent point that I hadn’t really considered. The maximum queue in the data appears to be an outlier. An interesting tangent could be to see how well queuing statistics correlate to peak hour turning movement counts and delay.
I use Paramics for queueing analyses, but share in your general grief over the limitations of all packages.
One issue I have experienced is the variability in driver behavior regionally. Drivers in the Midwest tend to respect the yield sign, while those in California or New Jersey are blind to it. My feeling is that different software packages use different default variables to capture this effect. So you might want to watch out for the driver characteristics such as headways, reaction times, speeds and speed distributions (to the extent they are applciatple), to be sure you’re comparing apples to apples across platforms.
Fascinating research, I will be watching for further updates.
The best software is the one you know how to use. This may be an obvious comment, but in 25 years, I have not come accross default values that represent local conditions. Could be something to do with the way people drive in Africa. However it would be interseting to see which software does come close.
Great review. I recently saw a similar study that attempted to do the same thing (I’ll try to dig up the link).
You’re comparing static models with stochastic/microsimulation models. It’s a nice comparison. I often use VISSIM and SIDRA to model roundabouts and it’s nice to see when they show similar results.
I noticed you didn’t have an actual capacity for VISSIM or SimTraffic. You can actually go into the VISSIM model, adjust certain behavior settings and gap acceptance settings to create the capacity that you need to achieve (or a saturation flow rate). This is one key element to the calibration of a VISSIM model. I find the actual calibration of roundabout pretty difficult because I’m usually modeling a proposed condition where one doesn’t already exist.
While I like the simplicity of the equation based models, I think microsimulation can capture the nuances a little bit better.
I disagree that you cant calibrate to something that doesn’t exist… That’s because you are not supposed to… you are supposed to calibrate to the existing condition then carry that calibration forward to the proposed condition (i.e. what doesn’t exist). The extend of your calibration parameters are all up to the level of detail that you want. Furthermore, queue and queuing theory is a very complicated subject based on various statistical theories, and caution should be used when attempting to use queuing as a calibration parameter. I have read many white papers over the years on queuing and they all seem to provide an approximation of queuing based on a degree of probability (i.e. 90th percentile queue) and can have results that vary significantly.
Since we (traffic engineers) generally need to relay to decision makes our findings in easily understood terms such as Level of Service (LOS), I think most modeling should be calibrated using delay and speed, provided there is no “Latent Demand,” a subject often overlooked and important in heavily congested urban areas.
PS We generally use queue distances or the back of queue as a “Soft” parameter when modeling.