Running Prediction Models Explained: Riegel, Cameron, and Daniels VDOT
If you've ever used a race time calculator, you've probably been handed a single number with no explanation of where it came from. Different calculators sometimes give different answers for the same input. Understanding why helps you know when to trust a prediction — and when to be skeptical.
Three models dominate serious running prediction: the Riegel formula, the Cameron formula, and Daniels' VDOT-based approach. Each has different strengths, different blind spots, and different scenarios where it performs best.
The Riegel Formula
In 1977, Peter Riegel published a study proposing that the relationship between running speed and distance follows a power law. His formula predicts how your performance at one distance translates to another based on how human fatigue compounds with distance.
The intuition: your pace degrades as distance increases, and it does so consistently enough to model mathematically. The formula uses an exponent that reflects the average human fatigue rate across distances.
Where it works best. Middle-range predictions — going from a 10K to a half marathon, or a half marathon to a marathon. When the reference race and the target race are close in distance, Riegel tends to be reliable.
Where it struggles. At the extremes. Predicting a marathon from a 5K often produces an overly optimistic result, because the fatigue dynamics at 42km aren't the same as at 5km. The further you extrapolate, the wider the error band becomes.
The Cameron Formula
Dave Cameron developed a refinement of Riegel's approach using a different mathematical structure to model the distance-performance relationship. The key improvement is in how it handles longer predictions — particularly the ones most runners care about: translating a 10K or half marathon into a marathon estimate.
Cameron's formula tends to be more conservative than Riegel's at long-range predictions, which also makes it more realistic. It's less likely to tell a 10K runner they're capable of a 2:45 marathon.
Where it works best. Longer prediction ranges, especially when predicting marathon time from shorter reference races.
Where it struggles. Like Riegel, it's a pure performance model — it knows nothing about your training, the weather, or your individual physiology. Two runners with identical 10K times might have very different marathons, and Cameron can't distinguish between them.
Daniels VDOT
Jack Daniels, the coach behind Daniels' Running Formula, developed the VDOT system as a way to estimate a runner's effective aerobic capacity from race performances.
Rather than directly scaling times, VDOT converts your race performance into an equivalent VO2max score — a measure of how efficiently your body uses oxygen — and then uses that score to look up expected times at other distances. It's grounded in exercise physiology rather than pure statistics.
Where it works best. Comparisons across a wide range of distances. Because VDOT is anchored to physiology, it tends to produce consistent, well-grounded predictions. It also implicitly accounts for running economy: not all runners with the same VO2max run at the same pace, and the VDOT tables reflect real-world performance data.
Where it struggles. As a lookup-table approach, it can be less precise at unusual input values or edge-case distances. It also shares the same limitation as Riegel and Cameron: without additional context about training or conditions, it's predicting from fitness alone.
Why blending works better than picking one
Each model has different strengths and different assumptions baked in. On a given prediction, they may agree closely — or they may diverge by several minutes, especially on long-range extrapolations where small differences in assumptions compound.
The most robust approach is to weight and blend all three models rather than betting on one. When they agree, confidence is high. When they diverge, a weighted blend produces a middle estimate, and the spread between them gives you a natural measure of prediction uncertainty — which is exactly what a confidence range should reflect.
That's the approach RaceCast uses: all three models run in parallel, their outputs are blended based on how much data you've provided, and the spread between them informs the confidence range in your result.
The limits of any formula
Every model shares the same fundamental limitation: it's built on population averages, not on you specifically.
Two runners with identical 10K times may have very different marathons because of training background, running economy, glycogen efficiency, and race-day execution. The formula is a starting point — what makes it actually useful is the context you add:
- Two race results instead of one. Two data points reveal your personal fatigue curve rather than assuming you match the average.
- Training load. A runner doing 80km per week isn't the same as one doing 30km per week, even if their last race was identical.
- Race-day conditions. Heat, elevation gain, and course profile all shift the real outcome in ways that pure performance models ignore.
The formula is the foundation. The adjustments are what turn a generic estimate into a prediction you can actually race to.