Thursday, July 14, 2016

Estimating 70.3 Triathlon Time from Olympic Triathlon Time

Setting goals can be difficult when moving to a new race distance. I recently read an article which had some example calculations for setting a half-ironman goals based on Olympic distance results, and I decided to run some regressions to see what equations I could come up with based on actual race data.

I found two races, an Olympic triathlon and a 70.3, that were reasonably close to Chicago and about one month apart, and I ran a computer script to identify athletes who competed in both.  I then ran some regressions to estimate the expected 70.3 times from the Olympic times.

The Races to Compare


ET Lake Zurich Olympic Triathlon
July 12, 2015
Swim: 1500m
Bike: 24.9 miles
Run: 6.2 Miles
Finishers: 433 (Age Group Category)

Ironman 70.3 Steelhead
August 9th , 2015
Swim: 1.2 miles
Bike: 56 miles
Run: 13.1 miles
Finishers: 2043

In comparing the results of these races, I was able to identify 57 athletes (one was omitted from final analysis) who competed the "Age Group" category in both races.

Regression Analysis


I ran some simple regressions using the results for the athletes who completed both races.  The Olympic splits were used as the independent variable, and the Half Ironman splits were used as the dependent variable.  Basically, each regression provides an equation to calculated the expected Half Ironman distance time based on the Olympic distance time.

All times in the equations and plots are in minutes!

Swim Regression



The "residual standard error" of this regression is about 2 minutes, so, roughly speaking, about 68% of swim results fall within +/- 2 minutes of the estimate provided by the equation and 95% of results fall within +/- 4 minutes (two times the standard error) of the estimate.

Bike Regression



The residual standard error for the bike time estimate is about 10 minutes.

Run Regression



The residual standard error for the run time estimate is about 12 minutes.

Overall Regression



The residual standard error for the overall time estimate is about 18 minutes.

Conclusion


These results ended up matching fairly well with the example calculations in the article noted earlier. Of course, in practice, there are many additional factors which could affect the results when applying these equations to other Olympic and Half-Ironman events, so, use with caution! I may try to find some other pairs of events to check how similar the estimates turn out to be.  I'd also like to run some similar regressions to compare half-ironman and full ironman times.

One other note, I did run some multiple regressions using the swim, bike, and run from the Olympic to estimate the Half Ironman overall time rather than just directly estimating one overall time from the other.  I expected this might give a better result since the individual components don't all scale by the same amount between the two events.  However, the result was only slightly better than the simple regression equation I posted above, so I opted to stick with the simpler model.



No comments:

Post a Comment