Of That

Brandt Redd on Education, Technology, Energy, and Trust

30 November 2012

Learning from Data - An Automotive Example

Monday saw me driving 800 miles home from a family Thanksgiving celebration. Due to my wife's change in plans, my only companion for the drive was our small dog (who had a narrow escape the day before). I needed something to keep my attention. So I decided to perform an experiment in data collection. I learned a lot even from a small data sample.

The vehicle I was driving was a 2010 Subaru Forester. Some friends have the same vehicle and have been pleased with getting around 27 MPG on the highway. We typically get only 23-24 MPG on the highway and I had been wondering why.

Among the features of this car is an average gas mileage display that's tied to the trip odometer. So, sampling the gas mileage is as simple as setting the cruise control, resetting the trip odometer, driving a set distance and reading out the result. As I was crossing the relatively flat plains of Idaho (speed limit 75) this seemed to be a good opportunity to gather some data.

Over a period of several hours, I took a bunch of samples following the above method and using my GPS to track altitude changes. I abandoned samples where the altitude change was more than a few hundred feet. The result is 27 good samples. I've posted the raw data here in case you want to play with them. Most of the samples are for 20 mile segments but some are as long as 40 and some are as short as 5 miles.

As you can imagine, the lower-speed samples got a bit tedious. But I was curious enough that I even took a side trip on a remote road (off the freeway) to get samples below 45 MPH. There's considerable variability in those results as you can see in the plot below. Halfway through the trip I refilled with fuel. I switched from regular (87 octane) to premium (92 octane) to see how that might affect mileage.

2010 Subaru Forester Fuel Economy vs. Speed
The results certainly aren't what I expected. EPA city and highway ratings have always led me to expect relatively flat miles per gallon with city being much poorer due to stop-and-go driving. Instead, I got a nearly linear downward slope. Using Excel's curve-fitting feature and found that a polynomial curve worked better than a line. The formula is embedded in the graph above.

More data points would be required to really validate this curve but it certainly fits within the margin of error of my samples. Therefore we can make some cost estimates using this formula. Notable is that peak economy is between 40 and 45 MPH – much slower than I had expected. From this I was able calculate the cost of each hour saved on this long drive.

Here's a table of results for a 800 mile drive in a 2010 Subaru Forester with average fuel cost of $3.759 per gallon. Time saved and additional cost are from a baseline speed of 55 MPH. Note that the distance is cancelled out in the Cost Per Hour Saved so those numbers are accurate regardless of the length of the trip.

SpeedMPGFuelCostTime  Hrs
Cost Per
Hr Saved

Here are a few things I've learned from this:

  • My friend with the other Forester drives slower on the highway than I do.
  • I had not known how sensitive vehicle gas mileage is to speed.
  • Everything I have read led me to expect no benefit from higher octane fuel once the vehicle's requirements have been met. In the case of the Forester, higher octane actually reduced fuel economy. This observation is confirmed by the official EPA ratings.
  • I would love to see tables like the one above before purchasing my next car.
I learned a lot from this tiny data sample and my future driving habits will be changed accordingly. Now imagine what we could learn if there were a large public database of fuel economy data. Car manufacturers could optimize for specific driving patterns. Consumers would be better informed about fuel economies to expect. There are fleet tracking devices like this one that are already reporting that data but it's locked up in private databases. If anonymous fuel economy data (speed, distance, altitude and MPG) were released there's a lot we could learn about fuel economy under a variety of conditions.

I can't wrap this post up without relating it to education. A relatively small data sample taught me a lot and will impact my future driving behavior. In the same way, it doesn't take a lot of data fed back to students and teachers before they see opportunities to improve. And when we grow from little data to big data, revolutionary changes are on the horizon.


  1. This is great data! I really wish there was more of it for more vehicles. I'm not surprised at all at the drop in fuel economy with speed. The drag caused by area of the car to the air is squared with change in velocity, isn't it? But the premium gasoline really surprised me. Really? Higher octane was worse? We've been doing an experiment with higher octane fuel on our Honda Odyssey, which says it requires 87 octane or higher. I believe we get about 3 mph better performance with 91 octane, but I wish I could gather better data than a whole tank of gas in varying circumstances.

    Another question: you said you threw out the data if the altitude varied much during a segment. Did the altitude change much fro the beginning of the trip to the end? Was the altitude (and therefore oxygen availability) significantly different for that second tank of gas?

    Thanks for sharing your findings!

  2. I started the trip in the neighborhood of 4500 feet going over a couple of passes (where samples were rejected) but otherwise gradually descending to about 1200 feet before going over the Cascade range and ending at sea level. In the samples I had, I didn't detect a noticeable effect from altitude.

    With more data, of course, it should be possible to detect a difference. Air friction would be reduced at higher altitude while I believe that internal combustion engines are more efficient at lower altitudes. It would be interesting to find out which effect is greater.

  3. Can't help but compare the EPA ratings to the Motion Picture Association movie ratings. In both cases they try to distill down an awful lot of nuanced information into a single rating, that taken alone is almost meaningless. At least the EPA isn't run by the very companies it is providing ratings for...