big at age forty, but by age eighty the risk more than doubles, from under 30 percent to over 60 percent. This is aclean and accurate way to present the data. But suppose you’re a young fourteen-year-old smoker who wants to convince your parents that you should be allowed to smoke. This graph is clearly not going to help you. So you dig deep into your bag of tricks and use the double y-axis, adding a y-axis to the right-hand side of the graph frame, with a different scaling factor that applies only to the nonsmokers. Once you do that, your graph looks like this:
From this, it looks like you’re just as likely to die from smoking as from not smoking. Smoking won’t harm you—old age will! The trouble with double y-axis graphs is that you can always scale the second axis any way that you choose.
Forbes magazine, a venerable and typically reliable news source, ran a graph very much like this one to show the relation betweenexpenditures per public school student and those students’ scores on the SAT, a widely used standardized test for college admission in the United States.
From the graph, it looks as though increasing the money spent per student (black line) doesn’t do anything to increase their SAT scores (gray line). The story that some anti–government spending politicos could tell about this is one of wasted taxpayer funds. But you now understand that the choice of scale for the second (right-hand) y-axis is arbitrary. If you were a school administrator, you might simply take the exact same data, change the scale of the right-hand axis, and voilà—increasing spending delivers a better education, as evidenced by the increase in SAT scores!
This graph obviously tells a very different story. Which one is true? You’d need to have a measure of how the one variable changes as a function of the other, a statistic known as a correlation. Correlations range from −1 to 1. A correlation of 0 means that one variable is not related to the other at all. A correlation of -1 means that as one variable goes up, the other goes down, in precise synchrony. A correlation of 1 means that as one variable goes up, the other does too, also in precise synchrony. The first graph appears to be illustrating a correlation of 0, the second graph appears to be representing one that is close to 1. The actual correlation for this dataset is .91, a very strong correlation. Spending more on students is, at least in this dataset, associated with better SAT scores.
The correlation also provides a good estimate of how much of theresult can be explained by the variables you’re looking at. The correlation of .91 tells us we can explain 91 percent of students’ SAT scores by looking at the amount of school expenditures per student. That is, it tells us to what extent expenditures explain the diversity in SAT scores.
A controversy about the double y-axis graph erupted in the fall of 2015 during a U.S. congressional committee meeting. Rep. Jason Chaffetz presented a graph that plotted twoservices provided by the organization Planned Parenthood: abortions, and cancer screening and prevention:
The congressman was attempting to make a political point, that over a seven-year period, Planned Parenthood has increased the number of abortions it performed (something he opposes) and decreased the number of cancer screening and prevention procedures. Planned Parenthood doesn’t deny this, but this distorted graph makes it seem that the number of abortion procedures exceeded those for cancer. Maybe the graph maker was feeling a bit guilty and so included the actual numbers next to the data points. Let’s accept her bread crumbs and look closely. The number of abortions in 2013, the most recent year given, is 327,000. The number of cancer services was nearly three times that, at 935,573. (By the way, it’s a bit suspicious that the abortion numbers are such tidy, round numbers while the cancer numbers are so precise.) This is a particularly
James A. Michener, Steve Berry