International Math Study Doesn't Add Up
January 11, 2010

Review finds study comparing U.S. math achievement with other countries misleading and deceptive

Jeremy Kilpatrick
(706) 542-4163

Teri Battaglieri
(517) 203-2940

EAST LANSING, MI. (Jan. 11, 2011) –While a recent report asserts that U.S. students are far behind students in most other developed countries in mathematics achievement, a new review of the study finds that it makes deceptive comparisons and exaggerates small differences, rendering the study useless in helping educators improve U.S. students' math performance.

U.S. Math Performance in Global Perspective: How Well Does Each State Do at Producing High-Achieving Students was reviewed for the Think Twice think tank review project by Jeremy Kilpatrick, Regents Professor of Mathematics Education at the University of Georgia. Professor Kilpatrick has extensive background and expertise in national and transnational testing of student mathematics achievement.

The review was produced by the National Education Policy Center (NEPC), housed at the University of Colorado at Boulder School of Education, with funding from the Great Lakes Center for Education Research and Practice.

U.S. Math Performance in Global Perspective, published by the Harvard University Program on Education Policy and Governance and by the journal Education Next, compares the performance of high-achieving students domestically and internationally, using data from the 2005 National Assessment of Educational Progress (NAEP) and the Program for International Student Assessment (PISA). The NAEP is administered domestically, to a sample of 8th grade students throughout the U.S.; PISA is administered internationally, to a sample of 15-year-olds in Organization for Economic Co-operation and Development (OECD) member countries and partner countries.

The study  compared the percentages of U.S. students in the 50 states and 10 urban districts who performed at an advanced level in mathematics on the 2005 NAEP with estimated percentages of students in other countries who would have reached that same level had they taken the NAEP 2005 mathematics assessment. Samples of U.S. students in the graduating class of 2009 had taken NAEP 2005 when they were eighth graders and the PISA 2006 when they were tenth graders, so a statistical procedure could be used to calculate the PISA mathematics score that was presumed equivalent to scoring at an advanced level on NAEP. It was thus possible to estimate what percentage of students in each of the 57 countries in PISA 2006 would have attained that advanced NAEP level. The report indicates that most of those countries had far greater percentages of high achievers than the United States did.

In his review, Kilpatrick finds six areas of concern:

  1. NAEP is taken by eighth graders, whereas PISA is taken by 15-year-olds in each country, who are typically not all going to be in the same graduating class.
  2. The mathematical proficiency of those students who graduated in the class of 2009 was likely rather different from what it had been in 2005 or 2006.
  3. To rank districts and states measured on one scale against countries measured on another gives a misleading picture of how they are performing, especially when the scores come from one tail of a distribution (the very highest achievers) and are estimated using statistical linking.
  4. Although a statistical linkage had in the past been possible between NAEP and previous international tests that had been administered to equivalent representative samples of students in mathematics at the same time and in the same grades, no similar basis exists to create a linkage between PISA and NAEP.
  5. NAEP mathematics is designed to measure the knowledge, skills, and competencies needed by U.S. students in mathematics at different grades, whereas PISA is not intended to be tied to the school mathematics curriculum, calling instead on the ability to use and apply knowledge and skills in real-world situations; the study thus tries to mesh the results of two different tests that measure different domains of mathematics proficiency.
  6. Mathematics was a so-called minor domain in PISA 2006, which means that relatively few questions were asked and there was likely substantial variability within countries, raising unresolved reliability and validity concerns.

Kilpatrick doesn't argue with the report's claim that relatively few students in the U.S. score at advanced levels in math. "What is misleading is that the percentages of advanced-level students in countries, states, and districts can be put on the same scale so that they can easily be compared," he writes.

Moreover, such misleading comparisons aren't necessary, according to Kilpatrick:  "At the website of the National Center for Education Statistics, state and urban district policymakers can obtain ample data on how their students at all levels of proficiency have been performing in NAEP mathematics. Those policymakers do not need, and should avoid, this flawed effort to enter their high-performing graduating seniors of 2009 into a mock international horse race in which they did not participate."

Find Jeremy Kilpatrick's review and a link to U.S. Math Performance in Global Perspective at:

The Think Twice think tank review project, a project of the National Education Policy Center, provides the public, policy makers and the press with timely, academically sound reviews of selected think tank publications. The project is made possible in part by the generous support of the Great Lakes Center for Education Research and Practice.

The review is also available on the National Education Policy Center website at:


The mission of the Great Lakes Center is to improve public education for all students in the Great Lakes region through the support and dissemination of high quality, academically sound research on education policy and practices.

Visit the Great Lakes Center Web Site at: