Mueller & Santos-Lozada: 2020 Census Discrepancies

In The 2020 Census Differential Privacy Method Introduces Disproportionate Discrepancies for Rural and Non-white Populations, J. Tom Mueller (Geography and Environmental Sustainability, University of Oklahoma) and Alexis R. Santos-Lozada (Human Development and Family Studies, Penn State) investigate the ability of the US Census Bureau’s disclosure avoidance system to provide accurate population counts and growth rates.

For the 2020 decennial tabulations, the U.S. Census Bureau adopted a new disclosure avoidance system called differential privacy. This approach uses an algorithm to inject noise into aggregate population counts to prevent disclosure, or the reidentification of individuals within census data, while still providing public data accurate enough for science and public policy.

The authors evaluate the differences in 2010 population estimates and county-level population growth rates from 2000 to 2010 when the data is analyzed using the traditional disclosure avoidance approach versus the new differential privacy algorithm. The authors also analyze how the Census Bureau’s algorithm is influencing discrepancies between statistics generated via traditional disclosure avoidance methods and the new differential privacy algorithm in 2020 population counts across the rural-urban continuum.

Mueller and Santos-Lozada found the differential privacy algorithm failed to accurately capture county-level population totals and population growth across the rural-urban continuum for all groups except the total and non-Hispanic white population. Further, the authors found estimates for rural non-white populations were extremely prone to discrepancies between the old and new disclosure avoidance techniques. The authors suggest this means non-Hispanic whites were afforded a greater level of accuracy in the 2020 census than other ethnoracial groups. Mueller and Santos-Lozada also found discrepancies increased dramatically when moving from urban to rural. Therefore, the authors conclude that the Census differential privacy method likely led to significant discrepancies for rural and non-white populations.

The authors suggest the algorithm may be introducing an unacceptable level of noise while also failing to generate the level of confidentially that the Census Bureau asserts it provides. In other words, the algorithm sacrifices important levels of accuracy for fears of disclosure. These discrepancies matter because important policy decisions and research are often based on Census data and these inaccuracies could lead to difficulties evaluating the best options for addressing the challenges facing some of America’s most vulnerable populations.

Previous
Previous

Roundup: May 11, 2022

Next
Next

Statz, et al.: Why A2J Initiatives Fail Rural Americans