Skip to main content
SearchLoginLogin or Signup

Response to Kenny et al.’s Commentary

Published onFeb 27, 2023
Response to Kenny et al.’s Commentary
1 of 2
key-enterThis Pub is a Rejoinder to

We would like to thank Kenny et al. (2022) for their response to our article. In our article, we discuss the statistical illusion of ground truth and how this illusion reinforces epistemic disconnects around the adoption of differential privacy in the 2020 U.S. Census. In describing this illusion, we point out that Kenny et al.’s (2021) published analyses “ignor[ed] biases that the published 2010 data might have introduced into the redistricting process,” therefore amplifying divides between stakeholders regarding data quality and accuracy (boyd & Sarathy 2022). Kenny et al. (2022) respond with three main critiques of our argument: first, that the main goal of their work is a policy evaluation to understand the impacts of implementing differential privacy as part of the 2020 Disclosure Avoidance System (DAS); second, that their analysis is based on reasonable approximations to ground truth (e.g., using election data and self-reported race in voter files); and third, that differential privacy (DP) still enables accurate predictions of individuals’ race, and stakeholders may care about actual disclosure risk in addition to relative risk.

In this note, we would like to focus on two higher-level takeaways from this discussion. First, Kenny et al.'s (2022) commentary reinforces our observation that contestations around data quality are often complicated by different epistemic orientations toward ground truth and uncertainty. Second, Kenny et al.’s commentary emphasizes the need to account for noise that exists not only for disclosure avoidance purposes, but also emerges from collecting, processing, and editing the data, raising questions about how to design more responsible, equitable policies that account for the limits of data. Finally, we address Kenny et al.’s third critique only briefly, as it does not directly relate to our article’s argument. We expand on these points below.

First, the commentary by Kenny et al. (2022) reinforces what we highlight in our article: census stakeholders disagree on how data quality should be evaluated in part due to differing epistemic orientations toward ground truth and uncertainty. Kenny et al.’s (2021) work argues that the errors due to the proposed 2020 DAS tend to undercount racially heterogenous areas as compared to the 2010 Census data. Yet, their analyses do not evaluate or discuss whether the errors induced by swapping within the 2010 Census data have racial components as compared to the 2010 Census Edited File (the ‘complete’ data that the Census Bureau uses as its internal ‘ground truth’), let alone how the editing and imputation procedures that the Census Bureau uses to address anomalies and nonresponse in preparing the Census Edited File might generate additional complexities.1 Other works provide more insight into these procedures. Wang and Goldbloom-Helzner (2021) point out that “data swapping is known to alter the apparent minority population in areas where that population is scarce,” and evaluations from Census Bureau suggest that swapping can be worse than its TopDown Algorithm for accurately representing minority populations (Hawes & Rodriguez, 2021). Christ et al. (2022) compare swapping and differential privacy independently of the bureau’s implementation, and they conclude that differential privacy and swapping have similar utility, but differential privacy offers more equitable privacy protection for minority populations. Additionally, Steed et al. (2022) show that the noise introduced by sampling and methods like imputation that the bureau uses to address nonresponses is much larger than, and indeed drowns out, the noise introduced by either swapping or differential privacy.

It is clear from these debates that technical methods of comparing swapping vs. differential privacy—let alone analyses that account for uncertainty holistically—are themselves contested. Differing conclusions stem, in part, from whether one takes the 2010 swapped data as a comparison point, conducts restricted comparisons to non-noised data, or does the evaluation of swapping and differential privacy independently. In addition, these conclusions rely on different assumptions about whether current methods used by data users and in courts can be improved by the very experts involved in this debate. For example, even with some similar findings about utility and privacy, Kenny et al. (2021) and Cohen et al. (2022) come to very different conclusions about what their findings imply about making data useful in practice. While Kenny et al. frame their findings under the assumption that data users will continue to use noise-infused data with unsuitable methods (Kenny et al., 2021), Cohen et al. (2022) come to rather a different conclusion. They write: “a finding that DP unsettles the current practices might lead us to call to refine the way it is applied, but might equally lead us to interrogate the traditional practices and seek next-generation methods for redistricting. In particular, it is clear that the practice of one-person population deviation across districts was never reasonably justified by the accuracy of Census data nor required by law, and the adoption of differential privacy might give redistricters occasion to reconsider that practice.” Not only do these opposing viewpoints reflect the epistemic tensions that we attempted to flag in our article, but they also demonstrate how epistemic tensions can be used—even by experts who are well-poised to change the status quo—in order to further reinforce statistical illusions around data.

In their commentary, Kenny et al. (2022) note that they use some data sources that approximate ground truth, including block-level population counts held invariant under swapping, election data, and self-reported race in voter files. We would like to note that this approach introduces other forms of uncertainty into the equation. First, even for population counts, ‘invariants’ are not the same as ground truth. Invariants are counts that the bureau produces after editing, de-duplicating, and other forms of ‘repair-work’ designed to ensure consistency, completeness, and the recognition that some people who do exist were not initially counted. Invariants also include errors that the bureau makes; some of these become visible, for example, when the bureau announces that it accidentally counted a military vessel in the wrong place. These may not match with external sources of data. Election data and voter files have different limitations to them; they follow different residence rules, include different populations, and have different types of error.

In addition, in their final set of experiments, Kenny et al. (2021) treat voter record data as ground truth for the purposes of voting policy. Many of these files make names available, and Kenny et al. infer race from the names provided in these files. Treating these inferences as close to ground truth introduces new questions. The Census Bureau asks the public to self-identify race whenever possible, believing that self-identification is more appropriate than the bureau labeling people (which is what enumerators did for decades). With this as a backdrop, the Census Bureau evaluated earlier methods of predicting race from location and surname as part of its quality assurance programs, but deemed to be too inaccurate for evaluating data quality. The bureau’s analysis concluded that accuracy limitations are inequitably distributed among different populations (Perkins, 1993; Word et al., 2007). The most common current approach, also used by Kenny et al. (2021) to construct their base data—Bayesian Improved Surname Geocoding (BISG)—depends on published census data but may not be acceptable for evaluating census data. Kenny et al.’s finding that BISG works as well as it did before, even with differential privacy, is not as straightforward as it seems, given that BISG was never that precise to begin with; certainly, BISG cannot provide the level of precision that current policies demand. Like any other statistical model, this approach also has statistical uncertainty in it, which complicates the ability to effectively compare the demonstration products against the swapping data.

We recognize that Kenny et al. (2021) are not necessarily advocating for the use of BISG in practice, but rather mimicking its use in voting rights cases to impute individual race, as well as leveraging it to understand how well one can predict individual race, as compared to self-reported race in voter files, under various privacy protections. Nevertheless, it would be imprudent for us to not also acknowledge that the history of using surnames to assign race is fraught. People’s sense of their racial identity evolves over time (Doyle & Kao, 2007) and the bureau has struggled for decades to evolve its methods to capture race and Hispanic-origin to the best of its ability (Cohn, 2012). Since 1960, the Census Bureau has emphasized the importance of allowing people to self-respond rather than having their race ascribed to them by enumerators; the bureau has also repeatedly updated the categories to reflect shifts in self-identification. While companies like RAND believe that techniques like BISG can be used toward ensuring equity (RAND n.d.), critics have raised concerns that this approach can reify structural discrimination (Partnership on AI, 2021). Moreover, given the historical misuse of surnames to categorize people in disturbing ways (e.g. Tocher, 1910), we urge caution before presuming that these techniques provide insight into racial self-identification vis-à-vis policymaking. We also think that it would be constructive for the data science community to have a much more thorough conversation about the technical, social, and political dimensions of using statistical techniques to infer race. This, however, was beyond the scope of our article.

Second, while the focus of this special issue and this thread is on differential privacy, Kenny et al.'s (2021) work also highlights the risks of not accounting for other types of noise in the data. Kenny et al.'s (2021) redistricting analysis, which relies on ‘bright line’ tests used in court, highlights how our current legal methods are highly sensitive to even small amounts of noise in the data. Their work emphasizes how these tests are insufficiently robust given the amount of noise even in the pre-privatized data (i.e., the Census Edited File). Any policy evaluation that is so sensitive to subtle shifts in noise is likely to be significantly perturbed by the range of imperfections in these data. Indeed, as mentioned above, work by Steed et al. (2022) shows that the noise introduced by repair-work methods is even more significant than the noise introduced for disclosure limitation. This then raises the questions: Are policy frameworks this sensitive to noise responsible or ethical? Should it be acceptable to ignore the imperfections that have always existed in the data?

The new approach at play is not as radical of a change from 1990, 2000, or 2010 as some suggest. In each of these decades, the bureau has injected noise into the data to ensure confidentiality. In each of the decades, the bureau has worked to minimize problems in data that were collected. What is different this decade is the visibility of this work.

As we argue in our article, statistical illusions have made it difficult for us to ‘see’ all of the ways in which the policies that depend on data obscure their errors, even though many scholars have been making this argument for decades (Gitelman, 2013; Hacking, 1990; Poovey, 1998; Porter, 1995). We would argue that the problem here is not the data—which will never be precise down to the person level—but the illusions that policies depend on. Policies that presume that errors can be ignored, and that data can be used ‘as-is,’ create conditions for discrimination and inequity (Steed et al., 2022). We argue that all forms of uncertainty must be grappled with rather than ignored, especially by those who have expertise in working within an inferential framework. Policy evaluations by such experts should not take the use of data ‘as-is’ as a given, which simply reinforces dangerous statistical illusions, but rather pave the way toward more responsible uses of data that contend holistically with different limitations to data.

Finally, although our article focuses on epistemic disconnects around data quality, we also note that stakeholders have differing perspectives around privacy risk. Kenny et al.'s (2022) commentary highlights one aspect of this debate, namely, whether statistical inference regarding racial identification violates the privacy of respondents. As this critique does not relate to the main argument in our article, we will not discuss this at length. We would simply like to point out that, like racial self-identification, privacy is also not a static issue. How people understand privacy evolves. A controversy in 2002 over the bureau’s decision to produce a special tabulation of previously published data about Arab American communities in small area geographies (El-Badry & Swanson, 2007) highlights how even the status quo of publishing statistics can cause furor. However, the bureau has a legal, moral, and bureaucratic need to protect confidentiality. After all, if people’s data are exposed in the present, their willingness to participate in the future decreases. To that end, we support Kenny et al.’s (2001, 2002) invitation for the community to grapple with whether the bureau does enough to protect the confidentiality of inferred characteristics.

Given that our article in this special issue is not focused on the technical issues at hand, we have been encouraged not to address those aspects of Kenny et al.'s (2021) work in our response. Other scholars have responded to some of the technical issues and analyzed the broader technical tradeoffs at play; we encourage curious readers to see these works for detailed discussion of these points (Bun et al., 2021; Brief for Data Privacy Experts, Alabama v. Department of Commerce, 2021; Christ et al., 2022; Cohen et al., 2022; Hawes & Rodriguez, 2021; Steed et al., 2022; Wang & Goldbloom-Helzner, 2021).


In our article, we traced how the adoption of differential privacy in the 2020 U.S. Census, which aimed to work with uncertainty to protect the confidentiality of respondents, dissolved these illusions and exposed the epistemic disconnects around census data. Kenny et al.'s (2022) commentary highlights some of the contours of these disagreements—for example, around the merits of policy evaluations using swapped data, and whether statistical inference regarding racial identification violates the privacy of respondents—and sheds light on the intricacies of the epistemic disconnects we trace in our article.

We admire Kenny et al.’s (2021, 2022) mission to evaluate the impacts of differential privacy on providing equitable, useful census data, and we agree with the authors that policy evaluations are critical to understanding tradeoffs between privacy and accuracy. But to work toward this goal, we believe that we must first have a more holistic conversation about the limits of data, analyzing impacts of different disclosure and repair-work techniques in ways that do not rely on and reinforce statistical illusions. Only then can policy evaluations attend to the uncertainty inherent in all data, rather than obscure them. Our article attempts to showcase how epistemic fractures complicate such an effort.

Disclosure Statement

danah boyd and Jayshree Sarathy have no financial or non-financial disclosures to share for this article.


Alabama v. Department of Commerce, Case No. 3:21-cv-211-RAH-ECM-KCN (WO) (M.D. Ala. 2021).

boyd, d., & Sarathy, J. (2022). Differential perspectives: Epistemic disconnects surrounding the U.S. Census Bureau’s use of differential privacy. Harvard Data Science Review, (Special Issue 2).

Bun, M., Desfontaines, D., Dwork, C., Naor, M., Nissim, K., Roth, A., Smith, A., Steinke, T., Ullman, J., & Vadhan, S. (2021). Statistical inference is not a privacy violation.

Brief for Data Privacy Experts as Amici Curiae Supporting Defendants, Alabama v. Department of Commerce, Case No. 3:21-cv-211-RAH-ECM-KCN (WO) (M.D. Ala. 2021).

Christ, M., Radway, S., & Bellovin, S. M. (2022). Differential privacy and swapping: Examining de-identification's impact on minority representation and privacy preservation in the U.S. Census. In 2022 IEEE Symposium on Security and Privacy (SP) (pp. 457–472). IEEE.

Cohen, A., Duchin, M., Matthews, J., & Suwal, B. (2022). Private numbers in public policy: Census, differential privacy, and redistricting. Harvard Data Science Review, (Special Issue 2).

Cohn, D. (2012, August 7). Census Bureau considers changing its race/Hispanic questions. Pew Research Center.

Doyle, J. M., & Kao, G. (2007). Are racial identities of multiracials stable? Changing self-identification among single and multiple race individuals. Social Psychology Quarterly, 70(4), 405–423.

El-Badry, S., & Swanson, D. A. (2007). Providing census tabulations to government security agencies in the United States: The case of Arab Americans. Government Information Quarterly, 24(2), 470–487.

Gitelman, L. (Ed.). (2013). “Raw data” is an oxymoron. The MIT Press.

Hacking, I. (1990). The taming of chance. Cambridge University Press.

Hawes, M., & Rodriguez, R. (2021, June 4). Determining the privacy-loss budget: Research into alternatives to differential privacy [PowerPoint presentation].

Kenny, C. T., Kuriwaki, S., McCartan, C., Rosenman, E. T. R., Simko, T., & Imai, K. (2021). The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census. Science Advances, 7(41), Article eabk3283.

Kenny, C. T., Kuriwaki, S., McCartan, C., Rosenman, E. T. R., Simko, T., & Imai, K. (2022). Comment: The essential role of policy evaluation for the 2020 Census Disclosure Avoidance System. ArXiv.

Partnership on AI. (2021, December 2). Fairer algorithmic decision-making and its consequences: Interrogating the risks and benefits of demographic data collection, use, and non-use.

Perkins, R. C. (1993). Evaluating the Passel-Word Spanish surname list: 1990 Decennial Census post enumeration survey results.

Poovey, M. (1998). A history of the modern fact: Problems of knowledge in the sciences of wealth and society. University of Chicago Press.

Porter, T. M. (1995). Trust in numbers: The pursuit of objectivity in science and public life. Princeton University Press.

RAND. (n.d.). RAND Bayesian Indirect Surname Geocoding (BISG)

Steed, R., Liu, T., Wu, Z. S., & Acquisti, A. (2022). Policy impacts of statistical uncertainty and privacy. Science, 377 (6609), 928–931.

Tocher, J. F. (1910). The necessity for a national eugenic survey. The Eugenics Review, 2(2), 124–141.

Wang, S., & Goldbloom-Helzner, A. (2021). Comment on “The impact of the U.S. Census Disclosure Avoidance System on redistricting and voting rights analysis,” by Kenny et al.

Word, D. L., Coleman, C. D., Nunziata, R., & Kominski, R. (2007). Demographic aspects of surnames from Census 2000. United States Census Bureau.

©2023 danah boyd and Jayshree Sarathy. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.

No comments here
Why not start the discussion?