Editor-in-Chief’s Note: The U.S. Census Bureau revolutionized its 2020 Census disclosure avoidance system through guidance by the principles of differential privacy (DP). The venture sparked both cheers and fears, as reflected in the HDSR 2022 special issue on DP for the 2020 Census. At the urging of the research community, the bureau has released the Noisy Measurement File (NMF), an intermediate product created by injecting an appropriate amount of DP noise to the confidential census data. While unsuitable for broad use—exhibiting negative counts, among other issues—the NMF is valuable for researchers conducting statistical analyses that account for DP-induced distortions. The accessibility and usability of this intermediate product has been closely examined by researchers, as discussed by McCartan, Simko, and Imai (2023). Because HDSR is a platform for direct exchanges among all stakeholders of data science, I invited John Abowd, who led the bureau's 2020 efforts to innovate its disclosure avoidance system, to comment on McCartan et al. and their recommendations to the bureau on making NMF more user-friendly. Abowd has completed his leadership terms at the bureau, and hence his response reflects his personal views as a scholar, not those of the bureau. Nevertheless, because of “thousands of insights gained from years of working with the teams at the Census Bureau,” as Abowd acknowledges, this exchange between Abowd and McCartan et al.—who also provide a rejoinder (McCartan et al., 2024)—demonstrates the need for direct communications between data curators and data users, and how their perspectives influenced by their experiences and information access pertaining to these roles.
Keywords: 2020 Census, differential privacy, Disclosure Avoidance System, noisy measurements, post-processing
There are many, many uses of data from the U.S. Decennial Census of Population and Housing, but only three have direct constitutional and federal statutory foundations. The first is apportionment of the House of Representatives (U.S. Const., art. I, § 2). The second is statistical support for redistricting every legislative body in the country (Census Act, 1954, § 141). The third is statistical support for the Census Bureau’s Population Estimates Program (Census Act, 1954, § 181). These uses dominate the publication format and accuracy assessments of modern U.S. censuses. While academics like McCartan, Simko, and Imai properly focus on valid statistical inferences based on published data, that is only one feature required to assess their fitness for use. For the redistricting data considered here, the dominant use case is the ability to support accurate, approximately equal–population new voting districts, whose boundaries obviously cannot be specified in advance except for the political perimeter encompassing all districts of a particular legislature, that meet the requirements of the Voting Rights Act (1965). This commentary suggests how research using the 2020 Census Noisy Measurement Files (NMFs) can inform the design of future decennial census data products in a meaningful way.
The redistricting data NMF, released on June 15, 2023, and the massive demographic and housing characteristics NMF, released on October 23, 2023, are the first data publications by any statistical agency in the world of the raw output of a confidentiality protection system. They are, effectively, the harbinger of 21st century replacements for public use microdata files because they contain massively more information than is embodied in the official tabular releases. In particular, they contain information on every high-order interaction consistent with the publication schema of every variable in any published tabulation for a given population (persons or housing units) at every level of geography specified in the hierarchy used to create the tabular publications. That’s a dense sentence, and it takes a while to sink in. Some details clarify what I mean.
The official 2020 Redistricting Data (P.L. 94-171) Summary File (redistricting data, hereafter) contains two tables of race and ethnicity counts for all persons, two tables of race and ethnicity counts for all adults, one table of population counts in households and major group quarters categories, and one table of counts of occupied and vacant housing units—approximately 1.5 billion linearly independent statistics. But the redistricting NMF also contains data for the race and ethnicity of all persons and all adults living in each major group quarters type. And these data are available for the same geographic hierarchy that was used to produce the official data—approximately 16 billion linearly independent statistics. The Census Bureau’s willingness to consider releasing the NMFs, and its active solicitation of user input for that process, began at the December 2019 Committee on National Statistics (CNSTAT) workshop on the 2020 Census Disclosure Avoidance System (DAS), where one of the authors of the DAS, William Sexton, reported:
The useful point here is that the noisy measurements are (as promised by differential privacy) future-proof to any subsequent attacks, meaning that providing direct access to the noisy measurements need not require the use of the Census Bureau’s Research Data Centers as a broker. They could be released as an alternative product to the census. (However, the approach would require the Census Bureau to support alternative releases surrounding the decennial census.) Given resource constraints, Sexton said that the Census Bureau is awaiting feedback from the user community before committing to producing alternative data releases and products. (National Academies of Sciences, Engineering, and Medicine, 2020)
Nothing in the August 2021 letter to the Census Bureau from 62 prominent researchers (Dwork et al., 2021, August 12) that further alerted the academic community to the role of the NMFs in the 2020 DAS foreshadowed the enormous volume of information in those noisy measurements that was not part of the official data. The official 2020 Demographic and Housing Characteristics File (DHC) contains approximately eight billion linearly independent statistics (including the 1.5 billion statistics in the redistricting data), but the demographic and housing characteristics NMF contains 25 trillion linearly independent statistics. It is so large that all the storage space on the Census Bureau’s data dissemination server farms could not hold it, even compressed.
The NMFs were developed in a modern, massively distributed cloud computing environment. That is the only environment where it makes sense to analyze them. The NMFs are experimental, and the suggestions in McCartan et al. (2023; MSI, hereafter) for improving the documentation are valuable contributions. The current documentation was developed in just a few weeks by the same staff working on the official products, which were appropriately prioritized. As experimental products, the expectation is that users will provide the kind of feedback that MSI have provided, and that this feedback may be reflected in future products.
The NMFs were not designed for direct publication. The internal NMFs used in the production DAS could not be released because their storage format commingled confidential information with the noisy measurements. New software was written to extract the noisy measurements, embedded in a 400TiB pickled data structure, and array them in the 40TiB Parquet files used for publication. This is why the noisy measurements could not be released under a Freedom of Information Act (FOIA) request, as the Census Bureau correctly responded when it received and denied such a request on December 1, 2022 (Phillips v. U. S. Bureau of the Census, 2023, p. 2). Justin H. Phillips, represented by the Election Law Clinic at Harvard University, then sued the Census Bureau to reverse the FOIA decision. The plaintiff continued to press the suit even after the agency agreed to publish the noisy measurements after the reformatting described above as an experimental data product with documentation.
Neither the reprogramming nor the documentation would have been required in a FOIA release. But the plaintiff would not agree to dismiss the suit until the Census Bureau agreed to a firm publication date for the Redistricting Noisy Measurement File of August 23, 2023 (Phillips v. U. S. Bureau of the Census, 2023, p. 4). The assistant U.S. attorney for the Southern District of New York and the plaintiff agreed to dismiss the suit with prejudice given that publication deadline (Phillips v. U. S. Bureau of the Census, 2023, p. 4). This is exactly what happened: the NMFs were approved for release, as the original scholars had requested, and sufficient documentation was developed to permit their use without delaying any other 2020 Census data products. The plaintiff pressed for attorneys’ fees arguing that the suit had “substantially prevailed” in forcing the Census Bureau to release the NMFs. The court was sympathetic to this argument but ultimately held that “the Census Bureau has provided persuasive evidence that it was intending to release files containing the noisy measurements data, like those requested by plaintiff, well in advance of plaintiff filing his FOIA request and lawsuit” and denied attorneys’ fees (Phillips v. U. S. Bureau of the Census, 2024, p. 10).
As MSI note, the experimental NMFs cannot be used without a working understanding of the production DAS code base, which is also public (U.S. Census Bureau, 2021). They note, in particular, three improvements that could be made to the existing noisy measurement files: (1) centralized documentation and codebooks to properly read and format the file, (2) aggregation specifications that link queries to tabulated statistics, and (3) full block assignment files for NMF geographies. MSI make other suggestions that appear to require major programming investments: (1) unnesting the NMF Parquet files, (2) filling in missing geographies that are on or off the 2020 DAS spine, and (3) providing application programming interface access to the full set of NMFs.
Given that the 2020 NMFs are an experimental data product whose improvements do not directly support the 2030 Census, the most productive next step for improving the quality of the public NMFs would be for a group of scholars like MSI to secure a National Science Foundation, National Institutes of Health, Sloan Foundation, or other grant to develop tools to analyze the NMFs. That grant should include adequate budget for Census Bureau staff time, expert consultants, and cloud computing resources. With these resources in hand, in particular, direct Census Bureau staff salary commitments, it would be reasonable for the scholars to set the priorities for improving the usefulness of the existing 2020 Census NMFs. Absent such an intellectual and financial commitment, it is hard to find fault with the Census Bureau’s prioritization of 2030 Census research.
Excellent models for how dedicated scholars like MSI can collaborate with the Census Bureau already exist. These include the Longitudinal Employer-Household Dynamics Program (Abowd et al., 2009), Business Dynamics Statistics (Haltiwanger et al., 2013), Opportunity Atlas (Chetty et al., 2020), Criminal Justice Administrative Record System (Finlay et al., 2022), and Census-Enhanced Health and Retirement Study (Institute for Survey Research, n.d.). The Census Bureau cannot be expected to provide all the resources to make an experimental product better. Productive engagement means that external scholars commit to sharing the development costs and to putting the resulting enhanced product and documentation in the public domain.
There are now official publications from the 2020 Census that do release the noisy measurements and the margins of error associated with the noise added by the differential privacy mechanism, for example, the 2020 Census Detailed Demographic and Housing Characteristics File A (Detailed DHC-A). This data release demonstrates that when the disclosure avoidance system is designed from the outset to publish the noisy measurements, a statistically better product can be released. The Detailed DHC-A uses the same differential privacy framework as the DAS (\rho-zero-concentrated differential privacy) and the same mechanism (discrete Gaussian noise), but the implementation has, approximately, one noisy measurement for each publication statistic. Design constraints imposed on the DAS that caused the redistricting and DHC data to have massively more noisy measurements than publication statistics (such as the requirement to generate synthetic microdata, which required noisy measurements to estimate all possible interactions between variables) were relaxed for the Detailed DHC-A, permitting publication of the noisy measurements with only very minor postprocessing. Virtually every one of the 500 million statistics in the Detailed DHC-A is the noisy measurement itself.
What I want to do in the rest of this commentary is focus researchers on the real issue: Why did the Census Bureau design the DAS to transform noisy measurements via postprocessing into microdata to publish conventional tabular summaries? My hope is that researchers can suggest acceptable modifications to the official products that still meet their dominant use cases, while better supporting the kinds of statistical inferences that MSI suggest and still properly protecting confidentiality. I focus on the redistricting use case because it drove virtually all the design decision-making for the first application of the 2020 DAS, which was then extended and enhanced for its second application to the DHC. The discussion is not a rehash of old arguments. It is a focused attempt to explain to nontechnical readers why the redistricting data took the form they have and what, if anything, the Census Bureau and redistricting community can do to improve the published data product in future censuses. To serve that goal, the reader must understand what determined the publication constraints and what options future censuses might employ. Even more importantly, readers must understand that over the course of the intercesnal decade, all parties to the production of redistricting data must consider and ultimately agree to changes in their content and format.
Using the differential privacy framework, the engineering begins with the query workload, which is the technical term for the collection of statistics for which the data steward wishes to control the error in anticipation of publication. For the 2020 redistricting data, the query workload is a set of tables that display 252 counts for the resident population, eight counts for the group quarters population, and two counts for the housing units—a total of 262 statistics—for 8.6 million geographies composed of aggregations of a prespecified atom, the census block, that tessellates the United States—divides its physical area into mutually exclusive and exhaustive pieces.
This query workload is the result of a decade-long negotiation between the Census Bureau’s Redistricting and Voting Rights Data Office—part of the Decennial Census Programs Directorate—and the National Conference of State Legislatures—a nonpartisan body encompassing representatives from all 50 states and the District of Columbia—that culminated, as in previous decades, with the Federal Register Notice in 2018 (U.S. Census Bureau, 2018, May 1) containing the detailed specifications for these statistics. Following that publication, state, county, municipal, tribal, and other organizations empowered their own redistricting offices to develop software that ingested the official redistricting data, combined them with other data, and produced tentative new voting districts. This software runs immediately after the release of the official data to produce legislative districts whose boundaries are defined using the same census blocks as the official redistricting data. A considerable investment, also occurring over the full decade, ensures that the boundaries embodied in the geographic areas from the nation to the census block properly reflect the boundaries of the political entities whose legislative bodies require new districts.
The second element in engineering a typical differentially private data publication system is the query strategy, which is the technical term for the collection of statistics to which the privacy-loss budget will be allocated. The statistics in the query strategy are calculated directly from the confidential microdata. Then, they are passed to the chosen differential privacy mechanism along with the privacy-loss budget allocation assigned to each statistic. For each statistic, the mechanism draws random bits, transforms them as specified to achieve the appropriate probability mass function, and adds that random number to the confidential statistic. The output of the mechanism is called a noisy measurement. The collection of noisy measurements from the 2020 Census redistricting data, referenced by MSI, is called the 2020 Census Redistricting Noisy Measurement File.
If the query strategy for the 2020 redistricting data were identical to the query workload, then there would be exactly one noisy measurement for every linearly independent statistic in the published data. Because the differential privacy mechanism in the DAS used discrete Gaussian noise, the variance of each published statistic would, in this case, depend only on the privacy-loss budget allocated to that statistic and could be easily calculated from the formulas in Canonne et al. (2020). Because the DAS added independent discrete Gaussian noise to each confidential statistic, the variance of any aggregation of the 2020 redistricting data could, again in this case, be calculated by summing the variances of each independent statistic composing the aggregation. To be succinct, for any proposed new voting district, the variance of its total population and of the component racial and ethnic subpopulations would, in this case, be the sums of the variances of those statistics in the geographic areas aggregated to produce the district.1
Table 7 in Abowd et al. (2022) reveals that the query strategy for the redistricting data was much more extensive than the query workload. The query workload consists of the 1.5 billion linearly independent statistics noted in Section 1. The query strategy consists of 16 billion linearly independent statistics—meaning that the NMF is an order of magnitude larger than the 2020 redistricting data. Why did this happen? Should it be repeated? Can it be improved? These are the research questions that experimental products like the NMF support.
The basic reason that the redistricting data query strategy produced 16 billion statistics instead of 1.5 billion is the design constraints imposed by consensus among the Census Bureau’s career senior executives—the members of its Operating and Data Stewardship Executive Policy Committees, including me. These constraints were that the 2020 Disclosure Avoidance System:
could not affect the apportionment use case;
had to use the query workload already defined by the 2018 Federal Register Notice for the 2020 Redistricting Data (P.L. 94-171) Summary File;
had to accept the 2020 Census Edited File (CEF) as input;
had to deliver its output to the 2020 Census tabulation system as unweighted microdata respecting the schema used for the CEF;
could not delay the publication of the 2020 Redistricting Data (P.L. 94-171) Summary File beyond March 31, 2021.
To meet the first constraint, the total populations of the 50 states and the District of Columbia, as well as the population of the Commonwealth of Puerto Rico, were not perturbed. The second constraint meant that a value for every one of the 1.5 billion statistics specified in the workload had to be published. This constraint effectively ruled out adaptive implementations, such as the one used for the recently released Detailed DHC-A, that could have tested thresholds, then aggregated cells according to specified precision (inverse variance) targets. The third constraint meant that the definition of the confidential data was the record-level image in the CEF and not the actual response data, which were contained in the 2020 Census Unedited File (CUF). The fourth constraint meant that the noisy measurements had to be postprocessed into microdata forcing the estimation of many statistics not in the query workload. The fifth constraint, which was unofficially relaxed to August 12, 2021, due to the pandemic, meant that the processing required to run the DAS could not push the publication date beyond the statutory deadline. Programming efficiencies produced code that ran in less than a day for the redistricting data, allowing adequate time for the scheduled human reviews of the output.
2020 DAS postprocessing of the noisy measurements into microdata had two important mathematical consequences. First, it implied nonnegativity constraints on all cells. Second, it implied that some privacy-loss budget had to be allocated to what is called the detailed query: the interaction of all the tabulation variables with each other. For the redistricting data, the person-level detailed query has 2,016 cells—far more than the 260 person-level cells in the published tables. Every one of these detailed query cells, as Table 7 in Abowd et al. (2022) shows, got the lion’s share of the privacy-loss budget in the 2020 DAS—far more than the core query that represented the 252-cell workload of
As Abowd et al. (2021) showed, the nonnegativity constraints are the accuracy culprit. They create a postprocessing error that cannot be reasonably controlled via the privacy-loss budget; however, they can be greatly reduced by algorithmic tuning, which was accomplished for the redistricting data. There are good reasons to do this postprocessing in spite of the error because the postprocessing discovers sparsity in the confidential tables, which permits most of the zeros in the confidential data to appear as zeros in the published data. Evaluating the postprocessing error has nothing to do with the values of the noisy measurements, and can only be studied in combination with the published tables and assessment of the simulation properties of the 2020 DAS, which a team at the Census Bureau is doing. (e.g., Cumings-Menon, 2024).
The 2020 Census is not the only recent population census to encounter this issue. The publication tables from the 2021 Census of England and Wales published by the Office of National Statistics (ONS) in the United Kingdom have exactly the same error due to nonnegativity constraints even though the noise added to their publication tables was not produced using a differential privacy framework. When adding the noise, ONS processed the tables to eliminate negative values and constrain population margins to sum to fixed population totals. This processing, which takes a lighter touch than the DAS postprocessing required to produce microdata, nevertheless means that in 2021 Census of England and Wales cells with low counts have a slight positive bias and cells with high counts a slight negative bias (Office of National Statistics, UK, 2023).
The privacy-loss budget allocations shown in Table 7 of Abowd et al. (2022) were determined by extensive experiments documented therein. Those experiments were necessary because the atom of the geographical hierarchy in the redistricting use case cannot have a minimum population, and the tables at every geographic level of the redistricting data must have an entry in every cell. Because of these design constraints, there can be neither minimum populations in a census block nor block-level tables that aggregate race and ethnicity combinations. Thus, the work-horse geographic unit became the census block group, but not the publication block group—rather a custom block group that isolated blocks containing group quarters or tribal areas from the rest of the blocks in the block group (Cumings-Menon et al., 2023).
Block groups have large enough populations to proxy for a voting district in lower population municipalities like census places. The work-horse table became the detailed query because it prevented the group quarters population characteristics from cross-contaminating the housing unit population characteristics when the microdata were created. Neither of these outcomes is essential to the confidentiality protection framework. They are consequences of the design constraints listed above, particularly imposing nonnegativity on every cell, implementing hierarchical consistency via tabulation from microdata, and forcing every cell in every table to have a value (no collapsing of cells). As noted in related work (Kenny et al., 2024) “the NMF contains too much noise to be directly useful without measurement error modeling, ... TopDown’s post-processing reduces the NMF noise and produces data whose accuracy is similar to that of swapping.” That is precisely the point of the postprocessing, and confirms the analysis in Wright and Irimata (2021). If the redistricting community, especially the academic contributors, want the noisy measurements to be less noisy, some of the constraints must be relaxed. Relaxing those constraints requires consensus in the redistricting community, practitioners as well as academics, concerning the geographic atom and the publication format.
The redistricting community largely drove the content and form of the 2020 Census redistricting data. Unless the Census Act is amended or the Voting Rights Act is repealed or further weakened by the Supreme Court this decade, the redistricting community will drive the content and form of 2030 Census redistricting data. That is how the statutory mandate to produce these data works. Now that the Census Bureau has acknowledged that publication formats that include detailed tables for geographies like census blocks require much more comprehensive confidentiality protections, there is no going back to the systems used for the 2010, 2000, and 1990 Censuses.
Redistricting researchers could ask: Given a fixed privacy-loss budget, what format for the publication tables best meets the redistricting use case? That format could be the one embodied in the 2020 DAS, which does an excellent job of preserving zeros in the confidential data (more than 90% of the confidential zeros are zeros in the published tables2) while still carefully protecting confidentiality. Or, it could be the application of the full privacy-loss budget to the publication query (the 262 cells in the 2020 redistricting data tables) with minimal postprocessing. Those tables would have many negative entries, but every entry would be unbiased (at least from the disclosure avoidance noise), and their aggregates might produce low-variance voting district statistics that could withstand Voting Rights Act scrutiny. Or something in between. Or something completely different.
The author is the former chief scientist and associate director for research and methodology at the U.S. Census Bureau. The opinions expressed in this commentary are his and not those of the Census Bureau. He acknowledges thousands of insights gained from years of working with the teams at the Census Bureau, both internal and external, who implemented and assessed the 2020 Census Disclosure Avoidance System. Dan Kifer and Philip Leclerc provided helpful comments on this article.
John M. Abowd has no financial or non-financial disclosures to share for this article.
Abowd, J. M., Ashmead, R., Cumings-Menon, R., Garfinkel, S., Heineck, M., Heiss, C., Johns, R., Kifer, D., Leclerc, P., Machanavajjhala, A., Moran, B., Sexton, W., Spence, M., & Zhuravlev, P. (2022). The 2020 Census Disclosure Avoidance System TopDown Algorithm. Harvard Data Science Review, (Special Issue 2). https://hdsr.mitpress.mit.edu/pub/7evz361i
Abowd, J. M., Ashmead, R., Cumings-Menon, R., Garfinkel, S., Kifer, D., Leclerc, P., Sexton, W., Simpson, A., Task, C., & Zhuravlev, P. (2021). An uncertainty principle is a price of privacy-preserving microdata. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, & J. W. Vaughan (Eds.), Advances in Neural Information Processing Systems (pp. 11883–11895, Vol. 34). Curran Associates. https://bit.ly/abowdetal2021
Abowd, J. M., Stephens, B., Vilhuber, L., Andersson, F., McKinney, K. L., Roemer, M., & Woodcock, S. (2009). The LEHD infrastructure files and the creation of the Quarterly Workforce Indicators. In Producer dynamics: New evidence from micro data (pp. 149–230). University of Chicago Press for the NBER. http://www.nber.org/chapters/c0485
Canonne, C. L., Kamath, G., & Steinke, T. (2020). The discrete Gaussian for differential privacy. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.), Advances in Neural Information Processing Systems (pp. 15676–15688, Vol. 33). Curran Associates. https://bit.ly/canonneetal2020
Census Act, 13 U.S.C., §§ 1–402 (1954). https://www.govinfo.gov/link/statute/68/1025
Chetty, R., Friedman, J. N., Hendren, N., Jones, M. R., & Porter, S. R. (2020). The Opportunity Atlas: Mapping the childhood roots of social mobility (tech. rep.). National Bureau of Economic Research. https://doi.org/10.3386/w25147
Cumings-Menon, R. (2024). Full-information estimation for hierarchical data. ArXiv. https://doi.org/10.48550/arXiv.2404.13164
Cumings-Menon, R., Abowd, J. M., Ashmead, R., Kifer, D., Leclerc, P., Ocker, J., Ratcliffe, M., & Zhuravlev, P. (2023). Geographic spines in the 2020 Census Disclosure Avoidance System. ArXiv. https://doi.org/10.48550/arXiv.2203.16654
Dwork, C., King, G., Greenwood, G., Adler, W., Alvarez, J., Ballesteros, M., Beck, N., Bouk, D., Boyd, D., Brehm, J., Bun, M., Cohen, A., Cook, C., Desfontaines, D., Evans, G., Flaxman, A., Franzese, R., Gaboardi, M., Geambasu, R., & . . . Zhang, L. (2021, August 12). Letter [to Dr. Ron Jarmin, Acting Director, United States Census Bureau]. https://bit.ly/kingdworklist
Finlay, K., Mueller-Smith, M., & Papp, J. (2022). The Criminal Justice Administrative Records System: A next-generation research data platform. Scientific Data, 9, Article 562. https://doi.org/10.1038/s41597-022-01620-y
Haltiwanger, J., Jarmin, R. S., & Miranda, J. (2013). Who creates jobs? Small versus large versus young. The Review of Economics and Statistics, 95(2), 347–361. http://www.jstor.org/stable/43554390
Institute for Survey Research. (n.d.). The Census-Enhanced Health and Retirement Study: About. Retrieved April 15, 2023, from https://cenhrs.isr.umich.edu/about/
Kenny, C. T., McCartan, C., Kuriwaki, S., Simko, T., & Imai, K. (2024). Evaluating bias and noise induced by the U.S. Census Bureau’s privacy protection methods. ArXiv. https://doi.org/10.48550/arXiv.2306.07521
McCartan, C., Simko, T., & Imai, K. (2023). Making differential privacy work for census data users. Harvard Data Science Review, 5(4). https://doi.org/10.1162/99608f92.c3c87223
McCartan, C., Simko, T., & Imai, K. (2024). Rejoinder: We can improve the usability of the Census Noisy Measurements File. Harvard Data Science Review, 6(2). https://doi.org/10.1162/99608f92.f9f4b9a4
National Academies of Sciences, Engineering, and Medicine. (2020). 2020 Census data products: Data needs and privacy considerations: Proceedings of a workshop. The National Academies Press. https://doi.org/10.17226/25978
Office of National Statistics, UK. (2023). Protecting personal data in Census 2021 results. Retrieved November 29, 2023, from https://bit.ly/onspriv2023
Phillips v. U. S. Bureau of the Census, 22-cv-9304 (JSR)(S.D.N.Y. April 10, 2023). Stipulation and order of dismissal. https://bit.ly/phillips_dismissal
Phillips v. U. S. Bureau of the Census, 22-cv-9304 (JSR)(S.D.N.Y. Jan. 22, 2024). Opinion and order. https://bit.ly/phillips_fees
U.S. Census Bureau. (2018). Final content design for the prototype 2020 Census redistricting data file. Federal Register, 83(84), 19042–19043. https://www.govinfo.gov/content/pkg/FR-2018-05-01/pdf/2018-09189.pdf
U.S. Census Bureau. (2021). DAS 2020 redistricting production code release. GitHub. Retrieved November 29, 2023, from https://github.com/uscensusbureau/DAS_2020_Redistricting_Production_Code
U.S. Const. art. I, § 2.
Voting Rights Act, 52 U.S.C, Subtitle I (1965). https://uscode.house.gov/view.xhtml?path=/prelim@title52/subtitle1&edition=prelim
Wright, T., & Irimata, K. (2021). Empirical study of two aspects of the TopDown Algorithm output for redistricting: Reliability & variability (August 5, 2021 update). U.S. Census Bureau. https://www.census.gov/content/dam/Census/library/working- papers/2021/adrm/SSS2021-02.pdf
©2024 John M. Abowd. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.