Column Editor’s Note: In this Effective Policy Learning piece, Victoria A. Velkoff, U.S. Census Bureau Associate Director for Demographic Programs, and Christine Hartley, Assistant Division Chief for Estimates and Projections in the Population Division at the U.S. Census Bureau, describe the process that the Bureau will use to update U.S. population estimates between the 2020 and 2030 decennial censuses of population and housing. The process, used between each decennial census, has become more complicated after factoring in the effects of the COVID-19 pandemic, which was raging during the data collection period for the 2020 Census. The population estimates rely heavily on an accurate 2020 Census population count, birth and death records, and immigration estimates, all of which were affected by the pandemic. In addition, the U.S. Census Bureau initiated some new approaches to preventing personal information in the census from being identifiable in publicly available files. These complications have required a major revamping of the population estimation methodology.
Keywords: 2020 Census, Population Estimates Program, COVID-19 pandemic, Vital Records, blended base, immigration estimates
Much attention is paid to the challenge of enumerating the entire population of the United States once per decade. The constitutionally and legislatively mandated decennial census of population and housing, hailed as the nation’s “largest peacetime mobilization effort” (Nagashybayeva, 2013) is a comprehensive lesson in logistics, adaptability, and patience. But in the years between the census counts, the Population Estimates Program (PEP) at the U.S. Census Bureau dedicates itself to keeping current data in the hands of the public by publishing annual time series of population and housing unit estimates. These files begin with data for Census Day (April 1 of the decennial year), but they leverage a variety of administrative records to capture annual change and update totals. This way, we can be 7 years out from the decennial census, for example, but we are not stuck with data that are similarly 7 years out of date. This is critical for the population estimates’ wide range of use cases, which include the allocation of federal funds to states and localities, the development of key statistical indicators and weights for demographic surveys, informing planning at the state and local levels, and so much more.
Following on a 2020 decennial census that featured major interruptions, obstacles, and complexities, bringing the annual population estimates forward into a post-2020 Census environment was not the straightforward exercise it would otherwise have been. Instead, PEP found itself contending with its own set of unique complications pertaining to its methodologies and administrative record inputs that challenged staff to adapt the estimates in a number of ways.
For example, the cohort-component method used to produce estimates for counties and higher levels of geography relies on an array of administrative record inputs to measure changes in births, deaths, and migration relative to the date of the census. In preparing for the Vintage 2021 estimates,1 PEP needed to determine whether COVID-19 had impacted these input data, and whether methodological adjustments would be needed to properly capture the impact of COVID-19 on these components of change.
In the case of births and deaths, this consisted of evaluating the vital records we receive from the U.S. National Center for Health Statistics (NCHS), which is part of the U.S. Centers for Disease Control and Prevention (CDC). Previously, finalized data used to estimate births and deaths featured a 2-year lag. Our usual method to compensate for this gap would not effectively depict changes in the trends resulting from the pandemic. For both, we were able to incorporate far more current provisional data from NCHS to ensure that the inputs reflected changes in the birth rates and increased deaths that were witnessed during the first year of the pandemic.
The situation surrounding the international migration component was more complex due to our usual reliance on data from the U.S. Census Bureau’s American Community Survey (ACS), which had struggled with data collection during the pandemic.2 Using the 2020 ACS data on foreign-born immigration would not produce plausible demographic patterns, and so we turned to other sources such as the U.S. Department of Justice, U.S. Department of Homeland Security (Citizenship and Immigration Services), U.S. State Department (Bureau of Consular Affairs and Refugee Processing Center), and the Institute of International Education. By combining data from these sources to create a benchmark for foreign-born immigration, we were able to determine a proportional adjustment to apply to the 2019 ACS estimates to create estimates for 2020.
Finally, we needed to consider whether our method of measuring net domestic migration was robust to the effects of the pandemic or similarly required an adjustment. The primary source for our domestic migration data is tax returns from the Internal Revenue Service (IRS). PEP staff sought to evaluate whether tax filing extensions that had been granted during the pandemic affected these data. Comparisons were made against historical filing trends, and differences were examined between patterns of domestic migration in the IRS data and data from other sources, such as the U.S. Postal Service’s National Change of Address database. We were able to conclude that no adjustment to our usual method of measuring net domestic migration was needed.
Ensuring the inputs to the cohort-component method were properly adapted to withstand (or reflect) the impact of the pandemic was critical in producing reliable estimates for July 1, 2021. But the greatest challenge PEP faced was creating the updated base population, or starting point, for the Vintage 2021 time series. Under normal circumstances, following the decennial census enumeration, PEP would simply import the data from the new census in the format required for processing and build the new time series from that ‘base.’ However, the COVID-19 pandemic resulted in major disruptions for field operations, which translated into significant delays for data availability. The annual estimates production process exists in the window of time between when necessary data inputs become available and the end-of-year deadline to publish the estimates, which is legislatively mandated. This meant that beyond a point, delays in the availability of the 2020 Census results could not be effectively absorbed by PEP. To that end, we began to explore options for creating the April 1, 2020, estimates base within the limited amount of time available to identify, test, and evaluate possible alternatives. This was particularly challenging given the lack of precedent; research into estimates methodologies dating back multiple decades supported the notion that the estimates base was consistently developed from the census count, typically only reflecting changes to capture geographic updates or the outcome of programs such as Count Question Resolution.3
However, we were not starting from scratch: the Vintage 2020 estimates series, which was built from the 2010 Census, included an estimate for April 1, 2020, with all the geographic and demographic detail required for estimates processing. During our research to prepare for Vintage 2021, the 2020 Census count of the total population was finalized, which revealed that the Vintage 2020 estimate for April 1, 2020, was 2,050,539 lower than the 2020 Census. Although this difference amounted to just 0.6% of the population, our intent was to identify a way to leverage detail from Vintage 2020 without retaining that difference (referred to as the ‘error of closure’) in the Vintage 2021 estimates. We examined multiple sources of data with priority given to those that were readily available and vetted regarding their methodology and quality. One data source was found to be a particularly good fit: the 2020 Demographic Analysis (DA) estimates, which are national-level estimates for April 1, 2020, that had been produced by PEP staff using an alternative administrative records-based methodology. We determined that by applying the age and sex distribution from these estimates to the national-, state-, and county-level population totals from the 2020 Census PL 94-171 Redistricting File,4 the resulting estimates could serve as a national control to essentially ‘pull up’ the Vintage 2020 estimates. Ultimately, with the time available to us, this represented the most detail that we could confidently incorporate into the estimates base, and the resulting April 1, 2020, estimates came to be known as the ‘blended base.’5
An evaluation of the blended base confirmed that incorporating the age and sex detail from 2020 DA and totals from the 2020 Census successfully improved the Vintage 2020 estimates, producing a suitable base population that featured some degree of consistency with the census counts. As with all PEP products, these estimates were subject to a rigorous review, which included a focus on the methodological reasonableness of the blended base. Additionally, although no perfect benchmark exists to determine the exact impact of the blended base across all demographic and geographic subgroups, the method was initially simulated using data from 2010. This enabled us to make comparisons between a blended base for April 1, 2010, and the actual April 1, 2010, estimates base to assess the model prior to applying it to 2020 data.
However, combining these data raised questions regarding why other detail available from the 2020 DA or PL 94-171 Redistricting File was not also included. Early research sought to pull in more detail—namely, race and Hispanic origin from DA and the population aged 18 and older from the 2020 Census—but incorporating this information piecemeal resulted in implausible demographic patterns within the base without sufficient time remaining to work through these issues.
Furthermore, during the process of producing and publishing the Vintage 2021 estimates series, the U.S. Census Bureau released measures of net coverage error from its Post-Enumeration Survey (PES)6 and 2020 DA, which reinvigorated discussions pertaining to 2020 Census quality. In response, a team of subject matter experts across the Census Bureau was assembled to research the feasibility of taking coverage measures such as the PES and DA into account in the development of the population estimates. This comprehensive research endeavor is expected to extend into next year and beyond, and its results will inform whether and how additional detail from the 2020 Census, PES, or DA is used in the estimates base. Although this approach represents a stark departure from previous decades, as was noted above, it provides a mechanism to ensure that the use of census data in the estimates base is thoughtful and deliberate so as not to retain significant documented coverage issues across the decade.
Whereas many challenges of bringing the annual population estimates forward into a post-2020 Census environment were overcome, other major challenges remain: namely, determining the best means to systematically incorporate data into our blended base to continue to improve the estimate for April 1, 2020; deciding which additional data from the 2020 Census, if any, we should incorporate; and ensuring the estimates are in compliance with the U.S. Census Bureau’s disclosure avoidance modernization in order to protect the privacy and confidentiality of the population. PEP staff are already exploring solutions, seeking a degree of adaptability that does not sacrifice quality as we work toward our ever-present goal of producing the most accurate estimates possible of our nation’s population.
Victoria A. Velkoff and Christine Hartley have no financial or non-financial disclosures to share for this article.
Nagashybayeva, Gulnar. (2013, March). History of the U.S. Census. Library of Congress Research Guides. https://guides.loc.gov/census-connections/census-history
Pub L. No. 94-171, 89 Stat. 1023 (1975). https://www.govinfo.gov/content/pkg/STATUTE-89/pdf/STATUTE-89-Pg1023.pdf
©2022 Victoria Velkoff and Christine Hartley. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.