Abstract

The use of data derived from electronic health records (EHRs) to describe racial and ethnic health disparities is increasingly common, but there are challenges. While the number of patients covered by EHRs can be quite large, such patients may not be representative of a source population. One way to evaluate the extent of this limitation is by linking EHRs to an external source, in this case with the American Community Survey (ACS). Relying on a stratified random sample of about 200,000 patient records from a large, public, integrated health delivery system in North Carolina (2016–2019), we assess linkages to restricted ACS microdata (2001–2017) by race and ethnicity to understand the strengths and weaknesses of EHR-derived data for describing disparities. The results in this research note suggest that Black–White comparisons will benefit from standard adjustments (e.g., weighting procedures) but that misestimation of health disparities may arise for Hispanic patients because of differential coverage rates for this group.

Introduction

As survey response rates fall and costs of data collection rise, researchers increasingly look to administrative and clinical data as supplements or substitutes (Meyer et al. 2015). For those interested in health disparities, data derived from electronic health records (EHRs) are one such source. Although researchers do not have direct access to EHRs, for simplicity, we refer to EHR-derived data as EHRs throughout. EHRs are increasingly being used to describe racial and ethnic disparities in health care utilization, treatment, and outcomes. Recent examples include studies of childhood obesity (Sharifi et al. 2016), maternal and neonatal delivery complications (Huennekens et al. 2020), and the receipt of treatment medications among adults seeking care and with a positive SARS-CoV-2 test (Boehmer et al. 2022; Wiltz et al. 2022). However, there are concerns related to the representation of racial and ethnic groups in EHRs, which has implications for studying racial and ethnic disparities in population research. This research note considers lessons learned about the strengths and weaknesses of EHRs for health disparities research. Such research has long been of interest to demographers (Foster et al. 2024; Montez et al. 2019). By linking EHRs from a large, integrated health delivery system in North Carolina to microdata from the American Community Survey (ACS), our findings demonstrate the potential for the use of EHRs in population research.

Background

The use of EHRs has tremendous potential for increasing knowledge and understanding of racial and ethnic health disparities. They contain up-to-date and detailed information pertinent to people's health (e.g., utilization, laboratory tests and results, procedures, problem lists, visit-specific diagnoses, medications prescribed) for very large patient populations (Casey et al. 2016) and have benefits over the use of other health data sources. Compared with health care administrative data (e.g., insurance claims data), which provide billing diagnoses and types of care regardless of where care is received, EHRs contain results of tests and biologic data such as weight or blood pressure. Compared with vital statistics, which describe health disparities at the beginning and end of life, EHRs cover the years in between. Finally, in contrast to most surveys, which rely on subjective reports and are subject to potential recall bias regarding test results and diagnoses, EHRs contain the actual test results and whether they exceed clinical thresholds, as well as the diagnoses based on them.

EHRs are a promising source for documenting racial and ethnic disparities in health and health care and can inform targeted interventions to ameliorate them (Rumball-Smith and Bates 2018), yet there are challenges: the data are not representative of a source population. Information based on EHRs pertains to people who obtain health care from a particular provider (or group of providers). Numbers can be quite large, but as a nonprobability sample of a source population, they are not representative of a population (Friedman et al. 2013; Goldstein et al. 2016; Groves 2006). Estimation of health disparities can be biased if racial and ethnic groups are differentially represented. Representativeness may be growing since health facilities are increasingly aggregated into large, integrated delivery systems that care for up to several million people each. Nevertheless, the potential for bias remains.

As a further complication, EHRs are selective of people who choose to seek health care, and this selection may vary between groups depending on insurance coverage and other factors (Goldstein et al. 2016; Weiskopf and Weng 2013). Numerators and denominators may be affected differently, as individuals who do not need or want health care or who lack insurance may have reduced representation. Denominators may exclude some individuals at risk of the disease or health condition recorded in the numerator, which can lead to potentially overstating prevalence. Analysts typically correct for this by weighting prevalence measures or incorporating inverse proportional weights in regression analyses. Bias is lessened but not eliminated when EHRs incorporate a full range of care, from preventive to acute (Bower et al. 2017; Klompas et al. 2017). Other factors affecting health care utilization are also relevant to inclusion in EHRs, including education, health insurance coverage, and access to transportation (Bower et al. 2017). Racial and ethnic differences in access to health care have narrowed over the past two decades, especially for the Hispanic population, but nevertheless persist (Ma et al. 2022; Mahajan et al. 2021).

With full information on the source population, it would be straightforward to identify potential bias and adjust for it. Instead, in this research note, we take an indirect approach, assessing strengths and weaknesses of EHRs for the study of racial and ethnic health disparities through comparison with the ACS. The ACS is an annual cross-sectional survey of about 1–1.5% of the U.S. population and was designed as a replacement for the decennial census long-form in 2001. Critically, it is based on a probability sample designed to be representative. Although the sampling fraction is small, the number of individuals included each year is very large, more than 5 million nationally and exceeding 100,000 in North Carolina, the state served by the integrated health system we study.1 Participation in the ACS is mandatory and response rates have historically topped 90% (U.S. Census Bureau 2022b), although coverage rates vary by race and ethnicity (U.S. Census Bureau 2022a), a point to which we will return below.

Data and Methods

This research note leveraged data collected as part of a pilot study to assess the feasibility of integrating the detailed health information available in EHRs with social, economic, and demographic data from the ACS (Udalova et al. 2022). The pilot study focused on EHRs from a large, integrated health delivery system in North Carolina over the period 2016–2019. The health system is composed of a large academic health center, 11 community hospitals, and many hundreds of practices across the state, including both urban and rural areas. Demographic and clinical data are collected using a single enterprise-level EHR with consistent policies regarding registration and demographic and clinical data collection across hospitals and practices. Care is offered to all residents of the state regardless of ability to pay, including the uninsured, resulting in a diverse patient population comprising citizens and noncitizens. More than two million patients were seen during 2016–2019, prior to Medicaid expansion in the state.

In the integrated health delivery system, race and ethnicity are generally obtained from a patient or proxy as part of registration during the first visit. Patients provided the information directly to a clerk or as part of an intake questionnaire. Answers were coded as follows: for race, American Indian or Alaska Native (AIAN), Asian, Black or African American, Native Hawaiian or other Pacific Islander, other race, patient refused, unknown, and White or Caucasian (categories were listed in alphabetical order on the screen); and for ethnicity, Hispanic or Latino, not Hispanic or Latino, patient refused, and unknown. Clerks were instructed to accept whatever answer they were given and specifically not to push if there was resistance and not to fill in on the basis of their own observations. About 10.5% of patients were missing, that is, unknown or refused to answer the race question (Table 1). This figure compares favorably to averages based on the 56 health care institutions in the National COVID Collaborative Cohort, in which 11.3% of patients were missing data on race and an additional 8% were recorded as refusals (Cook et al. 2022). Although the missing data are included in our analyses, a detailed assessment of these data is beyond the scope of this research note. We return to this in the Discussion.

We drew a disproportionate stratified random sample of about 200,000 patients aged 25–74 with at least two visits (e.g., hospitalization, inpatient visit, or outpatient visit) between 2016 and 2019 (Udalova et al. 2022). We oversampled patients who identified as Black, Asian, or Hispanic, as well as those for whom race and ethnicity were unknown or refused (collectively identified as missing) to assess more precise estimates for these groups. Table 1 presents descriptive statistics for the sample drawn, weighted to account for the stratified sample design. Weights were constructed using the racial and ethnic composition of the original EHR population. Patients in the weighted sample predominantly identify as White (61.71%) or Black (19.05%); smaller fractions identify as AIAN, Asian, or Hispanic (0.47%, 1.84%, and 5.58%, respectively). For this analysis, patients identifying as Native Hawaiian or other Pacific Islander were grouped with other race.

Structured EHRs were transferred via secure means to the Census Bureau IT environment and a protected identification key (PIK) was assigned.2 A PIK is a unique anonymized person identifier used at the Census Bureau to link surveys and administrative records. Personally identifiable information from EHRs is passed through successive modules of the Person Identification Validation System, which compares Social Security number (SSN), address, name, sex, and full date of birth to a reference file maintained at the Census Bureau (Wagner and Layne 2014:23). When a linkage could be made between the incoming record and the reference file, a PIK was appended to the patient record. The match to SSN is exact; matches involving the other identifiers are probabilistic (for additional information regarding PIK rates, see Udalova et al. 2022). Once a PIK was assigned, SSN and name were dropped, leaving information on sex, birth year, race, ethnicity, language, health insurance, and broad categories of residence for analysis. No health information was included in the transfer.

Our first interest was in whether the success of PIK assignment depended on race and ethnicity. There are a couple of reasons why it might (Bond et al. 2014:30). First, some patients may not be included in the federal and state data sources on which the reference file is based, for example, recent immigrants. In North Carolina, the Hispanic population has increased rapidly, accounting for almost 10% of the overall population in 2019, the end of our window (U.S. Census Bureau 2019a). Immigration played a major role in this change: about 40% of the Hispanic population was foreign-born in 2019 (U.S. Census Bureau 2019b). A substantial portion of all immigrants in North Carolina were undocumented in 2016 (39%), most of whom (56%) were from Mexico (Pew Research Center 2019). Undocumented immigrants may purposely avoid being included in government data. Regarding our second concern, although patients may be included in federal data sources, discrepancies between these sources and EHRs in the way information is recorded may undermine the match.

Once a PIK is assigned to the EHR, we attempted to match to ACS data between 2001 and 2017. There are three reasons why we might fail to make a match. First, ACS data pertain to a sample of the state population, about 1–1.5% per year. An EHR that received a PIK might not match an ACS record because that person was not part of the ACS sample in any of the study years. Without knowing the population of eligible potential matches, it is not possible to say what the match rate should be, but it likely lies between 12% and 18% (Udalova et al. 2022). Match rates will likely be lower for recently arrived immigrants as they are less likely to have been included in the early years of the ACS. Second, there are long-standing differences in the coverage of racial and ethnic groups in the ACS. The Census Bureau measures coverage rates as the ratio of the ACS population of a group to an independent estimate for that group multiplied by 100 (U.S. Census Bureau 2022d). In 2017, the end of our observation window for the study's use of ACS data, national coverage rates were 94.9% for White non-Hispanic, 82.5% for Black non-Hispanic, and 86.9% for Hispanic individuals (U.S. Census Bureau 2022a). Although aggregate ACS population estimates are adjusted for under-coverage, some individuals who should have been included will be missing and not available for a match. Third and finally, only ACS records that have received a PIK can be matched to EHRs with PIKs. The challenges with PIK assignments noted for EHRs also apply to PIK assignments for the ACS (Bond et al. 2014).

Results

Table 2 shows unadjusted PIK and ACS match rates for racial and ethnic groups as well as for other social attributes available in the EHRs. Table 3 presents the unadjusted PIK and ACS match rates for racial and ethnic groups from Table 2 alongside PIK and ACS match rates adjusted for sex, birth cohort, language, health insurance, and residence. The adjusted rates are predicted probabilities from logistic regression models for each outcome, weighted to account for sample design. Of interest is the degree to which disparities narrow when we account for the (limited) information in the EHRs relevant to access and utilization. Full results are shown in Table A1 in the online appendix, which reports regression coefficients from linear probability models and average marginal effects for logistic models. To aid in interpretation, Table 3 also includes information about national ACS PIK and coverage rates available from published sources. While we focus our interpretation of results on White, Black, and Hispanic patients, all groups are included in the tables.

Beginning with unadjusted PIK rates, almost all patients who identified as White (98.65%) or Black (99.54%) were assigned PIKs (Table 3). These patients were well covered in administrative data systems, and the information provided to the health care system was of sufficiently high quality for PIK assignment. In contrast, only 77.61% of patients who identified their race as something “other” than the categories provided in the EHRs were assigned PIKs. With respect to ethnicity, 71.65% of patients who identified as Hispanic or Latino received PIKs, compared with a high degree of success for patients who did not identify as Hispanic or Latino (98.49%). When comparing these patterns to unadjusted PIK rates for the 2010 ACS (Bond et al. 2014), which were broadly similar, we make two observations. First, PIK rates were higher for White and Black patients in the EHRs than for the population generally, probably because SSNs were available for many (although not all) of them. Second, PIK rates were lower for “other” race and Hispanic/Latino patients in our study, suggesting something distinctive about the population of patients in our EHRs relative to the national sample.

Next, we investigated the likelihood of receiving a PIK while adjusting for age, sex, race, ethnicity, language, and insurance status as reported in the EHRs. Table 3 shows predicted probabilities based on logistic regression estimates, with all other variables held at their means. All adjusted PIK rates were quite high, with differences substantially narrowed relative to the unadjusted rates. Adjusted PIK rates for “other” race and for Hispanic or Latino patients were within two percentage points of the other racial and ethnic categories, indicating the importance of language, health insurance, and to a lesser extent age and residence. These results suggest that a substantial portion of the difference in the unadjusted PIK rates reflects differences in health care access among racial and ethnic subpopulations.

Once PIKs are assigned to EHRs, the next step is to match them to individuals in the ACS who have been assigned PIKs. Unadjusted conditional ACS match rates were 17.82% for White patients, 14.46% for Black patients, and 10.21% for Hispanic or Latino patients. Adjustments for sex, birth cohort, health insurance, and residence were of little consequence to conditional ACS match rates for White and Black patients but make a substantial difference for Hispanic patients. For the latter, the adjusted conditional ACS match rate was almost 40% higher than the unadjusted rate, again suggesting the importance of access and utilization. That there is not more of an improvement for Black patients is likely due to under-coverage in the ACS (Table 3). Although the ACS coverage rates refer to a combined race/ethnicity category rather than separate categories, they are nevertheless instructive. ACS coverage rates for 2017 were highest for non-Hispanic White individuals (94.9%), lower for Hispanic or Latino individuals (86.9%), and lowest for non-Hispanic Black individuals (82.5%). Additionally, the adjusted ACS match rates in Table 3 were highest for White patients and lowest for Black and Hispanic patients. It is important to remember that population estimates based on ACS data adjust for coverage.

Discussion

Prevalence is measured relative to a population at risk. Research that draws on EHRs to describe health disparities typically uses the total number of patients of a specific race or ethnicity for the denominator. However, because the patients included in EHRs may be differentially selected from the source population, these estimates are vulnerable to bias (Goldstein et al. 2016). Prevalence may be overstated if representation is better in the numerator than the denominator. If the extent of this bias differs by race or ethnicity, the interpretation of health disparities may be misestimated, which has implications for the use of EHRs for population research. The goal of this research note was to reflect on this potential bias by using information gleaned from linking EHRs from a large integrated health delivery system in North Carolina to ACS microdata.

The first step in this linkage, PIK assignment, provided information about the quality of personal information in the EHRs and patient representation in government data systems. In terms of our results, PIK assignments were successful. Availability of SSNs in many of the health records facilitated these assignments, helping to explain why PIK rates for the EHRs were higher than for the 2010 ACS. The exceptions were for patients who identified as “other” race and Hispanic or Latino ethnicity in the EHRs. Unadjusted PIK rates for these groups were 77.61% and 71.65%, respectively. The disproportionate representation of recent immigrants could help explain why PIK rates are so low for Hispanic patients. Problems with PIK assignments for Hispanic patients in our sample may also relate to documentation status.3 More than half of undocumented immigrants in North Carolina are Hispanic (Pew Research Center 2019). Underutilization of health services among people with undocumented status is well-known (Cabral and Cuevas 2020), possibly for that reason but also because of a lack of insurance (except in the case of pregnancy), yet those urgently needing medical care may opt to take the risk. EHRs thus may include more undocumented immigrants than government sources, especially those in poor health.

The second step, matching to the ACS conditional on PIK assignment, provided information about representation in the ACS. Our procedures consider ACS data over a large interval, 2001–2017. A conditional ACS match rate that falls short of the average might be due to changes in racial and ethnic composition over that period. In North Carolina, the Hispanic population has increased, more than doubling between 2000 and 2010 and increasing by a further 40% the following decade (U.S. Census Bureau 2020). Because the Hispanic population was smaller in earlier years of the ACS, we expected conditional ACS match rates to be lower as a result, and they were.4

Conditional ACS match rates also depend on ACS coverage, measures of which reflect inclusion of addresses in the frame as well as response rates for those found. Despite best efforts, and the fact that participation is mandatory, the ACS falls short of complete coverage. Challenges associated with enumerating the Black population are long-standing and well-documented (O'Hare 2019). Lower coverage for the Hispanic population may partly reflect undocumented status. Not only are people with undocumented status less likely to participate in federal data collections, those with whom they live may also avoid participation, even those who are documented or citizens (Hall et al. 2019; Kopparam 2022).

What does this mean for health disparities research based on EHRs? Our results suggest that differences between disease prevalence rates for Hispanic individuals relative to other groups could be misrepresented by biases in the EHRs. On one hand, our results suggest that undocumented immigrants may be better represented in EHRs than in government systems. Yet, this representation is likely selective, related to the need for health care. Compounding the problem is that a large proportion of Hispanic individuals are uninsured, almost three times the uninsured proportion of the state population (31% compared with 11% in 2019) (Khachaturyan and Dreier 2020). Using ACS data to adjust the denominator of the prevalence rate would be an improvement but would not fully correct the problem given that Hispanic individuals missing from government records are also less likely to participate in the ACS.

The situation is different for Black patients. It is possible that lack of trust and other characteristics that reduce the participation of Black individuals in the ACS also lead these individuals to avoid interacting with the health care system. However, this seems unlikely. Considering their high PIK rates, it appears that Black patients are part of government data collection more broadly. Further, the fraction of uninsured for Black individuals is about the same as for North Carolina as a whole (12% compared with 11%) (Khachaturyan and Dreier 2020). Moreover, according to Behavioral Risk Factor Surveillance System data for 2019, Black North Carolinians saw a doctor in the past 12 months for routine medical care (checkups) at a rate roughly equivalent to White North Carolinians (84% compared with 80%) and substantially more than Hispanic North Carolinians (64%; NC State Center for Health Statistics 2020). In addition to differences in insurance coverage, differences in age structure and the healthy migrant effect for first-generation Hispanic individuals help to explain this difference. Given all of this, using ACS data to correct the denominator of prevalence rates for potential underrepresentation seems like a reasonable approach for Black patients.

Of course, our analysis suffers from some limitations. The data come from a single health system, which while large is one of several in the state of North Carolina. Its mission is to serve the health needs of all North Carolinians, so we have compared patients with the population of the state, but residents do have alternatives and coverage is far from complete. Second, our analysis assumed that EHR-based information about race and ethnicity reflected patient identities. This information was missing for about 10% of our sample, a significant fraction even if low compared with many other health systems. A detailed discussion of these patients is beyond the scope of this research note, but we would like to point out that PIK and conditional ACS match rates for this group were quite high. Third, the results may reflect some of the peculiarities of North Carolina, perhaps especially the recent growth in the Hispanic population and the role immigration is playing in this growth. While our results may not completely generalize to other health systems or states, there are larger lessons learned, particularly about potential bias in the EHRs, the value of linking EHRs to external sources for assessment purposes, and finally the possibility of leveraging EHRs for research on social determinants and population health.

Acknowledgments

We are grateful for support from the Carolina Population Center (Eunice Kennedy Shriver National Institute of Child Health and Human Development grant P2C HD050924), the UNC Translational and Clinical Sciences Institute (CTSA UL1TR002489), and the Enhancing Health Data (EHealth) program at the U.S. Census Bureau (census.gov/ehealth). This research note is intended to inform interested parties of ongoing research and to encourage discussion. Any opinions and conclusions expressed herein are those of the authors and do not reflect the views of the U.S. Census Bureau. All results were approved for release by the Disclosure Review Board of the U.S. Census Bureau, authorization numbers CBDRB-FY21-POP001-0087, CBDRB-FY23-SEHSD003-021, CBDRB-FY23-POP001-0074, CBDRB-FY23-POP001-0053, and CBDRB-FY22-POP001-0141. All numeric values were rounded according to U.S. Census Bureau disclosure protocols to preserve data privacy. Any opinions and conclusions expressed herein are those of the authors and do not reflect the view of the U.S. Census Bureau.

Notes

1

The total number of housing units included in the ACS average around 2 million nationally (with the exception of 2020) and 60,000 for the state of North Carolina (U.S. Census Bureau 2022c). The number of persons per household between 2018 and 2022 was 2.57 nationally and 2.48 for North Carolina (U.S. Census Bureau 2023). The total numbers of housing units and persons per household were multiplied to get the number of individuals included in each year of the ACS.

2

This research is the result of a collaboration between researchers at the University of North Carolina at Chapel Hill (UNC) and the Enhancing Health Data (EHealth) Program at the U.S. Census Bureau. The EHealth Program (census.gov/ehealth) partners with health data organizations to produce high-quality statistics and research related to population health. Title 13 of the U.S. Code authorizes the Census Bureau to collect information from other entities and requires the Census Bureau to keep the information confidential and to use it only for statistical purposes. If you are interested in using existing Census Bureau restricted microdata, please see the FSRDC website: https://www.census.gov/about/adrm/fsrdc.html.

3

If immigration status were the primary explanation for these low PIK rates, we would expect similarly low PIK rates for Asian patients, who are on average even more recent arrivals in North Carolina. Unadjusted PIK rates for Asian patients are very high, only two percentage points lower than for White patients (see Table 3).

4

The Asian population has also increased dramatically, especially between 2010 and 2020 (64%; U.S. Census Bureau 2020). The conditional ACS match rate for Asian patients in the EHRs is the lowest of any group in our sample (14%; see Table 3).

References

Boehmer, T. K., Koumans, E. H., Skillen, E. L., Kappelman, M. D., Carton, T. W., Patel, A., . . . Block, J. P. (
2022
).
Racial and ethnic disparities in outpatient treatment of COVID-19—United States, January–July 2022
.
Morbidity and Mortality Weekly Report
,
71
,
1359
1365
.
Bond, B., Brown, J. D., Luque, A., & O'Hara, A. (
2014
).
The nature of the bias when studying only linkable person records: Evidence from the American Community Survey
(CARRA Working Paper Series, No. 2014-08).
U.S. Census Bureau
. Retrieved from https://www.census.gov/content/dam/Census/library/working-papers/2014/adrm/carra-wp-2014-08.pdf
Bower, J. K., Patel, S., Rudy, J. E., & Felix, A. S. (
2017
).
Addressing bias in electronic health record-based surveillance of cardiovascular disease risk: Finding the signal through the noise
.
Current Epidemiology Reports
,
4
,
346
352
.
Cabral, J., & Cuevas, A. G. (
2020
).
Health inequities among Latinos/Hispanics: Documentation status as a determinant of health
.
Journal of Racial and Ethnic Health Disparities
,
7
,
874
879
.
Casey, J. A., Schwartz, B. S., Stewart, W. F., & Adler, N. E. (
2016
).
Using electronic health records for population health research: A review of methods and applications
.
Annual Review of Public Health
,
37
,
61
81
.
Cook, L., Espinoza, J., Weiskopf, N. G., Mathews, N., Dorr, D. A., Gonzales, K. L., . . . 
N3C Consortium
. (
2022
).
Issues with variability in electronic health record data about race and ethnicity: Descriptive analysis of the national COVID Cohort Collaborative data enclave
.
JMIR Medical Informatics
,
10
,
e39235
. https://doi.org/10.2196/39235
Foster, T. B., Fernandez, L., Porter, S. R., & Pharris-Ciurej, N. (
2024
).
Racial and ethnic disparities in excess all-cause mortality in the first year of the COVID-19 pandemic
.
Demography
,
61
,
59
85
. https://doi.org/10.1215/00703370-11133943
Friedman, D. J., Parrish, R. G., & Ross, D. A. (
2013
).
Electronic health records and U.S. public health: Current realities and future promise
.
American Journal of Public Health
,
103
,
1560
1567
.
Goldstein, B. A., Bhavsar, N. A., Phelan, M., & Pencina, M. J. (
2016
).
Controlling for informed presence bias due to the number of health encounters in an electronic health record
.
American Journal of Epidemiology
,
184
,
847
855
.
Groves, R. M. (
2006
).
Nonresponse rates and nonresponse bias in household surveys
.
Public Opinion Quarterly
,
70
,
646
675
.
Hall, M., Musick, K., & Yi, Y. (
2019
).
Living arrangements and household complexity among undocumented immigrants
.
Population and Development Review
,
45
,
81
101
.
Huennekens, K., Oot, A., Lantos, E., Yee, L. M., & Feinglass, J. (
2020
).
Using electronic health record and administrative data to analyze maternal and neonatal delivery complications
.
Joint Commission Journal on Quality and Patient Safety
,
46
,
623
630
.
Khachaturyan, S., & Dreier, A. (
2020
, June 6).
North Carolina's overall uninsured rate masks stark differences across racial and ethnic groups
. North Carolina Justice Center. Retrieved from https://www.ncjustice.org/publications/north-carolinas-overall-uninsured-rate-masks-stark-differences-across-racial-and-ethnic-groups/
Klompas, M., Cocoros, N. M., Menchaca, J. T., Erani, D., Hafer, E., Herrick, B., . . . Land, T. (
2017
).
State and local chronic disease surveillance using electronic health record systems
.
American Journal of Public Health
,
107
,
1406
1412
.
Kopparam, R. (
2022
, October 14).
What federal statistical agencies can do to improve survey response rates among Hispanic communities in the United States
.
Washington Center for Equitable Growth
. Retrieved from https://equitablegrowth.org/what-federal-statistical-agencies-can-do-to-improve-survey-response-rates-among-hispanic-communities-in-the-united-states/
Ma, A., Sanchez, A., & Ma, M. (
2022
). Racial disparities in health care utilization, the Affordable Care Act and racial concordance preference.
International Journal of Health Economics and Management
,
22
,
91
110
.
Mahajan, S., Caraballo, C., Lu, Y., Valero-Elizondo, J., Massey, D., Annapureddy, A. R., . . . Krumholz, H. M. (
2021
).
Trends in differences in health status and health care access and affordability by race and ethnicity in the United States, 1999–2018
.
JAMA
,
326
,
637
648
.
Meyer, B. D., Mok, W. K. C., & Sullivan, J. X. (
2015
).
Household surveys in crisis
.
Journal of Economic Perspectives
,
29
(
4
),
199
226
.
Montez, J. K., Zajacova, A., Hayward, M. D., Woolf, S. H., Chapman, D., & Beckfield, J. (
2019
).
Educational disparities in adult mortality across U.S. states: How do they differ, and have they changed since the mid-1980s?
Demography
,
56
,
621
644
.
NC State Center for Health Statistics
. (
2020
).
2019 BRFSS Survey results: North Carolina health care access
.
NCDHHS Division of Public Health
. Retrieved from https://schs.dph.ncdhhs.gov/data/brfss/2019/nc/all/checkup1.html
O'Hare, W. P. (
2019
).
Differential undercounts in the U.S. Census: Who is missed?
Cham
:
Springer Nature Switzerland
. https://doi.org/10.1007/978-3-030-10973-8
Pew Research Center
. (
2019
, February 5).
U.S. unauthorized immigrant population estimates by state, 2016
. Retrieved from https://www.pewresearch.org/hispanic/interactives/u-s-unauthorized-immigrants-by-state/
Rumball-Smith, J., & Bates, D. W. (
2018
).
The electronic health record and health IT to decrease racial/ethnic disparities in care
.
Journal of Health Care for the Poor and Underserved
,
29
,
58
62
.
Sharifi, M., Sequist, T. D., Rifas-Shiman, S. L., Melly, S. J., Duncan, D. T., Horan, C. M., . . . Taveras, E. M. (
2016
).
The role of neighborhood characteristics and the built environment in understanding racial/ethnic disparities in childhood obesity
.
Preventive Medicine
,
91
,
103
109
.
Udalova, V., Carey, T. S., Chelminski, P. R., Dalzell, L., Knoepp, P., Motro, J., & Entwisle, B. (
2022
).
Linking electronic health records to the American Community Survey: Feasibility and process
.
American Journal of Public Health
,
112
,
923
930
.
U.S. Census Bureau
. (
2019a
).
American Community Survey 1-Year estimates: B03003—Hispanic or Latino origin: North Carolina
. Retrieved from https://data.census.gov/table/acsdt1y2019.b03003?q=b03003&g=040xx00us37
U.S. Census Bureau
. (
2019b
).
American Community Survey 1-Year estimates: S0201—Selected population profile in the United States: North Carolina
. Retrieved from https://data.census.gov/table/acsspp1y2019.s0201?text=s0201&g=040xx00us37&y=2019
U.S. Census Bureau
. (
2020
).
Decennial Census 2000, 2010, and 2020: DP1—Profile of General Population and Housing Characteristics: North Carolina
. Retrieved from https://data.census.gov/table/decennialdpsldh2000.dp1?q=dp1%20north%20carolina&tid=decennialdpcd110h2000.dp1
U.S. Census Bureau
. (
2022a
).
American Community Survey (ACS): Coverage rates
. Retrieved from https://www.census.gov/acs/www/methodology/sample-size-and-data-quality/coverage-rates/
U.S. Census Bureau
. (
2022b
).
American Community Survey (ACS): Response rates
. Retrieved from https://www.census.gov/acs/www/methodology/sample-size-and-data-quality/response-rates/
U.S. Census Bureau
. (
2022c
).
American Community Survey (ACS): Sample size
. Retrieved from https://www.census.gov/acs/www/methodology/sample-size-and-data-quality/sample-size
U.S. Census Bureau
. (
2023
).
QuickFacts: North Carolina; United States
. Retrieved from https://www.census.gov/quickfacts/fact/table/nc,us/hsd310222
Wagner, D., & Layne, M. (
2014
).
The Person Identification Validation System (PVS): Applying the Center for Administrative Records Research and Applications’ (CARRA) record linkage software
(CARRA Working Paper Series, No. 2014–01).
U.S. Census Bureau
. Retrieved from https://www.census.gov/content/dam/Census/library/working-papers/2014/adrm/carra-wp-2014-01.pdf
Weiskopf, N. G., & Weng, C. (
2013
).
Methods and dimensions of electronic health record data quality assessment: Enabling reuse for clinical research
.
Journal of the American Medical Informatics Association
,
20
,
144
151
.
Wiltz, J. L., Feehan, A. K., Molinari, N. M., Ladva, C. N., Truman, B. I., Hall, J., . . . Boehmer, T. K. (
2022
).
Racial and ethnic disparities in receipt of medications for treatment of COVID-19—United States, March 2020–August 2021
.
Morbidity and Mortality Weekly Report
,
71
,
96
102
.
Freely available online through the Demography open access option.

Supplementary data