To address declining response rates and rising data-collection costs, survey methodologists have devised new techniques for using process data (“paradata”) to address nonresponse by altering the survey design dynamically during data collection. We investigate the substantive consequences of responsive survey design—tools that use paradata to improve the representative qualities of surveys and control costs. By improving representation of reluctant respondents, responsive design can change our understanding of the topic being studied. Using the National Survey of Family Growth Cycle 6, we illustrate how responsive survey design can shape both demographic estimates and models of demographic behaviors based on survey data. By juxtaposing measures from regular and responsive data collection phases, we document how special efforts to interview reluctant respondents may affect demographic estimates. Results demonstrate the potential of responsive survey design to change the quality of demographic research based on survey data.
The sample survey has been a fundamental building block of demographic research. Many key advances in both empirical evidence and theoretical reasoning are founded on information from surveys. But even as sophistication of survey measurement and analysis advances, the general population’s growing reluctance to participate in surveys poses a key threat to the field. This problem is greatest in relatively rich countries of Europe and North America, but it is growing across the world. The problem has been documented in detail (Groves and Couper 1998), but demographers’ standards for acceptable survey response rates continue to drop, and social scientists devote increasing effort to the study of the consequences of nonresponse for the issues that they investigate. In the midst of this potential scientific crisis, methodologists continue to pioneer innovative approaches for using new data collection technologies to address the nonresponse problem. Together, these approaches are termed “responsive survey design,” and they can be used to simultaneously improve survey representation of reluctant respondents and control costs of data collection. Here, we describe the application of responsive survey design to a key demographic survey in the United States: the National Survey of Family Growth (NSFG). Using the NSFG example, we demonstrate how responsive survey design can change both demographic estimates and models of demographic behaviors.
Computerization of surveys was the technological shift that stimulated new forms of responsive design. Although “paper and pencil” data collection continues to be used in some rural parts of the world, computer-assisted personal interviewing (CAPI) is now used for the majority of the world’s demographic surveys. Use of computer software for questionnaires promoted data-collection instruments that could be more easily tailored to respondents’ unique circumstances or previous responses and also allowed for dynamic error detection during fieldwork and more rapid release of data in electronic form. All these are desirable in the creation of new demographic data. But computerization also provided the means for the creation of survey “paradata,” or data about the data collection process itself (Couper 1998). Now, paradata from CAPI data collection, combined with Internet technologies that allow paradata to flow from decentralized data collection staff to centralized management, provide the means to centrally manage responsive survey designs in large-scale face-to-face demographic surveys.
The NSFG Cycle 6 (2002–2003) featured CAPI interviewing, collection and analysis of paradata, and responsive design on a large scale.1 The study involved more than 12,500 personal interviews collected nationwide by a staff of more than 300 interviewers. The study was designed in two phases: a main phase, designed following protocols established before data collection began; and a responsive phase, designed explicitly to use analyses of paradata to direct changes of protocol targeted to improve representation of reluctant respondents (Groves et al. 2005). By juxtaposing measures from these two data-collection phases, we are able to document changes in survey estimates produced by the special effort to interview reluctant respondents characteristic of responsive design. Then, using NSFG data, we are able to demonstrate how respondents recruited during the responsive phase produce different results, indicating that the addition of reluctant respondents through responsive survey design may change what we learn from demographic models based on survey data. Together, this body of evidence shows how responsive survey design provides a methodological tool that can change the quality of demographic research based on survey data. The evidence also provides one of the first evaluations of the substantive consequences of implementing responsive design methods.
Key Building Blocks of Responsive Design
The dynamic nature of modern societies presents survey researchers with increased uncertainty about the performance of their survey design, increased effort required to obtain interviews, and thus increased costs of data collection (de Leeuw and de Heer 2002; Groves and Couper 1998). Computer-assisted methods of data collection provided survey researchers with tools to capture a variety of paradata about the data-collection process (Couper 1998; Hapuarachchi et al. 1997; Scheuren 2001). Paradata can be used to change the design during the course of data collection, in efforts to achieve response rate targets or lower survey errors and costs. The responsive use of paradata to modify the design during the field period has been labeled “responsive design” (Groves and Heeringa 2006).
Survey researchers employ a variety of tools to intervene in collection to achieve more desirable distributions of respondent attributes. There is a consensus from survey methodology that survey participation is enhanced by repeated calls on sample households (Goyder 1985), lengthened data-collection periods and interviewer workloads permitting those calls (Botman and Thornberry 1992), prenotification of the survey request through advance letters (Traugott et al. 1987), use of incentives (Singer 2002), reduced interview burden (Goyder 1985), interviewer behavior customized to the concerns of the householder (Groves and Couper 1998), and alternative modes of data collection. Because each of these tools is also directly associated with a cost, paradata-driven responsive design can be used to maximize the effectiveness of these tools while controlling costs. Fundamentally, all survey design options involve cost-error tradeoffs (Groves 1989). Responsive design uses paradata to systematically intervene to maximize these tradeoffs in ways that can be both documented and replicated (Groves and Heeringa 2006). The systematic application of responsive design is an important advance over the ad hoc application of these tools during data collection.2
Theory and Process of Responsive Design
Responsive designs are organized around “design phases” (Groves and Heeringa 2006). The first phases often involve collecting paradata that inform the cost and error properties of alternative design features (e.g., number of calls made to sample cases, nature of incentives). By changing these features in subsequent phases, the researcher can improve the quality of estimates given fixed data collection budgets. Responsive survey designs do five things: they (1) pre-identify a set of design features potentially affecting costs and errors of survey statistics; (2) identify a set of indicators of the cost and error properties of those features; (3) monitor those indicators in initial phases of data collection; (4) alter the active features of the survey in subsequent phases based on cost/error tradeoff decision rules; and (5) combine data from the separate design phases into a single estimator.
During Phase 1 of a survey, paradata are collected to inform the researcher of the interviewer hours spent calling on sample households, driving to sample areas, conversing with household members, and interviewing sample persons. The paradata may include observations about the characteristics of housing units (e.g., whether they have some access impediments) or comments by contacted sample persons that are predictive of later actions. Supplementing the paradata are key statistics from the survey analyzed as functions of interviewer effort, computed on intermediate data sets as interviews are completed.
At the end of Phase 1, the researcher makes a decision about the Phase 2 design options that appear to be prudent. This decision will be guided by the paradata information on costs and sensitivity of values and standard errors of key statistics. Either this phase or a final phase is often introduced to control the costs of the final stages of data collection while attaining desirable nonresponse error features for key statistics. This might involve a second phase sampling of remaining nonrespondents, the use of different modes of data collection, or the use of larger incentives. After the final phase is complete, the survey data collected in all phases are combined to produce the final survey estimates.
The Practice of Responsive Design
Computer-assisted interviewing software offers a needed infrastructure for responsive design. The software system used in the NSFG permits daily uploading from the field interviewer of all call records and travel documentation for her day’s activities. These administrative data contain some paradata deliberately introduced into the NSFG field design: observations about whether the household may contain children, on the likelihood of non-English speakers, and about concerns raised by a household toward the survey request. In addition, the software uploads all completed interview records. These data are used in background analytic processes to estimate the propensity that the next call on a case will yield an interview, whether the interviewer’s effort might be redirected to improve the balance of the respondent data set on key auxiliary variables, and whether the level of calling on some cases has reached an unproductive level.3 Based on these statistical analyses of paradata, the survey researcher can choose to flag some active cases for greater attention by interviewers. The downloaded information to the interviewer’s laptop on the nightly transmission provides flags on certain cases that direct the interviewer to call on these cases first at the next work shift.
Applying Responsive Design Approaches in the NSFG
The fieldwork for Cycle 6 of the NSFG was organized in two distinct phases of operation. The main data-collection phase occurred during an 11-month period from March 2002 through January 2003. During this initial phase, paradata were collected to monitor information about the data collection. Paradata included information such as interviewer performance, observations on neighborhoods and housing units, day and time of call attempts, and observations on contact with household members (e.g., whether they asked a question about the survey or responded with a negative statement about the survey). These paradata were used to build predictive response propensity models—logistic models predicting the odds that the next call on a sample case would produce an interview, given a set of prior experiences with the sample case. (For a full description of the paradata collected and the propensity model specification and coefficients, see Groves and Heeringa 2006.) Then sample segments from the main phase were stratified on two major dimensions: the number of cases in the segment that are still active (i.e., not finalized) and the total expected propensities for active cases in the segment. Segments with large total expected propensities to be interviewed or large numbers of active cases were assigned the highest selection probability of being interviewed. These steps resulted in a responsive phase sample with high mean probability of being interviewed, but including some low-probability cases (Lepkowski et al. 2006). A cluster sample design was employed to reduce travel costs, with all nonrespondent cases in a selected segment included in the responsive phase sample.
The second responsive design phase occurred during the last month of fieldwork: February 2003. For this phase, the recruitment protocol was altered in attempt to attract sample people on whom the earlier phase protocols were not effective. The responsive phase recruitment protocol entailed the use of the most productive interviewers on staff, increased use of proxy informants for the screening interview to lower the burden of obtaining screener information, a small prepaid token incentive (one-eighth of the main interview incentive as compared with no prepaid incentive in the main phase) for completing the screening interview, and the promise of an additional incentive (double the main incentive) for completing the main interview. The responsive phase was successful in increasing the overall response rate, by recruiting a large number of respondents (824) who failed to participate in initial phases.
Evaluating Responsive Design in the NSFG
The second phase of data collection in responsive design adds cases to the database, necessarily improving the overall response rate in the study. The American Association for Public Opinion Research has published a standardized set of guidelines for determining the overall response rate of a study that includes a phase of responsive design, weighting those cases to provide an appropriate response rate calculation. (To learn more about those calculations for the NSFG, see Groves et al. 2005.)
But a key question remains: exactly how different are the cases added to the study through responsive design? Responsive design brings more respondents into the study, but those additional respondents change what we know only if they are different from respondents in the main study in some important ways. Moreover, even if they are different, the importance of those differences depends on the relationship between the differences and the topic being studied. Sample differences closely related to key substantive topics will be more important than those differences that are not related to the topic being studied. Thus, evaluation of the substantive consequences of responsive design depends greatly on assessing the substantive differences between respondents in the main study and respondents in the responsive design phase.
The overall hypothesis guiding our analyses is that differences between respondents in the main study and respondents in the responsive design phase will vary greatly across substantive domains, sometimes with strong consequences. Studies of nonresponse bias usually rely on explicit external data to evaluate bias in survey measures. Our aim is not an investigation of nonresponse bias. However, extensive previous research on nonresponse bias demonstrates that this bias can vary greatly across substantive measures within a single survey (Groves 2006; Groves and Peytcheva 2008). As a result, response rates per se are a poor predictor of the level of bias in any particular substantive domain (Groves 2006). Instead, hypotheses regarding the specific substantive consequences of adding reluctant respondents must derive from what is known about the likely reasons for reluctance and the potential relationship between those reasons and the subjects being studied.
Theories of nonresponse emphasize that busy people are less likely to participate in surveys and are harder to locate (Groves and Couper 1998). This principle yields predictions for the expected difference in characteristics for the two phases of data collection: respondents from the responsive design phase should be characterized by life circumstances that make them busier than those in the main study. So, with regard to employment, we would expect respondents in the responsive design phase to be more likely to be employed full-time. Other characteristics that create time pressure in individuals’ lives—such as relationship transitions (e.g., marriage, divorce) and parenthood—should similarly be more prevalent among respondents in the responsive design phase.
Theories of nonresponse also identify other systematic reasons for reluctance to participate in surveys (Groves and Couper 1998). The subject matter may affect reluctance such that some respondents are more willing to answer questions in some substantive areas than others, with those for whom the subject matter is the most salient generally being the most willing to participate (Groves et al. 2000, 2004). The sponsorship of the study may also affect reluctance (Groves and Couper 1998). For example, those who feel most disconnected from the federal government (potentially minorities and the foreign-born) are expected to be the least likely to participate in a government-sponsored study. The substantive consequences of these various mechanisms reducing survey participation then depend on the subjects being studied.
The main substantive objectives of the NSFG are partnering (sexual, cohabiting, and marital relationships) and parenting (childbearing, family planning, contraceptive use, and child rearing). The literatures on both entry into sexual partnerships (Thornton et al. 2007) and entry into parenthood (Rindfuss et al. 1988) emphasize how both have the potential to significantly increase role conflict with other social activities. Because of the known relationship between role conflict and the subjects studied in the NSFG, intensive time demands from behaviors that are known to be associated with reluctance to participate in surveys, such as full-time employment, are also likely to produce an association between reluctance to participate and the substantive domains of the NSFG. Likewise, theories of parenting and partnering point toward the potential importance of both minority status and foreign origin (Rindfuss et al. 1988; Thornton et al. 2007). Because these factors may be associated with reluctance to participate in a federally sponsored survey such as the NSFG, again we predict that reluctance to participate will be associated with the subjects studied in the NSFG. Finally, because studies of both partnering and parenting demonstrate that individuals’ positive attitudes toward these behaviors affect their behavior (Barber 2000, 2001; Thornton et al. 2007) and these attitudes are likely to predict willingness to participate in the NSFG, we again predict reluctance to participate in the NSFG will be associated with the subjects studied in the NSFG.
Hypotheses regarding the magnitude of the substantive consequences of these expected associations between NSFG measures and the reluctance to participate in surveys are beyond the scope of this article. Instead, we aim to provide an initial empirical investigation into the potential magnitude of such consequences. To do this, we compare models of demographic behavior estimated separately among cases selected across differing design phases. This approach implies a full multivariate model of an important demographic outcome, built as closely as possible to the specifications produced by previous research, with known expectations for values of key parameters. By estimating such a model once with data from the main study and a second time with data from the responsive phase, one can capture a heuristic description of the differences in substantive conclusions likely to result from adding reluctant respondents to the study by using responsive design. Differences in parameters across models fit to these two different data sources provide information on potential consequences of responsive design for substantive interpretations of demographic models. This test for consequences of responsive design goes beyond the simple identification of differences in characteristics of respondents to assess the extent to which such differences may alter our substantive conclusions.
In the following sections, we evaluate responsive design using two approaches. First, we compare the characteristics of respondents from the main study with characteristics of respondents from the responsive phase. Second, we estimate multivariate models of key NSFG outcomes twice: once using data from the main study, and then using data from the responsive phase. Our objective in this second approach is not innovative modeling, so we use simple models drawn from established strategies or previously published research based on the NSFG.
Data and Methods
Data for this study were taken from the NSFG Cycle 6. Fieldwork for the NSFG, conducted between March 2002 and February 2003, was done by professional female interviewers who questioned 12,571 men and women ages 15 to 44 in their homes. The NSFG obtained detailed information on factors affecting childbearing, marriage, and parenthood.
For these analyses, we focus on two groups among the respondents: those interviewed during the main data-collection phase and those interviewed during the responsive design phase. Furthermore, we subdivide these groups by gender, so that our sample contains men interviewed during the main phase (n = 4,601), women interviewed during the main phase (n = 7,146), men interviewed during the responsive phase (n = 327), and women interviewed during the responsive phase (n = 497). Of course, the responsive phase of the study is, by design, a small proportion of the total interviews collected. One consequence is limited statistical power for testing differences between the main and responsive phases. This limitation prevents us from detecting small differences and focuses instead on large differences across phases. Both the descriptive statistics and the regression analyses we present are weighted using the National Center for Health Statistics–recommended final adjustment weights, which account for all key aspects of the complex sampling design of the NSFG (Lepkowski et al. 2006).
Comparing Characteristics of Respondents from Responsive Design in the NSFG
To compare characteristics of respondents from the responsive design phase in the NSFG, we first compare basic demographic characteristics of those respondents interviewed in the main study with those interviewed in the responsive design phase. Next, we compare behaviors across the domains of greatest importance to the NSFG: partnering and parenting.
In Table 1, we present the percentage of respondents in various age, labor force, race/ethnicity, genealogical, and educational categories by interview phase and gender. We assess whether the percentages in each category in the main phase are different from the same percentages in the responsive phase. Overall, the responsive phase sample is older than the main phase sample. For both men and women, the responsive phase sample is significantly less likely to be younger than 20 and more likely to be 30 or older. There is also a strong difference in labor force participation, manifested more among women. Women in the responsive phase are much more likely to be employed full-time compared with women in the main phase. The proportion of Hispanics in each interview phase is significantly different for both genders: 20% of women in the responsive phase compared with 13% of women in the main phase, and 24% of men in the responsive phase compared with 15% of men in the main phase. Among both genders, the responsive design phase also contains significantly fewer blacks. Being foreign-born is also significantly more likely among those in the responsive design phase for both men and women. We also see some potentially important differences in educational attainment, with those in the responsive design phase being characterized by somewhat higher educational attainment than those in the main phase.
Our initial comparison is consistent with the conclusion that the addition of a responsive design phase can add significantly different people to those represented in a survey. Not only are these differences statistically significant, but they are so in spite of the relatively small case base for the responsive phase of the survey. Some of these results appear consistent with the hypothesis that responsive design may draw those with the most survey participation role conflict into the respondent pool. Full-time work and older age may be the clearest evidence of this. Being foreign-born or having higher education may also create more role conflict for extra activities like a survey interview. Country-of-birth and race/ethnicity differences may also reflect consequences of the U.S. government sponsorship of this survey. Or, of course, other factors may also be at work.
We continue comparing samples from the different interview phases at the bottom of Table 1. As the results show, the samples display some important differences in patterns of both partnering and parenting behaviors. First, in terms of many lifetime sexual partners, responsive phase men have had significantly more opposite sex sexual partners than main phase men. Second, strong differences in marital status pertain only to women: women in the responsive phase are significantly more likely to be currently married (54% compared with 45% in the main phase) and ever married (64% compared with 57% in the main phase). Finally, in terms of childbearing, there is clear evidence that responsive phase men are more likely to have biologically fathered a child than main phase men.
This pattern of results for partnering and parenting behaviors is also consistent with the hypothesis that busier people are more likely to be added to a survey using responsive design, given what we know about the role conflict associated with these behaviors. Those with many sexual partnerships and those who are or have been married are more likely to be added in responsive design. Consistent with other evidence that there is less role conflict associated with nonmarital cohabitation (Thornton et al. 2007), we find here that responsive design does not produce as large a gap in measurement of cohabiting experience as it does in marital experience. Our evidence is also consistent with the expectation that parents are likely to be added in the responsive phase of a survey, although we find stronger evidence for an association for fathers than for mothers. Of course, more detailed measures of statuses producing the greatest potential role conflict, such as the ages of the children, would likely demonstrate these differences more strongly than the gross categories shown here.
We also investigated whether these phase differences in partnering and parenting behaviors hold within age and labor force status groups (analysis not shown). For this extension of our analysis, because of the smaller number of responsive phase respondents, we investigated two age groups (younger than 30 vs. 30 and older) and then two labor force status groups (working full-time vs. not working full-time). In our analysis by age, our finding that the responsive phase recruited more currently and ever-married women holds among women younger than 30. However, among women age 30 and older, the difference between phases for ever-married disappears, and the difference between phases for currently married is reduced. Similarly, our finding that the responsive phase recruited more men who have had many sexual partners and have biologically fathered a child holds among men younger than 30, but these differences between phases disappear among men age 30 and older. Thus the evidence is consistent with the hypothesis that responsive design adds people who are busy with partnering and parenting, but this is mainly true for people younger than 30. In our analysis by labor force status, our finding that the responsive phase recruited more currently and ever-married women holds among both women working full-time and those not working full-time; however, the differences between phases are reduced among women not working full-time. For men, labor force status explains the differences between phases in terms of the proportion of men who have had many sexual partners and have biologically fathered a child. Among men who are not working full-time, the responsive phase recruited men who are more likely to have had many sexual partners and to have ever been married, and those who are slightly more likely to have fathered a child. However, among men working full-time, all differences in these partnering and parenting behaviors between the main phase and the responsive phase disappear.
We also compared responses to questions about attitudes across the main and responsive phases (analysis not shown). Attitudes and related subjective phenomena are known to have different measurement properties than behaviors (Tourangeau et al. 2000). We compare the samples on agreement (a response of either “Strongly Agree” or “Agree”) with key attitudinal measures. We find some significant differences in attitude measurement in the responsive phase of the survey, but these differences follow a less clear pattern, making interpretation of them more difficult. There is no significant difference between the male samples in response to “It is better for a person to get married than to go through life being single,” but there is a significant difference between the female samples: responsive phase women are more likely to agree than main phase women. This is consistent with the result that women added in the responsive phase are also more likely to have been married (Thornton et al. 2007). In response to the statement “Sexual relations between two adults of the same sex is always wrong,” men added through the responsive phase are more likely to agree, but women added through the responsive phase are less likely to agree. This interesting result points toward a pernicious characteristic of responsive survey design: not only may factors other than role conflict be at work, but it may be difficult to predict the consequences of adding these reluctant respondents across substantive domains.
Comparing Models of Demographic Outcomes
Our next step will be to estimate models that are common in social demographic literature, first on the main phase sample only, and then on the responsive phase sample. For these multivariate analyses, we use logistic regression to model the odds of the demographic outcome in question, and present model coefficients in tables as odds ratios. Because our objective here is methodological, not substantive, we do not interpret the model coefficients themselves, but instead focus on statistically significant differences in coefficients estimated on the different samples. In keeping with this methodological focus, we draw the models themselves from the previous literature on each topic and do not construct any theoretical frameworks for these topics here. The specific model parameters, variable construction, and coding in the models was derived from recently published papers on these topics based on NSFG data (Bloom and Bennett 1990; Darroch et al. 1999; Finer and Henshaw 2006; Guzzo and Furstenberg 2007; Hayford and Morgan 2008; Manlove et al. 2006, 2008; Zhang 2008). Likewise, because our methodological objective is to learn the extent to which responsive design may alter our substantive conclusions based on survey data across multiple subject matters, we investigate several different types of models spanning dimensions of partnering and parenting. To represent common uses of NSFG data, the specific subjects were chosen based on an analysis of highly cited recent scholarly works using NSFG data. Finally, because our methodological objective is in the evaluation of the responsive design data collection strategy—not advances in modeling or analytic strategies—we replicate the modeling strategies used in previous highly cited work with NSFG data. In some cases, these modeling strategies do not reflect the most sophisticated possible analysis techniques applied to the specific subject. In such cases, we comment on our investigation of more sophisticated modeling strategies to supplement the results.
Responsive Design and Multivariate Models of Partnering
Next, we investigate a dimension of partnering behavior often analyzed using NSFG data: lifetime number of sexual partners. In Table 2, we present multivariate models of the likelihood of experiencing many lifetime sexual partners. Because, in general, men report more lifetime sexual partners than women in the United States (Laumann et al. 1994; Smith 1992), we code “many” as seven or more partners for men and four or more partners for women. As described earlier, our analysis focuses on differences between model coefficients for models estimated on the main phase sample versus models estimated on the responsive phase sample, and not on the coefficient values themselves. Table 2 presents three models for men and three models for women. This format simplifies visual inspection of differences between estimated coefficients from the two different samples of men and women. In a third column for each gender, statistical significance of differences between coefficients is determined in pooled models that add interaction terms between each independent variable and a dichotomous indicator of the sample phase during which the respondent was added to the study.4
Examining the results displayed in Table 2, among men, the coefficients for age; being Hispanic; being another race/ethnicity besides white, black, or Hispanic; having a high school diploma; having a college degree or higher; and frequency of attendance at religious services are each significantly different between the main phase sample and the responsive phase sample. Religious service attendance reduces the likelihood of experiencing a large number of sexual partners in both models, but the magnitude of this effect is estimated as significantly larger among the main phase respondents than among the responsive phase respondents.5 Effects of being Hispanic and being another race/ethnicity, on the other hand, appear to be more significant among responsive phase respondents than main phase respondents. Among women, the coefficients for age, having ever been married, and having a mother work full-time during youth are each significantly different in the responsive design phase than in the main phase. In each case, the two different coefficients are in the same direction, and the difference in each pair is in the magnitude of the estimate.
Of course, it is common in social demography to place greater theoretical emphasis on the direction of such coefficient estimates than on the magnitude. So one might argue that even though these are statistically significant differences, they are substantively similar and would lead to similar substantive conclusions, at least at a gross theoretical level.6 As we will show later, this is not always the case; responsive design sometimes produces coefficient estimates in the opposite direction. In the meantime, however, it remains clear that in models of sexual partnerships, responsive survey design adds cases that lead to significantly different estimates of the magnitude of model coefficients. Again, note that we document these significant differences in spite of the relatively small size of the responsive phase sample, which greatly limits the statistical power of such tests.
Responsive Design and Multivariate Models of Parenting
We now investigate multivariate models of dimensions of parenting behavior often analyzed by using NSFG data. We focus on estimates from models of the likelihood of having fathered/mothered two or more children. We also investigate models of ever becoming a father/mother as well as models of a newly emerging topic in the literature: multiple-partner fertility (analyses not shown in tables).
In Table 3, we present models of having fathered/mothered two or more children. Among men, the coefficient for being Hispanic in the responsive phase sample is significant in the opposite direction than that in the main phase sample. The effect of being foreign-born follows a pattern of being insignificant among the main phase sample of men and strongly positive among the responsive phase sample of men. So, for both domains, the responsive phase sample adds cases with estimated effects that are quite different than those among the cases in the main study design. Also among men, significant differences in coefficient estimates between responsive and main phase samples are found for both levels of income (income between the poverty level and twice the poverty level, and income at least twice the poverty level).
Among women, the results in Table 3 demonstrate many differences in coefficient estimates between the main phase sample and the responsive phase sample. The responsive phase incorporates respondents who produce significantly different coefficient estimates for six different model parameters: other racial/ethnic group, some college, college degree or higher, income at least twice the poverty level, living in an urban area, and number of siblings. In some cases (e.g., other racial/ethnic group, living in an urban area), effects appear stronger among the responsive phase sample, whereas in other cases (e.g., income at least twice the poverty level, number of siblings), effects seem to be reduced among the responsive phase sample.
Unfortunately, we cannot fully explain why so many model parameters work differently for predicting lifetime number of sexual partners and high fertility among respondents recruited during the responsive design phase. However, our results clearly imply that responsive phase samples have the potential for altering substantive conclusions. Furthermore, Tables 2 and 3 show that many model parameters (e.g., age, Hispanic and other racial/ethnic groups, and college degree or higher) consistently produce different coefficient estimates between main phase samples and responsive phase samples across models predicting various demographic outcomes.7
We also explored the recently emerging topic of multiple-partner fertility: having fathered children with more than one woman. Research is beginning to document the prevalence of multiple-partner fertility and factors associated with it (Guzzo and Furstenberg 2007; Manlove et al. 2008). Because we found differences in models of other parenting behaviors, such as fathering two or more children, for the responsive versus main phase samples, we expect differences in models of multiple-partner fertility as well. Multivariate models of this particular demographic outcome stratified by sample phase, however, are the most limited in statistical power because of the very small number of men in the responsive phase sample who have fathered two or more children (n = 76) and are therefore at risk of multiple-partner fertility. Nevertheless, we estimated the likelihood of multiple-partner fertility as an exploratory exercise (analysis not shown). Results demonstrate one significant difference in the coefficient estimates for an intact family background between the main phase sample and the responsive phase sample, although this result must be interpreted with caution. However, the significant differences between samples in coefficient estimates for models predicting other parenting behaviors suggest that examining patterns of difference in models of multiple-partner fertility will be a fruitful avenue for future research as more respondents are recruited through responsive design phases.
Before leaving the results, note the pattern across the multivariate models discussed here. We chose models of outcomes spanning the important dimensions of partnering and parenting at the core of NSFG measurement. There are many other topics represented in the NSFG as well. It is now well documented that nonresponse bias of respondent-based estimates varies greatly over different estimates in the same survey (Groves and Peytcheva 2008). However, across these different dimensions of NSFG measurement, we repeatedly see that responsive design draws reluctant respondents into the study who not only are different from main study respondents (Table 1) but also change our understanding of the factors associated with each of the outcomes (Tables 2 and 3). Each multivariate model is based on previously published work using the NSFG, and each model is relatively parsimonious. We also estimate the models separately for men and women. In all models but one, we find that some coefficients are significantly different among the responsive phase respondents than among the main phase respondents. The majority of the coefficients we estimate are either unchanged among these respondents, or not significantly changed. Most of the significant changes among the responsive phase respondents are still in the same direction as those among the main phase respondents; this means that adding responsive phase respondents may change estimates of the magnitudes of effects but will not change the substantive interpretation of hypothesis tests relying only on the direction of the estimated association.
However, in some cases, the coefficients estimated among responsive phase respondents are not only significantly different but also in the opposite direction of those among main phase respondents. These cases would lead to opposite conclusions about the direction of association. Moreover, the pattern of differences does not appear obvious or easily predicted. Models of men and women often produce different results for the consequences of the responsive phase additions. Sometimes the additions produce greater change in models of men’s outcomes, and sometimes they produce greater change in models of women’s outcomes. Together, this body of results demonstrates that responsive design has the potential to substantially alter the substantive conclusions we reach from analyses of survey data across a range of demographic topics. The results also suggest our understanding of these consequences of adding cases through responsive design will require investigation of each specific topic we study, even within the same survey.
The findings are consistent with the growing body of evidence from survey methodological research, which shows that for some estimates in a survey, nonresponse bias can be fatal; for others, it remains a minor issue (Groves 2006; Keeter et al. 2000). In fact, this same evidence demonstrates that nonresponse bias varies greatly across measures within the same survey (Groves and Peytcheva 2008). This means that at any one level of overall survey nonresponse, different measures will suffer from different levels of nonresponse bias. A key implication of this finding is that the overall survey response rate actually tells us little about the level of nonresponse bias in any particular measure (Groves 2006). The value of the responsive design phase sample in the NSFG is that the researcher is alerted to the variation in nonresponse bias sensitivity across key estimates of interest. Knowing this, the researcher can then use paradata to guide interventions to target efforts that increase response rates in specific subgroups and thereby reduce nonresponse bias in those key measures of interest. This process does not eliminate nonresponse bias, but uses the information to reduce nonresponse bias.
Recent Advances in Responsive Design and Prospects for Demographic Data Collection
Reluctance to participate in surveys appears here to stay. Reluctance has been growing among all types of surveys, including both phone surveys and face-to-face surveys, which are the mainstay of demographic research (de Leeuw and de Heer 2002). We have little reason to expect that reluctance to participate in surveys will decline significantly. This growing reluctance substantially increases the costs of creating survey data with the same basic nonresponse properties. Survey data collection is always characterized by fundamental cost-error tradeoffs (Groves 1989). Methodological decisions in survey design and execution are fundamentally efforts to either reduce costs at a given level of quality or improve quality at a given level of costs. In the face of increasing nonresponse and the potential for increasing nonresponse bias in survey results, responsive design is an essential tool for balancing control of survey data–collection costs with control of potential nonresponse error and bias.
Responsive design clearly has the potential to improve the representation qualities of surveys by bringing other people into a study that are different from those who participate in response to the main study protocol. Using analyses of NSFG Cycle 6 data, we demonstrate that responsive design adds respondents who are more likely to be older, working full-time, foreign-born, highly educated, and from specific racial/ethnic groups. In reference to the core topics of NSFG measurement, these responsive design phase respondents are also likely to have had many sexual partners, to have been married, and to have had children. Respondents added during the responsive design phase even have significantly different attitudes, at least with respect to some attitude domains. Many of these differences among those added in the responsive design phase fit theories of nonresponse that hypothesize that people facing the most time pressure in their daily lives, those least interested in the topic, or those least connected to the sponsor of the study will be the least likely to accept requests for survey interviews (Groves and Couper 1998). However, other factors may also keep people from agreeing to participate, and the specific factors affecting any particular survey maybe be closely associated with the subjects included in the survey, the sponsorship of the survey, or the target population for the survey. These factors may also change over time, so that replication of the same survey may not replicate the determinants of reluctance to participate. Responsive design is an essential tool for adding significantly different groups of people into a survey.
Not only are these additions to the survey different, but adding them can potentially alter the substantive conclusions we reach from analyzing multivariate models of demographic behaviors. Estimating several different multivariate models, we find numerous statistically significant differences among coefficients estimated for respondents added to the study during the responsive design phase. Sometimes these differences are big enough to produce sign changes in the estimated coefficients that would dramatically alter substantive conclusions based on these analyses. Because our investigation is constrained by the relatively small size of the NSFG Cycle 6 responsive design phase sample, we detect only the largest of these differences. As efforts to use responsive design to control nonresponse bias increase, the numbers of respondents added in responsive phases will likely grow; and as responsive design protocols become more effective, the differences between the responsive phase respondents and main phase respondents may also grow. Thus, we have every reason to expect the consequences of responsive design for conclusions based on demographic models to grow in the future.
In fact, the NSFG is both using and improving responsive design in Cycle 7. Cycle 7 of the NSFG reflects a substantial design change relative to all previous cycles, called “continuous interviewing” (Groves et al. 2009). Like the U.S. Census Bureau’s American Community Survey, the continuous-interviewing NSFG is always in the field collecting data that is cumulated over time into large data sets that continuously span historical time. The NSFG Cycle 7 rotates annually across primary sampling units (PSUs) to provide lower sampling error when aggregated across larger units of time. Within each annual set of PSUs, the sample is worked in four replicates that provide quarterly sampling units and a fresh work flow four times per year. Within each quarter, the study is conducted in two phases: a main phase and a responsive design phase. The replicate study design allows lessons learned from previous replicates to be applied in each new replicate, so that responsive design–based interventions can be continuously adapted to changing social conditions and can provide continually improving responsive design effects. As a result, responsive design has been extraordinarily effective at maintaining high response rates and highly balanced samples across key subgroups in the NSFG sample design (Groves et al. 2009).
At the same time, the tools to implement responsive designs continue to improve. The computerization of surveys continues to improve through the creation of newer and more capable automated systems for the centralized management of large-scale, geographically dispersed survey data–collection operations (Cheung 2007). These improvements continue to fuel the creation of more and more paradata, giving survey methodologists more information to use in building management models for responsive design that draw on paradata to improve the efficiency of data collection (Couper and Lyberg 2005). Greater applications of these tools across the Internet allows paradata to flow to centralized managers continuously, greatly improving the speed of analysis and the speed of adjustments to data collection protocols. All indications are that these technical breakthroughs will continue to occur at a fast clip. From this, we expect greater use of paradata to design more effective responsive design strategies, greater use of responsive design in large-scale data collections, and more effective implementations of responsive design, so that the resulting survey data are shaped substantially by these approaches.
Finally, although the technologies and methods needed for responsive design are most prevalent in the United States, Canada, and few rich countries of western Europe, we have every reason to expect this approach to surveys will soon be adopted elsewhere. The technologies for using responsive design are quickly spreading worldwide. In fact, in some poor countries with large populations in rural areas, electronic data collection is frequently being proposed to overcome other logistical barriers. For example, China is currently launching its first large-scale, national survey data collection using CAPI technologies and electronic tools for centralized management of a national field staff. As computerization and Internet access spread, not only will it be possible for other countries to implement responsive designs in survey data collection, but it will also be possible to engage in centralized management and coordination of international data collections across many different countries. Technical and legal barriers may impede free flow of survey management data across national borders, but eventually, the efficiency gains and improved scientific quality of such coordination are likely to produce this outcome.
Thus, we investigate responsive design here fully, expecting it to be a central aspect of demographic data collection for decades to come. Based on our analysis of NSFG Cycle 6 data, we argue that responsive design has enormous potential to improve the representative qualities of demographic data while controlling the costs of producing the most scientifically advanced data possible. However, we also caution that consequences of responsive design include the possibility of fundamentally altering the substantive conclusions we reach from analyses of complex multivariate models. In fact, these consequences are not only likely to differ across studies but also to differ across topics, models, and even coefficients within the same study. So, careful attention to the consequences of these responsive design advances will also likely become a fundamental aspect of demographic analyses of survey data.
The authors wish to thank the NSFG staff at both the National Center for Health Statistics and the University of Michigan for all their effort to produce the NSFG data with the most innovative survey tools available. We also wish to thank Paul Schulz and Sarah Brauner-Otto for their contributions to previous versions of this paper, and Mick Couper, Peter Granda, Nicole Kirgis, Jim Lepkowski, and James Wagner for their helpful comments regarding the analysis of responsive design effects. The responsibility for all errors remains with the authors. An earlier version of this paper was presented at the 2009 annual meetings of the Population Association of America.
Many earlier studies also used CAPI (e.g., the NSFG Cycle 5 in 1995), and some earlier studies used analyses of paradata and responsive design on a large scale.
In fact, in many ways, responsive design is similar to common practices in telephone survey call centers of using information about the respondent’s circumstances to maximize the efficiency of phone calls (Groves et al. 1988). It is also similar to the common practice of having experienced study managers devise intelligent strategies for the focus of interviewer effort in large- and small-scale face-to-face surveys. The key advantages of responsive design are the ability to simultaneously consider many different factors using statistical models and concrete documentation forced by explicit modeling that can be replicated. These tools provide a powerful means for systematic intervention.
For example, one analytic process involves estimating key statistics (e.g., the mean number of live births) and plotting the cumulative estimates of these statistics by the number of call attempts. During the course of the data-collection period, these plots are monitored to see whether estimates of key statistics begin to show stability after a certain number of call attempts. At this point, additional call attempts would produce values for these statistics yielding the same substantive conclusions. This information is used to determine the maximum number of call attempts to be made on future cases. For additional examples of background analytic processes using paradata, see Groves et al. (2005).
Of course, some differences that appear statistically significant may be the result of random chance rather than systematic processes. Further advances in hypotheses regarding the mechanisms producing specific differences will be needed to adjudicate this possibility.
Note that more sophisticated approaches to estimation of the relationship between religious service attendance and sexual partnerships use longitudinal data, not cross-sectional data as in the NSFG. This is because there is known reciprocal causation between religious service attendance and sexual partnerships, in which religious service attendance affects subsequent sexual partnering behaviors but sexual partnering behaviors also affect subsequent religious service attendance (Thornton et al. 1992). We do not use such an approach here because our aim is to evaluate how responsive survey design may affect model estimates in typical uses of NSFG data, and NSFG data are used by some analysts for this purpose.
In fact, when the size of the responsive phase sample is small relative to the size of the main phase sample, estimates based on the pooled sample will rarely differ greatly from estimates based on the main phase only. However, as we argue later, responsive design is becoming a more common feature of survey data collection and is likely to shape a higher proportion of cases in the future, so that these differences are likely to become an increasingly common feature of demographic analyses based on survey data.
We also examined models of ever biologically fathered a child or gave birth to a child, estimated using both logistic regression and modeling the dependent variable as the hazard of first birth. This was the only case where, at least among women, there were no significant differences in coefficient estimates between the main phase sample and the responsive phase sample. Among men, the effect of being foreign-born is markedly stronger among the responsive phase sample than among the main phase sample (an odds ratio of 4.93 among the responsive phase sample compared with an insignificant effect among the main phase sample).