Most survey data on sexual activities are obtained via face-to-face interviews, which are prone to misreporting of socially unacceptable behaviors. Demographers have developed various private response methods to minimize social desirability bias and improve the quality of reporting; however, these methods often limit the complexity of information collected. We designed a life history calendar—the Relationship History Calendar (RHC)—to increase the scope of data collected on sexual relationships and behavior while enhancing their quality. The RHC records detailed, 10-year retrospective information on sexual relationship histories. The structure and interview procedure draw on qualitative techniques, which could reduce social desirability bias. We compare the quality of data collected with the RHC with a standard face-to-face survey instrument through a field experiment conducted among 1,275 youth in Kisumu, Kenya. The results suggest that the RHC reduces social desirability bias and improves reporting on multiple measures, including higher rates of abstinence among males and multiple recent sexual partnerships among females. The RHC fosters higher levels of rapport and respondent enjoyment, which appear to be the mechanisms through which social desirability bias is minimized. The RHC is an excellent alternative to private response methods and could potentially be adapted for large-scale surveys.
In the last 25 years, the HIV/AIDS epidemic has challenged researchers across the globe to explain the patterns, determinants, and consequences of its spread. Population scientists have made significant contributions to this work using demographic methods of data collection and analysis. They have monitored trends, projected the course of the epidemic, and demonstrated its consequences on the family and society (Bongaarts et al. 2008; Case et al. 2004; Heuveline 2003; Madhavan et al. 2009; Merli et al. 2006). Demographers have also made important inroads into understanding the determinants of HIV infection, including sexual behavior, which is the primary route of transmission in sub-Saharan Africa (Cleland et al. 2004; Morris and Kretzschmar 1997; Orubuloye et al. 1997).
Most population-based data on sexual behavior are obtained via self-reports through structured, face-to-face interviews, such as those used in the Demographic and Health Surveys (DHS). Questions have been raised regarding the quality of sexual behavior information collected through this method, however (Cleland et al. 2004). The main type of measurement error associated with self-reports of sensitive topics like sexual behavior is social desirability bias, which occurs when the respondent gives an inaccurate response in order to conceal information that is considered socially unacceptable (Gregson et al. 2002). This bias tends to result in men overreporting and women underreporting their sexual activities (Nnko et al. 2004). Demographers have been among the most creative in developing new survey approaches that aim to minimize social desirability bias by allowing respondents to record their responses privately with computer-assisted interviewing and other self-administered procedures (Gregson et al. 2002; Jaya et al. 2008; Lindstrom et al. 2010; Mensch et al. 2003). Overall, evaluations of these methods show improvements in reporting compared with face-to-face interviews, although the results are not always consistent. Moreover, the self-administered nature of private response methods places limits on the complexity of information that can be elicited. In order to more fully explain patterns of sexual behavior, how they change over time, and their linkages to HIV infection and other reproductive outcomes, detailed and high-quality data on sexual relationships and behavior are required. This is particularly relevant for youth, who experience multiple and often rapid sexual and reproductive transitions, which are accompanied by some of the poorest associated health outcomes.
As an alternative data-collection strategy, we designed a life history calendar—the Relationship History Calendar (RHC)—to increase the scope of data collected on sexual relationships and behavior while enhancing their quality. Demographers have used life history calendars to gather retrospective information on the contextual and dynamic aspects of the life course, including birth, marriage, contraceptive use, migration, schooling, and employment histories from diverse populations around the world (e.g., Axinn et al. 1999; Curtis and Blanc 1997; Freedman et al. 1988; Goldman et al. 1989; Leridon 1990; White et al. 2008). Life history calendars are designed to help respondents accurately report the timing of past events, and multiple evaluations have found that calendars reduce recall error and significantly increase data reliability (Belli and Callegaro 2009; Belli et al. 2007; Caspi et al. 1996; Freedman et al. 1988; Goldman et al. 1989; Smith 2009; Strickler et al. 1997). The structure and interview procedure of the life history calendar also draw on qualitative techniques, which could reduce social desirability bias and potentially lead to more truthful reporting as well.
Past surveys, including the DHS, have used calendars to collect information on sexual partnership histories, which generally pertain to marital and cohabiting relationships and childbearing within them (e.g., Ali et al. 2003; Balán et al. 1969; Belli et al. 2007). However, there have been very limited attempts to use calendars to capture the changing nature of relationships and the sexual behaviors of youth, and none have been tested for their ability to decrease social desirability bias with respect to these sensitive topics.
In this article, we describe the design of the RHC and the scope of information on relationship histories it collects. We assess the quality of reporting compared with a standard face-to-face survey questionnaire through a field experiment conducted among youth in Kisumu, Kenya. Kisumu, the capital of Nyanza Province, is the epicenter of a mature HIV/AIDS epidemic in the region. HIV prevalence in the province was estimated at 13.9% in 2008–2009 (KNBS & ICF Macro 2010), and young people (especially young females) are among the most severely affected (Kabiru et al. 2010). In addition, other sexually transmitted infections (STIs) are common among youth in this setting (Buvé et al. 2001; Weiss et al. 2001). These outcomes are a consequence of the relationships that young people enter into and their sexual behaviors within them. Kisumu, therefore, provides an important context in which to implement the RHC and assess its potential to improve the quality of sexual behavior data.
The analysis has three aims. First, we evaluate the RHC for improvements in reporting for sexual behaviors that have been found to be important correlates of HIV infection and other STIs. We expect that, compared with the standard survey instrument, the RHC will reduce social desirability bias, resulting in higher levels of reporting of socially unacceptable sexual behaviors, which vary by sex.1 In particular, we expect that young women interviewed with the RHC will be more forthcoming about their level of sexual activity, including age at first sex, number of sexual partners, and multiple partnerships, whereas young men will be more willing to admit their relative lack of sexual experience with respect to these measures. For both sexes, we expect that reporting of inconsistent condom use will be higher on the RHC. Second, we consider additional explanations for the differences in reporting across instruments, including fatigue and recall error. Finally, we assess differences in levels of rapport, comfort, and enjoyment across instruments to investigate the mechanisms by which the RHC could reduce social desirability bias and improve reporting of sexual behavior among young people.
Methodological Advances in Sexual Behavior Reporting
Multiple surveys across sub-Saharan Africa have found imbalances in male and female reports of sexual partnerships and behaviors within the same population, with males consistently reporting more nonmarital sexual activity and higher numbers of partners than females. Although sampling procedures that miss certain types of high-partnered women, such as sex workers, could account for these differences, evidence suggests that social desirability bias is the main explanation, with men exaggerating and women understating their behaviors in face-to-face interviews (Curtis and Sutherland 2004; Gersovitz et al. 1998; Nnko et al. 2004).
The direction of this systematic reporting bias stems from cultural norms and expectations that vary by sex. In much of sub-Saharan Africa, including Kenya, males garner social prestige from being sexually active and attracting multiple partners (Smith 2007; Watkins et al. 1997). Expectations regarding sexual activity are more restrictive for females. For young women, virginity is stressed by many cultures, religious organizations, and the educational system, and engaging in multiple partnerships threatens one’s reputation (Mensch et al. 2003; Munthali and Zulu 2007; Wight et al. 2006). With respect to condom use, at this stage of the HIV/AIDS epidemic, awareness of the risks of unprotected sex is high among youth, and condom use portrays urban, modern behavior (Smith 2000; Tavory and Swidler 2009). Therefore, both males and females will likely overreport consistent condom use in face-to-face interviews.
Methodological advances in survey research have sought to minimize social desirability bias by removing the interviewer from the process and allowing respondents to record their answers privately. Audio computer-assisted self-interviewing (ACASI), in which respondents listen to a recording of questions via headphones and type their answers directly onto a computer keypad, is state-of-the-art in developed countries and is widely implemented (Anderson et al. 2006; Turner et al. 1998). Demographers have tested its potential in developing-country settings, with mixed results. For example, in comparative trials of ACASI and face-to-face interviews in Kenya, Malawi, and India, ACASI produced some statistically significant differences in reporting of sexual behaviors in unexpected directions for young women, while the results for males were more in line with expectations (Hewitt et al. 2004; Jaya et al. 2008; Mensch et al. 2003, 2008). Researchers conclude that unfamiliarity with the technology and the impersonal nature of computerized interviews may have led to these inconsistent and unanticipated findings.
Low-technology private response methods aim to ensure confidentiality through some form of questionnaire self-administration. Respondents either write their answers privately on precoded forms and submit them into ballot boxes or envelopes or designate their answers in code via response cards. The results of most experiments that compare sexual behavior data obtained through these procedures with face-to-face interview reports are in expected directions, although not always statistically significant (Gregson et al. 2002, 2004; Hanck et al. 2008; Jaya et al. 2008; Lindstrom et al. 2010; Plummer et al. 2004b; Potdar and Koenig 2005).
In spite of these innovations, private response methods are accompanied by several disadvantages. For example, many procedures assume literacy or place cognitive burdens on respondents to read, interpret, write, or type responses themselves; consequently, these often produce the greatest improvements in reporting among higher-educated groups (Hanck et al. 2008; Lindstrom et al. 2010; Mensch et al. 2003; Plummer et al. 2004b; Potdar and Koenig 2005). General problems with self-administered methods include misunderstandings and inconsistency of responses, as there is little opportunity for clarification of questions and cross-checking of answers (Cleland et al. 2004; Wight and West 1999). Furthermore, most of these methods offer limited possibilities to include complex skip patterns, multiple response categories, and extensive lines of questioning. A major drawback of existing surveys, such as the DHS, is that questions are restricted to a few items about recent sexual partnerships and behaviors. Important details about the context of these behaviors and how they change over time within relationships are omitted. Most private response methods do not enhance, and some further limit, the scope of information collected.
The opposite approach to removing the interviewer from the response process is to employ qualitative techniques, which encourage more, rather than less, interaction between the interviewer and respondent. Qualitative methods, such as in-depth interviews and participant observation, reduce social desirability bias in two ways. First, the interview procedure is a longer, conversational experience. The interviewer takes time to demonstrate sufficient interest in and empathy with the respondent, leading to a high degree of trust and rapport (Plummer et al. 2004a; Wight and West 1999). Second, in contrast to structured questionnaires that use scripted questions within defined topic areas, sensitive subjects are discussed in the context in which they occurred. Gathering greater details about individual experiences and their circumstances desensitizes respondents to discussing these topics (Wight and West 1999). In a less judgmental and nonstigmatizing environment, respondents feel more comfortable voicing socially proscribed behavior (Corbin and Morse 2003; Poulin 2010; Tawfik and Watkins 2007). When evaluated in comparison with data gathered from face-to-face questionnaires, qualitative techniques elicit very comprehensive, high-quality data (Plummer et al. 2004a; Wight and West 1999). Nevertheless, they are particularly time-consuming and difficult to make representative, both in terms of sampling and consistency of information, and are therefore usually infeasible for large samples (Belli and Callegaro 2009; Cleland et al. 2004). As a hybrid of in-depth and structured interviews, the life history calendar could potentially be used with large samples to obtain detailed, high-quality information on sexual relationships and behaviors.
The Relationship History Calendar
Life history calendars were developed as a means to collect retrospective information on the life course by emphasizing context and change over time (Axinn et al. 1997; Elder et al. 2003). Calendars are generally placed within a standard survey instrument and, through face-to-face interviewing, respondents report detailed information on changes across a variety of life course domains for various reference periods, from several months to many years before the survey (Belli and Callegaro 2009). The RHC is designed to gather retrospective information on the romantic and sexual relationships and other important life course domains of youth for the 10 years before the survey. We chose a reference period of 10 years so we could gather full relationship histories for most of the young adults ages 18–24 in our sample. In addition, we included not only sexual relationships but those that are nonsexual (which we term “romantic”). Nonsexual relationships are ignored in most surveys in sub-Saharan Africa, although these relationships may be particularly prevalent among young people (Bankole et al. 2007), and analysis of their dynamics could provide useful information about the transition into first and subsequent sexual experiences.
The RHC is a fold-out grid with units of time in months and years noted across the top of the grid. Life domains, such as schooling and relationships, are represented as timelines that extend across the 10-year reference period. The RHC records information in monthly intervals, as opposed to years, because many relationships survive for less than one year, and we wanted to elicit changes in relationship dimensions and behaviors over the course of each relationship. Figure 1 shows a truncated version of the RHC.2
The top portion of the RHC records information on life course domains that are particularly significant for the transition to adulthood, including residence, schooling, and employment histories, and, for female respondents, their pregnancy and birth histories. The bottom portion records detailed information on each romantic and sexual relationship. Here, our particular interest was to collect relationship, or couple-level, measures. The relationship is an important context in which sexual decisions are negotiated and enacted, and this level of analysis tends to be overlooked by many researchers (Giordano 2003; Manlove et al. 2007).
Within each relationship, the RHC records partner characteristics, including ethnicity and highest level of schooling, as well as attributes that vary over time, such as the partner’s year in school and residence. The RHC also records relationship dimensions that are likely to influence sexual behaviors among youth, including relationship type, duration, emotional attachment, aspirations for marriage, and exchanges of money and gifts between partners. Finally, the RHC elicits information on sexual and reproductive behaviors, including details on coital frequency, consistency of condom use, and type of contraceptives used. For male respondents, information on the pregnancies and births of each of their partners is also recorded in the relationship history section.3
Across all domains of the RHC, information is filled in by month of occurrence according to precoded responses, and a line is drawn to indicate the number of months in which that state, characteristic, or behavior continued, thus producing time-varying information. To accurately report the occurrence and timing of each relationship, respondents reference the dates of public and personal events known as “landmarks” (such as national elections or the death of a parent) as well as the timing of their schooling, migration, and other relationship trajectories.4 The flexible interview procedure also allows for cross-checking to resolve inconsistencies in event dating and sequencing. In addition to facilitating recall, these procedures could lead to more honest reporting; sustaining consistently socially desirable responses could be difficult for respondents, given the amount of detail elicited on each relationship and the implicit corroboration between events across multiple domains (Balán et al. 1969).
Because the RHC interview is akin to an in-depth interview, it could also reduce social desirability bias in ways similar to qualitative techniques. Interviewers are trained to take time to develop significant rapport before beginning the RHC and broaching the topic of romantic and sexual relationships. The RHC interview is flexible and conversational in nature. The order of questions is left up to the trained interviewer, with life course domains and calendar months helping to structure the questions (Axinn et al. 1999; Belli et al. 2001; Freedman et al. 1988). Respondents’ relationship histories are discussed in the sequence and detail with which respondents feel most comfortable. The structure of the questioning also minimizes the potential embarrassment of sensitive questions about sexual behavior by embedding them within the more innocuous context of young people’s relationships as well as in conjunction with life domains of schooling, work, and residence. Studies have found that calendar interviews are interesting and enjoyable experiences, which increases respondent motivation (Belli and Callegaro 2009; Belli et al. 2007; Dijkstra et al. 2009; Freedman et al. 1988).
Survey Instruments and Measurements
The objective of our analysis is to compare the quality of reporting on sexual behavior in the RHC with data collected via the conventional face-to-face approach used in the DHS and other surveys. We employ an experimental design and randomly assign respondents to receive either the RHC or a modified DHS questionnaire (described below). With respect to the conventional approach, the DHS and similar questionnaires ask respondents to report their age at first sexual intercourse (in years only) and the total number of sexual partners they had in their lifetimes and/or the last year. Subsequently, respondents are asked for a limited amount of detail on the characteristics of and sexual behaviors within several of their partnerships, usually the first sexual relationship and, as in the 2003 Kenya DHS, up to three relationships in the last year. On the DHS, questions on sexual activities are limited to condom use at first sex for the first sexual partner and condom use at last sex for partners in the last year (Central Bureau of Statistics et al. 2004). Questions are scripted and ordered, and each sexual relationship is discussed one at a time.
We base our benchmark survey instrument, the Sexual Partnership Questionnaire (SPQ), on the Kenya 2003 DHS questionnaire with some modifications to allow for comparison with the RHC. The SPQ begins with questions about age at first sex and the number of lifetime sexual partners and sexual partners in the last year. For all sexual relationships in the last year, the SPQ elicits details regarding partner characteristics and relationship dimensions for the first month and last month of the relationship, as well as details on sexual activities within the first month of sex and last month of the relationship. There are several additional questions about the respondent’s first sexual relationship if it did not take place in the last year.
The RHC and SPQ instruments begin with an identical introductory section, which consists of scripted questions on demographic characteristics of respondents, including age, education, marital status, ethnicity, place of birth (rural/urban), and household economic status.5 All of these background measures are used as control variables in the analysis.
Subsequent to recording the details of each romantic and sexual relationship, the RHC instrument includes a follow-up section in which interviewers probe to ensure complete reporting of all relationships in the last year and the last 10 years. If all relationships were not recorded in detail, respondents are asked to provide the actual number of partners they had and the reason why they did not discuss them. Of all RHC respondents, 8.7% did not provide full details of all relationships in the last 10 years, the main reasons being that respondents were not comfortable discussing them (38%), could not remember them in detail (21%), and did not have enough time to complete the interview (21%). It is important to consider these explanations as possible reasons for differences in reporting of the numbers of sexual partners across instruments. Respondents are also asked to report the total number of sexual partnerships in their lifetimes on the RHC follow-up. Finally, there are several additional questions about the respondent’s first sexual relationship if it did not take place in the last 10 years.6
The RHC and SPQ instruments produce the same set of variables on sexual behavior, collected in two different ways, which allows us to make comparisons across instruments. The outcomes that we consider include measures of sexual activity over the course of a respondent’s lifetime and recently (in the last year). Age at first sex is reported directly on the SPQ and is calculated as the respondent’s age in years in the first month of sexual intercourse across all relationships on the RHC. The number of sexual partners in the lifetime and last year are reported directly on the SPQ and are obtained from the RHC follow-up. We use a dichotomous variable to record whether respondents had at least one sexual partner in their lifetimes or last year or did not engage in sex during these time frames. Multiple sexual relationships are tallied from the number of sexual partners in the lifetime and in the last year. We construct a dichotomous variable for inconsistent use of condoms, coded 1 if a respondent did not always use condoms in the first month of sexual intercourse in all relationships in the last year, and 0 otherwise.
We compare the interview experience across instruments using information from a short exit interview, which was identical for the RHC and SPQ. The exit interview elicited information about respondents’ comfort level discussing their sexual relationship histories, the overall enjoyment of the interview, and acceptability of the time taken for the interview. We also elicited information on the interviewers’ assessments of each respondent’s experience. Respondents might feel pressure to offer positive reviews, and therefore interviewer assessments might be more objective. Conversely, reports of respondent displeasure may reflect poorly on interviewers’ performance, and therefore interviewers might actually give more positive reports. Gathering details on perceptions of both respondents and interviewers allows us to assess the consistency between reports. In addition, interviewers were asked to assess the level of rapport that was achieved with the respondent. Each of these measures was assessed on a three-point Likert scale.
The RHC was pretested in peri-urban areas outside Kisumu in February 2007, and the field experiment was conducted in Kisumu in June and July 2007. The sample includes 1,275 18- to 24-year olds. Enumeration areas (EAs) mapped by the Government of Kenya’s Central Bureau of Statistics were used as primary sampling units. Of the urban EAs, 45 were randomly chosen for the survey, and every other household in each enumeration area was selected for inclusion in the study. One eligible respondent was selected randomly from each household. Randomization of the survey instruments occurred at the interviewer level; interviewers administered the RHC or SPQ to alternate respondents whom they interviewed. Each respondent was compensated Kenyan shillings 200 (US $2.80) for the interview regardless of instrument type.7 The survey team was particularly concerned with maintaining a high response rate and, therefore, attempted to contact selected respondents at least three times. The overall response rate for the study was 94.9%, with no significant differences by sex of the respondent and instrument type.8
For the field experiment, 10 interviewers—five women and five men—were hired, with all but one in the age range of respondents. They were trained to administer both the RHC and the SPQ using detailed questionnaire manuals. We were particularly concerned that respondents tally and describe all types of romantic and sexual relationships for the RHC and all types of sexual relationships for the SPQ, and these aspects were stressed. In addition, because the RHC does not use scripted questions, its manual gave examples of how to ask each question and probe for consistent responses. Finally, general interviewing exercises were practiced, as well as rapport-building techniques that could be used in the more conversational RHC interviews. Training took place for eight days, which included practice RHC and SPQ interviews in the nonsample area.
At the end of each day, a group of checkers, including the principal investigators, went over all completed RHC and SPQ questionnaires to look for missing or inconsistent responses. If these issues could not be readily resolved by the interviewer, the interviewer returned to the respondent the next day for clarification. While such detailed checking occurs in many survey projects, it was particularly useful to ensure accurate recording of the information in the RHC questionnaires. This process was time-consuming for the RHC questionnaires, particularly at the beginning of fieldwork when interviewers were refining their skills.
Finally, because both instruments were administered in face-to-face interviews, two issues regarding interviewer effects are particularly relevant. First, while our objective for training interviewers to administer both instruments was to minimize interviewer effects, this strategy could result in contamination of interviewing styles across instruments. In particular, differences in social desirability bias across instruments could be reduced if RHC rapport-building techniques were also applied in SPQ interviews. This contamination would result in conservative estimates of the differences between reporting across the RHC and SPQ, and this possibility should be taken into account when interpreting the results. To control for other potential interviewer effects, we include a full set of interviewer dummy variables in the regressions.9 Second, survey researchers are often concerned that young females will not disclose their sexual activities to male interviewers (McCombie and Anarfi 2002). Therefore, we initially aimed to match interviewers and respondents by sex. This was difficult in the field, however, and some respondents were interviewed by opposite-sex interviewers.10 Because the sex of the interviewer is associated with the sex of respondents and may also affect reporting of sexual behaviors, we replace the interviewer dummy variables with sex of the interviewer as a control variable in the regressions and report the findings separately.
Descriptive statistics of the characteristics of our sample by sex of respondent and instrument type are shown in Table 1. Overall, we observe few significant differences between the RHC and SPQ samples, which verifies that randomization by instrument type was achieved. RHC respondents are significantly older than SPQ respondents among both males and females (females at the .10 level), and there is a marginally significant difference in place of birth for male respondents. Finally, we note that more SPQs than RHCs were administered among females, although this does not appear to produce systematic differences in observed characteristics by instrument.11
Expectations Regarding Misreporting by Sex
We expect social desirability bias and resultant misreporting to be greatest for those behaviors that are deemed least socially acceptable, and for these behaviors to vary by sex. For young males, being sexually active, engaging with multiple partners, and using condoms are socially acceptable, and respondents who satisfy these expectations have little incentive to misreport. In contrast, those who had sex with no or only a few sexual partners or who have unprotected sex should be the most likely to misrepresent their activities. Therefore, if the RHC reduces social desirability bias, we expect to see higher reports for males in the left tails of the distributions of lifetime and recent sexual partners compared with the SPQ.12 Alternatively, there should be no difference in reporting across instruments in the upper tails of the distributions. This also implies that, on average, males will report fewer lifetime and recent sexual partners on the RHC than the SPQ. The RHC should elicit a higher percentage reporting inconsistent condom use and older ages at first sex compared with the SPQ as well.
While abstinence was traditionally the norm for young women, participating in a monogamous sexual relationship no longer appears to be socially unacceptable in Kisumu.13 Therefore, females who are sexually abstinent or in monogamous relationships should be least likely to misreport their behavior, and it follows that there should be no differences across instruments in the left tails of the distributions of reports of ever having had sex or having had sex in the last year. In contrast, social desirability bias should be most manifest and the differences across instruments largest in the right tails, such that female RHC respondents should report greater numbers of lifetime and recent sexual partners on average than SPQ respondents. It is also socially proscribed for young women to use condoms inconsistently and to initiate sexual activity at very young ages. Therefore, the RHC should encourage higher levels of reporting of inconsistent condom use and younger ages at first sex than the SPQ.
Aggregate Differences in Sexual Behavior
The first part of our analysis examines aggregate differences in sexual behaviors reported on the RHC and SPQ by sex. The results of significance tests along with p values are reported in Table 2. For all the lifetime measures of sexual activity for males, the RHC figures are statistically significantly lower than the SPQ figures, as expected, with the exception of age at first sex, which does not differ across instruments. Reports of ever having had sex are five percentage points lower among males interviewed with the RHC, and they also report one sexual partner less in their lifetimes on average compared with the SPQ. Reports of multiple lifetime partnerships are almost 10 percentage points lower on the RHC. With respect to measures of recent sexual activity, the percentage of male RHC respondents who report having been sexually active in the last year (69.7%) is significantly lower than SPQ respondents (81.6%), again in the expected direction. For the remainder of recent measures, there are no statistically significant differences across instruments. On the whole, half the comparisons for young men are consistent with our hypotheses.
Among young women, there are no statistically significant differences across the RHC and SPQ in the percentages having engaged in sexual activity in their lifetimes or in the past year, as expected. Respondents who were interviewed with the RHC report lower ages at first sex than the SPQ respondents (marginally significant), as expected. There are no significant differences across instrument type in the mean number of lifetime sexual partners or the percentage with multiple lifetime sexual partners. The remainder of the measures of recent sexual activity show significant differences in the expected directions. The magnitude of the difference in mean number of partners in the last year is not very substantial; however, the percentage reporting multiple partners on the RHC (13.5%) is almost three times as large as the SPQ figure (5.0%). Finally, reporting of inconsistent condom use is marginally significantly higher for female respondents on the RHC than on the SPQ. Thus, for young women, over half the comparisons are consistent with our hypotheses.
These findings suggest that the RHC improves reporting on aspects of sexual behavior where social desirability bias is most pronounced. Furthermore, the RHC did not produce estimates that differed significantly in unexpected directions for any measure that we assessed. However, despite randomization, several background characteristics differed significantly across RHC and SPQ samples, and these could also be correlated with sexual behavior outcomes. To account for these differences across samples, we conducted regression analyses of the sexual behavior measures in Table 2 with instrument type as the main independent variable (RHC = 1; SPQ = 0) and included controls for each of the background characteristics in Table 1. An ordinary least squares regression was carried out for age at first sex, negative binomial regressions for the number of lifetime and recent partners, and logit regressions for all dichotomous outcomes. Table 3 presents the coefficients on the RHC dummy variable for two specifications for each measure: first, the bivariate relationship, and second, adjusting with control variables.14
The findings in Table 3 largely corroborate those in Table 2. For males, the coefficients on the RHC dummy variable remain stable with the addition of the controls, and the same measures retain their significance.15 For females, the coefficients become larger and reach significance for the regressions of age at first sex, the number of partners in the last year (marginally significant), and inconsistent condom use after controlling for background characteristics. The coefficient for multiple partners in the last year is highly significant in both regressions. In addition, the differences across instruments with respect to the likelihood of having had sex ever or in the past year remain insignificant, as expected. Thus, once we control for observed characteristics, the results are at least as strong as the unconditional estimates (especially for young women). It is also interesting to note that, given our concern about females underreporting their sexual activities to male interviewers in particular, we find that sex of the interviewer does not have a large or significant effect for any of the sexual behavior outcomes among females, although there are a few differences for males (not shown in the table).16
Sexual Partner Distributions
Thus far, we have attributed differences in reporting of the aggregate measures of sexual behavior to decreases in social desirability bias produced by the RHC. There could be additional explanations for the observed differences across instrument type, including interview fatigue and recall error. In an attempt to disentangle the separate effects of fatigue, recall error, and social desirability bias, we explore the distributions of the number of lifetime and recent sexual partners reported across instruments.
While a major benefit of the RHC is the scope of data collected on sexual relationships and behaviors, eliciting such detailed reports could lead to respondent fatigue. Indeed, we find that RHC interviews lasted longer (57 min, on average) than SPQ interviews (33 min). After providing extensive information on a few sexual partners, RHC respondents could become “test wise” and conceal subsequent partners to shorten the remainder of the interview (Hart et al. 2005). Indeed, this length of time could also fatigue interviewers, who could be motivated to record fewer relationships on the RHC to decrease their workload (Cleland et al. 2004). As a result, interview fatigue (on the part of the respondent or interviewer) would result in underreports of the total number of lifetime or recent sexual partners.
Recall error is also an issue for any type of retrospective reporting (Smith and Thomas 2003). As noted, the RHC tries to reduce this error by helping respondents accurately recall the occurrence and timing of past relationships. If this type of error is reduced, respondents interviewed with the RHC should provide more precise figures regarding the numbers of lifetime and recent sexual partners, although it is not clear whether recall systematically biases reports in one direction or the other or differentially for males and females.
If the differences in the mean number of reported sexual partners across instruments were primarily driven by interview fatigue or recall error, we would expect to find the largest differences in the right tails of the distributions because of the greater burden of reporting, and among males, who have the largest numbers of sexual partners to discuss (Morris 1993).17 If fatigue dominated, this would result in lower reports for both males and females compared with the SPQ, while a reduction in recall error with the RHC could produce differential reporting in either direction. In contrast, if differences in the mean number of sexual partners were being driven by reductions in social desirability bias, we would expect to see lower reporting on the RHC compared with the SPQ in the lefts tails of the distributions for males and higher levels of reporting on the RHC in the right tails for females, as noted earlier. Importantly, fatigue and recall error are less problematic in the left tails of the distributions because providing details about a very small number of partnerships is generally not tiresome and remembering a few salient relationships is relatively straightforward.
We investigate these potential types of misreporting in Table 4 and in Figs. 2 and 3, which present cross-tabulations and histograms of the number of recent and lifetime sexual partners for males and females by instrument type. Chi-square tests are used to compare the percentage of respondents with sexual partners in both tails of the distributions. Designations for the right and left tails are not obvious; therefore, we draw cutoffs for the right tails that are large enough to carry out statistical comparisons across instruments (at least five observations) but are at the same time small enough to be considered a tail (less than 15% of the cases). For males, 10+ lifetime partners satisfies these criteria and 15+ partners is included as a more stringent cutoff; 5+ partners is used as the cutoff for partners in the last year. For females, 5+ lifetime partners and 2+ partners in the last year are used as cutoffs. Because the left tails are bounded at zero, we show percentages of zero, one, and two sexual partners for each measure, except for recent partners for females, where zero and one partner are shown for the left tail.
Focusing on the right tails of the distributions for males, we find no statistically significant differences in reports of 10+ and 15+ lifetime partners as well as 5+ partners in the last year across the RHC and SPQ. A comparison of the left tails finds that male RHC respondents report higher percentages of limited sexual activity (no, one, or two lifetime partners and no recent partners) than SPQ respondents, and these differences are statistically significant at each level (with the exception of one lifetime partner). These findings support the argument that differences in the percentage reporting any sexual activity and the mean number of sexual partners across instruments are being primarily driven by a reduction in social desirability bias rather than by interview fatigue or recall error. The striking difference in the percentages reporting no partners versus one partner in the last year—where the RHC elicits higher reporting of no partners and lower reporting of one partner compared with the SPQ—is particularly noteworthy. A plausible interpretation is that males overreport having one sexual partner on the SPQ in order not to appear recently sexually inactive, while they are more truthful in disclosing their abstinence on the RHC. It is also easier to fabricate a sexual partnership on the SPQ because there are fewer details to report.
Turning to female reports, we find no statistically significant differences in the right tail of lifetime sexual partners but significantly higher percentages of 2+ partners in the last year reported on the RHC than on the SPQ, as shown in Table 2. These results support the view that social desirability bias has been decreased with the RHC for recent but not lifetime partner reports. The difference in recent reports could also be attributed to more precise recall of the number of partners with the RHC; however, given the small numbers overall, we expect they are not difficult to remember. There are no significant differences across instruments in the left tails of the distributions, as expected. Overall, the results in Table 4 and Figs. 2 and 3 show that the greatest differences in reporting across instruments appear in the tails of the distributions where social desirability was hypothesized to have the greatest impact, while fatigue and recall error do not appear to play major roles in explaining these differences.18
Taken together, the results in Tables 2 through 4 appear to suggest that the RHC reduces social desirability bias and increases respondents’ willingness to report multiple types of socially unacceptable sexual behaviors. It is likely that the RHC interview fosters significant rapport between interviewer and respondent and creates a comfortable, enjoyable environment in which to disclose these activities. We explore these possibilities in the next section.
Exit Interview Findings
In the final part of our analysis, we present perceptions of the interview experience by respondents and interviewers by instrument type. Results are shown in Table 5. A great majority of respondents (approximately 80%) report feeling very comfortable discussing their sexual behaviors in both RHC and SPQ interviews, and there are no significant differences in comfort levels by instrument type.19 Interviewers also indicate very high levels of comfort among respondents on both instruments, and they report statistically significantly higher levels for male respondents interviewed with the RHC compared with the SPQ. The high levels of comfort reported on both instruments could reflect the particular context of Kisumu, where a history of high HIV prevalence has attracted much research and program attention. Individuals may have become accustomed to openly discussing these issues with investigators or in their social networks.
Approximately 85% of both male and female respondents found the RHC interview very enjoyable, compared with 73% of male and 66% of female SPQ respondents. These differences are highly significant for both sexes. Interviewer perceptions confirm the significantly higher levels of enjoyment experienced by RHC respondents. In terms of rapport, interviewers report that the majority of RHC interviews were characterized by significant rapport, while moderate to no rapport was apparent in the majority of SPQ interviews. These differences are highly significant for both sexes. In addition, we find that interviewers judge these measures of the interview experience to be less positive than do respondents. This suggests that interviewers are less inclined to misrepresent their performance and that respondents may indeed overstate their contentment.
As noted, RHC interviews are longer than SPQ interviews, on average; we were therefore concerned about fatigue. The results in Table 5 show that respondents believe that the duration of RHC interviews is significantly less acceptable than that of SPQ interviews. On this measure, interviewers report generally similar levels of respondent acceptance; however, the difference across instruments is significant only for females. Interestingly, only a small percentage report that the time taken to complete the RHC was not acceptable at all. Some observers have noted that fatigue could have more to do with a lack of respondent interest or rapport than the length of the interview (Gross and Mason 1953). By this definition, RHC interviews could actually be less fatiguing than SPQ interviews despite their longer duration, since the former were much more enjoyable. Interviewer motivation could be affected in a similar manner. Although we did not systematically question interviewers about their own enjoyment or opinion about the length of the interview, they agreed that the RHC was much more enjoyable for them to administer, perhaps because they played a meaningful role in the conversation and direction of the interview, as has been found in other life history calendar projects (Belli and Callegaro 2009; Dijkstra et al. 2009).
Finally, as a robustness check, we conducted logit regression analyses of the exit interview outcomes with instrument type as the main independent variable (RHC = 1; SPQ = 0) and background characteristics included as controls. We constructed dichotomous dependent variables, with the category very/significant coded 1 and somewhat/moderate and not/none coded 0. The results are reported in Table 6. The results corroborate the findings regarding significant differences across instruments found in Table 5 and show that, on the whole, the RHC generates significantly greater rapport and respondent enjoyment than the SPQ, while the length of the interview is significantly less acceptable to RHC respondents.
Although much progress has been made in the measurement and understanding of sexual behavior, big gaps remain, especially the need for high-quality, comprehensive data. We designed the Relationship History Calendar to gather detailed retrospective information on sexual relationships and behavior that cannot be gleaned using existing survey approaches, including standard face-to-face survey questionnaires and many computer-assisted and other self-administered techniques. The rich data collected with the RHC can be tapped to examine the dynamic nature of sexual behaviors by using event-history techniques (Kabiru et al. 2010; Xu et al. 2010), and the relationship can be explored as an important context in multilevel modeling (Clark et al. 2010; Luke et al. forthcoming). The inclusion of time-varying information on other important life course domains, such as migration and schooling, can also be used to investigate the factors driving sexual behaviors of youth.
Gathering high-quality data on sensitive sexual behaviors, particularly among young people, is notoriously difficult using conventional survey approaches. We conducted a methodological experiment in Kisumu, Kenya, to assess the quality of reporting on the RHC compared with a standard face-to-face instrument. The results suggest that the RHC improves reporting on multiple measures of sexual behavior for young men and women. Importantly, in contrast to assessments of ACASI, we did not find significant differences between the RHC and the face-to-face instrument in the unexpected directions. After examining fatigue and recall error as alternative explanations for the differences we observe across instruments, we conclude that enhanced reporting is most likely a function of decreases in social desirability bias brought about through administration of the RHC.
Similar to many private response methods, the RHC did not produce significant improvements in reporting on all measures of sexual behavior in comparison with the standard survey approach. It is plausible to suggest that the RHC may be most successful with respect to behaviors that are deemed least socially acceptable in Kisumu. For young men, the RHC elicited higher reporting of sexual abstinence as well as lower numbers of lifetime sexual partners. Limited or no sexual experience in a setting where the overwhelming majority of young men have had multiple sexual encounters could be most embarrassing, and the RHC may afford an environment in which to discuss such behaviors openly. Young women interviewed with the RHC reported higher numbers of sexual partners and multiple partners in the last year as well as higher levels of inconsistent condom use and lower ages at first sex. In a context where many of their peers are sexually active and approximately one-third are married, abstinence may not be as stigmatizing as engaging in multiple sexual partnerships and at earlier ages, and the RHC may increase respondents’ willingness to report these behaviors.
Using exit interview data, we also find that the RHC interview fosters substantially higher levels of rapport and respondent enjoyment than the standard face-to-face interview, and these appear to be the mechanisms through which social desirability bias is reduced. Indeed, although the RHC interview lasts longer than the standard interview, the time taken to build rapport and discuss individuals’ experiences in some depth likely contributes to both respondent and interviewer motivation, resulting in higher-quality reports.
Although the RHC appears to produce rich, high-quality data from young people in Kisumu, further studies should examine the extent to which the Kenyan experience can be generalized to other populations and age groups. As noted, Kisumu is the epicenter of a mature HIV/AIDS epidemic, where numerous studies and interventions have been carried out; consequently, our population could be relatively comfortable discussing these issues. For example, there could be greater differences in comfort levels—and possibly wider differences in reporting—across instruments in other contexts or among younger adolescents, who may be more inhibited. The applicability of the RHC should also be tested in older populations. Older age groups have higher levels of sexual experience, and, combined with generally longer time exposed to the risk of sexual activity, it may be quite time-consuming to collect full sexual histories from them. Future work should also investigate the accuracy of monthly data on sexual behavior and also determine the optimal reference period for collecting such retrospective information.
Finally, an important issue to consider for scale-up of the RHC to larger surveys relates to the management of field logistics and quality control measures. Like all face-to-face survey projects, interviewer selection and training are crucial to success, but perhaps more so with the RHC, where it is imperative to have motivated interviewers who build excellent rapport with respondents. In addition, labor and financial inputs, such as close editing of questionnaires or financial compensation for respondents, may not be feasible for all projects, and the necessity of these inputs to the RHC’s success should be assessed. Furthermore, while the length of the RHC interview is not excessive compared with other studies like the DHS, its inclusion does increase the time of the interview, particularly for respondents with many partners. Slight modifications of the RHC format could help defray some of these costs to the research team and the burden on respondents. For example, future projects could collect fewer details on relationship dimensions or life course domains or shorten the reference period to five or three years, which would be easier to adapt to respondents with longer sexual histories. A particular innovation would be to place such truncated relationship histories in repeated waves of longitudinal surveys to create full histories over time. The benefits of the RHC demonstrated in this study, if verified in other settings, appear substantial enough to warrant experimenting with the methodology in large-scale studies, including national surveys carried out by organizations such as the DHS.
Funding for this research was provided by a grant from the Eunice Kennedy Shriver National Institute for Child Health and Human Development (R21-HD 053587), as well as supplementary funding from the Population Studies and Training Center, Department of Sociology, and UTRA at Brown University; the African Population and Health Research Center; and the Population Research Center at the University of Chicago. The authors gratefully acknowledge the role of research team members Caroline Kabiru, Rachel Goldberg, Hongwei Xu, Aidan Jeffery, Rena Otieno, Salome Wawire, Alena Davidoff-Gore, and Rohini Mathur. We thank the data management staff at the African Population and Health Research Center; Michael White, Catherine Andrezjewski, Holly Reed, and Justin Buszin for information regarding calendar design; Hongwei Xu and Sanyu Mojola for research assistance; and Kelley Smith for editorial assistance. We also thank the interviewers and respondents in Kisumu. Rachel Goldberg, Dennis Hogan, Caroline Kabiru, Kaivan Munshi, and Michael White provided helpful feedback on the paper.
There is no “gold standard” against which to compare respondent self-reports of sexual behavior to determine their quality (Catania et al. 1990; Fenton et al. 2001). Therefore, our strategy is to use data collected from a conventional face-to-face survey approach as the benchmark against which to compare reporting on the RHC.
DHS calendars generally collect five-year monthly information on pregnancies/births, contraceptive use and source, reason for contraceptive discontinuation, and marriage/union status. Age (by month) at sexual debut, frequency of sexual intercourse, the type and number of sexual partners, consistency of condom use, and how these characteristics and behaviors vary within each relationship are not recorded (Ali et al. 2003). All of this information, with the exception of contraceptive source and reason for discontinuation, is included on the RHC. Relationship information is included in calendars designed by Yoshihama et al. (2005) and Martyn (2009) as well, although information is collected at yearly intervals and relationship dimensions are fairly limited.
This referencing process is believed to map onto the structure of autobiographical memory to result in higher-quality retrospective reports than standard survey questioning techniques (Anderson and Conway 1993; Belli 1998). For example, filling out calendar information for the period “year in school” could jog the respondent’s specific memory about the first romantic or sexual partner (known as parallel retrieval), and thinking about the first partner may prompt memory about a later partner (sequential retrieval within domain) (Belli et al. 2007).
An index for economic status was constructed from 14 items relating to household assets, housing characteristics, and utilities and infrastructure. Principal components analysis was used to generate standardized weight scores, which were summed to produce index scores. These were then ranked and divided into wealth quintiles.
In the RHC sample, 12.9% of respondents commenced sexual activity prior to the 10-year reference period.
We compensated respondents for the time and effort required to complete the RHC and SPQ interviews. While compensation may in itself be an inducement for more truthful reporting of sexual experiences, funds were applied equally and should not lead to differential reporting across instruments.
These and all other results not shown are available from the first author upon request.
Interviewer effects could arise if some interviewers performed better than others, for example. If these interviewers happened to complete more RHC interviews, then the RHC could produce improved reporting because it reduces social desirability bias and/or because the best interviewers administered it.
Interviewers covered each enumeration area in teams of two to four, each with at least one male and one female interviewer. It was often the case that a female interviewer, for example, was interviewing a respondent when a male team member selected the next eligible respondent, who was female. In this case, the research team proceeded as follows: if a same-sex interviewer was not available, respondents were asked if they felt comfortable talking to an opposite-sex interviewer and, if so, were asked to proceed with the interview. If not, an appointment was made for a same-sex interviewer to return at a later time. In total, 18.6% of female respondents were interviewed by males, and 23.5% of male respondents were interviewed by females.
Given that the principal investigators, study director, and supervisors encouraged interviewers to complete at least four questionnaires per interviewer per day in the field, it appears that when time was available at the end of each day, interviewers—particularly female interviewers—completed an additional SPQ interview.
Although the numbers of lifetime and recent sexual partners are bounded at zero, for ease of exposition, we use the term “left tail” to refer to reports in the lower ends of the distributions.
Table 2 shows that approximately 85% of young women aged 18–24 have initiated sexual activity. In addition, among younger adolescent girls (ages 15–19) in Kisumu, Mensch et al. (2003) estimated that approximately 45% had had sex.
Because a small number of respondents were divorced, separated, or widowed, we constructed a dichotomous measure of ever married as the control variable for marital status.
One alternative explanation for the lower numbers of lifetime sexual partners reported by males on the RHC in Tables 2 and 3 is that, given the great amount of detail being collected on each relationship, male respondents may have been under the impression that the researchers were primarily interested in more serious types of relationships on the RHC or they intended to reveal details about more serious ones only. We stressed during training that all types of partners should be reported on the RHC and SPQ. In addition, we find that males reported more casual and other types of recent sexual partners (one-night stands and commercial sex workers) on the RHC (38.8% had at least one of these types) than on the SPQ (28.0%), which helps rule out this explanation. Females also reported more of these less serious types of partners on the RHC (13.7%) than on the SPQ (11.7%).
For male respondents, we find that the expected number of lifetime partners is 0.09 higher for those interviewed by a male compared with a female interviewer, and the expected number of partners in the last year is 0.15 larger when interviewed by a male, both of which are statistically significant. For male respondents, a male interviewer also increased the odds of reporting ever having sex by 1.9 times compared with a female interviewer, which is significant at the .10 level.
Greater fatigue among male than among female respondents is also supported by the finding that 10 of the 12 respondents who did not report details of all of their relationships in the last 10 years on the RHC because of lack of time were male.
Our conclusion that fatigue and recall error are not major explanations for differential reporting of the actual numbers of sexual partners across instruments does not imply that they do not affect retrospective reporting or that the RHC does not have the potential to reduce these types of error for other measures. The conversational flexibility of the RHC interview combined with the use of effective retrieval cues could motivate and aid respondents to recall more precise timing of relationship transitions and more details of relationship dimensions than the SPQ format, for example (Belli and Callegaro 2009; Belli et al. 2007). Alternatively, fatigue could result in fewer relationship dimensions being reported in detail on the RHC.
Results in Table 4 are similar in terms of significance levels if measures are constructed dichotomously, with the category very/significant coded 1 and somewhat/moderate and not/none coded 0.