How to survey hard-to-reach populations
A recent workshop organised by ISER’s Renee Luthra publicised new developments in Respondent Driven Sampling (RDS), an innovative sampling method that uses social networks to find, survey, obtain and use quantitative data that is representative of hard-to-reach populations.
The Norface and ESRC sponsored workshop brought leading statisticians working on RDS together with survey methodologists and practitioners from a variety of social science disciplines with a view to encouraging the exchange and dissemination of new findings which included the following:
- RDS is now used in a variety of sociological and epidemiological settings, to research diverse topics from the prevalence of HIV in Africa to the employment status of immigrants in Sweden. RDS is an efficient, relatively low cost method to achieve interviews where no sampling frame is present. Most studies that use RDS achieve their desired sample size and are lower cost than other sampling techniques, such as targeted screening.
- Despite these strengths, RDS estimates rely on a variety of assumptions that are likely violated in real studies. A review of 12 RDS studies in the Dominican Republic suggests that central RDS assumptions are frequently violated, with variation across studies in the kind and severity of the violation. The authors develop a variety of diagnostics techniques that can be implemented during data collection to monitor and correct violations while still in field.
- Comparing population estimates drawn from a household census to estimates from RDS, recent validation work in Uganda reveals that RDS estimates of HIV prevalence, mean population age, and mean population wealth are biased, and that the weights produced in standard RDS software do not adequately correct for these biases. In a parallel qualitative study, some of the age bias was linked to different cultural definitions of the target population: male heads of household.
- In another validation exercise comparing the results of an RDS study of Polish immigrants in Oslo and Reykjavik to available registry data on the equivalent population, RDS estimates were found to be fairly accurate measures of the gender and family structure, but were biased in terms of unemployment. This bias was related to differences in the referral times of employed and unemployed respondents.
- To avoid study failure and poor estimates, formative research is necessary. Formative research consists of multiple methods to assess the appropriateness of RDS for the research question, including qualitative interviews and focus groups with the target population, informational interviews with local organizations, and preliminary testing of the questionnaire and survey materials. The use of formative research helps to identify cleavages in population networks, population understanding of network questions, and the willingness of potential respondents to participate in an RDS study.
- A new statistical software package is currently being developed with improved estimators for RDS data. The new estimators are more robust to violations of RDS assumptions, in particular, this new package will allow researchers to model homophily in respondent referral chains.
- The general consensus of both expert presenters and workshop participants was that although RDS is a useful technique to sample hard to reach populations, it is necessary to conduct extensive formative research to assess the appropriateness of the method. Due to potentially large design effects, it is also necessary to exercise caution about the accuracy of the resulting population estimates.
Renee Luthra said:
“Governments around the world use targeted programmes designed to reach populations for which no sampling frame is present, for instance recent immigrants, drug users, homeless men and women, or individuals engaged in high risk sexual behavior. Millions of pounds are spent annually on social programmes such as those designed to aid immigrant integration, drug cessation, provide shelter, or encourage safer sex. In order to evaluate the effectiveness of such programmes, Government needs ways to survey these population members and estimate prevalence rates that are representative and unbiased. RDS presents one important sampling and estimation strategy to fulfill these needs, however, it is important that policy makers are aware of the strengths and weaknesses of the method and understand the accuracy of estimates drawn from RDS sampling and estimation designs.”
She added that there were some definite challenges surrounding RDS:
“Despite the increasing popularity of the method and considerable optimism about its potential to provide population estimates on hard to reach populations, recent research has uncovered several possible weak points in the methods, especially weaknesses in variance estimations. The accuracy of RDS is impacted by the underlying social network, the distribution of traits within this network, and the recruitment dynamic. Particular challenges are highly clustered and balkanized populations, homophily in referral chains, and the need for large samples to overcome design effects.”
RDS methods are currently used in a variety of health epidemiology settings and are sponsored by major organisations. RDS uses the social networks of hidden populations to recruit respondents. First, a few initial target population members are recruited and administered the questionnaire. These initial respondents are then trained to recruit additional members of the target population. Each respondent is provided an incentive to complete the questionnaire as well as a secondary incentive to recruit other target population members. The sample thus grows in size completely directed by respondents themselves. Population members with large social networks therefore face a higher probability of selection; to correct for this, social network data is collected and used to adjust for differential probabilities of referral.
The organisers believe the workshop has served as a first step by bringing together statisticians, survey methodologists, and researchers using RDS in an effort to disseminate knowledge and improve implementation and reporting. The resulting papers are also being prepared for review as a special issue to the Journal of the Royal Statistical Society in order to reach a wider audience.