Yesterday AAPOR released The AAPOR Task Force on Non-Probability Sampling, which was summarized in a session in Boston today at the annual conference. The committee was co-chaired by Reg Baker and J. Michael Brick, who both presented.
In 2010, the AAPOR report on opt-in panels said that panels were not reliable for point estimates. One of the followups was to explore non-probability samples in more detail. “The mission was to examine the conditions under which various survey designs that do not use probability samples have scientific merit, that is, the methods produce study estimates that have desirable statistical properties for making inferences to a larger population.” The report, by design, did not focus on probability samples under suboptimal conditions, did not define non-probability sampling as the online mode, and did not include social media sampling.
The report looked at convenience samples, sample matching, network sampling, estimation and weight adjustment methods, measures of quality and fit for purpose. Some conclusions:
- Unlike probability sampling, there is no single framework that adequately encompasses all non-probability sampling.
- Researchers and other data users may find it useful to think of different approaches as falling on a continuum of expected accuracy.
- Transparency is essential. It’s even more important, given the differences in types of non-probability methodologies.
- Making inferences for any probability or non-probability survey requires some reliance on modeling assumptions.
- The most promising models deal with challenges in both sampling and estimation.
- Model-based methods may be used infrequently because of the requirement of statistical expertise.
- Fit for purpose is an important concept for judging survey quality.
- Research into non-probability methods should focus on the sampling techniques. “We heard the most vocal criticism of non-probability samples from other suppliers who use non-probability samples!” said Michael.
- Total Survey Error may not be the right framework for evaluating non-probability samples.
- Evidence of accuracy varies by domain (e.g., elections, healthcare, etc.).
- Appropriateness for making inferences rests on the assumptions underlying the model and deviations from those assumptions affect specific estimates.
Gary Langer commented that this might be an effort to find the usability of non-probability sampling rather than impartially evaluate it. To this, Reg pled “Guilty as charged,” saying that the task force asked if there were items worth exploring to find value from non-probability sampling. Gary had also argued that election polls are in effect models of a population that does not yet exist and so therefore are not true surveys. To Gary’s criticism of the lack of a theoretical foundation, Michael pointed out that sample matching is a model with a theoretical foundation.
Reg said, “There will never be a non-probability model that works for every single topic. The models must reflect the interests in the survey. For instance, for a healthcare survey of ten waves with 100,000 interviews, we developed a model to allow use of panel. We parallel tested and it works really well. You can afford to do that for a survey with 100,000 interviews but you can’t afford to do it for a one-off survey with 1,500 responses. It is hard for me to conceive of a model that will produce good estimates for all those different topics because the adjustment is designed to balance the co-variates that matter.” Michael added, “There will not be a common crank to drag out good estimates.”
Steve Gittleman said, “Digital data is sliding down the street like an iceberg. It has no sample whatsoever. We need to get into it before the next three years: Google and Facebook, those little guys, can’t get ignored.” Reg said, “For those who got up at 8 am to hear it, the Census is moving to Big Data. If you want to stretch your mind, the Internet of Things is coming, and there are a lot people in Steve’s part of the industry who think that Big Data will make surveys easier in the beginnning but less relevant in the end.”
A comment from Barry Feinberg makes a fitting conclusion. “Probability surveys are the real way to have external validity, but some who use non-probability surveys claim that they have external validity and so we need to examine that claim. The alchemist attempted to turn base metals into gold and silver, to turn a sow’s ear into a golden purse. They did not succeed. But they did suceed at discovering experimental techniques that led to chemistry and modern medicine. This is an alchemist’s situation, and I don’t believe we can turn non-probability surveys into probability samples, but we might discover something worthwhile along the way.”