Some researchers believe that most sampling bias from convenience samples can be removed by weighting. Post-stratification weighting is typically done once the survey is complete to make the results more closely conform to the national totals for key demographic categories. Cell weighting is the simplest form of such weighting – for instance, group responses by demographic cells such as age and gender (men under 55, men 55+; women under 55, women 55+) then calculate the weight for each cell so that it reflects the target population’s proportions.

Weighting is too often presented as a simple problem of arithmetic: upweight certain respondents to compensate for undersampling them. In fact, weighting is an editorial process, and what factors people weight responses on differs dramatically. Of surveys whose results were published in press releases, 43% were weighted, almost always by at least age, gender and geographic region. In such cases, responses are actually weighted differently for each configuration of age, gender and region. More extensive weighting attempts to bring even more variables into alignment with the target population: race/ethnicity, education and household income. Certain organizations champion unique weighting variables as part of proprietary methods to improve representativeness: for instance, YouGov and news interest; and Mktg Inc. and panel tenure. Most organizations that weight presume that weighting by additional factors produces greater representativeness.

While cell weighting is still commonplace (it is used in Google Consumer Surveys, for instance), it is gradually giving way to raking. Cell weighting needs to calculate the weight per cell. Its key weakness is that it requires knowing the target population breakdowns – the percent of people who are female, Hispanic, 55+, and live in the South, for instance.

Raking, in contrast, is an iterative process of recalculating the weights so that the weighted totals line up for one attribute in the target population. Raking repeats the process for a different attribute each cycle, taking the weights of its past output as the new input; the process repeats dozens of times until results converge for all weighted attributes when compared to the target population.

The advantage of raking is that there is no need to know the internal detailed cross-tabulation for each demographic cell or subgroup.

With raking, weights are often different respondent to respondent, reflecting their unique demographic characteristics. As a result, the weights on different respondents can vary dramatically. Widely divergent weights reduce the effective sample size. If you interview 400 people, but 80% of them are men, your reweighted total sample is effectively 256 respondents. Weighting produces much wider margins of sampling error (for probability samples).

Once upon a time, weighting was presumed to make up for any errors in convenience sampling. But researchers now realize that poststratification weighting is not a magical fix. How the responses were sampled is important.

The assumption implicit in weighting is that the people we did survey in a particular demographic group are representative of the people that we did not survey. Yet this assumption is unlikely to hold true in many non-probability samples. For instance, respondents 80 years old and up who take a survey most likely differ in key ways relating to technology usage, health and mental alertness from those who didn’t take the survey. As a result of such types of differences, Krosnick has found that while post-stratification weighting is very effective for probability samples, it is typically unlikely to be effective for non-probability samples. In contrast to Krosnick, most commercial researchers do use post-stratification weights on non-probability samples. They do so in the hope that it does no harm, in the belief that it improves quality, and due to the fact that it redistributes demographic reporting to match the target population. For instance, a survey with 1% of its respondents 80 years old or older, when weighted, reports as if it surveyed the 5% of U.S. adults who are that age.


This is an excerpt from the free Researchscape white paper, “Improving the Representativeness of Online Surveys”. Download your own copy now.