Compare the estimates from the address sample to the estimates from the RDD sample. To do this, use the address full-sample and RDD full-sample weights to produce the two estimates. A simple review of the data should provide an initial assessment if there are differences in the estimates and whether they are large enough to be concerned.

After reviewing the above differences, you may want to conduct a formal test to see if the estimates are statistically different. Remember, that statistical significance is not particularly meaningful for a sample as large as the HINTS. Relatively small differences will be statistically different but not substantively meaningful.

The method to conduct formal significance tests will depend on the type of analysis that is being conducted. For descriptive analyses, you can use procedures described in Rizzo, et al, (2008) pertaining to Goal 1 (e.g., Chapter 3). This involves generating separate estimates and standard errors for each sample frame and conducting a z test:

(Xr – Xa)/sqrt[V(Xr) + V(Xa)]

Where Xr is the estimate for the RDD sample; Xa is the estimate for the address sample; V(Xr) and V(Xa) are the variance estimates for the RDD and address sample, respectively.

If you are conducting a multivariate analysis that is concerned with the relationship between two variables, then you should include a dichotomous variable, Si, in the regression that represents the type of sample. Si would be 0 if the i=RDD and a 1 if i=Address. An interaction term should also be included between address type and the variable of interest. For example, if one were looking at the relationship between age and whether someone looked for cancer information, the regression would include a term for age, sample type and an interaction between the two. A statistically significant interaction suggests that the relationship between age and looking for cancer is different by mode. One can then review the magnitude and implications this difference might have for the particular analysis being conducted.

One source of a significant interaction is different responses because the mode of communication is different (self-administered mail survey vs interviewer administered telephone survey). Keep in mind, however, that there may be differences across mode for other reasons. For example, the address sample contains households that to not have landlines, while the telephone sample does not contain these households. Hence, it is possible that a significant interaction is due to different responses between households that have landlines and households that do not have landlines.

To conduct the multivariate analysis, weights should be created using the procedures described in Rizzo et al (2008: Chapter 4, https://hints.cancer.gov/docs/HINTS_Data_Users_Handbook-2008.pdf) where the two sets of replicate weights are combined into a single set of replicates. The procedure in Rizzo describes how to do this when conducting tests between two survey years. For testing for mode effects, use the same procedure except treat each sample type as if they were two different years. The weights created by applying the procedure in Rizzo to the two 2007 samples should only be used for testing for mode effects. They should not be used to calculate estimates of totals as their sum over the RDD sample plus the address sample is two times the number of adults in the United States in 2006. (The American Community Survey results for 2007 were not available at the time the weights for HINTS 2007 were calculated.) Also, the weights created by applying the procedure in Rizzo to the 2007 sample should also not be used to estimate quantities that involve sums across both of the 2007 samples because they do not maintain the correct relative relationships between the weights of RDD cases, the weights of address-sample cases with landlines, and the weights of address-sample cases without landlines.