Skip to main content

HINTS Data Errors, Remediation, and Recommendations

This page details HINTS data errors, remediation procedures, and resultant recommendations. More details about these issues are found in associated documentation that is contained in data files found on the data downloads page, and if you have questions, please don’t hesitate to contact us and we would be happy to assist you.

HINTS 5 Cycle 3 (2019) Web Respondent Sample

An update to this dataset and supporting documentation was posted to the HINTS Web site data downloads section in March 2021 after it was discovered that 35 variables were affected by coding errors associated with missing values for data from participants who completed the survey online as part of the push-to-web pilot study. For some of the open-ended questions in the survey, invalid skips (-9) were coded to 0 instead of -9, and other related, derived variables required minor revisions to missing value assignments. In addition, for the four-item N6 matrix items (“How much do you think that each of the following can influence whether or not a person will develop cancer?”), web respondents who chose “Don’t Know” should have been assigned a corresponding value of 4 but were incorrectly set to missing (-9). Variables impacted by the former issue are corrected in the newly posted HINTS 5 Cycle 3 data; the latter issue with N6 is not correctable and remains in the revised dataset. We strongly encourage HINTS 5 Cycle 3 data users to consult the H5C3 Survey Overview & Data Analysis Recommendations document, contained in the data file, for guidance on how to revise analyses to address the issues with the data. Updated variable formats now alert data users to this issue. See the HINTS 5 Cycle 3 methodology report Section 4.2 and Appendices I and J for further information and guidance.

HINTS 4 Cycle 3 (2013) and HINTS 4 Cycle 4 (2014)

An error occurred where 5-year American Community Survey (ACS) estimates were used as the source of the population totals used in the calibration step of the weighting. The correct population should have been the 1-year ACS estimates. The 5-year estimates are based on an average of the ACS for the previous 5 years, while the 1-year estimates are based on the results of the ACS for that particular year. The HINTS estimates affected most by this error are population totals or counts (e.g., the total number of adults who have searched for information about cancer from any source). These totals will be, on average, about 2 percent lower for the 5-year estimates than if the 1-year estimates were used.

Linear and logistic regressions using the incorrect weights will be affected less than population totals because the error is in both the numerator and the denominator, which will tend to cancel the error out. We have run several different types of analyses which compare results using the weights with the error and a corrected set of weights. These involved looking at percentages, regression estimates, and trends. None of these analyses were substantively different when using the corrected weights. Virtually all resulted in percentages, regression coefficients, and significance tests that did not differ at the first decimal place.

Our advice to users who have completed analyses using HINTS 4 Cycle 3 or HINTS 4 Cycle 4 but not published yet is to rerun the analyses with the correct weights found on the updated datasets on the HINTS website. For results that have already been published, we do not advise doing anything except in two scenarios:

  1. If the results rely on reporting population counts or totals.
  2. If a small change in the statistical significance of a result would affect your conclusions. For example, if the result is based on a result that is significant close to a 5% level (if that is the criteria used in the analysis).

In both cases, we advise re-running the analysis and deciding if the results differ enough to merit reporting an erratum to the journal.