Health Quality Council of Alberta Report is Junk Science
robertgerst | Jun 02, 2010 | Comments 0
Originally Published at the Quality Council of Alberta
The Health Quality Council of Alberta report: The Urban and Regional Emergency Department Patient Experience Report 2009, represents a significant step backward for the Quality movement in Alberta.
This, because the methodology is so flawed, none of the facts or statistics offered can, or should, be taken as reliable indicators of anything, let alone the performance of Alberta’s health care system. Albertan’s, if they are to receive useful information concerning the status of their health care system, will need something better than bad science offered as a report on Quality.
Background
The Health Quality Council of Alberta HQCA (no association with the QCA) was created by the Alberta Government to: promote patient safety and health service quality on a province wide basis, primarily through the lens of the Alberta Quality Matrix for Health. Among its responsibilities, to: Measure, monitor and assess patient safety and health service quality.
The The Urban and Regional Emergency Department Patient Experience Report 2009 is part of the HQCA’s efforts to meet these responsibilities to the public.
The Technical Issues
Unfortunately, numerous technical and methodological issues that make the facts, data and statistics in the report and subsequent conclusions, unreliable and, therefore, useless for any meaningful practical application. Four foundational elements of the methodology employed, and the errors associated with each, are focused on here. These errors in these foundational elements corrupt essentially all of the data and evidence offered in the report.
The Basis of Comparison: the Baseline.
Using the 2007 results as a baseline, the purpose of the 2009 study was to monitor changes in the performance of the twelve urban and regional emergency department sites with the greatest crowding pressures, longest wait times and poorest patient experience.
This opening sentence, conveys the first of the fundamental errors made – the basis of comparison, namely the baseline. All of the conclusions in the report referencing change (whether things are getting better or worse) are performed on a basis of comparison with a single data point – that for 2007. Thus the foundation for assessing change in the report is what statistics and business students have long understood to be a fallacy of reasoning and the source of many bad jokes in statistics classrooms and business offices across the country: What do you call any two data points? An executive trend!
The point being that no conclusion concerning trends or changes should ever be drawn from two data points. Proper comparisons conducted over time requires comparison to a base rate including measures of variation recorded over time, such as that presented on a run or control chart. A baseline means just that – a baseline, not a base point! Drawing comparisons with a single data point ignores naturally occurring variation (or assumes it doesn’t exist). As such, there is no statistical or scientific way to assess whether the differences between 2007 and 2009 highlighted in the report are the result of some differences in the performance of the system or simple random fluctuations in the data.
Because of this, no scientific or meaningful conclusions can be drawn from any of the time-based comparisons contained in the report.
Response Rate
A broader issue that affects all of the data and conclusions in the Report (time based or not) is the response rate of those surveyed. According to the Report: In total, 4,942 patients completed the questionnaire for an overall response rate of 45%. The Report highlights the good effort of the HQCA in attempting to increase this response rate. Nevertheless, a response rate of 45% means that no inferences can or should be made from data arising from the research.
Sample size and response rates are one of the more widely misunderstood aspects of survey research. A survey is an attempt to make inferences about a large population based on data taken from a relatively small sample of that population. Certain requirements demanded by statistical theory, such as a probability sample, allows these inferences to be made.
A lack of response violates the requirements of statistical theory and means that in principle, no inferences can be made . The problem cannot be made to go away by increasing sample size. In writing What is a Survey?, Robert Ferber, Chair and other members of the Section on Survey Research Methods of the American Statistical Association put it plainly when they wrote:
A low response rate does more damage in rendering a survey’s results questionable than a small sample, since there is no valid way of scientifically inferring the characteristics of the population represented by the nonrespondents (emphasis added).
This is the problem with The Urban and Regional Emergency Department Patient Experience Report 2009. It may be true, to pick one item in the Report, that 85% of respondents rated their overall care as excellent, but that does not mean there is any scientific or statistical basis to infer that the 85% figure in any way represents how the broader population would rate their overall care. The balance of the population could rate their health care as excellent or as terrible or as anything in between. This means all of the data and statistics reported in The Urban and Regional Emergency Department Patient Experience Report 2009 apply only to those completing the questionnaire, and in no way represent the broader population of those using the emergency departments in the sample frame.
Because of this, no inferences about the general experience of Albertan’s should be made from any of the data in the Report.
Statistical Significance
The treatment of statistical significance is the third foundational component in which serious errors are made. Generally the Report uses Chi-Square, t-tests and Cramer’s V as statistical tests for designating whether a relationship can be termed statistically significant. The Report doesn’t discuss why it would be desirable to designate a relationship as statistically significant or not, and therein lies the error.
Statistical significance is generally misinterpreted as a measure of practical or scientific importance (real world significance). Under this misinterpretation, a statistically significant difference between two numbers is taken to be something to which people need to pay attention or to act upon. Thus, to take another sample from the Report, we are told:
About 5 in 10 (46% in 2007, 48% in 2009) visited the emergency department because it was the best place to deal with their medical problem; this difference between years is statistically significant.
Why mention that the difference between years is statistically significant? It may be statistically significant, but is clearly unimportant. As Edward Tufte has pointed out, confusing statistically significant with the everyday meaning of significant is a bad pun – and bad science (See Beautiful Evidence, Graphic Press). It gives the appearance of scientific method by using the language of science, but without an appreciation for thinking behind the tools and techniques.
Generating statistical significant findings is easy, it can be done simply by increasing sample size. In contrast, findings of real and practical importance are hard to come by. Because of this, statements made throughout the Report concerning the statistical significance of results, should not be confused with the practical importance of those results.
Correlation and Causation
Roughly half of the Report deals with what is termed Composite Variables and specific patient experience questions. Regression analysis is used to build models in which a group of variables (such as questions in the survey) are used to ‘predict’ another variable – usually something taken to be an ‘outcome’. For example, the first regression model described in the Report uses facility cleanliness, pain management, wait time and other composite variables to predict the overall (global) rating of emergency department care.
Outside of repeating the error concerning the misinterpretation of statistical significance, in using regression analysis, the report appears to confuse correlation with causation. It is one thing to build a regression model that describes the correlations within a specific data set and quite another to say those correlations somehow reflect causal relations in the real world.
In fairness to the Report and its authors, no explicit claims of causation are made. Yet, there are numerous examples where confusion between correlation and causation seem to infer causal relationships have been uncovered. Take for example the statement introducing this section of the Report when it states:
This analysis (and subsequent multivariate analysis) suggests that these variables are valid, reliable and have significant predictive power with respect to patient rating of overall care and quality and other outcome variables. (Emphasis added)
It may be that variables have significant predictive power within the data set, but that doesn’t mean that this predictive power exists in the real world nor does it mean that there is any causal relationship. On the last point for example, it may be true that facility cleanliness influences the overall rating people give their emergency care experience, but it might also be true that an individuals overall experience influences how the level of cleanliness was recalled. This is the well documented Halo Effect (described in a book of the same name). The same mistake is made all too frequently with studies that claim to identify the ten factors that produce employee engagement or the twelve drivers of customer satisfaction.
Because of this, the various models in the report should be interpreted as interesting correlations, in some cases, worthy of further study, but not as any causal models that could be used to improve patients overall experience with emergency care.
Conclusions
We need to make it clear that our criticism of the The Urban and Regional Emergency Department Patient Experience Report 2009 is limited to that Report alone and is not to be taken as a critique of the HQCA.
It is important that faulty conclusions, misinterpretation of data, poor research/statistical technique and other forms of evidence corruption not be allowed to drive change in Alberta’s health care system.
There has been more than enough of that already.
Filed Under: Healthcare • News
About the Author:


