Elsevier

Manual Therapy

Volume 16, Issue 6, December 2011, Pages 563-572
Manual Therapy

Original article
Clinical relevance vs. statistical significance: Using neck outcomes in patients with temporomandibular disorders as an example

https://doi.org/10.1016/j.math.2011.05.006Get rights and content

Abstract

Statistical significance has been used extensively to evaluate the results of research studies. Nevertheless, it offers only limited information to clinicians. The assessment of clinical relevance can facilitate the interpretation of the research results into clinical practice. The objective of this study was to explore different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders and healthy controls. Subjects were compared for head and cervical posture, maximal cervical muscle strength, endurance of the cervical flexor and extensor muscles, and electromyographic activity of the cervical flexor muscles during the CranioCervical Flexion Test (CCFT). The evaluation of clinical relevance of the results was performed based on the effect size (ES), minimal important difference (MID), and clinical judgement. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having statistical significance, or to have neither statistical significance nor clinical relevance. The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinical researchers should present the clinical relevance of their results.

Introduction

Most of the results of research in general and health research have used statistical significance in order to demonstrate effectiveness of an intervention, differences among groups in some variables of interest, or associations between variables. Statistical significance is based on hypothesis testing (Kirk, 1996). The null hypothesis states that there is no difference between groups or that an independent variable does not have an effect on the dependent variable. The alternative hypothesis states that groups are different or that an independent variable does have an effect on the dependent variable. After conducting the research, the statistical analysis provides one with the “p” value which indicates the strength of the evidence against the null hypothesis. Thus, statistical significance analysis only provides a dichotomous answer: it may or may not be statistically significant (in other words we have enough evidence against the null hypothesis or not) (Sterne and Smith, 2001). Therefore, statistical significance does not offer an indication of how important the result of the study is (Thompson, 1999, Ogles et al., 2001, Millis, 2003).

Statistical significance can also provide misleading results. A statistical difference between groups could be found if the sample size is large and if the intersubject variability is low, even though the difference between groups is small to be considered clinically important (Millis, 2003). Some authors have argued that tests of statistical significance are not generally useful and instead confidence intervals (CIs) and measures of effect size should be the main focus of research findings since they can provide more complete information regarding the magnitude of the association between variables, changes after a treatment, or differences between groups (Olejnik and Algina, 2000, Sterne and Smith, 2001). For example, CIs contain all of the information provided by a significance test in addition to a range of values within which the true difference is likely to lie. This information facilitates understanding of the “magnitude of the effect” by researchers and clinicians and offers a richer source of information in addition to the simple yes/no dichotomy of hypothesis testing (McNeely and Warren, 2006).

A result can be clinically relevant but might be neglected if statistical significance was not attained due to small sample sizes and high intersubject variability. Clinical relevance (also called clinical significance) assessment indicates whether the results are meaningful or not. In this way the evaluation of clinical relevance can provide more interesting results for health care clinicians as well as clients receiving care, facilitating the transfer of knowledge into clinical practice (Musselman, 2007). Some authors in the areas of education (Kirk, 1996, Carnine, 1997) as well as health research (Millis, 2003, Musselman, 2007) have urged that research findings be reported in language that is familiar to practitioners. With the advancement of health care and the introduction of evidence based practice, researchers need to provide information regarding their research that can be used in clinical practice and demonstrate an impact in health care and clinical decisions. The information of “p” values is insufficient to achieve these requirements and because it provides insufficient and limited information, clinical researchers needed to present the clinical relevance of their results to help busy clinicians with interpretation.

Some methods to determine clinical relevance have been created in order to provide clinicians, clients and policy makers with standards of meaningful change. The most common and used methods to determine clinical relevance are “distribution-based methods” and “anchor-based methods”. Distribution-based methods are based on the statistical distribution and the psychometric properties of the outcomes. The calculation of the effect size, the minimal important difference (MID), and the standard error of measurement are examples of distribution-based methods to evaluate clinical relevance. Anchor-based methods involve the clients’ perspective in the assessment of clinical relevance and are used prospectively.

Clinical relevance is generally evaluated as a result of an intervention; however, clinical relevance can also be assessed in other types of research such as cross-sectional studies. In these studies, patients and controls are assessed on certain variables of interest and it is important to know if the differences found between groups are in fact clinically meaningful. The interpretation of a score on a certain outcome in a cross-sectional study is performed by comparing the values obtained with those found in a reference population. Unfortunately, most of the outcomes used in clinical research lack “normative or reference values” to establish “normality” of health status. Thus the interpretation of clinical relevance for results in this type of research is uncertain and difficult to make for the general practitioner. Therefore, other methods for assessing clinical relevance in cross-sectional studies need to be used in the absence of normative values for the outcomes of interest. In addition, information regarding clinical relevance for neck outcomes is lacking and clinicians have difficulty to interpret results from research studies. Thus, the objectives of this paper were (1) to explore and analyze different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders (TMD) and healthy controls and (2) to discuss different issues regarding clinical relevance and statistical significance when interpreting these results.

Section snippets

Sample data

The data used for this example was obtained from a large study investigating the involvement of cervical spine in patients with TMD. Details regarding this study are described elsewhere (Armijo-Olivo, 2010, Armijo-Olivo et al., 2010b). The general description of the sample is as follows.

Results

Mean differences and 95% confidence intervals between groups in the variables of interest, as well as values for clinical relevance based on different methods (i.e. effect size, and MID) are described in Table 1, Table 2, Table 3.

A summary of the specific results of each one of the variables is as follows.

Discussion

This study shows an example of how to evaluate the clinical relevance of research results using data obtained from a cross-sectional study comparing several outcomes used for evaluating neck musculoskeletal functioning in patients with TMD when compared with healthy subjects. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having

Conclusion

The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinicians and researchers need to be aware of the importance of the research results and should abandon the only simplistic approach of statistical significance interpretation. This paper encourages researchers to assess and present the clinical relevance of their research results in addition to the statistical significance analysis. In addition, editors of

Acknowledgements

Alberta Provincial CIHR Training Program in Bone and Joint Health, Canadian Institutes of Health Research, Government of Chile (MECESUP Program), University Catholic of Maule, Physiotherapy Foundation of Canada through an Alberta Research Award and the University of Alberta. Authors would like to thank Martha Funabashi, Larissa Costa, and Anelise Silveira for their constructive feedback.

References (44)

  • J. Lemieux et al.

    Three methods for minimally important difference: no relationship was found with the net proportion of patients improving

    Journal of Clinical Epidemiology

    (2007)
  • F. Lobbezoo et al.

    Impaired health status, sleep disorders, and pain in the craniomandibular and cervical spinal regions

    European Journal of Pain

    (2004)
  • B.M. Ogles et al.

    Clinical significance: history, application, and current practice

    Clinical Psychology Review

    (2001)
  • S. Olejnik et al.

    Measures of effect size for comparative studies: applications, interpretations, and limitations

    Contemporary Educational Psychology

    (2000)
  • L.E. Olson et al.

    Reliability of a clinical test for deep cervical flexor endurance

    Journal of Manipulative and Physiological Therapeutics

    (2006)
  • A. Peolsson et al.

    Age- and sex-specific reference values of a test of neck muscle endurance

    Journal of Manipulative and Physiological Therapeutics

    (2007)
  • S. Armijo-Olivo

    Relationship between cervical musculoskletal impairments and temporomandibular disorders: clinical and electromyographic variables, faculty of rehabilitation medicine

    (2010)
  • Armijo-Olivo S, Rappoport K, Fuentes J, Gadotti IC, Major P, Warren S, et al. Head and cervical posture in patients...
  • Armijo-Olivo S, Silvestre R, Fuentes J, da Costa BR, Gadotti IC, Warren S, et al. Electromyographic activity of the...
  • Armijo-Olivo S, Silvestre R, Fuentes J, da Costa BR, Warren S, Major P, et al. Patients with temporomandibular...
  • S.A. Armijo-Olivo et al.

    The association between neck disability and jaw disability

    Journal of Oral Rehabilitation

    (2010)
  • J.L. Callahan et al.

    Making subjective judgments in quantitative studies: the importance of using effect sizes and confidence intervals

    Human Resource Development Quarterly

    (2006)
  • Cited by (108)

    • Optimizing the question

      2023, Handbook for Designing and Conducting Clinical and Translational Surgery
    View all citing articles on Scopus
    View full text