Clinical relevance vs. statistical significance: Using neck outcomes in patients with temporomandibular disorders as an example

doi:10.1016/j.math.2011.05.006

Manual Therapy

Volume 16, Issue 6, December 2011, Pages 563-572

https://doi.org/10.1016/j.math.2011.05.006 Get rights and content

Abstract

Statistical significance has been used extensively to evaluate the results of research studies. Nevertheless, it offers only limited information to clinicians. The assessment of clinical relevance can facilitate the interpretation of the research results into clinical practice. The objective of this study was to explore different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders and healthy controls. Subjects were compared for head and cervical posture, maximal cervical muscle strength, endurance of the cervical flexor and extensor muscles, and electromyographic activity of the cervical flexor muscles during the CranioCervical Flexion Test (CCFT). The evaluation of clinical relevance of the results was performed based on the effect size (ES), minimal important difference (MID), and clinical judgement. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having statistical significance, or to have neither statistical significance nor clinical relevance. The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinical researchers should present the clinical relevance of their results.

Introduction

Most of the results of research in general and health research have used statistical significance in order to demonstrate effectiveness of an intervention, differences among groups in some variables of interest, or associations between variables. Statistical significance is based on hypothesis testing (Kirk, 1996). The null hypothesis states that there is no difference between groups or that an independent variable does not have an effect on the dependent variable. The alternative hypothesis states that groups are different or that an independent variable does have an effect on the dependent variable. After conducting the research, the statistical analysis provides one with the “p” value which indicates the strength of the evidence against the null hypothesis. Thus, statistical significance analysis only provides a dichotomous answer: it may or may not be statistically significant (in other words we have enough evidence against the null hypothesis or not) (Sterne and Smith, 2001). Therefore, statistical significance does not offer an indication of how important the result of the study is (Thompson, 1999, Ogles et al., 2001, Millis, 2003).

Statistical significance can also provide misleading results. A statistical difference between groups could be found if the sample size is large and if the intersubject variability is low, even though the difference between groups is small to be considered clinically important (Millis, 2003). Some authors have argued that tests of statistical significance are not generally useful and instead confidence intervals (CIs) and measures of effect size should be the main focus of research findings since they can provide more complete information regarding the magnitude of the association between variables, changes after a treatment, or differences between groups (Olejnik and Algina, 2000, Sterne and Smith, 2001). For example, CIs contain all of the information provided by a significance test in addition to a range of values within which the true difference is likely to lie. This information facilitates understanding of the “magnitude of the effect” by researchers and clinicians and offers a richer source of information in addition to the simple yes/no dichotomy of hypothesis testing (McNeely and Warren, 2006).

A result can be clinically relevant but might be neglected if statistical significance was not attained due to small sample sizes and high intersubject variability. Clinical relevance (also called clinical significance) assessment indicates whether the results are meaningful or not. In this way the evaluation of clinical relevance can provide more interesting results for health care clinicians as well as clients receiving care, facilitating the transfer of knowledge into clinical practice (Musselman, 2007). Some authors in the areas of education (Kirk, 1996, Carnine, 1997) as well as health research (Millis, 2003, Musselman, 2007) have urged that research findings be reported in language that is familiar to practitioners. With the advancement of health care and the introduction of evidence based practice, researchers need to provide information regarding their research that can be used in clinical practice and demonstrate an impact in health care and clinical decisions. The information of “p” values is insufficient to achieve these requirements and because it provides insufficient and limited information, clinical researchers needed to present the clinical relevance of their results to help busy clinicians with interpretation.

Some methods to determine clinical relevance have been created in order to provide clinicians, clients and policy makers with standards of meaningful change. The most common and used methods to determine clinical relevance are “distribution-based methods” and “anchor-based methods”. Distribution-based methods are based on the statistical distribution and the psychometric properties of the outcomes. The calculation of the effect size, the minimal important difference (MID), and the standard error of measurement are examples of distribution-based methods to evaluate clinical relevance. Anchor-based methods involve the clients’ perspective in the assessment of clinical relevance and are used prospectively.

Clinical relevance is generally evaluated as a result of an intervention; however, clinical relevance can also be assessed in other types of research such as cross-sectional studies. In these studies, patients and controls are assessed on certain variables of interest and it is important to know if the differences found between groups are in fact clinically meaningful. The interpretation of a score on a certain outcome in a cross-sectional study is performed by comparing the values obtained with those found in a reference population. Unfortunately, most of the outcomes used in clinical research lack “normative or reference values” to establish “normality” of health status. Thus the interpretation of clinical relevance for results in this type of research is uncertain and difficult to make for the general practitioner. Therefore, other methods for assessing clinical relevance in cross-sectional studies need to be used in the absence of normative values for the outcomes of interest. In addition, information regarding clinical relevance for neck outcomes is lacking and clinicians have difficulty to interpret results from research studies. Thus, the objectives of this paper were (1) to explore and analyze different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders (TMD) and healthy controls and (2) to discuss different issues regarding clinical relevance and statistical significance when interpreting these results.

Section snippets

Sample data

The data used for this example was obtained from a large study investigating the involvement of cervical spine in patients with TMD. Details regarding this study are described elsewhere (Armijo-Olivo, 2010, Armijo-Olivo et al., 2010b). The general description of the sample is as follows.

Results

Mean differences and 95% confidence intervals between groups in the variables of interest, as well as values for clinical relevance based on different methods (i.e. effect size, and MID) are described in Table 1, Table 2, Table 3.

A summary of the specific results of each one of the variables is as follows.

Discussion

This study shows an example of how to evaluate the clinical relevance of research results using data obtained from a cross-sectional study comparing several outcomes used for evaluating neck musculoskeletal functioning in patients with TMD when compared with healthy subjects. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having

Conclusion

The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinicians and researchers need to be aware of the importance of the research results and should abandon the only simplistic approach of statistical significance interpretation. This paper encourages researchers to assess and present the clinical relevance of their research results in addition to the statistical significance analysis. In addition, editors of

Acknowledgements

Alberta Provincial CIHR Training Program in Bone and Joint Health, Canadian Institutes of Health Research, Government of Chile (MECESUP Program), University Catholic of Maule, Physiotherapy Foundation of Canada through an Alberta Research Award and the University of Alberta. Authors would like to thank Martha Funabashi, Larissa Costa, and Anelise Silveira for their constructive feedback.

References (44)

S. Armijo-Olivo et al.
Reduced endurance of the cervical flexor muscles in patients with concurrent temporomandibular disorders and neck disability
Manual Therapy
(2010)
S.L. Armijo-Olivo et al.
Is maximal strength of the cervical flexor muscles reduced in patients with temporomandibular disorders?
Archives of Physical Medicine and Rehabilitation
(2010)
A. Barber
Upper cervical spine flexor muscles: age related performance in asymptomatic women
Australian Journal of Physiotherapy
(1994)
L. Blizzard et al.
Validity of a measure of the frequency of headaches with overt neck involvement, and reliability of measurement of cervical spine anthropometric and muscle performance factors
Archives of Physical Medicine and Rehabilitation
(2000)
M.S. Cooke et al.
The reproducibility of natural head posture: a methodological study
American Journal of Orthodontics and Dentofacial Orthopedics. Official publication of the American Association Of Orthodontists, its constituent societies, and the American Board Of Orthodontics.
(1988)
C. Faulkner et al.
The value of RCT evidence depends on the quality of statistical analysis
Behaviour Research and Therapy
(2008)
K. Grimmer
Measuring the endurance capacity of the cervical short flexor muscle group
Australian Journal of Physiotherapy
(1994)
G.H. Guyatt et al.
Methods to explain the clinical significance of health status measures
Mayo Clinic Proceedings
(2002)
R.E. Kirk
Effect magnitude: a different focus
Journal of Statistical Planning and Inference
(2007)
H. Lee et al.
Neck muscle endurance, self-report, and range of motion data from subjects with treated and untreated neck pain
Journal of Manipulative and Physiological Therapeutics
(2005)

J. Lemieux et al.

Three methods for minimally important difference: no relationship was found with the net proportion of patients improving

Journal of Clinical Epidemiology

(2007)

F. Lobbezoo et al.

Impaired health status, sleep disorders, and pain in the craniomandibular and cervical spinal regions

European Journal of Pain

(2004)

B.M. Ogles et al.

Clinical significance: history, application, and current practice

Clinical Psychology Review

(2001)

S. Olejnik et al.

Measures of effect size for comparative studies: applications, interpretations, and limitations

Contemporary Educational Psychology

(2000)

L.E. Olson et al.

Reliability of a clinical test for deep cervical flexor endurance

Journal of Manipulative and Physiological Therapeutics

(2006)

A. Peolsson et al.

Age- and sex-specific reference values of a test of neck muscle endurance

Journal of Manipulative and Physiological Therapeutics

(2007)

S. Armijo-Olivo

Relationship between cervical musculoskletal impairments and temporomandibular disorders: clinical and electromyographic variables, faculty of rehabilitation medicine

(2010)

Armijo-Olivo S, Rappoport K, Fuentes J, Gadotti IC, Major P, Warren S, et al. Head and cervical posture in patients...

Armijo-Olivo S, Silvestre R, Fuentes J, da Costa BR, Gadotti IC, Warren S, et al. Electromyographic activity of the...

Armijo-Olivo S, Silvestre R, Fuentes J, da Costa BR, Warren S, Major P, et al. Patients with temporomandibular...

S.A. Armijo-Olivo et al.

The association between neck disability and jaw disability

Journal of Oral Rehabilitation

(2010)

J.L. Callahan et al.

Making subjective judgments in quantitative studies: the importance of using effect sizes and confidence intervals

Human Resource Development Quarterly

(2006)

Cited by (108)

Motor alterations along the kinetic chain in amateur volleyball and handball athletes with shoulder pain: An observational comparative study
2024, Journal of Bodywork and Movement Therapies
Overhead sports overload the shoulder complex due to movement repetition and the great amount of force created during the athletic motion, which may cause adaptations in the shoulder and lead to shoulder pain. However, overhead movements include the kinetic chain, and alterations in some of the structures throughout the kinetic chain may increase stress on the shoulder complex and be associated with shoulder pain.
To compare kinetic chain components in overhead athletes with and without shoulder pain.
Forty-one volleyball and handball athletes (21 with and 20 without shoulder pain) were included and assessed for hip internal (IR) and external rotation (ER) range of motion (ROM), hip and trunk isometric strength, trunk endurance and neuromuscular control of the lower and upper limbs (Y balance test).
Athletes with shoulder pain showed smaller IR ROM in both hips, lower endurance time for trunk extensors and flexors, decreased reach distance in the anterior and posteromedial direction, as well as a smaller composite score in the Y balance test (p < 0.05).
Volleyball and handball athletes with shoulder pain showed changes in ROM throughout the kinetic chain in addition to lower core endurance, and decreased neuromuscular control of lower limbs.
Rotator cuff isometric exercises in combination with scapular muscle strengthening and stretching in individuals with rotator cuff tendinopathy: A multiple-subject case report
2024, Journal of Bodywork and Movement Therapies
To assess the effects of a rehabilitation protocol of rotator cuff (RC) isometrics coupled with traditional shoulder exercises on patient-rated outcomes, muscle strength, and electromyographic activity in individuals with RC tendinopathy. Methods: Eleven individuals (8 women and 3 men, 37.9 ± 5.6 years) with RC tendinopathy performed isometric RC exercises in combination with scapular muscle stretching and strengthening for 6 weeks. Treatment effects were assessed with patient-rated pain and shoulder function, isometric muscle strength, electromyographic activity during arm elevation and internal and external shoulder rotation, and pain during arm elevation before and at the end of the first session, and after 6 weeks of intervention. Results: There were improvements in pain and shoulder function, increased isometric muscle strength for arm elevation and internal rotation, increased muscle activity of the infraspinatus and serratus anterior, and reduced pain during arm elevation after 6 weeks of intervention. Discussion: This case report showed improvements on pain and function, increases on isometric strength of the shoulder and on electromyographic activity of the serratus anterior and infraspinatus muscles, as well as decreases on pain during arm elevation, after a 6-week intervention of RC isometric exercises associated with scapular muscle stretching and strengthening in patients with RC tendinopathy.
Cervical sensitivity, range of motion and strength in individuals with shoulder pain: A cross-sectional case control study
2023, Musculoskeletal Science and Practice
To assess whether cervical sensitivity, range of motion (ROM) and strength are impaired in individuals with shoulder pain and how they interact with sociodemographic and clinical data.
Forty-eight individuals with shoulder pain and 48 asymptomatic matched ones were included. Pressure pain thresholds (PPTs) in cervical region and tibialis anterior muscles, ROM of cervical flexion, extension, lateral flexions and rotations and cervical muscle strength of flexion, extension and lateral flexions were assessed. Between-groups comparisons and a logistic multiple regression model were performed.
The symptomatic group showed lower and not meaningful PPTs in trapezius of the unaffected/unmatched side, both sternocleidomastoid muscles, and tibialis anterior and reduced ROM in cervical extension (MD = −9.00°) when compared to the asymptomatic group. No differences were identified in muscle strength. Reduced PPT of the trapezius and reduced cervical extension ROM together accounted for 40.2% of the variance of the chance of presenting shoulder pain.
Individuals with shoulder pain have more, but not clinically relevant, cervical sensitivity and lower cervical extension than asymptomatic individuals. The lower the PPT of the upper trapezius and the cervical extension ROM, the higher was the chance to present shoulder pain. Regional interdependence between cervical spine and shoulder may explain cervical physical function alterations in shoulder pain.
The Effect of Upper Cervical Mobilization/Manipulation on Temporomandibular Joint Pain, Maximal Mouth Opening, and Pressure Pain Thresholds: A Systematic Review and Meta-Analysis
2023, Archives of Rehabilitation Research and Clinical Translation
To evaluate the efficacy of upper cervical joint mobilization and/or manipulation on reducing pain and improving maximal mouth opening (MMO) and pressure pain thresholds (PPTs) in adults with temporomandibular joint (TMJ) dysfunction compared with sham or other intervention.
MEDLINE, CINAHL, EMBASE, and Cochrane Library from inception to June 3, 2022, were searched.
Eight randomized controlled trials with 437 participants evaluating manual therapy (MT) vs sham and MT vs other intervention were included. Two reviewers independently extracted data and assessed risk of bias.
Two independent reviewers extracted information about origin, number of study participants, eligibility criteria, type of intervention, and outcome measures.
Manual therapy was statistically significant in reducing pain compared with sham (mean difference [MD]: -1.93 points, 95% confidence interval [CI]: -3.61 to -0.24, P=.03), and other intervention (MD: -1.03 points, 95% CI: -1.73 to -0.33, P=.004), improved MMO compared with sham (MD: 2.11 mm, 95% CI: 0.26 to 3.96, P=.03), and other intervention (MD: 2.25 mm, 95% CI: 1.01 to 3.48, P<.001), but not statistically significant in improving PPT of masseter compared with sham (MD: 0.45 kg/cm², 95% CI: -0.21 to 1.11, P=.18), and other intervention (MD: 0.42 kg/cm², 95% CI: -0.19 to 1.03, P=.18), or the PPT of temporalis compared with sham (MD: 0.37 kg/cm², 95% CI: -0.03 to 0.77, P=.07), and other intervention (MD: 0.43 kg/cm², 95% CI: -0.60 to 1.45, P=.42).
There appears to be limited benefit of upper cervical spine MT on TMJ dysfunction, but definitive conclusions cannot be made because of heterogeneity and imprecision of treatment effects.
Optimizing the question
2023, Handbook for Designing and Conducting Clinical and Translational Surgery
Statistical significance, while always the ambition of any clinical research study, does not equate to clinical significance. A research question must balance the dual goals of both statistical and clinical significance, maintaining cognizance of whether or not the potential results will be relevant to practice and patient care. The objective of this chapter is to review the key components of clinical significance, as to aid researchers in developing research ideas with maximal impact. Those important factors include feasibility, novelty/proper context, patient centeredness, and relevance.
Comparing exercises with and without electromyographic biofeedback in subacromial pain syndrome: A randomized controlled trial
2022, Clinical Biomechanics
Deficits in movement and muscle activation of scapulohumeral joint are related to Subacromial Pain Syndrome. Electromyography biofeedback during exercise may enhance muscle activation and coordination, and consequently improve pain and shoulder function.
This study compared the effects of an exercise protocol with and without using electromyographic biofeedback on pain, function and movement of the shoulder complex in subjects with Subacromial Pain Syndrome. A total of 24 patients with subacromial pain (mean age = 46.2 + 8.1;18 women) were randomized to either therapeutic exercise or exercise plus biofeedback to the trapezius and serratus muscles. Pain and shoulder function were evaluated as the primary outcome and range of motion, muscle strength, electromyographic activity and scapulohumeral kinematics as secondary outcomes. The subjects underwent eight weeks of intervention and comparisons were made between groups in baseline, at 4 weeks, 8 weeks, and at 4 weeks post intervention.
There were differences between groups for pain [mean difference = 1.5 (CI 0.3, 3.2) p = 0.01] at 8 weeks in the Exercise group and scapular upward rotation at 60° of arm elevation [mean difference = 13.9 (CI 0.9, 9.3), p = 0.006] in the Biofeedback group. There was no difference for the other variables of scapular kinematics as well as for shoulder function (DASH), muscle strength, range of motion and electromyographic variables.
The addition of Biofeedback to the exercise protocol increased upward rotation of the scapula. However, the volunteers who performed only the Exercises had a better response in reducing pain.

View all citing articles on Scopus

View full text

Original articleClinical relevance vs. statistical significance: Using neck outcomes in patients with temporomandibular disorders as an example

Abstract

Introduction

Section snippets

Sample data

Results

Discussion

Conclusion

Acknowledgements

Manual Therapy

Archives of Physical Medicine and Rehabilitation

Australian Journal of Physiotherapy

Archives of Physical Medicine and Rehabilitation

American Journal of Orthodontics and Dentofacial Orthopedics. Official publication of the American Association Of Orthodontists, its constituent societies, and the American Board Of Orthodontics.

Behaviour Research and Therapy

Australian Journal of Physiotherapy

Mayo Clinic Proceedings

Journal of Statistical Planning and Inference

Journal of Manipulative and Physiological Therapeutics

Journal of Clinical Epidemiology

European Journal of Pain

Clinical Psychology Review

Contemporary Educational Psychology

Journal of Manipulative and Physiological Therapeutics

Journal of Manipulative and Physiological Therapeutics

Relationship between cervical musculoskletal impairments and temporomandibular disorders: clinical and electromyographic variables, faculty of rehabilitation medicine

The association between neck disability and jaw disability

Journal of Oral Rehabilitation

Making subjective judgments in quantitative studies: the importance of using effect sizes and confidence intervals

Human Resource Development Quarterly

Original article
Clinical relevance vs. statistical significance: Using neck outcomes in patients with temporomandibular disorders as an example