Abstract
BACKGROUND: Auscultation is a fundamental part of the physical examination, but its utility has been questioned due to the low inter-rater concordance. We therefore sought to evaluate the concordance of the discrimination of lung sound recordings between experienced physiotherapists.
METHODS: Lung sound recordings were selected and validated by an expert panel when Fleiss κ concordance was > 0.75. Eleven recordings were played for subject recognition using a portable computer in their workplace. Results were analyzed using Fleiss κ when looking for concordance between physiotherapists. Univariate regression was performed to determine if there was an association with clinical training, years of experience, academic accomplishment, or university affiliation.
RESULTS: Sixty-nine physiotherapists with a median of 4 years of working experience (interquartile range 2–6 y) completed the study. There was moderate concordance (κ = 0.562; 95% CI 0.462–0.605) for overall lung sound recording discrimination. For continuous and noncontinuous lung sound recordings, discrimination concordance was substantial (κ = 0.63 and κ = 0.76, respectively). A bivariate analysis revealed that years of experience presented an inverse association with stridor recognition.
CONCLUSIONS: Concordance between physiotherapists in discriminating recorded lung sounds was moderate. The ability to recognize stridor was inversely associated with years of work experience.
Introduction
Auscultation of the lung is a noninvasive and fundamental part of the physical examination process. It was first described approximately 200 years ago and has been widely used with only minor changes.1 Auscultation assesses air flow throughout the entire airway evaluating the frequency (ie, pitch) and amplitude (ie, loudness) of different sounds that determine the character of a particular lung sound. By doing so, auscultation provides relevant and current clinical information about the patient quickly and at the bedside.2,3 However, its utility has been questioned due to poor inter-provider reliability.3–7 Recognition of lung sounds is influenced by years of experience and mode of training, but also by the lack of consensus around nomenclature for lung sounds.8 Many investigators have worked to reach a common vocabulary for respiratory sounds,9–12 and the emergence of computerized lung sound analysis, which provides for a more precise description of respiratory sounds, has resulted in progress in this area.13,14
Some studies16–18 found fair to moderate inter-observer agreement between health professionals (eg physicians, nurses, physiotherapists) for the recognition of lung sounds. However, these studies were designed to test clinical scores that were limited to wheezes as the only lung sound. In addition, results could be biased because each subject's experience in such a study design was different and could be influenced by confounding factors such as environmental noise, patient anxiety, or collaboration. Other authors have attempted to provide a more objective experience using recorded videos from children and infants, but the inter-observer variation was large, leading the authors to conclude that there was a need for more objective measures.19,20 It is vital to remain cognizant of this lack of concordance between health professionals in our workplace, given that auscultation findings are routinely communicated between providers and are used to diagnose diseases and define specific therapies.21,22 We performed this study to evaluate the degree of concordance between physiotherapists in recognizing standardized recorded lung sounds in an unbiased and controlled setting.
QUICK LOOK
Current knowledge
Auscultation of the respiratory system is universally used for diagnosis, but there is low inter-rater concordance on lung sound recognition among health professionals, between patients and lung sound recordings. This poor agreement has been attributed to different modalities of training and amount of experience.
What this paper contributes to our knowledge
The reported concordance of physiotherapists on lung sound recognition was very good. There were no variables associated with improvement in lung sounds recognition, except for stridor, which was inversely related to clinical experience. Better methods are needed to test concordance of real world lung sounds between health professionals and to develop tools to improve training in lung auscultation.
Methods
Subjects
Subjects were all physiotherapists who work closely with physicians in diverse settings. These professionals have responsibilities similar to those of respiratory therapists in other countries, but in Chile they are titled physiotherapists. All subjects were licensed to work in pediatric respiratory care and performed their duties at least 3 times a week. They were recruited from pediatric ICUs, pediatric in-patient units, chronic and rehabilitation care facilities, emergency departments, and cardiopulmonary rescue units, all located in 3 central areas in Chile (ie, Metropolitan, Valparaiso, and Coquimbo regions). Physiotherapists with visual or hearing impairments were excluded from the study. Each subject provided informed consent to participate in this study, which was approved by the ethics committee of the Universidad Católica de Chile, Santiago, Chile. Physiotherapists were classified according to their clinical training as general, respiratory, pediatric respiratory, and intensive care pediatric respiratory physiotherapists. They were also categorized according to years of postgraduate clinical experience. Their academic titles were separated into 4 categories: master's degree, postgraduate diploma, postgraduate courses, or no postgraduate academic achievements. Places where they practiced were classified as affiliated with a university facility or not affiliated. Finally, subjects were asked to select their level of familiarity with the recently published lung sound nomenclature.12
Lung Sound Recordings
Recorded lung sounds used in this study were selected from a bank of sounds previously recorded from children with common diseases (eg, bronchiolitis, pneumonia) at the Catholic University of Chile Hospital in Santiago. The respiratory sounds were recorded from the posterior right lower lobe using contact sensors (EMT25C, Siemens-Elema, Solna, Sweden) while patients breathed through a mouthpiece connected to a pneumotachograph (Validyne, Northridge, California). Sound signals were filtered and amplified, and a fast Fourier analysis was applied. The signals were digitized (DT 2831-G, Data Translation, Marlboro, Massachusetts) at a rate of 10,240 samples/s with a resolution of 12 bits. Customized software was used for data acquisition and analysis (RALE, Respiration Acoustic Laboratory Enviroment, Manitoba, Canada).23,24 This computer program allowed the identification of wheezes, rhonchus, fine and coarse crackles, stridor, and normal lung sounds according to acoustics characteristics. Sixteen recordings from pediatric patients were selected according to the duration and quality of the recording and the presence of adventitious sounds. An expert panel of 5 pediatric respiratory physiotherapists performed an unbiased assessment of the recordings. The professionals were all affiliated with a university as a teaching professor, with > 10 years of clinical experience and at least 5 published manuscripts over the past 5 years. Their responses to the 16 recorded sounds were analyzed using multirater Fleiss κ as a measure of concordance (κ = 0.6, 95%CI 0.59–0.68). Eleven recorded lung sounds that reached κ > 0.75 concordance (ie, substantial agreement) were included in the study and named as follows: fine crackles 1 and 2, coarse crackles 1 and 2, wheezing 1 and 2, rhonchus 1 and 2, stridor 1, and normal lung sounds 1 and 2. The 11 lung sounds recordings were played for the physiotherapists in a quiet room, and subjects were allowed to listen to each recording up to 5 times to select 1 of 6 possible answers. Before listening to any of the recordings, subjects were briefly coached on the study and how to choose their responses. They were also instructed as to the time of the respiratory cycle at the beginning of each of the 11 recordings. All reproductions were played using a portable computer, using Window Media player software and Phillips SHL3060WT speakers. No additional clinical information was provided to any of the subjects until after the end of the study.
Statistical Analysis
All demographic data are presented as n (%) values or as median (interquartile range) values. The intra-observer reliability of recorded lung sounds was examined using Cohen's κ.25 The inter-observer concordance of recorded lung sounds was analyzed using Fleiss κ. Concordance results were categorized as follows: slight (0.01–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), and perfect (0.81–1.0), according to the method described by Landis and Koch24 concordance categorization or classification. After calculating individual recorded lung sound agreement, Fleiss κ was calculated combining continuous sounds (ie, wheeze, rhonchus, and stridor) and noncontinuous sounds (ie, fine and coarse crackles). For association analysis, univariate logistic regression models were performed using clinical experience, academic exposure, and place of employment as outcomes. Results are reported as odds ratios with 95% confidence intervals.
Sample size was calculated as sample size = av5, as suggested by Duffau,25 with a being possible dichotomous answer (correct or wrong) and v being the number of independent variables (ie, 6 lung sounds: normal breath sound, wheezing, rhonchus, fine crackles, coarse crackles, and stridor). The analysis revealed that a total of 60 subjects was needed, and an additional 15% was considered necessary for anticipated loss of data, resulting in a total of 69 subjects. The statistic package used was Stata 14.2 SE (StataCorp, College Station, Texas).
Results
A total of 69 physiotherapists were enrolled over a period of 3 months. Most subjects were pediatric respiratory physiotherapists (39%), followed by intensive care pediatric respiratory physiotherapists (30%). They had a median of 4 years of experience (interquartile range 2–6). Their academic achievements showed that many of them had either a postgraduate master's degree (12%) or a postgraduate diploma (48%); 39 subjects (56%) were affiliated with a university. Familiarity with the current lung sound nomenclature was reported by 23 (33%) subjects (Table 1).
Subject Characteristics
Relative reliability among subjects reached a concordance of κ = 0.562 (95% CI 0.462–0.605) for all lung sound recordings together. For individual lung sound recordings, concordance results were considered perfect for normal breath sound 1, normal breath sound 2, fine crackles 2 and coarse crackles 2; concordance was substantial for fine crackles 1, rhonchus 1, rhonchus 2, and stridor; and concordance was moderate for coarse crackles 1, wheezing 1, and wheezing 2. When analysis was performed adding together by same lung sounds (eg, all recordings of wheezing sounds), agreement results were considered perfect for normal lung sounds (n = 2) and fine crackles (n = 2); substantial for coarse crackles (n = 2), rhonchus (n = 2), and stridor (n = 1); and moderate for wheezing (n = 2). Agreement for noncontinuous (n = 4) and continuous sounds (n = 5) were substantial (Table 2).
Concordance by Lung Sound Recordings
The univariate logistic regression analysis for the recognition of individual lung sound records showed no significant association with subject characteristics (ie, clinical training, years of experience, academic achievements, university affiliation, and current knowledge of lung sound nomenclature) for normal lung sounds, fine crackles, coarse crackles, wheezing, or rhonchus (data not shown). An inverse association was found between recognition of stridor and years of experience (odds ratio = 0.88, 95% CI 0.81–0.96, P = .002) and being a respiratory physiotherapist (odds ratio = 0.3, 95% CI 0.09–0.77, P = .01) (Table 3). A post hoc exploration using a bivariate analysis using both variables years of experience and academic title found that stridor recognition had an inverse association with years of experience (odds ratio 0.89, 95% CI 0.82–0.98, P value = .01).
Univariate Regression Analysis for Stridor Recognition
Discussion
This study reports satisfactory results in discrimination concordance of recorded lung sounds between physiotherapists. This stands in contrast to previous reports that found high levels of discordance between health professionals.4,6–8,16,17 We found these results encouraging because in our health system important decisions are often made solely on the basis of history and physical exam findings. The good performance of the physiotherapists in this study can be attributed to a variety of reasons. First, this study was designed to provide each subject with an identical experience to eliminate potential bias, so we applied stringent criteria for the selection of lung sound recordings. We also intentionally chose single adventitious sounds and not a combination of sounds in any of the administered recordings. In addition, the recordings were validated by an expert panel that demonstrated substantial concordance among the selected lung sounds. To our knowledge, this is the first report assessing discrimination concordance between professionals from a Spanish-speaking country where implementation of the lung sounds nomenclature was confusing for clinicians. For noncontinuous lung sounds, Cruz26 proposed using the word crepitaciones (ie, crackles) to replace the old and ambiguous estertores (ie, rales). Despite such efforts to teach a standard nomenclature, most of the study subjects were not formally aware of the updated nomenclature. Nevertheless, their good degree of concordance suggests that they were trained on the basis of a common language.
Some authors have demonstrated good inter-observer reliability for discrimination between lung sounds. Gajdos et al17 published substantial inter-observer agreement (κ = 0.73) between health professionals in a study involving 180 infants with a known diagnosis of bronchiolitis. In that study, however, subjects were asked only to evaluate the presence or absence of wheezing as part of a respiratory score, and not to recognize the other lung sounds.
Margolis et al15 reported inter-observer agreement between physicians in a study done in infants suspected of lower respiratory tract illness. Subjects evaluated 56 out of 377 infants, reaching substantial agreement for wheezing (κ = 0.7), but agreement was poor for adventitious sounds (κ = 0.3). Both studies considered wheezing as a broad-spectrum word, where subjects did not follow a common language or a given nomenclature, which may explain the higher degree of concordance for wheezing. Further, studies in children are challenging. If they are auscultated by > 2 subjects, their respiratory status may change while being evaluated sequentially. Auscultators will be challenged by many uncontrolled variables such as crying, coughing, shallow breathing, and environmental noise. To improve real-life auscultation assessment, studies assessing concordance in pediatric auscultation should consider use of equipment that allows for 4–6 observers to listen a patient simultaneously.
To simulate a typical clinical experience, some authors have performed lung sound recognition using video recordings. In such studies, however, the degree of concordance was not any better. When assessing for wheezing, Bekhof et al19 reported only fair inter-observer variation (κ = 0.36) in 27 dyspneic children filmed in the emergency room. Jensen et al18 reported inconsistent inter-observer reliability when assessing for wheezing, crackles, and stridor (κ = 0.34–0.54) in 30 premature infants at 36–40 weeks postmenstrual age. Melbye et al27 reported poor to fair agreement (κ < 0.4) for wheezing, crackles, and rhonchi using 20 videos from 10 adults and 10 children.
When we analyzed the results by grouped lung sound recordings, concordance tended to move to the middle of the isolated lung sound recordings. This finding was in open contradiction with results reported by Melbye et al.27 They reported improvement in concordance when they grouped the sounds into broader categories. Further analysis of grouped lung sound recordings suggests that fine crackles (κ = 0.84) are more easily recognized than coarse crackles (κ = 0.69). This has been previously described and explained by the different waveforms of crackles.29 Kiyokawa et al28 tested recorded lung sounds in an optimized environment, adding artificial sounds to normal breath sounds. They postulated that the frequency of sounds generated by breath sounds is wide and more easily confused with coarse crackles than fine crackles. They also suggested that discrimination of lung sounds is more difficult when adding other adventitial sounds that possess a broad spectrum of frequencies. This could explain the poor performance of auscultation for the diagnosis of pneumonia.29
Our study found better concordance for noncontinuous sounds (κ = 0.76) than for continuous sounds (κ = 0.63). We attribute this to the fact that there is still no agreement in the literature about using the terms “rhonchus” or “low pitch wheezing” when analyzing continuous sounds. This is relevant when we consider that asthma is often diagnosed in children on the basis of recurrent wheezes or rhonchi. Looking to obtain an objective measure, Puder et al13 performed computerized wheeze detection in 120 sleeping infants. They reported that this was a feasible, though very invasive, way to detect wheezing. Adoption of a standardized nomenclature by all health professionals would likely improve the ability of care providers to discriminate between lung sounds.
As previously reported,7,17 we did not find that years of experience improved lung sound discrimination (data not shown); the only exception was stridor, which was better recognized by younger physiotherapists than by older physiotherapists. This is interesting because our study sample was mainly composed of young professionals who trained recently using similar nomenclature. We hypothesize that modern training elements (eg, lung sound recordings, video recordings, group sessions for clinical auscultation) may help improve lung sound discrimination.
There are several limitations to our study. Most selected subjects were young professionals with less than a decade of experience; in addition, subject hearing capacity was not formally tested. The lung sound recording presented the lung sounds in an ideal fashion compared to real-world settings. The absence of other clinical information, such as respiratory symptoms or signs on physical exam, could have negatively influenced the ability to diagnose lung sounds from the recordings. However, the primary focus of this study was lung-sound discrimination, thus the lack of clinical information could be considered a strength of the study. These data should not be used for auscultation, but to test learning skills among different health professional trainees. In addition, this study presented a multiple choice test to the subjects, which could have biased the concordance in comparison with an open-ended test.
Conclusions
The discrimination concordance of lung sound recordings between physiotherapists studied in Chile was acceptable. This concordance by lung sound was considered perfect for normal breath sounds and fine crackles; substantial for coarse crackles, rhonchus, and stridor; and moderate for wheezing. Concordance for noncontinuous and continuous sounds grouped together was substantial. Years of experience was inversely related to recognition of stridor only.
There is a need for further studies to investigate the discrimination of lung sounds in live auscultation so that the skill can be better taught to trainees. In addition, focus needs to be placed on the new standardized nomenclature for lung sounds in order to improve providers' ability to communicate.
Acknowledgment
We thank Albert Faro MD from the Cystic Fibrosis Foundation, Bethesda, Maryland, for his review of this manuscript.
Footnotes
- Correspondence: Pablo José Bertrand Navarrete MD, División de Pediatría, Unidad de Enfermedades Respiratorias Pediátricas, Facultad de Medicina, Pontificia Universidad Católica de Chile, Diagonal Paraguay 241, Santiago, Chile E-8331010. E-mail: pbertrand{at}med.puc.cl.
- Copyright © 2020 by Daedalus Enterprises