Abstract
BACKGROUND: Spirometry is an apparently simple test, yet the recommended criteria for acceptability and reproducibility can be difficult to fulfill. This study aimed (1) to prospectively assess the number of tests that meet the American Thoracic Society/European Respiratory Society (ATS/ERS) 2005 acceptability and repeatability criteria in the routine practice of an experienced technician at a referral hospital's lung function laboratory, (2) to identify the most common errors, and (3) to explore patient characteristics possibly associated with failure to meet standards.
METHODS: We prospectively evaluated 257 consecutive spirometries supervised by the same technician, who gave priority to achieving a minimum of 3 correct maneuvers within a maximum of 8 attempts. We recorded FVC, FEV1, expiratory time (TE), back-extrapolated volume (VE), end-of-test volume (VEOT), number of maneuvers with and without errors, and errors (VE > 0.15 L or 5% of FVC, TE < 6 s, and VEOT ≥ 0.025 L for ≥ 1 s).
RESULTS: Two-hundred and fifteen spirometries (83.7%, 95% CI 78.6–87.7%) met the ATS/ERS 2005 criteria. Acceptability criteria were met in 73.9% (95% CI 71.2–76.3%) of the maneuvers and repeatability criteria in 90.7% (95% CI 86.5–93.6%). A mean ± SD of 3.3 ± 1.4 per subject was acceptable, and a mean ± SD of 4.5 ± 1.9 was obtained. TE and VEOT errors were the most common.
CONCLUSIONS: Nearly 15% of the subjects failed to fulfill all the ATS/ERS 2005 criteria for spirometry performed even though they were coached by a qualified and regularly trained technician in a hospital lung function laboratory. The fact that the ATS/ERS 2005 criteria cannot be met by all patients in optimal technical conditions should be further considered and explored.
Introduction
Spirometry is used to diagnose and follow chronic respiratory diseases such as asthma and COPD, to assess lung function in medicolegal cases and before thoracic surgery, and to screen for lung disease in smokers and patients with respiratory symptoms. Successful use of this test requires accurate equipment, a thoroughly trained technician who explains the test and coaches the patient, the understanding and cooperation of the person being tested, and physicians well acquainted with the interpretation of the results. Technically, to ensure the clinical value of the information obtained from spirometry, the criteria for the spirometric variables internationally accepted are at present those proposed jointly by the American Thoracic Society and European Respiratory Society (ATS/ERS) in 2005 (Table 1).1 These criteria define the limits for acceptability and repeatability of the spirometric variables if they are to have their full value in both the clinical management of patients and epidemiologic or pharmacologic studies. Although most spirometric testing is done in the lung function laboratories (LFLs) of hospitals and health-care centers for patients with respiratory problems, the technical information that provides the basis for recommendations has been obtained mainly from large epidemiologic surveys. Thus, the thresholds for acceptability and reproducibility were set near the 90th percentile observed in a large population-based survey.2 The repeatability thresholds were validated in a retrospective chart review of 18,000 consecutive adult patients3 and later used to evaluate the quality of spirometry in another large study,4 in which technicians' overall success rates for achieving quality spirometry ranged from < 70% to 88%. The authors concluded that a 90% success rate was achievable in adults and recommended remedial attention to the training for technicians when this level of success is not achieved in 8 of every 10 subjects tested. One recent hospital-based study looked directly at adherence to criteria in 2 LFLs, observing successful compliance rates of ∼60% at the start of study and an improvement of up to 90% in one of the LFLs after a program of feedback and training.5 Thus, despite the efforts to improve technicians' success, it seems that we can expect a failure rate of at least 10%. Such failures are possibly related to patient factors, but we have been unable to find a study attempting to focus on the characteristics of the patient who fails the test, particularly in the clinical setting. More detailed information on failure to meet specific standards in relation to patient characteristics would potentially help identify specific threats to compliance and underline the value of meeting spirometric standards, encouraging new approaches to motivate and train technicians who experience difficulties in clinical LFLs.
The aim of this study was to prospectively assess the number of tests and patients who meet the ATS/ERS 2005 recommendations1 in the routine practice of a highly experienced technician at a referral hospital's LFL, to identify the most common errors that prevent meeting recommendations, and to explore patient characteristics possibly associated with failure to reach these standards.
QUICK LOOK
Current knowledge
Spirometry is a common test used to diagnose and track respiratory dysfunction in disease. The criteria for acceptability and reproducibility are defined by a joint statement of the American Thoracic Society and European Respiratory Society (ATS/ERS). Despite the relative simplicity of spirometry, meeting the acceptability and reproducibility in a routine clinical setting can be difficult.
What this paper contributes to our knowledge
In an ideal setting, despite coaching by a qualified, trained laboratory technician, 15% of the subjects failed to fulfill all the ATS/ERS 2005 criteria for spirometry. The most frequent errors were seen in expiratory time and end-of-test volume. Further research to minimize these errors is warranted.
Methods
This prospective observational study was carried out in the LFL of a 644-bed teaching and referral hospital. The LFL's work load was ∼8,500 lung function studies/y from May 1 to 30, 2010. During this period, we recorded the spirometry of 257 consecutive patients (≥ 18 y old) who required the test for evaluation of lung capacity (symptomatic patients), screening for functional disorders (in individuals at risk of respiratory disease, such as smokers), or risk assessment before thoracic surgery. Patients were excluded for the following reasons: physical or mental impairment preventing performance of a forced expiration, chest pain, pneumothorax, hemoptysis, frequent unstable angina, recent retinal detachment, tracheostomy, facial hemiparesis, or mouth malformation that impeded mouth closure. Tests done for medicolegal assessment were also excluded. The study was approved by the hospital's ethics committee, which did not require informed consent from patients because only standard spirometric procedures were performed at the request of the attending physician. The data obtained were treated in such a way as to protect confidentiality and preclude patient identification.
Patient variables considered were age, sex, height, and weight. Spirometric data were FVC, FEV1, and errors in expiratory time (TE), back-extrapolated volume (VE), and end-of-test volume (VEOT) (see Table 1). We also recorded the highest FVC and FEV1 achieved on maneuvers with and without errors and the patient's diagnosis and previous experience with spirometry. The spirometric results were expressed as absolute values and as percentages of reference values.6
We used a Datospir-600 spirometer with a meteorologic station incorporated and a Fleisch-type number 3 pneumotachometer (Sibel SA, Barcelona, Spain). The equipment was connected to a computer that provided automatic messages about errors in VE, TE, or VEOT but did not give other information regarding test quality. The equipment was calibrated daily with a 3-L syringe with maneuvers performed at high, medium, and low flows. Before and after the study, 2 healthy volunteers with no respiratory disease performed the tests to check that the spirometer was working properly. The data from these tests were not included.
During the test, subjects were comfortably seated in a chair with a backrest; nose clips were used. The technician directing the test explained the procedure to each subject and encouraged maximum effort and performance on each maneuver, requiring a minimum of 3 and allowing a maximum of 8 maneuvers. To meet the ATS/ERS 2005 recommendations,1 both acceptability and repeatability requirements (see Table 1) had to be satisfied. During the execution of the maneuvers, the technician visually checked the fulfillment of these criteria with no assistance from the computer. If any errors were spotted, the subject was retrained to improve performance. The attendant technician had > 30 y of experience in research and clinical pulmonary function testing, performed ∼4,100 pulmonary function studies yearly, including spirometry, and is an author (JG) of the study.
Statistical Analysis
The number of spirometries to be collected was calculated based on a usual work load of 375 spirometries per technician per month at the time of the study. Assuming a maximum uncertainty for the achievement of the ATS/ERS 2005 recommendations1 (ie, an estimate of 50% of tests performed according to recommendations and a 5% risk of α error), we calculated that the sample size needed for the study was 257 spirometries.
Categorical variables are presented as absolute and relative frequencies, whereas the continuous variables are described as means± SD; 95% CI values were calculated. We used the Pearson chi-square test and the Fisher exact test to compare categorical variables. A t test was used to compare means with 2 categories if normally distributed (or the Mann-Whitney U test in case of non-normally distributed data). Logistic regression was performed to test for associations between independent variables that were statistically significant in the bivariate analysis. The significance level was set at P < .05. All analyses were performed with statistics software (SPSS 19, IBM, Armonk, New York).
Results
We evaluated 257 spirometry tests consisting of a total of 1,155 maneuvers. The physical, diagnostic, and spirometric data are shown in Table 2. A total of 136 subjects (56.0%) had airway obstruction (asthma or COPD). The best and the highest spirometric values were very similar. Nearly half of the subjects performed only 3 maneuvers in total (Fig. 1A), and 3 acceptable maneuvers had been achieved within 3 attempts by nearly 64% of the subjects (Fig. 1B). Ninety-nine spirometries (38.5%, 95% CI 32.8–44.6) had at least one error in some of the maneuvers, the most common ones being TE and VEOT errors (Fig. 2). In 13 tests (5.1%), all maneuvers had an TE error; in 10 of these 13 tests, this was the only error, and in 7 of them, the FEV1 and FVC values were within the reference range.
In 215 of the 257 spirometries (83.7%, 95% CI 78.6–87.7%), the ATS/ERS 2005 acceptability and repeatability criteria were achieved. Acceptability criteria were met in 854 of the 1,155 maneuvers (73.9%, 95% CI 71.2–76.3%), and the spirometric repeatability criteria were met in 233 of the 257 tests (90.7%, 95% CI 86.5–93.6%). In 24 subjects (9%), it was not possible to obtain 2 maneuvers that satisfied the repeatability criteria even though all the maneuvers were acceptable. The FEV1 repeatability criterion was met in 252 spirometries (98.1%), and the FVC repeatability criterion was met in 237 (92.2%). Of the 5 spirometries that failed to meet the repeatability criterion for FEV1, 4 (80%) were performed by subjects with airway obstruction; of the 20 not meeting the FVC criterion, 14 were performed by subjects with airway obstruction (70%).
Differences between the best and highest values of FEV1 and FVC in the total study group were 0.057 ± 0.044 and 0.086 ± 0.078 L, respectively; the differences decreased to 0.052 ± 0.035 and 0.076 ± 0.057 L when spirometries fulfilled the criteria. A mean of 4.5 ± 1.9 maneuvers was performed per subject, and a mean of 3.3 ± 1.4 maneuvers per subject was acceptable.
Table 3 shows the relationships between quality criteria and subject characteristics. Subjects who had performed spirometry previously had fewer errors and required fewer maneuvers. TE errors were responsible for most defects. Male gender was related to a larger number of acceptable maneuvers and a need for fewer attempts in total. Subjects with spirometric results within the reference range made more errors. Age was related only to TE errors; subjects with this error (mean ± SD age, 49.7 ± 19.4 y, 95% CI 44.7–54.7 y) were significantly younger than those without the error (58.1 ± 16.5 y, 95% CI 55.8–60.4 y, P = .001). When subjects were stratified by age under and over 60 y, there were no statistically significant differences in the number of maneuvers, number of acceptable maneuvers, overall number of errors in the maneuvers, or the number of maneuvers with an error. Finally, subjects meeting the criteria performed fewer maneuvers and made fewer errors of any kind.
Eighty-four subjects had obstructive disease in the group with previous spirometry experience (61.8% of this group had asthma or COPD). This percentage was larger than in the group of 52 subjects new to spirometry (48.6%), although the difference did not reach statistical significance (P = .051). The rate of airway obstruction did not differ between tests meeting or not meeting spirometry quality criteria: 121 subjects with an obstructive disease were able to meet the criteria (accounting for 56.3% of the error-free spirometries) in comparison with 15 subjects without obstruction (53.6% of the error-free spirometries) (P = .84). Logistic regression showed that none of the independent variables studied (sex, age, previous spirometry experience, obstructive disease, and normal spirometry) were associated with the fulfillment of the repeatability criterion.
Discussion
The present study of the level of fulfillment of the ATS/ERS 2005 recommendations1 in the spirometric testing of hospital patients coached by an experienced LFL technician found that 83.7% of 257 spirometries satisfied both the acceptability and repeatability criteria. This success rate underlines the demanding level posed by the recommendations, in particular those referring to the end of the forced-expiration maneuver and, to a lesser degree, repeatability. TE and VEOT errors were the most common in this series; particularly frequent was a maneuver lasting < 6 s (present in 23.3% of all maneuvers). The total number of errors was related to sex and previous experience with spirometry, suggesting a learning effect. The error rate was lower in spirometries outside the reference range, possibly because most abnormal findings corresponded to subjects with airway obstruction, who tend to prolong expiration. In 13 tests, the TE error was present in all the performed maneuvers, emphasizing the fact that prolonging expiration beyond 6 s presents the main problem in the fulfillment of the recommendations, a fact worth taking into consideration especially when coaching patients who have not yet developed obstructive disease.
When Malmstrom et al7 analyzed 36,800 spirometries obtained in 6 clinical trials carried out in 232 centers in 31 countries, only 37% of the spirometries fulfilled the earlier ATS recommendations of 1994.8 Pérez-Padilla et al9 reported fulfillment of the ATS/ERS 2005 recommendations1 in 89% of the spirometries in a large epidemiologic study. However, they attained this success rate by repeating the spirometry on several occasions until satisfactory results had been achieved by patients who did not reach the standards on earlier attempts. Recently, Enright et al4 developed a strict spirometry quality program for the World Trade Center Worker and Volunteer Medical Screening Program, achieving adherence to the ATS/ERS 2005 recommendations in 80% of the tests, with technician success rates ranging from 71% to 88%. Borg et al5 demonstrated the efficiency of programs created to improve and maintain a high quality of spirometry, increasing repeatability in one laboratory to 97%, from an initial 59% to 87% over a 5-y study period.
The repeatability requirements may be easier to fulfill, as suggested by a retrospective chart review in which 90% of patients were able to reproduce an FEV1 within 120 mL and an FVC within 150 mL.10 The authors of that study also observed that the differences between the 2 best FEV1 and FVC measurements were 0.058 ± 0.060 and 0.072 ± 0.076 mL, respectively, results that were very similar to the figures observed in our study of an experienced technician in a routine LFL setting. Only 24 of our subjects' spirometries failed to meet the repeatability criterion of a < 0.15-L difference between the 2 best FVC and FEV1 values. When this error occurs, it could be related to significant air-flow obstruction because 80% of the subjects with no repeatable FEV1 measurements and 70% of those with no repeatable FVC values had air-flow obstruction. It has been reported that such obstruction can affect FVC repeatability, and for this reason, FEV6 has been suggested as a substitute for FVC.11 We noted that the observed differences between the best FVC and FEV1 and the highest FVC and FEV1 did not affect the final outcome of the test, however.
A limitation of this exploratory study was its focus on a single center and a single technician conducting the spirometries, which limits the generalizability of the results. An observer's high level of expertise and long experience may not be common in clinical practice outside tertiary hospital LFLs, and the results of this study might not be replicable in centers with a high turnover of technical personnel but without strict spirometry quality assurance programs. However, this design might also be considered as a strength because a potential source of confounding due to variability in coaching was probably reduced in the interest of estimating error rates that may persist even in best-case clinical scenarios.
The fact that there are patients who cannot achieve the ATS/ERS 2005 recommendations1 even when the tests are carried out in a specialized laboratory with coaching by a qualified expert, often patients who are already familiar with the procedure, should be taken into consideration and explored further in clinical settings. Whether the criteria needed to achieve error-free maneuvers should apply to everyone or should be modified in certain situations (eg, primary care, which might be considered a preliminary screening setting) should also be considered. Moreover, given that the most frequent error is TE < 6 s, it might at some point become appropriate to reassess whether this error should classify a maneuver as unacceptable if it continues to cause difficulty. If it were not in force at this time, for example, achievement of a higher number of acceptable spirometries would be facilitated. Ten more would have been counted in the present study, for example, bringing the success rate to 87.5%, even in clinical conditions without several repeated attempts, as were allowed for in the study by Pérez-Padilla et al9 We note that this criterion was not included in the earlier ERS recommendations,12 but we emphasize that the present study was not designed to analyze its clinical relevance and are aware that the data we present do not allow for the proper evaluation of its inclusion.
We also note that our findings are inconsistent with a common belief that older patients have more difficulty performing spirometry because of their age per se or because they have more altered lung function. In fact, we found that spirometry-naive subjects, who were usually younger, were the ones with more errors. Our prospectively gathered data are consistent with the recent findings of a larger but retrospective study reporting that patients over the age of 80 y did not have statistically significant differences in spirometry quality compared with patients 40–50 y old.13
We conclude that in clinical settings with widely varying patients, the rate of failure to fulfill all the ATS/ERS 2005 criteria1 may exceed 10%, as it did in this study designed to observe a best-case scenario. Our observations that the most common errors are related to TE and VEOT and that those errors are more frequently seen in females and in spirometry-naive subjects with values within the reference range suggest that training might encourage technicians to give particular attention to such individuals during tests. However, these results should be verified in similar best-case scenarios before training approaches are redesigned and tested.
Acknowledgments
We are grateful to Sera Tort for editorial assistance with the preparation of this manuscript and to Mary Ellen Kerans for advice on English usage in some versions of the paper.
Footnotes
- Correspondence: Jordi Giner Donaire MSc RN, Respiratory Department, Hospital de la Santa Creu i Sant Pau, Sant Antoni Maria Claret 167, 08025 Barcelona, Spain. E-mail: jginer{at}santpau.cat.
This study was supported by Spanish Government research grant FIS PS09/00686 and Catalan Society for Pulmonology grant SOCAP 2009.
Mr Giner has participated in workshops and courses organized by Sibel SA and Sonmedica SA. Dr Rigau is an employee of Sibel SA. The other authors have disclosed no conflicts of interest.
- Copyright © 2014 by Daedalus Enterprises