Abstract
Pulmonary function testing is often considered the basis for diagnosis in many categories of pulmonary disease. Although most of the testing methodologies are well established and widely employed, there are still many questions regarding how tests should be performed, how to ensure that reliable data are produced, what reference values and rules should be used, and how pulmonary function tests (PFTs) should be interpreted to best support clinical decision making. This conference was organized around a set of questions aimed at many of these issues. Each presenter was asked to address a specific topic regarding what tests should be done, how those test should be performed to answer a particular clinical question, and to relate test results to an accurate diagnosis and appropriate treatment of the patient. These topics included testing of adults and children, with concentration on important disease entities such as COPD, asthma, and unexplained dyspnea. Special emphasis was given to discussing reference values, lower limits of normal, interpretive strategies to optimize disease classification, and those factors directly affecting data quality. Established techniques for spirometry, lung volumes, diffusing capacity, exercise testing, and bronchial challenges were compared and contrasted with new technologies, and with technologies that might be part of pulmonary function laboratories in the near future.
- pulmonary function testing
Introduction
Pulmonary function tests form the basis for clinical decision making not only for patients who have pulmonary disease, but for a wide range of subjects who have symptoms of dyspnea, who need surgery that involves the thorax or abdomen, or who might require screening because they are at risk.1–4 There are however, many gaps in the evidence base for individual tests. In many instances, patients who need pulmonary function testing fail to get appropriate evaluation. Similarly, poor data quality or inappropriate reference equations frequently confound interpretation of PFTs, resulting in misclassification of the patient's condition.5 All of these issues are amenable to quality improvement when addressed by the medical directors and technologists who staff the pulmonary function laboratory. This conference was organized to recognize the problems that increase diagnostic misclassification and to propose solutions that could improve the clinical value of PFTs.
Toward this goal, the conference presenters were given a series of questions to answer, and/or specific problematic areas to address. Table 1 lists these questions/problems. Pulmonary function testing as conducted in most laboratories or clinical settings is largely influenced by the available technology. Computerization of spirometers, gas analyzers, and related devices allows rapid, repeated testing, with instantaneous results for most parameters. One outcome of these technologic improvements has been standardization of how specific tests should be performed, including how pulmonary function data are reported, and how an individual patient's values should be compared to expected norms. Physicians and technologists involved in pulmonary diagnostics have the advantage of several decades of thoughtful consideration by the American Thoracic Society (ATS), European Respiratory Society (ERS), and other professional organizations, which have produced widely accepted standards.6–10 But despite sophisticated diagnostic instruments and recommended testing guidelines, pulmonary function testing often fails to answer the clinical questions asked.
In attempting to summarize the responses to the questions posed for this conference, 4 major themes seemed to emerge:
Specific Tests and Testing Methodologies
Quality Issues Related to PFTs
Predicted Values
Interpretation and Interpretive Strategies
Specific Tests and Testing Methodologies
DLCO
Measurement of diffusing capacity (DLCO) has many indications, including differential diagnosis in restrictive and obstructive diseases, lung transplant candidacy, disability assessment, evaluating medication toxicity, and predicting exertional hypoxemia, among others. The DLCO test, although represented by a relatively simple maneuver, remains somewhat problematic because of multiple sources of variability.11 These sources include equipment, software, test gases, reference equations, testing procedures, and atmospheric conditions, along with patient characteristics.
The ATS/ERS guidelines9 provide exact recommendations for most aspects of test performance. These include rapid inspiration of at least 85% of the best vital capacity (VC); breath-hold without excessive pressure for 8–12 seconds, followed by unhesitating exhalation to clear anatomic and equipment dead space; and appropriate sampling of alveolar gas. In addition to acceptability of at least 2 maneuvers, the resulting DLCO values should be repeatable within 3 mL CO/min/mm Hg or 10% of the largest acceptable value, whichever is greater. There is some evidence that the repeatability criteria are too lenient and that attention to testing details can achieve repeatability of 2–2.5 mL CO/min/mm Hg.
Equipment has been a major source of variability between and within labs. Equipment from various manufacturers has shown changes in accuracy as large as 20% over a 3 month interval. Because multiple transducers are involved (flow sensors, gas analyzers), along with complex breathing circuits, problems arising within individual components are sometimes difficult to troubleshoot. The gas analyzer(s) used to measure CO and the tracer gas have to remain linear in relation to each other. The use of a single detector for both gases, as in the case of rapidly responding infrared analyzers, has helped to reduce but not eliminate this requirement. Simple steps like assuring that an accurate temperature is used for standard temperature and pressure dry (STPD) corrections are sometimes overlooked, as is the maintenance of gas conditioning devices in systems that require such.
The most straightforward approach to reducing these problems is a comprehensive quality control program aimed at monitoring DLCO. In addition to equipment maintenance, quality control necessarily involves testing the system using either biologic controls (normal healthy subjects) or some device that can simulate an expected DLCO value. Results from biocontrols or simulators can be analyzed to determine coefficients of variability or repeatability. ATS/ERS guidelines suggest that this should be done at least weekly, but a substantial proportion of laboratories have yet to implement such a program.
Interpretation of diffusing capacity tests should include appropriate adjustments for factors such as hemoglobin (Hb) and carboxyhemoglobin (COHb), and should be based on reference values that represent the population being tested. Selection of reference values is somewhat problematic because many predicted sets are dated, and some use body weight as a variable. Interpretation of DLCO must also consider what constitutes a clinically important change, and recent evidence suggests that this might be as large as 20–25% under typical laboratory conditions.
Lung Volumes
Unlike spirometry and diffusing capacity, the indications for performing lung volume measurements are limited.12 This short list includes patients who have reduced FVC values but are not overtly obstructed. A normal FVC excludes restrictive disease in almost every instance, but when VC is low, TLC may be useful to determine whether a true restrictive ventilatory pattern is present. Restrictive lung disease combined with obstruction is rare. However, lung volumes (% predicted TLC) can be used to adjust the severity classification represented by the FEV1 (% predicted). In most cases, this adjustment reduces the severity, both for COPD and asthma. TLC may be normal even though the FVC and FEV1 are proportionately reduced (ie, the FEV1/FVC is preserved). This phenomenon has been called the “nonspecific” ventilatory pattern. Hyatt and co-workers described this pattern and suggested that it was common in subclinical asthma and in obesity.13 Subsequent serial studies showed that, although some subjects converted from “nonspecific” to obstructed or restricted (about 15% in either case), a majority of patients (64%) continued to show the pattern over time. Lung volumes may also be useful in explaining dyspnea related to obesity. TLC and VC fall with increasing body mass index (BMI), but tend to remain within the limits of normal. Functional residual capacity (FRC) and expiratory reserve volume (ERV) decrease exponentially, with the greatest changes in the overweight (BMI > 25 kg/m2) to obesity (BMI ≥ 30 kg/m2) classes.
Lung volumes (TLC, RV) tend to be increased in obstruction, but air-trapping and hyperinflation are poorly defined.14 Air-trapping is usually associated with an increased RV. However, when the upper limit of normal for RV is defined as the 5th percentile, the upper limit as a % predicted is significantly higher than the often-used 120%. Inspiratory capacity has been suggested as an index of hyperinflation that can be measured by spirometry. However, inspiratory capacity is most useful when it is related to other lung volumes, most importantly to the TLC. There is increasing evidence that reduction in hyperinflation by various therapeutic modalities (bronchodilators, surgery, et cetera) may play a significant role in reducing dyspnea. The value of measuring lung volumes, especially RV, before and after bronchodilator, has yet to be fully defined but certainly merits consideration and further study.
Body plethysmography has emerged as the standard (gold?) for measurement of lung volumes. The role of gas dilution techniques is still not completely resolved. At least one recent investigation has suggested that good correspondence between helium dilution and volume measurement by CT scan, even in subjects who are obstructed, can be interpreted as evidence that body plethysmography systematically overestimates lung volume, and that the degree of overestimation correlates with the severity of obstruction.15 Another study suggests that there is a bias between methods, but that the bias is relatively constant and not affected by the degree of obstruction.16 What is clear is that single-breath methods (such as employed during the single-breath DLCO) do underestimate lung volumes in subjects who have obstructed airways. Imaging technologies may be able to provide an estimate of lung volume, but the cost and risk of radiation exposure make them less than ideal.
Pediatrics
For more than 30 years there have been substantial efforts to standardize pulmonary function testing in adults. Because pulmonary function in children is also an important tool for managing lung disease, many of the standards for adult patients have been “adapted” for the pediatric population. The 2005 ATS/ERS standards in fact include some recommendations for patients less than 10 years of age.7 However, there are some significant differences in regard to which tests are most appropriate, acceptability, and repeatability in children, coaching, and reference values.17
Unlike laboratories that test only adults, pediatric PFT laboratories must ensure that the testing areas are child-friendly and that equipment, such as mouthpieces, nose clips, and chairs, are conducive to keeping the patient relaxed and attentive. Because of the increased elastic recoil of the child's lungs, emptying occurs rapidly and the FEV1 may not be an ideal index of airway function. FEV0.75 has been suggested as more appropriate for young children. While the FEF25–75% is not recommended for use in adults (because of the wide variability in healthy subjects), it may be useful for pediatric patients. Because the child's lung has greater elasticity and empties more rapidly, there is less variability in the FVC (and in the forced expiratory time or FET). The FEV1/FVC ratio has presented problems because FVC is often incomplete in children. New predicted values for children as young as 3 years along with a better understanding of how the FEV1/FVC changes from childhood through adolescence into adulthood should make it a more useful index for diagnosing obstruction.
The acceptability and repeatability criteria used for adults are not entirely suitable for children. The back extrapolated volume (BEV) should be less than the 5% of VC, or 150 mL used for adults; 80 mL and 12.5% (a larger proportion of the VC) have been recommended for children. An expiratory time of 3 seconds for patients less than 10 years old was included in the 2005 spirometry standards, but this may not be appropriate because most children exhale their VC and reach a plateau in less than 3 seconds. Adolescents and young adults also can exhale completely in 3 seconds or less, so rejection of efforts based on the 3 second “rule” is probably not warranted. Computer displays or recorders need to be able to scale data appropriately to allow visual inspection of efforts when small volumes are measured. The repeatability criteria of 150 mL (adults and children) is undoubtedly too generous for young children, in whom it may be much greater than 5% of the VC. The criteria for assessing bronchodilator response (12% and 200 mL increase in FEV1 or FVC) similarly overestimate the degree of response that might be expected in young children with small vital capacities.
Robust reference values for pediatric patients are available with the National Health and Nutrition Examination Survey (NHANES) III predicteds, and these have been recently extended to include children as young as 3 years. Use of a statistically valid lower limit of normal (LLN) (z-scores) is one area in which pediatric lung function testing has out-stripped adult testing, which still reports results as percents of predicted.
Airway Resistance
The physiology of airway resistance has been well described, and the partitioning of resistance across the respiratory system corresponds to changes that occur with obstructive lung disease. However, measurement of airway resistance (and derived parameters) has not always been viewed as a complement of spirometry.18
The availability of body plethysmography allows measurement of the pressure drop across the airways and relates this pressure to flow at the mouth. Because lung volume (VTG) can be measured in the same session, it is useful to express resistance and its reciprocal, conductance, as specific for the lung volume at which they are measured (sRaw and sGaw, respectively). Because sRaw encompasses lung volume, it is not as useful as sGaw (specific conductance), which represents Raw per unit of lung volume. sGaw and FEV1 measure slightly different aspects of airway function, hence either one can change without significant alteration in the other. Airway resistance and specific conductance can be useful adjuncts to spirometry for both bronchodilator and bronchial challenges studies, in which the functional properties of various portions of the airways are changed. Specific conductance (sGaw) also has the advantage of not requiring a deep inspiration, which may affect bronchomotor tone during spirometric maneuvers. A disadvantage of using Raw or sGaw to monitor airway function is that these values are quite variable in healthy subjects, and relatively large changes (eg, 35–60%) are required to detect a significant response.
The forced oscillation technique (FOT) offers an alternative means of measuring resistance in the pulmonary system. By applying a perturbed flow at the mouth and measuring the resulting pressures across a range of frequencies, the resistive characteristics of the airways, lung parenchyma, and of the chest wall can be measured. FOT allows the impedance of the respiratory system (ZRS) to be measured, along with resistance (RRS) and reactance (XRS). Unlike Raw measured by plethysmography, resistance (and the associated variables) measured by FOT represents contributions from both the lung and chest wall. The measurements are relatively easy to perform, requiring only tidal breathing, and are applicable to a wide range of subjects, including children who are too young to perform spirometry. Although the methodology for FOT is somewhat complex, it can be used for studying bronchodilator or bronchochallenge response, the viscoelastic properties of the lung, and distinguishing inspiratory from expiratory mechanics in obstructive lung disease.
Resistance can also be assessed using an interrupter technique (Rint). This method uses airway occlusion during tidal breathing to estimate alveolar pressure, and then relate this to flow at the mouth. Rint requires careful attention to proper technique and repeated measurements, but has been demonstrated as useful in children suspected of having airway obstruction.
Exercise Testing
Spirometry, lung volumes, and diffusing capacity tests remain the standards for evaluating subjects whose chief complaint is dyspnea. But static tests may not provide an explanation of the patient's symptoms or correlate with the individual's level of impairment. In these circumstances, some form of exercise evaluation is required.19
A number of exercise testing protocols have evolved, usually to answer a specific question regarding exercise intolerance. Stair climbing is a simple form of exercise dating back 50 years. The number of stairs climbed has been shown to relate inversely to such things as postoperative complications, but the test is limited by variability and lack of standardization. Exercise challenge tests are commonly used to look for exercise-induced bronchospasm (EIB). Patients walk or jog at a high heart rate or ventilatory level for 6–8 min, followed by spirometry to assess airway narrowing; a fall of 10–15% in FEV1 is considered diagnostic of EIB. Shuttle walk tests have also been used in a variety of settings to evaluate exercise intolerance. The incremental shuttle walk test (ISWT) sets a pace for the subject and then increases the pace at a fixed rate until the subject's symptoms occur; the distance walked is reported. A refinement is the endurance shuttle walk test (ESWT), which has the subject walk at 85% of the ISWT pace, and walking time is reported.
The most commonly used test appears to be the 6-minute walk test, which has been standardized by the ATS. The primary outcome variable is the distance covered in 6 min (6MWD) on a closed course of at least 30 meters length. Unlike the shuttle walk tests, the 6MWT is a self-paced (constant work load) effort in which the subjects typically do not reach their peak V̇O2. Reference values from several sources are available to estimate 6MWD as well as the LLN. The minimally important clinical difference (MCID) for 6MWD varies with disease entity, but in COPD estimates range from 25–80 meters. The 6MWT is included in some multi-dimensional metrics such as the BODE (body mass index, air-flow obstruction, dyspnea, exercise capacity) index. Unfortunately, a reduced 6MWD is largely non-specific as to the etiology that might be causing exercise intolerance.
Cardiopulmonary exercise testing (CPET) with measurement of exhaled gases has numerous indications. Although it seldom is able to diagnose specific causes of exercise intolerance, it is useful in discriminating cardiovascular versus pulmonary limitations to work. Other indications include assessment of work capacity in various lung disease states, as well as in congestive heart failure or evaluation for heart transplant. It is also widely used to assess perioperative risk in thoracic and abdominal surgery. The complexity of the testing as well as the expertise required to safely perform CPET has limited it to a certain extent. But when combined with blood gas sampling it provides a complete picture of cardiopulmonary and exercise physiology.
Pulse oximetry has assumed a significant role in exercise testing, often used to answer the question “Does the patient need oxygen for exertion?” Several large studies that have compared pulse oximetry to conventional blood gas measurements during exercise have shown that both false negatives (normal SpO2 despite desaturation) and false positives (falling SpO2 with normoxia) are not infrequent. Pulse oximetry before and after the 6MWT is suggested by the ATS guideline, but many clinicians monitor SpO2 during the test and thus compromise the design and end point of the walk.20 A separate procedure (“O2 titration”) should be considered for patients at risk of desaturation.
In the future, exercise testing may include evolving parameters such as heart rate recovery (HRR), which has been shown to correlate with overall cardiovascular status. Technology will most likely also influence exercise testing with the ability to monitor physiologic parameters remotely (eg, wireless, blue-tooth, et cetera) and to assess such basic functions as cardiac output noninvasively via impedance cardiography.
Quality Issues Related to PFTs
Technologists
There are 3 key elements to obtain high quality pulmonary function data: accurate and precise instrumentation, a patient/subject capable of performing acceptable and repeatable measurements, and a motivated technologist to elicit maximum performance from the patient. In the realm of standardization, the technologist has received the least attention.
The most important asset of the successful pulmonary function technologist is motivation. In the United States, pulmonary function technologists are quite often respiratory therapists, but a desire to do something other than the usual respiratory care duties does not equal motivation. Upon transitioning to a position in a pulmonary function lab, the technologist or therapist has to be considered a student. In this role, those with a motivation to excel as a professional may be expected to perform better than someone who pursues sufficient mastery just to meet expectations.21 In addition to motivation, some minimum qualifications for pulmonary function technologists have been suggested. The most recent ATS/ERS guidelines recommend 2 years of college, with emphasis on health related science, particularly in the area of pulmonary physiology and pathology.6 The question of whether a credential, either as a respiratory therapist (RRT) or as a pulmonary function technologist (CPFT or RPFT) relates to better testing performance in the laboratory remains unanswered. Individuals who earn a CPFT or RPFT designation demonstrate a minimum level of competency at the entry or advanced levels, respectively.22 Aptitude testing may be an alternative to the simple resume + interview. Testing for cognitive aptitude (ability to learn), conscientiousness, and critical thinking skills may be better than traditional methods of selecting someone to work in a pulmonary function laboratory.
Training of personnel in the laboratory setting typically requires a one-on-one type of mentoring in order for the individual to become proficient in the basic tests and procedures employed by the lab. As his/her skills improve, the mentor or instructor can gradually assign more responsibility to the trainee. Key to this training is adequate and appropriate feedback. Computerized PFT systems provide much of the information needed to assess the quality of the data obtained, and as such serve in the feedback role. More important, however, is for each laboratory to have a mechanism by which all technologists get continued feedback related to their performance. Several large studies have documented that performance tends to remain high when there is feedback, but may fall in its absence.23
Although we have credentialing for practitioners in pulmonary function labs in the United States, accreditation for the labs themselves is lacking. Australia and New Zealand implemented a voluntary credentialing program that has had good success and has been used as a model in a few other countries. Although there is not yet a mandate to credential individual laboratories, the benefits to the public, to third-party payers, and to those conducting research seem obvious.
Office Spirometry
During the last decade there has been a great deal of interest in the detection of COPD with spirometry as the primary tool. Various organizations have proposed making spirometry widely available in primary care practice settings. In the United States, only about 25% of new cases of COPD have had spirometry. Other countries have fared somewhat better, but there is still not widespread adoption of spirometry for making the diagnosis of airway obstruction. National and international efforts to raise awareness of COPD have included spirometry workshops, testing sessions, and related initiatives.1 Despite these efforts, including placing spirometers into primary care offices, spirometry remains an infrequent testing modality.
Multiple studies have concluded that obtaining good quality spirometry is problematic, despite rather sophisticated software that grades patient effort and implements quality guidelines as recommended by the ATS/ERS standards. The spirometers themselves may or may not contribute to the problem. Although the ATS recommends daily calibration (or calibration checks), very few spirometers are sold with the required 3 L syringe. Modern spirometers designed for physicians' offices use disposable flow sensors, but only a small number of these have been validated for accuracy over extended intervals. The software accompanying these spirometers is often not user-friendly, and older systems may not provide the NHANES III reference equations recommended for interpretation.
Because spirometry is an effort-dependent test, the patient's ability to cooperate sometimes becomes a limiting factor. However, trained technologists can achieve a 90% success rate in a wide range of subjects. Very young children and the elderly may not be able to perform the required maneuvers to meet ATS/ERS recommendations, but modified standards and novel approaches (such as using the FEV6) should permit acceptable and repeatable measurements to be obtained. In order to get the best results, those performing the test need to know how to operate the spirometer correctly, as well as a certain level of skill in communicating the proper technique to the patient. Several studies have demonstrated that the training for those performing spirometry must be supported by ongoing feedback in order obtain acceptable measurements over a long interval.23,24 The final component of spirometry, interpretation of the results, has also contributed to the problem of widespread utilization. The ability to make an appropriate conclusion based on spirometric results has been hampered by widely accepted “rules of thumb” such as GOLD's recommendation to use a fixed ratio (FEV1/FVC) of 0.70 to define airway obstruction, and the oft-cited “80% of predicted” as the lower limit of normal for all spirometry parameters. These rules have been shown to misclassify elderly patients as having obstruction when they do not, and missing younger subjects who may have early onset of airway abnormality.5 Because spirometry is often performed without administration of a bronchodilator, asthma is frequently misdiagnosed as COPD.
Use of spirometry in primary care will continue to be problematic unless high quality testing (to diagnose and treat COPD or asthma) is tied to reimbursement.25 Using simple spirometry (ie, FEV1) or even peak flow (PEF) to rule out airway abnormality in the majority of patients, followed by referral for more sophisticated pre/post-bronchodilator studies in those remaining, may be the best alternative.
Predicted Values
Reference Equations
Reference equations are a key component of PFTs, because patients infrequently have serial tests for purposes of comparison.26 Restrictive lung disease is diagnosed as a reduction in TLC, but this can be problematic because the degree of reduction must be judged from a reference or normal value. Predicted values for TLC are often hampered by the fact that the reference populations are typically small and the techniques used may not be reflective of those in current practice. Because RV is not always reduced in proportion to TLC, a low FVC cannot be relied on to diagnose restriction. A more workable definition of restriction is a reduction in VC without an increase in FRC or RV.
Spirometric “restriction” (ie, a reduced FVC with a normal or increased FEV1/FVC) requires further evaluation to distinguish whether true restriction, mixed obstruction and restriction, or a nonspecific ventilatory pattern is responsible for the patient's clinical presentation.12 The non-specific pattern, which relies on appropriate predicted values for its definition, appears to correlate with obstruction and obesity in some patients. Over time some individuals (about one third) with the nonspecific pattern progress to frank obstruction or restriction. Restriction may even be present in patients who have asthma (FEV1 and FVC are similarly reduced and both respond to bronchodilator therapy).
The NHANES III reference set for spirometry is recommended for use in the United States, and the values for whites have been implemented in other countries with populations derived from European ancestry. A recent study has validated the appropriateness of the NHANES III equations, but with some important exceptions.27 The spirometric reference values for blacks of African origin are systematically lower than for whites. However, they may not be comparable in black subjects from other ethnic backgrounds. Similarly, reference values for Mexican-Americans are slightly higher than those of Hispanics of non-Mexican origin. The same study suggested that reference values for Chinese-Americans could be estimated as 0.88 times the white predicteds, but this adjustment may not be valid for Asians of different ethnicities. The Global Lungs Initiative (GLI, sponsored by the ERS) proposes to generate all-age equations for each of the major ethnic groups from a large pooled database of healthy subjects.28
Diffusing capacity (DLCO) is one of the few PFTs that measure a non-mechanical property of the lungs (arterial blood gases is another). Unfortunately, various reference equations produce markedly different results in terms of identifying and quantifying gas exchange abnormality. The equations of Miller et al represent a large population living near sea level, with adjustments for current and former smokers available.29 Some newer studies performed using equipment meeting current ATS/ERS recommendations have been published, but the choice of DLCO reference equations remains an issue because of the importance of diffusing capacity measurements for clinical management.
Lower Limit of Normal
Unlike most other laboratory tests in which the expected or normal values remain the same throughout life (in adults), pulmonary function parameters vary with sex, age, height, and ethnicity. Expected normal values are typically calculated using a regression equation and expressed as the “predicted.” The patient's actual value is then compared to his/her predicted, and often expressed as a percentage. However, this approach does not tell the interpreter much about the range of normal, and fixed cutoffs (such as 80% of predicted) have been used to judge normality versus abnormality. As far back as 1991, the ATS guidelines have recommended using a statistically valid LLN based on the lowest 5th percentile of a healthy non-smoking population.30 Despite this recommendation, many laboratories in the United States and elsewhere continue to use fixed cutoffs.
For spirometry, very good reference sets are available for a limited number of ethnic groups, mainly whites. In the United States, the NHANES III reference set is recommended for whites, African-Americans, and Mexican Americans. One of the strengths of this set, as well as for other recently published normals, is that LLN values can be readily calculated based on the lowest 5th percentile.31 Most modern study authors report the residual standard deviation (RSD) or standard error of estimate (SEE) for normative data, so that the LLN can be calculated as −1.645 × RSD. Using this approach prevents the age- or sex-related bias that is introduced when a fixed percentage is used to define normality. Because the lowest 5th percentile represents healthy subjects who are defined as “abnormal” for clinical purposes (ie, false positive), it is imperative to interpret patient values near the LLN with caution.
Almost all pulmonary function systems are computerized, so it is very easy to calculate predicted values, even if the regression equations are complex or require look-up tables. Modern statistical methods are providing more precise equations to estimate the limits of normal for pulmonary function variables. The lambda, mu, sigma (LMS) technique, which has been used to generate growth charts for children, can be applied to pulmonary function data in healthy subjects. This method allows modeling of not only the predicted value, but of the changing variability and skewness of the data because of age and height. As a result “all-age” equations are becoming available, which can reliably estimate a mean predicted value and LLN from childhood into adolescence and throughout adulthood.28
These new techniques for estimating the LLN have caused some re-evaluation of outcomes (morbidity, mortality, et cetera) related to pulmonary function variables, particularly in patients who have COPD. Although many guidelines for diagnosing COPD still rely on fixed cutoffs, many clinicians are now considering limits of normal that better differentiate not only the presence of disease, but its severity as well. COPD is the target disease, and it continues to be under-diagnosed and misclassified in many patients.
There is a paucity of reference studies for predicting normal lung volumes and diffusing capacity (DLCO).26 Many of those in use pre-date the testing recommendations of the ATS or ERS, and were derived from very small (< 150) numbers of healthy subjects. DLCO equations in particular reflect the differing methodologies and procedural techniques historically used. As a result, a patient's DLCO, when expressed as a percent of predicted, may span a wide range depending on the reference set chosen. The efforts of the Global Lungs Initiative (ERS) has demonstrated that many of these problems can be overcome by collating raw data from different studies when advanced statistical techniques are employed.28 These efforts need to be extended to measures of lung volumes and DLCO.
Interpretation and Interpretive Strategies
Direct and Indirect Bronchial Challenges
Asthma is characterized by airway hyper-responsiveness (AHR), airway inflammation, and variable airway obstruction, but the diagnosis of asthma is often based on clinical signs and symptoms. These clinical findings (cough, wheeze, shortness of breath) are nonspecific and may represent diseases other than asthma. Pulmonary function measurements are commonly within normal limits, even when the patient does have asthma.32
One method of confirming the diagnosis of asthma is simply to perform pre- and post-bronchodilator spirometry. Even in subjects who have “normal” FEV1 and FEV1/FVC before bronchodilator, improvement after treatment demonstrates reversibility of airway obstruction and supports the diagnosis of asthma.
AHR is a complex, but characteristic, feature of asthma. AHR can be conceptualized as having “persistent” and “variable” components. The persistent component includes structural changes (eg, smooth muscle hypertrophy, mucosal thickening). The variable component has been ascribed to inflammatory changes in the airway. There is likely considerable overlap between these factors. Methacholine and histamine, which act directly on smooth muscle, can be viewed as agonists that mediate through the structural component.33 Indirect agonists, such as exercise, mannitol, or hypertonic saline, act on the variable component, causing mast cells to release the mediators associated with the inflammatory response.34
Methacholine challenge testing shows considerable variability in the dose required to induce a 20% fall in FEV1. Management of asthmatics using inhaled corticosteroids reduces AHR as measured by methacholine, and histological changes accompany improved symptoms and reduced exacerbations. Mannitol challenge is a new tool that provides an indirect stimulus to detect AHR. It appears to do this by causing osmolar changes in the mucosa of the airways. Comparing methacholine to mannitol, responses from asthmatic subjects overlap, but with significant variability. This variability indicates that the provocative agents may be acting on the different factors (structural vs variable) causing AHR. These findings suggest that perhaps both direct and indirect challenges may be needed to confirm the diagnosis, if one challenge is negative but the pre-test probability of asthma is high.
Other biomarkers may be useful in supporting the diagnosis of asthma; these include measurement of exhaled NO (FENO) and analysis of sputum eosinophils. Measuring FENO is noninvasive and standardized, but its usefulness in the clinical setting is still actively debated. Elevated FENO appears to be a sensitive indicator of eosinophilic airway inflammation, but may be within normal limits if the patient has been using inhaled corticosteroids or has very mild asthma.35 Elevated sputum eosinophils have been found to correlate with positive responses to mannitol, but not to methacholine, consistent with airway inflammation rather than structural changes. In the patient with asthmatic symptoms, responses to either direct or indirect stimuli modify the pre-test probability, but do not make the diagnosis of asthma.
Phenotypes in COPD
Phenotypes have historically been viewed as disease attributes that describe important differences in patients who have the disorder. The current definition of the term includes the idea that a particular phenotype relates to not just pathophysiology but to clinical outcomes.36 For COPD, phenotyping allows estimation of morbidity and mortality as well as guiding therapeutic decisions. Pulmonary function variables such as FEV1 explain less than 25% of the symptoms that patients report; multi-dimensional tools such as the BODE index include phenotypes that add to the physiologic parameters measured by PFTs.
One phenotype of special importance is bronchodilator response; for many clinicians lack of a response in an adult who smokes rules out asthma (in favor of COPD). Several studies, including the large UPLIFT protocol, have looked at bronchodilator responsiveness in patients who have COPD. In general, PFT phenotypes alone were not able to differentiate asthma from COPD. Cluster analysis (identifying patients who have similar symptoms, et cetera) may be more useful for making clinical decisions than simple pathophysiologic models. Although the GOLD definition of COPD requires post-bronchodilator spirometry, this may not be the sole answer. Reversibility may not be evident from a single test, and almost all available reference sets include only pre-bronchodilator values. In terms of outcomes, the highest FEV1 attainable correlates with mortality, whether pre- or post-bronchodilator.
Air-flow limitation (post-bronchodilator) by itself does not necessarily equal COPD. When exacerbations are correlated with FEV1 values, there seems to be evidence that even small decreases in pulmonary function are associated with clinically important outcomes. DLCO has been used to further characterize air-flow limitation due to emphysematous changes, rather than asthma or bronchitis. Imaging technologies, such as CT, have suggested that it is possible to have emphysema (with a low DLCO) but not have frank obstruction demonstrated by spirometry. Even though a low DLCO suggests emphysema, a normal finding does not rule it out. Normal spirometry and lung volumes with a low DLCO are likely to be found in pulmonary vascular disease, but emphysema has to be considered as well.
Measurement of lung volumes is not required to make the diagnosis of COPD, but may be useful in the context of phenotyping.12 Patients who have a decreased inspiratory capacity appear to have worse outcomes, as do those who have an increased RV. The role of PFTs in identifying patients who may be at increased risk for lung cancer or for having alpha-1 antitrypsin deficiency is just beginning to be established.
PFTs and Therapeutic Decisions
Pulmonary function and cardiopulmonary exercise tests are widely used to evaluate patients who might be candidates for resectional surgery.37 The American and British thoracic societies have each developed algorithms to assess perioperative risk when thoracotomy, pneumonectomy, or lobectomy are likely. Each of these algorithms proceeds from spirometry and DLCO to CPET. Other measures of exercise capacity, such as the 6MWT, have been used in specific patient populations. Unfortunately, most of these approaches have not been prospectively validated, and some suffer from less than ideal research methodologies. There is little evidence to support preoperative pulmonary function testing in nonthoracic surgical candidates. Spirometry may be useful for confirming the presence and severity of COPD in patients who need major abdominal surgery.38
Spirometry and DLCO have been used to evaluate the effects of both chemo- and radiation therapy. However, most of the published literature consists of case studies, which identify the sequelae of the therapies rather than predict their risks. Some studies suggest that PFTs are helpful in predicting risk, while other indicate that they are not. In both cases, the investigations have been seriously limited because there is no widely accepted definition of “toxicity.”
Pulmonary function testing may or may not be useful in the management of patients who are hospitalized. Spirometry plays no role in managing infections such as community-acquired pneumonia. Evaluation of hospitalized patients with COPD would seem to be an obvious indication, but the ATS/ERS 2004 guidelines make no recommendations about using spirometry to determine the need for hospitalization or for management once a patient is admitted. The most recent update of the GOLD guidelines does suggest using spirometry to grade severity of COPD exacerbations, with hospital admission as a possible course. Neither guideline offers advice on using spirometry to manage in-patients, or decide when to discharge. It should not be surprising that very few hospitalized COPD patients receive spirometry. Both the Global Initiative for Asthma (GINA) guidelines2 and the National Heart, Lung, and Blood Institute (NHLBI) Expert Panel Report 339 recommend spirometry for managing exacerbations of asthma that require emergency treatment or hospitalization, including specific values for FEV1 and PEF as indications for admission and response to treatment. These recommendations are based on expert opinion, with minimal supporting literature. There have been a few studies of the use of spirometry and PEF in emergency departments, but the general conclusions show that most asthmatics do not get these objective measures as part of management.
Physicians who utilize spirometry to manage COPD or asthma show variable patterns of response as to whether lung function measurements really influence management decisions. In several studies, spirometry did affect medication changes as well as decision-making in regard to ordering additional diagnostic tests.40,41 However, the majority of patients get treated without basing the management on spirometry. The real question of whether pulmonary functions tests actually change outcomes, whether or not they are used for management decisions, remains unclear.
Pulmonary function testing does play a role in selecting patients for lung transplantation. The Lung Allocation Score (LAS), used to assess transplant benefit and wait-list urgency, includes FVC % predicted and 6MWD in the composite score. The International Society for Heart and Lung Transplantation recognizes pulmonary function measures in specific disease categories, including the BODE index for COPD, FEV1 for cystic fibrosis, and DLCO in interstitial pulmonary fibrosis. The use of composite indices, such as the BODE index, should result in better models for predicting risk.
PFT Lab of the Future
Pulmonary function laboratories in the future will continue to produce results that bridge a wide range of physiologic variables related to the lungs. To date, the goals of most PFTs have been to define abnormality and to quantify the extent of disease, with emphasis on what type of disease might be causing the patient's complaint. This approach will probably not be sufficient in the future.42 The ultimate goal of each PFT should be to predict outcomes, such as quality of life, morbidity/mortality, or the risks and benefits of a particular intervention. This will require not only very accurate and repeatable tests, but integrating the results into clinical decision making.
There are a number of paths that need to be followed in order for these goals to be achieved. Standards and guidelines have improved the equipment and procedures that are used in the PFT lab. Similar emphasis should be placed on assuring the competence of individuals working in pulmonary function laboratories, and on the quality of the labs themselves. Voluntary efforts in this regard have not matured, but third-party reimbursement for sophisticated and sometimes expensive tests will likely demand formal accreditation. Interpretation of PFTs needs to be timely, based on the best possible reference values, and performed by clinicians who understand pulmonary physiology.10 Advanced software can assist in making valid interpretive statements, but it is ultimately the clinician who is responsible for linking test results to patient outcomes. Unfortunately, in many hospitals the role of PFT interpreter is determined by budgetary constraints rather than excellence in providing answers to the clinical questions being asked. Pulmonary function interpretation needs to be integrated into clinical decision making in a timely fashion. Additional outcome-based studies are needed so that PFT results can be used for individual patient management (risk/benefit of therapy, prognosis, or similar analyses).
With changes in healthcare and the potential addition of millions of patients, pulmonary function labs will have to expand services, potentially with lower reimbursement, and do so without sacrificing quality.43 This type of expansion can result in many more cases of false positives (low lung function but no disease). This makes it even more important that increased diagnostic testing leads to improved outcomes.
Tests that may play a role in the future PFT lab involve some technologies that are not new. These include the FOT,18 enhanced use of transcutaneous PO2 and PCO2 sensors (for adults as well as children), uptake of NO to measure DLNO (membrane transfer factor),44 impedance cardiography to measure cardiac output,19 and analysis of exhaled NO (FENO) to assess airway inflammation.35 Some technologies are on the horizon that are “brand new” and may have a role in the PFT lab of the future. These are listed in Table 2. There are substantial barriers to getting these new technologies into practice, not the least of which is reimbursement. Pulmonary function testing currently provides cost-effective diagnostics, but to do so in the future will require that the test results be related to beneficial outcomes.
Summary
Pulmonary function testing is an important diagnostic specialty, particularly with regard to diseases such as COPD, asthma, and interstitial lung disease. As new and better therapies for a wide array of pulmonary disorders become available, pulmonary function testing should play an increasing role in identifying and quantifying the physiologic changes attributable not just to the disease, but to the treatment as well. This will require attention to the details of laboratory quality assurance, highly motivated and well-trained physicians and technologists, and a willingness to explore new and better ways of improving patient outcomes.
Footnotes
- Correspondence: Gregg L Ruppel MEd RRT RPFT FAARC, Pulmonary Function Laboratory, Saint Louis University Hospital, 3635 Vista Avenue, St Louis MO 63104. E-mail: ruppelgl{at}slu.edu.
-
Mr Ruppel presented a version of this paper at the 48th Respiratory Care Journal Conference, “Pulmonary Function Testing,” held March 25–27, 2011, in Tampa, Florida.
-
Mr Ruppel has disclosed relationships with Medical Graphics, Gilead Sciences, Biomedical Systems, and GlaxoSmithKline. Dr Enright has disclosed no conflicts of interest.
- Copyright © 2010 by Daedalus Enterprises Inc.
References
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.