Abstract
BACKGROUND: The 2017 American Thoracic Society/European Respiratory Society (ATS/ERS) diffusing capacity of the lung for carbon monoxide (DLCO) standards specify a control rule for assessing biologic quality control (BioQC) but have limited guidance on how to establish expected values for control rule variables. This study aimed to determine expected values for DLCO BioQC using coefficient of variation (CV) and compare that the mean ± 2 SD control rule yields the same precision as mean ± 12% of the mean.
METHODS: DLCO BioQC data were collected from a multi-center inhaled medication study. This descriptive study spanned 42 months ending in 2018. The annual DLCO CV was based upon 10 DLCO values separated by at least 5 d. The root mean square CV (RMSCV) was computed for each year and Friedman test evaluated within subject annual CV changes. Ninetieth percentile values were computed for annual control rule limits/mean DLCO.
RESULTS: Of 217 BioQCs, the study's first year had 168 subjects with fewer in subsequent years. Annual CV values from RMSCV were 5.3, 4.5, and 4.6% in years 1, 2, and 3, respectively. No change was seen in the CV for those subjects with data for all 3 years, n = 24, P = .07. The 90th percentile of measurements 2 SD/mean DLCO were 15, 12.4, and 11% in years 1, 2, and 3, respectively.
CONCLUSIONS: A DLCO BioQC CV ≤ 6% is achievable across multiple sites, technologists, and brands of equipment. This CV value assures that measurements for control rule variables emerge from an expected range. A control rule of mean ± 2 SD appeared to yield similar results as the mean ± 12% of the mean rule reported in the 2017 ATS/ERS DLCO standards.
Introduction
The diffusing capacity of the lung for carbon monoxide (DLCO) in a single-breath-hold maneuver measures the lung's ability to transfer gas molecules from alveolar space into pulmonary vasculature.1 DLCO is routinely measured in pulmonary function testing and is a noninvasive surrogate of oxygen uptake in the lung. However, calibration and quality control (QC) are vital to ensure the accuracy and precision of DLCO measurements.1,2 Equipment and performance errors may affect medical decisions, resulting in harm to patients through incorrect or delayed treatment.3 Thus, clinical pulmonary function laboratories (PFLs) require strong QC programs to assure their systems perform as expected.
Four components of planning a QC strategy include (1) selecting appropriate control materials (subjects in the case of a biologic QC [BioQC]), (2) determining expected values for control materials, (3) timing of control measurements, and (4) control rules to identify out-of-control situations.3 The 2017 American Thoracic Society/European Respiratory Society (ATS/ERS) DLCO standards recommend conducting BioQC on a weekly basis to assess equipment variation, technologist performance, and variations in human physiology. PFLs recruit healthy biologic control subjects from laboratory personnel or others who can perform DLCO maneuvers on a routine basis.1 The Clinical and Laboratory Standards Institute (CLSI) states that control materials should have an established target mean and SD and recommends 10 measurements made on separate days to establish the initial target value.3 The 2017 ATS/ERS DLCO standards do not specify how to establish the target mean DLCO but imply technologists can use the mean of 6 values. The standards provide limited guidance for the second QC strategy step, establish expected values for control materials. Established values need both accuracy and precision. Scientists use the coefficient of variation (CV) to assess precision when the measurement's mean and SD depend upon each other in a single sample4 and the root mean square CV (RMSCV) to estimate a population's precision.5 A defined CV assures that measurements for DLCO BioQC occur in an expected range, confirming the DLCO system is in-control and the BioQC performed the procedure consistently.4
Several studies of DLCO data from healthy subjects found a CV or RMSCV ranging from 5–10%.6–9 These studies employed a variety of time frames, laboratories, testing systems, along with various levels of technologist experience and technical oversight as shown in Table 1. A separate study explored the target mean DLCO by using either a single value, the mean of 6 values, or mean of all values after assuring the equipment passed DLCO simulator testing with QC oversight. Study data utilized the mean of the first 2 acceptable efforts that varied by ≤ 15%. The authors concluded the first 6 values established an appropriate mean.10 Most PFLs lack DLCO simulator equipment and access to external QC oversight to assure accuracy. Therefore, technologists unknowingly using inaccurate equipment could establish an errant mean DLCO in the absence of a defined CV that assesses the control's expected values.
Next, QC programs need to establish control rules to identify out-of-control conditions. The Association for Respiratory Technology and Physiology (ARTP) recommends computing control limits using the mean ± 2 SD,11 which is consistent with guidelines from the CLSI.3 Additionally, blood gas QC programs use this strategy, making it familiar to clinicians.
The 2017 ATS/ERS DLCO standards recommend an alternative BioQC control rule of mean ± 12% of the DLCO mean. This recommendation was based upon a single study with data derived from 162 different sites and included 288 BioQCs without self-reported lung disease that could be a current or past smoker. Data were collected from 6 months–5 y and only included one equipment brand for measuring DLCO. The researchers found a mean intersession variability of 12.3% at the 90th percentile.10
The 2017 ATS/ERS DLCO standards lack a measure of precision to confirm that measurements used in control rules occur in an expected range and thus fail to confirm that the system is in-control before computing a BioQC's target mean DLCO. Further, the control rule of mean ± 12% of the mean in the 2017 ATS/ERS DLCO standards was based upon a single study,1 but the mean ± 2 SD control rule is being used in guidelines11 and clinical practice. Therefore, this study's first aim was to establish the precision for expected values in DLCO biologic control materials using CV. The second aim compared two control rule methods, using the mean ± 2 SD with the 2017 ATS/ERS DLCO standards mean ± 12% of the mean.1
QUICK LOOK
Current knowledge
The 2017 American Thoracic Society/European Respiratory Society (ATS/ERS) diffusing capacity of the lung for carbon monoxide (DLCO) standards recommend conducting weekly biologic quality controls (QCs) (BioQCs) but do not specify how to establish the BioQC's target mean DLCO. Further, the standards do not specify coefficient of variation (CV) values to confirm control rule measurements are derived from an expected range.
What this paper contributes to our knowledge
Technologists can achieve a CV ≤ 6% for the initial DLCO BioQC mean across a wide range of sites, personnel, and equipment. The mean ± 2 control rule appears to provide similar results as the mean ± 12% of the mean control rule recommended in the 2017 ATS/ERS DLCO standard.
Methods
This descriptive study was a secondary analysis of data collected as part of a large, global pharmaceutical study, NCT02242487. The BioQC data were collected over the course of 42 months, concluding in 2018. The DLCO was measured in 217 BioQC participants from 114 PFLs located in North America, Europe, and Israel. Five manufacturers of DLCO equipment were included. This study received institutional review board approval from Rush University Medical Center, Rush University, Chicago, Illinois, ORA number 19032007.
On-site training was provided to all sites by 5 prequalified trainers with > 10 y of experience in PFLs. Additionally, trainers held Registered Respiratory Therapist (RRT) and/or Registered Pulmonary Function Technologist (RPFT) credentials or equivalent based upon residing country. Training included verification of DLCO equipment settings, medical gas accuracy, test method, and equipment function. Acceptable syringe DLCO check and DLCO simulation studies with Hans Rudolph DLco Simulator with EasyLab Software (Hans Rudolph, Shawnee, Kansas) were confirmed prior to BioQC test performance. Simulation data were in-control when DLCO variables (DLCO, inspiratory vital capacity, alveolar volume, expired carbon monoxide, and expired tracer gas concentrations) were within 10% of the expected values. Each trainer completed slow vital capacity and DLCO measurements as a mock BioQC subject during the site training to confirm the technologists' adherence to the 2005 ATS/ERS DLCO standards12 and consistent results between the trainer's DLCO result and historical values. BioQC data including spirometry and DLCO were submitted weekly for central review throughout the study. Data collection was not controlled for diurnal variation as assessment for a suspected out-of-control situation can occur at any point in the day. Reviewers with 10–40 y of experience and RRT and/or RPFT credentials used a standardized process to evaluate the data. BioQC data were accepted if calibration, syringe DLCO check, and DLCO simulations were in-control and met the 2005 ATS/ERS test performance and repeatability standards.12 The first two acceptable and repeatable DLCO trials were averaged.
The conceptual model used for this study appears in Figure 1. Sites submitted 10 d of BioQC testing to evaluate whether the measurements met an expected CV ≤ 7%. If the initial CV was > 7%, troubleshooting was completed, and an additional 10 d of BioQC data were collected. Demographic data were extracted from reports for BioQC participants' age, sex, height, and weight.
All acceptable data were compiled in an Excel spreadsheet (Microsoft, Redmond, Washington), and BioQC values within 5 d were deleted to allow for variability. Measures taken over a few months provide better estimates of SD as they integrate the changes that occur with environmental factors, routine maintenance, and other sources of variation such as physiology.3 Analysis was conducted on individuals where there was a minimum of 10 DLCO BioQC measures within a 12-month period. A CV was computed for each participant for each year of data using the first 10 measurements meeting the criteria described above by dividing the SD into the mean. Next, we computed an average CV value for each year's subjects using RMSCV as shown in Equation 1. When pooling precision data for a group, the best unbiased estimate of the population variance is SD2 and not SD.5,13 Thus, we squared each CV, averaged the squared values, then computed the square root to generate the RMSCV. where CVi is the CV of an individual BioQC and n = the total number of subjects in the year.
Additionally, the control limit in the 2017 ATS/ERS DLCO standard, which was ≤ 12% of the mean,1 was compared with the 2 SD control limit in the current study. Therefore, this study's control limit of 2 SD was divided by the mean as indicated in Equation 2 to allow direct comparisons between studies. where SD = standard deviation and = the DLCO mean, both from the first 10 values of each year.
Data Analysis
Descriptive statistics characterized demographic variables, DLCO, CV, and RMSCV for each year of participation. We tested the relationship between each year's DLCO mean and SD through Spearman rho or Pearson r based upon the distribution's normality. Additionally, Friedman test evaluated the intra-subject variation in CV across the 3 y using α = 0.05. The 90th percentile was computed for each year's 2 SD control limit's percentage of mean DLCO to directly compare this study's results with the 90th percentile values from a prior study.10 Data analysis was performed using SPSS Version 26.0 (IBM, Armonk, New York).
Results
Data from 217 participants were reviewed, of which 168 (77%) completed 10 or more DLCO BioQCs their first year and 89 (41%) and 28 (13%) for years 2 and 3, respectively. Most participants were female (65.5%) with a mean age of 46.3 y and median DLCO of 23.6 mL/min/mm Hg. DLCO equipment from 5 manufacturers was included: Vmax (Vyaire Medical, San Diego, California), n = 67, 39.9%; ndd EasyOne Pro (ndd Medical Technologies, Andover, Massachusetts), n = 34, 20.2%; Jaeger (Vyaire Medical, San Diego, California), n = 29, 17.3%; MGC Diagnostics Corporation (St. Paul, Minnesota), n = 29, 17.3% ;and KoKo - formerly NSpire Health (Longmont, Colorado), n = 9, 5.4%. Complete demographic data appear in Table 2.
Moderate correlations emerged the first 2 years between the mean and SD with rs = 0.503, P < .001, n = 168; rs = 0.493, P < .001, n = 89; and r = 0.373, P = .051, n = 28 in years 1, 2, and 3, respectively. The median CV for DLCO BioQC was near 4%, and the RMSCV values were consistently < 6% across all years. These data are summarized in Table 3. Friedman test confirmed there was no variation in CV (medians 4.33, 3.76, and 3.38% for years 1, 2, and 3, respectively) for the 24 participants with data for all 3 y, Q = 5.25, degrees of freedom = 2, n = 24, P = .07. The discrete data from all participants over 3 y appear in Figure 2.
The control limit's percentage of DLCO mean was computed for each year. The resulting 90th percentile values were 15, 12.4, and 11% for years 1, 2, and 3, respectively.
Discussion
This study showed that a CV ≤ 6% establishes the expected precision for DLCO biologic control materials among a diverse population of subjects, sites, and equipment. Moderate correlation between the mean and SD DLCO measures supported the use of CV to assess precision. The CV values remained stable over the course of the study. Further, this study's mean DLCO ± 2 SD control rule yielded similar results as the mean ± 12% of the mean control rule recommended in the 2017 ATS/ERS DLCO standards.
Ideally BioQCs achieve low CV values to limit the amount of equipment and procedural error that could impact each person's test results. Lower CV values can be obtained by a single individual testing using the same brand of equipment in the same lab. A retrospective observational study of persons referred to a university clinic evaluated intra-session DLCO. Subjects with normal lung function (n = 821) had a CV = 3.09%.14 Another study (n = 1) had a healthy BioQC test weekly for 3 y with the same instrument and yielded a CV = 5.4%.6 Studies including multiple laboratories (3, 5, and 33 laboratories) yielded higher CVs near 5.0,7 4.5–9.8,8 and 6.0%,9 respectively. The highest CV range occurred in the study that included the most diverse equipment. The current study's CV falls in a similar range as prior studies but is notable for achieving this relatively low value despite a greater number of laboratories, 5 different equipment brands, and diverse geographic locations. Thus, a CV ≤ 6% should be achievable for all laboratories given the RMSCV findings from years 1–3. The Canadian Diagnostic Accreditation Program standards15 use a BioQC DLCO CV ≤ 5%. This lower CV likely resulted from employing QC practices with test quality oversight and mentoring for many years. Thus, tighter control values could be consistently achieved in a single laboratory and potentially improve earlier detection of equipment errors.
The stability of the CV over time has been questioned. This study and one other using SD to assess variability10 showed a lack of significant change over 3 or more years. Only 24 subjects in the current study had sufficient data for a 3-y comparison. These subjects had median CVs ≤ 4.3% across all 3 y, suggesting they had experience to effectively maintain their equipment. Figure 2 also shows the impact of experience. Note that most persons with a high CV in year 1 were able to decrease their values in years 2 and 3. Thus, there appears to be a trend toward lowering CV with experience; however, our data did not permit us to confirm this observation with statistics. In any case, technologists should strive to achieve the lowest CV possible and re-evaluate their values annually.
The 2017 ATS/ERS DLCO standards were not clear about how to establish the mean DLCO. Both CLSI3 and ARTP11 recommend that in the absence of historical QC data 10 measurements on separate days create a reasonable initial mean. However, use of measurements made over the first several months gives a better estimate because it accounts for longer-term sources of variability.3 In this study, a CV > 7% prompted a review of the technical procedure and/or equipment for error correction. Using a measure of precision to establish an expected value combined with consistent feedback from central review likely influenced the relatively low CV (5%) across the study.
After establishing the expected values, technologists create control rules to determine out-of-control conditions. QC rules should result in a low false-positive rate while detecting large enough error conditions that affect patient care decisions.3 The ARTP 2020 standards11 and McCormack16 recommend using the mean ± 2 SD as the control rule. The second aim of this study was to compare the use of ± 2 SD from the mean as an equivalent control rule to the current 2017 ATS/ERS DLCO standards. The 2017 standards were based upon a study that found values at the 90th percentile differed from the mean by 12.25% when the mean was an average of the first 6 measurements. The difference from the mean fell to 10.91% at the 90th percentile when the mean was an average of all measurements.10 Our data using 2 SD from the mean from years 2 and 3 were similar to the Hegewald et al10 study with values 12.4% and 11%, respectively, at the 90th percentile. Thus the control rule mean ± 2 SD in the current study appears to be consistent with prior work while using a familiar QC method.
Personnel who perform BioQCs may have concerns about their test results being a violation of their medical privacy. Thus, affordable mechanical devices to accomplish the QC goal would be beneficial. A DLCO simulator verifies the accuracy and precision of the DLCO system, and its use allows early detection of instrumentation problems. For example, at the start of an inhaled insulin study, 25% of study sites failed a DLCO simulator test. However, simulator testing does not address procedural or patient variability. Patients can account for 30–60% of intra-subject variability.17 One study found that BioQC was as good as a DLCO simulator.6 This study, however, was conducted with a single machine and biologic control. Further, research is needed to clarify whether BioQC alone is sufficient.
Although QC practices were implemented in some research studies for the past 20 years, they remain underutilized in clinical PFL testing.18 One pharmaceutical study tracking the training intensity needed for conducting mechanical QC (syringe loops, syringe DLCO, and DLCO simulations) and BioQC showed poor understanding of QC procedures.19 In another study of 15 PFLs, most did not have standardized BioQC procedures, and initially 43% of the machines had unacceptable accuracy.20 Technologists need to utilize principles of measurement science in their QC practices. The CLSI (https://clsi.org/. Accessed February 15, 2023) provides a well-established framework and vocabulary to guide QC in laboratory practices. These CLSI standards need to be integrated into laboratory education to assure test results accurately reflect patients' physiology, which may guide better treatment decisions, further drug development, or measure the impact of occupational exposures.
Methods for decreasing variability in DLCO have been reported in multiple studies. These include using a DLCO simulator,6,17,20,21 same brand of PFT equipment,6,7,9,22 standardizing test protocols,9,10,22 providing staff education with a written test and return demonstration,9,10,22 and review of DLCO results at a centralized site with feedback.9,10,22 Numerous authors strongly endorse a central PFT test oversight with technologist feedback as a vital part of a quality assurance program.9,10,16,18,22–27 Use of oversight practices reduced variability in this study and two others.9,22 Not all countries have centralized test oversight by an accreditation organization. A senior technologist using a standardized process can perform this important duty in a single lab or health care system.
Technologists must be aware of other sources of DLCO variability such as the technologist-subject interaction, circadian rhythms, potential underlying lung disease, fluctuating hemoglobin values, and the unique testing procedures imposed by different devices.8 Also, variability due to ambient environment must be considered for barometric pressure, relative humidity, and room temperature. Temperature alone will cause 0.67% error in DLCO for every 1°C increase.16 One study found 36–70% of the DLCO variability were due to instrumentation errors.21 Clinicians should not forget that reducing variability ultimately results in improving patient care.28 A suggested DLCO QC checklist for QC appears in Table 4.
To move QC into the forefront of diagnostic testing, QC education needs to be enhanced in both training programs and clinical settings. Additionally, manufacturers could assist by creating device software to help monitor and generate BioQC data with flags for out-of-control BioQC results. Also, regulatory oversight would require that all PFLs meet quality standards and enable better patient care decisions.
Limitations
This study had several limitations. Variation for sex and age was weak due to majority of participants being female (65%) with a mean age of 46.3 y. Also, there was a decline in the participation numbers each year due to normal attrition and study closure. The CV remained stable across 3 years; however, the subjects included in the analysis had a lower mean CV than the whole population and could have been more experienced to achieve this lower CV. Thus, the impact of changing CV among lesser experienced technologists remains unanswered. Lastly, this study used the 2005 ATS/ERS DLCO standards;12 however, the use of 2017 ATS/ERS DLCO standards1 would not likely increase the CV.
Conclusions
Health care providers need accurate data to deliver the best possible patient care. Thus, DLCO QC standards need sufficient rigor to reflect the presence or absence of a diffusion defect. A DLCO BioQC CV ≤ 6% is achievable across multiple sites, multiple technologists, and different equipment brands. Evaluating the CV assures systems are in-control prior to establishing a control rule based upon a target DLCO mean. It does not appear to matter whether the control rule for out-of-control status is the mean ± 12% of the mean or the mean ± 2 SD as the results from both methods were similar.
The relatively low CV in this study was achievable because systems underwent troubleshooting whenever a BioQC had a CV > 7% with a team of experts who provided training and ongoing monitoring. Laboratory oversight by qualified personnel has been shown to improve DLCO precision.
Acknowledgments
The authors thank Laurel Stewart MSc RRT for her assistance with data management during her graduate studies at Rush University.
Footnotes
- Correspondence: Ellen A Becker PhD RRT RRT-NPS RPFT AE-C FAARC, 600 S Paulina St, Suite 765, Chicago, IL 60612. E-mail: Ellen_Becker{at}rush.edu
Ms Blonshine and Mr Blonshine disclose relationships with Acorda Therapeutics. The remaining authors have disclosed no conflicts of interest.
- Copyright © 2023 by Daedalus Enterprises