Abstract
BACKGROUND: Pulse oximetry is the mainstay of patient oxygen monitoring. Measurement error from pulse oximetry is more common for those with darker skin pigmentation, yet this topic remains understudied, and evidence-based clinical mitigation strategies do not currently exist. Our objectives were to measure the rate of occult hypoxemia, defined as arterial oxygen saturation (SaO2) < 88% when pulse oximeter oxygen saturation was between 92–96%, in a racially diverse critically ill population; to analyze degree, direction, and consistency of measurement error; and to develop a mitigation strategy that minimizes occult hypoxemia in advance of technological advancements.
METHODS: We performed a multi-center retrospective cohort study of critically ill subjects.
RESULTS: Among 105,467 paired observations from 7,693 subjects, we found occult hypoxemia was more common among minority subjects. The frequency of occult hypoxemia was 7.9% versus 2.9% between Black and white subjects, respectively, (P < .001). Pulse oximeter measurement errors were inconsistent throughout a patient encounter, with 67% of encounters having a range of intra-subject measurement errors > 4 percentage points. In 75% of encounters, the intra-subject errors were bidirectional. SaO2 < 88% was less common at higher pulse oximeter oxygenation ranges (4.1% and 1.8% of observations among Black and white subjects at a pulse oximeter threshold of 94–98%). Although occult hypoxemia was further reduced at oxygenation saturation range 95–100%, the frequency of hyperoxemia (partial pressure of arterial oxygen > 110 mm Hg) became more common, occurring in 42.3% of Black and 46.0% of white observations.
CONCLUSIONS: Measurement error in pulse oximetry is common for all racial groups, but occult hypoxemia occurred most commonly in Black subjects. The highly variable magnitude and direction of measurement error preclude an individualized mitigation approach. In advance of technological advancements, we recommend targeting a pulse oximetry saturation goal of 94–98% for all patients.
Introduction
In emergency departments, clinical wards, ICUs, operating rooms, and procedural suites, pulse oximeters are indispensable to assessing oxygenation for the acutely and seriously ill. Pulse oximeters use the absorption of red- and near-infrared wavelength light (at 660 nm and 940 nm, respectively) in pulsatile arterial blood to estimate arterial oxygen saturation (SaO2).1 Most approved oximeters have a reported accuracy specification between 1–2%, with blood co-oximetry serving as the accepted standard.2 However, many factors encountered during routine medical care impact the reliability of pulse oximetry.3,4
For over 35 years, it has been known that pulse oximeters are comparatively inaccurate for individuals with dark skin pigmentation.4-12 A recent study brought this issue to international attention13 and invigorated clinicians, investigators, and regulatory agencies to take action. In the study by Sjoding and colleagues,13 the investigators found that pulse oximeter readings frequently overestimated the SaO2 in hospitalized subjects. Inaccuracies were disproportionately present in subjects of color such that occult hypoxemia, defined as SaO2 < 88% among subjects with pulse oximeter saturation between 92–96%, was 3 times more likely in Black compared to white subjects.13 Subsequent studies14,15 have confirmed that occult hypoxemia is more common among patients of color and is associated with poor clinical outcomes. However, no previous study has used data related to pulse oximeter measurement error to develop an intervention designed to minimize disparities.
Our goals for this study were several-fold. First, given evidence that suggested certain pulse oximeters may be more accurate in patients of color,7 we examined the frequency of occult hypoxemia across different pulse oximeters in a racially diverse multi-center cohort of acutely ill subjects. We then examined whether the observed bias was the result of dynamic changes in hospitalized subjects by conducting analyses in steady-state conditions. Second, once confirmed, we sought to develop a health system-wide approach to mitigate racial disparities in occult hypoxemia and to avoid the potential risk of occult hyperoxemia, defined as an arterial oxygen partial pressure > 110 mm Hg,16 in advance of needed technological advancements. We conducted a series of analyses to understand magnitude, direction, and consistency of measurement bias to inform the development of a systematic approach to mitigate the development of occult hypoxemia, especially for patients with darker skin pigmentation, for whom Black race is often used as a proxy.
QUICK LOOK
Current Knowledge
Pulse oximetry is the standard for oxygen monitoring of critically ill patients, but inaccuracies have been reported for dark skinned patients as a result of pulse oximeter measurement error. This is a serious limitation of commonly used respiratory monitoring devices that has been known for over 3 decades and is the source of disparities of occult hypoxemia disproportionately experienced among Black patients.
What This Paper Contributes to Our Knowledge
Occult hypoxemia was common for Hispanic and Asian/Pacific Islander subjects in addition to Black subjects. In addition, this study demonstrates that the direction and magnitude of measurement error preclude a patient-based approach to minimize disparities (ie, applying a “correction factor” would not accurately target arterial oxygen saturation). Rather, a threshold-based approach for patients of all ethnic backgrounds, in which oxygen saturations are targeted between 94–98%, is the most informed approach to minimize harms from occult hypoxemia related to pulse oximeter measurement error.
Methods
Study Design
We conducted a multi-center, observational cohort study of critically ill subjects hospitalized at Penn Medicine, located in Philadelphia, Pennsylvania, between January 1, 2019–January 15, 2021. The study was reviewed and approved as a quality improvement project by the institutional review board of the University of Pennsylvania (number 844981).
Study Population
We included patients admitted to ICUs at the Hospital of the University of Pennsylvania and Penn Presbyterian Medical Center.
Measurements of Oxygen Saturation
At the Hospital of the University of Pennsylvania, SpO2 was measured using Covidien Nellcor oximeters and OxiMax technology disposable finger sensors (Medtronic, Mansfield, Massachusetts). At Penn Presbyterian Medical Center, SpO2 was measured using Masimo pulse oximeters and LNCS disposable sensors (Masimo, Irvine California). SaO2 was measured in both sites by multi-wavelength co-oximeters (ABL90 [Danaher, Washington, District of Columbia] at the Hospital of the University of Pennsylvania; GEM 4000 [Instrumentation Laboratory, Bedford, Massachusetts] at Penn Presbyterian Medical Center).
In a pilot study that preceded this work, we examined 40 pulse oximetry measurements at the time of arterial blood gas (ABG) at Penn Presbyterian Medical Center. ABGs were analyzed by multi-wavelength co-oximetry using a broad-spectrum spectrometer. We determined the presence of dyshemoglobins, including carboxyhemoglobinemia (COHb) and methemoglobinemia (MetHb), during the pilot. We did not observe evidence of dyshemoglobinemia, as none of the readings were out of normal range (average COHb was 0.99, and average MetHb was 0.83). As we observed bias similar in frequency to that observed by Sjoding et al13 in the pilot, we did not assess for dyshemoglobinemia in our current study.
Data Collection
We retrospectively abstracted clinical information from the electronic health record. We limited our study to critically ill patients. Consistent with prior studies, we paired SpO2 with SaO2 measurements via ABG obtained within 10 min of each other.13 We also extracted subject age, sex, race, ethnicity, and hemoglobin on the day of ABG. Subject race and ethnicity were self-reported or identified by clinical staff if clinical condition precluded subject self-report. Racial and ethnic study categories were based on National Institutes of Health recommendations17 for reporting in clinical research and related guidance.
Outcomes
The primary outcome was frequency of occult hypoxemia, defined as SaO2 < 88% when SpO2 was within one of 3 specified normal ranges.13 For our primary analyses, we examined frequency of occult hypoxemia when pulse oximeter oxygen saturation was between 92–96%.13 We subsequently examined how frequently SaO2 was < 88% at the higher goal SpO2 ranges 94–98% and 95–100%. To understand how frequently occult hypoxemia occurs at the subject level, we reported frequency of occult hypoxemia at paired observational level and at subject level. An instance of occult hypoxemia identified at paired observational level for a given subject’s hospitalization would result in the subject being categorized as having experienced occult hypoxemia. The secondary outcome was frequency of arterial hyperoxemia, defined as partial pressure of arterial oxygen > 110 mm Hg on ABG.16
Steady-State Subgroup Analysis
Because measurements displayed on pulse oximeters might not immediately reflect an abrupt change in arterial saturation given the delay associated with signal averaging algorithms, we identified a subset of samples that had a 60-min period of SpO2 stability. Stability was defined as having all SpO2 readings within 3 percentage points of each other during the 30 min before and after paired measurements. In sensitivity analyses, we further restricted oxygen saturation range to be within 1 or 2 percentage points in the 30 min preceding, and after, paired samples.
Statistical Analysis
We conducted a series of analyses to understand magnitude, direction, and consistency of intra-subject measurement errors during a subject’s hospitalization. These analyses were used to determine whether one could reliably use the initial error measured between paired and SaO2 values to correct all subsequent SpO2 measurements. We calculated the range and SD in errors measured from all paired samples available throughout a subject’s hospitalization to quantitate the intra-subject variability in SpO2 measurement error. We used Bland-Altman plots to visually compare distribution and SD of errors measured in all available sample pairs between Black and white subjects. Bland-Altman plots depict paired differences between SpO2 and SaO2 measurements as a function of mean value of the 2 paired oxygen saturation readings. In addition, for each subject encounter, we counted the numbers of paired observations that either overestimated or underestimated SaO2 to examine directionality and consistency. For an alternative mitigation strategy, we also examined how frequently SaO2 was < 88% at progressively higher-goal SpO2 ranges, including 94–98%18 and 95–100%. We also measured rate of hyperoxemia across these SpO2 goal ranges.
We report counts and percentages. We used chi-square and Kruskal-Wallis tests when comparing categorical and continuous (respectively) variables between groups. We used P ≤ .05 to signify statistical significance. We used multivariable logistic regression to adjust for age, sex, and hemoglobin on the day of ABG to examine the association between self-reported race and occult hypoxemia.7,19 We used Stata 15.1 (StataCorp, College Station, Texas) and Tableau Desktop Professional Edition Version 2020.1.4 (Tableau Software, Seattle, Washington) to perform analyses.
Results
We examined 105,467 paired observations from 7,693 unique subjects (Table 1). Since there were no differences in results when measurements were compared across hospitals, the data were combined and the results reported accordingly. Specifically, at 2 hospitals using different pulse oximeters and disposable sensors and different blood gas analyzers, we found that occult hypoxemia was more common among Black subjects compared with white subjects.
As presented in Table 1, at the level of paired observations, frequency of occult hypoxemia was more common among Black subjects compared to white subjects, at 7.9% and 2.9%, respectively, (P < .001). At the subject level, this meant that 22% of Black subjects and 11% of white subjects experienced occult hypoxemia during their hospitalization. Among other races and ethnicities, the frequency of occult hypoxemia ranged between 1.5–4.8% (Table 1). Occult hypoxemia was more common at lower SpO2 saturations and for Black subjects (Table 1–2). For example, at SpO2 92%, SaO2 was < 88% in 17.9% and 6.6% of observations from Black and white subjects, respectively. As presented in Table 3, the association between self-reported race and occult hypoxemia persisted after adjustment for age, sex, and hemoglobin.
Steady State Analyses
We examined 25,678 pulse oximetry measurements, from 5,262 subjects, where paired ABG sample was obtained while SpO2 was in “steady-state.” As shown in Figure 1, the frequency of occult hypoxemia and magnitude of difference between white and Black subjects (6.2% vs 1.2%) were similar to the primary analyses, which included all paired observations when SpO2 was in 92–96% range.
Mitigation Strategies
Comparing multiple paired samples from individual subjects, we found large differences in magnitude of errors in SpO2 measurements. As shown in Table 4, errors varied by < 4% points in saturation in only 33% of all subjects. Consistent with the above findings, the intra-subject SD of measurement errors was large and was greater in Black compared to white subjects (Fig. 2). Furthermore, as shown in Figure 3, directionality of intra-subject measurement errors was inconsistent such that in > 3 of 4 subjects errors were bidirectional. The findings from these 3 analyses were confirmed when performed using the steady-state cohort.
When we evaluated paired samples from progressively higher pulse oximeter oxygenation ranges, the frequency of SaO2 being < 88% decreased, whereas the frequency of hyperoxemia increased (Fig. 4). At SpO2 range 94–98%, SaO2 was < 88% in 4.1% and 1.8% of observations from Black and white subjects, respectively. This change was associated with a modest increase in frequency of occult hyperoxemia. Although frequency of SaO2 being < 88% was further reduced at SpO2 range 95–100%, the rate of occult hyperoxemia rose to 42.3% and 46.0% in Black and white subjects, respectively. Notably, disparity between races persisted even at the highest SpO2 range, with SaO2 being < 88% in 2.5% of observations from Black subjects compared to 1.4% of observations from white subjects (P < .001).
Discussion
In this retrospective observational study, we found that occult hypoxemia was disproportionately more common among minority groups. Whereas occult hypoxemia was most common among Black subjects, it was also more common among Latinx, Asian, and Pacific Islander subjects compared to white subjects. Though attenuated, we found these disparities were present among subjects who had reached a steady state of oxygen stability by pulse oximeter. Our findings support the notion that patients with darker skin pigmentation are vulnerable to pulse oximeter mismeasurement. Lastly, we found that frequency of occult hypoxemia increased as measured SpO2 decreased, suggesting that targeting a higher SpO2 may reduce the occurrence of clinically unrecognized hypoxemia.
Our study adds to a growing body of literature that confirms that pulse oximeters frequently overestimate SaO2, especially for individuals with dark skin pigmentation.4-7,13-15,19 Given the widespread use of pulse oximetry, technological advancements are urgently needed. However, until such time that technological advancements have been achieved and are accessible to centers who care for patients of color who are at greatest risk for occult hypoxemia, health systems must consider practice changes designed to mitigate these disparities. The prevailing mechanism used to explain the tendency of pulse oximeters to overestimate true SaO2 is that darkly pigmented skin absorbs more light in red and infrared wavelengths, leading these devices to erroneously interpret increased light absorbance as greater oxygen saturation. From this framework, errors in oxygen saturation estimated by pulse oximeter would be anticipated to be positive (ie, greater than the true arterial saturation) and if consistently positive could perhaps be mitigated by assuming a lower SaO2 than that displayed by pulse oximeter. In addition, one would expect that in a given patient the degree of pigmentation would impact all samples similarly and cause a consistent error. However, our study demonstrates that this is not the case. Rather, among all races, and especially at lower oxygen saturations, we found estimated bias in oxygen saturation by pulse oximeters was inconsistent and often erroneous in both positive and negative direction. As a result, beyond melanin, other factors may play a role, including poor perfusion, motion degradation, and poorly fitting or functioning pulse oximeter devices.19 Whereas measurement error related to dark skin pigmentation may be a major driver of disparities, future mechanistic studies designed to examine the aforementioned factors, and the potential interaction between such factors and skin pigment, are needed to inform and aid necessary technological advances. Further, we encourage future studies to consider how disparities in the quality of oxygen monitoring, from clinician selection of appropriately fitting oxygen sensors to clinician monitoring of pulse oximeter waveform quality, may be influenced by patient race.
The inconsistencies in magnitude and direction of errors in pulse oximetry preclude an individualized approach to setting oxygenation targets. Our study reveals that the measurement error over the course of a hospitalization was not a fixed one, and therefore, an assessment of error observed in a single paired sample cannot be applied as a correction factor for subsequent SpO2 measurements.
In contrast, we found that raising the oxygen saturation range from 92–96% to 94–98% was associated with a lower frequency of occult hypoxemia with only a modest increase in hyperoxemia. Although further elevation of oxygen saturation range to 95–100% further reduced the frequency of occult hypoxemia, this incremental benefit appeared to be outweighed by the increase in hyperoxemia. Of note, raising oxygenation range to 94–98% mitigated risk of occult hypoxemia in Black subjects but did not eliminate it.
Based on our analyses, until more accurate pulse oximeters are widely available, we recommend that targeting a pulse oximetry saturation range of 94–98% be the preferred mitigation strategy to reduce the prevalence of occult hypoxemia in patients who are now being targeted to an oxygen saturation goal of 92–96% and have adopted such a strategy within our health system. This recommendation aligns with established guideline recommendations18 and may reduce prevalence of occult hypoxemia independent of race, with only a modest increase in risk of exposure to hyperoxemia. Of note, these recommendations do not apply to patients currently targeted to a goal oxygen saturation 88–92% (eg, those with chronic lung disease) since systematically raising FIO2 in these patients may cause acute hypercapnia. Based on our results, we discourage clinicians from the practice of relying on episodic ABG results performed as a mitigation strategy, which could potentially lead to differential rates of ABG sampling based on race, with unclear benefit.
Although recent trials20 suggest that a modest increase in the oxygen saturation goal to the level proposed herein will not prolong duration of mechanical ventilation, we acknowledge that further study is necessary to directly assess this and other potential consequences of adopting the strategy we suggest. Likewise, we acknowledge that raising the goal oxygen saturation threshold modestly increases the risk of hyperoxemia. Although recent studies20-22 suggest risks associated with hyperoxia may have been overstated, further study is warranted.
Our study further highlights the urgent need for more accurate pulse oximeter technology. Whereas approval of these devices relies on internal validation of these devices by medical device companies themselves,2 disparities related to pulse oximeter measurement error are unlikely to be fully mitigated until such time as better technologies are available. In addition, conditions during validation studies tend to be idealized simulations of clinical stability that fail to account for factors that can lead to fluctuations in patient oxygen saturations.2,3,19,23-25
The racial disparities we observed in this study are likely a result of inadequate requirements for accounting for skin pigment variation in the device approval process. The FDA requires validation studies for new pulse oximeters to enroll a minimum of 10 subjects and at least 2 of these subjects be patients with dark skin pigmentation. If sample sizes are larger than 13, at least 15% of enrolled patients must have dark skin pigmentation.26 Whereas enrolling patients with dark skin color is a necessary step to address disparities, it is insufficient if race and ethnicity considerations are not incorporated into a rigorous analytical plan designed to assess for accuracy and performance in both idealized and dynamic clinical conditions. We call upon medical device manufacturers and federal regulatory bodies to develop new standards for pulse oximeter measurement devices that explicitly take patient racial and ethnic background into account during device development, recruitment, data analysis, and results reporting to avoid disparities such as these in the future. A statement that acknowledges this limitation of pulse oximeters is not enough.27
Our study has limitations. As a retrospective observational study, we are unable to confirm that our recommended strategy will mitigate disparities. Empirically derived, confirmatory studies designed to test its effectiveness are needed. In addition, we acknowledge as a limitation that we used self-reported race as a proxy for skin color and did not assess important patient characteristics, including patient skin pigment, presence of vascular disease and other comorbid conditions, body mass index, tissue perfusion, and other underlying disease that might be prevalent in different proportions among subjects of different races. Additionally, because fluctuations in pulse oximetry readings may be due to periodic breathing in patients who are hypoxic, it is possible that at least some of the variability in measurement error was due to this phenomenon rather than to the pulse oximeter sensor. Prospective studies, designed to account for these potential confounders in addition to subject motion, ambient light, and fingernail polish,3 are warranted. Although we performed our study in 2 hospitals with different patient populations using different pulse oximeters, our study was done in a single health system and may not generalize to centers that use different oximeters. However, given the consistency of results across multiple studies, including one19 that included reusable finger clips, disposable adhesive finger, and disposable adhesive forehead sensors, this is unlikely. While skin tone differences are likely to account for disproportional errors in pulse oximetry, further study is required. Further, confirmatory studies powered to permit robust analyses across all racial and ethnic minority groups including Indigenous, Asian, Native Hawaiian and other Pacific Islanders, and Hispanic peoples, are warranted. Finally, we could not assess the clinical impact of occult hypoxemic events. However, recent work14,15 revealing an association between unrecognized hypoxemic episodes and poor clinical outcomes substantiated our intuition and supports the notion that strategies to mitigate risk of occult hypoxemia are necessary in advance of technological advances.
Conclusions
In this retrospective observational study conducted across 2 hospitals, we found that occult hypoxemia was more common among all subjects of color and Black subjects in particular. Our data provide evidence for hospitals to implement an interim mitigation strategy of raising oxygen saturation goals to 94–98% for all patients, as this should have a favorable impact on reducing hypoxemic events without significantly increasing exposure to hyperoxemia. As this strategy does not eliminate disparity in occult hypoxemia experienced by Black patients, additional studies and innovations in pulse oximetry technology are urgently needed to eliminate risk across all patients.
Footnotes
- Correspondence: Christopher F Chesley MD, 3400 Spruce Street, 839 West Gates Building, Philadelphia, PA 19104. E-mail: christopher.chesley{at}pennmedicine.upenn.edu
See the Related Editorial on Page 1633
The authors have disclosed no conflicts of interest.
Dr Chesley received funding from the American Thoracic Society Fellowship in Health Equity.
Drs Mikkelsen and Fuchs are co-senior authors.
- Copyright © 2022 by Daedalus Enterprises