Pulse Oximeter Performance, Racial Inequity, and the Work Ahead =============================================================== * Olubunmi E Okunlola * Michael S Lipnick * Paul B Batchelder * Michael Bernstein * John R Feiner * Philip E Bickler ## Abstract It has long been known that many pulse oximeters function less accurately in patients with darker skin. Reasons for this observation are incompletely characterized and potentially enabled by limitations in existing regulatory oversight. Based on decades of experience and unpublished data, we believe it is feasible to fully characterize, in the public domain, the factors that contribute to missing clinically important hypoxemia in patients with darkly pigmented skin. Here we propose 5 priority areas of inquiry for the research community and actionable changes to current regulations that will help improve oximeter accuracy. We propose that leading regulatory agencies should immediately modify standards for measuring accuracy and precision of oximeter performance, analyzing and reporting performance outliers, diversifying study subject pools, thoughtfully defining skin pigmentation, reporting data transparently, and accounting for performance during low-perfusion states. These changes will help reduce bias in pulse oximeter performance and improve access to safe oximeters. * pulse oximeter * racial bias * oxygen saturation * medical device ## Introduction The majority of the world’s poor populations has no access to reliable pulse oximetry. Whereas this challenge is often thought of as a matter of device availability in low- and middle-income countries, the inequity in pulse oximetry is more complex. Pulse oximetry is receiving unprecedented attention for its essential role in COVID-19 patient management as well as for new studies reconfirming the technology’s inaccuracy in patients with darkly pigmented skin. Retrospective data from work published earlier this year by Sjoding et al1 identified large enough errors in pulse oximeter performance, for subjects who self-identified as Black, to miss clinically important hypoxemia.2 As a result, the United States Food and Drug Administration (FDA) received a congressional inquiry requesting a “review of the interaction between a patient’s skin color and the accuracy of pulse oximetry measurements.”3 Based on decades of experience with pulse oximeter testing and development at the UCSF Hypoxia Research Laboratory and at Clinimark laboratories, we believe it is feasible and necessary to fully characterize, in the public domain, the factors that may contribute to missing clinically important hypoxemia in patients with darkly pigmented skin. Here we propose 5 priority areas of inquiry, actionable changes to current regulations, and initial steps that will improve access to safe oximeters. ## Proposed Changes ### Recognize That Pulse Oximeter Errors in Patients With Darkly Pigmented Skin Are Real and Require Further Examination Skin pigment has long been observed to impact oximeter performance. The earliest studies had significant limitations in design and reported mixed findings. To help clarify the impact of skin pigment on oximetry, in 2005 the UCSF Hypoxia Research Laboratory undertook a controlled, prospective study that reported a positive bias for pulse oximeters in darkly pigmented healthy test subjects in instruments from 3 different manufacturers.4 This study raised concerns that confirmatory tests used by the FDA to approve devices, which chiefly use light-skinned volunteers, may be insufficient. Although there is renewed interest in this issue due to the high prevalence of hypoxemia in patients with COVID-19, it is disappointing that in over 15 years since this study the magnitude and clinical importance of these errors remain largely neglected. To begin to understand if the current generation of pulse oximeters continues to show positive bias in subjects with dark skin, we analyzed data from 18 hypoxemia studies completed over the last 3 years; the studies included 9 pulse oximeter brands, employing both transmission and reflectance, with 3 sensor types (reusable finger clip, disposable adhesive finger, disposable adhesive forehead) and studies were conducted in accordance with the International Organization for Standardization (ISO), FDA guidelines on pulse oximeters, and more detailed techniques as described previously.5 Subjects breathed an air/nitrogen mixture controlled to attain multiple stable levels of hypoxemia between 67–100% arterial oxygen saturation (SaO2). This included 3,778 data pairs (849 darkly pigmented pairs, 2,929 lightly pigmented pairs) from 491 subjects (108 with darkly pigmented skin, 383 with lightly pigmented skin). Each subject contributed similar numbers of points in the range of interest (± 2 per subject). These data are shown in Figure 1. ![Fig. 1.](http://rc.rcjournal.com/https://rc.rcjournal.com/content/respcare/67/2/252/F1.medium.gif) [Fig. 1.](http://rc.rcjournal.com/content/67/2/252/F1) Fig. 1. A: Data from recent pulse oximeter performance studies at Clinimark laboratories and represents simultaneously collected pulse oximeter readings and CO-oximeter arterial blood analysis. B: Data from Sjoding et al.1 The red rectangle is the zone of “occult hypoxia” as described in the Sjoding letter. Data from the controlled laboratory studies show no points in this area. From Reference 1, with permission. Two significant observations are evident in the comparison presented in Figure 1. The first is that a small positive bias in oximeter readings still exists in individuals with dark skin pigmentation, similarly observed by Bickler and Feiner in 2005 and 2007.4,6 The second is that the paucity of data points within the red rectangle in the Clinimark laboratory data, as compared to the number of data points within the red rectangle in Sjoding’s data,1 supports the notion that the performance of pulse oximeters in clinical environments is different from that in ideal laboratory conditions.7 Laboratory hypoxia study subjects were screened for good health. Additionally, the incidence of many confounding issues for pulse oximetry including low perfusion, motion, marginal probe positioning, irregular heartbeat, or unusual pulse morphology was low compared to a clinical setting. It is also important to note that, to date, the effects of skin pigment on pulse oximeter performance have been published for only a relatively small number of pulse oximeter models. More studies in both the clinical and laboratory setting are needed to understand the impact of skin pigmentation on SpO2 in currently manufactured oximeters. In pulse oximetry, the ratio of absorbance of 2 different wavelengths of light, red and infrared, by oxyhemoglobin and deoxyhemoglobin is measured and mathematically translated into an SpO2 reading. Pulse oximeter calibration requires hypoxia testing in human subjects and may fail when based on a nonrepresentative study population. Errors generated in this way are likely amplified by low perfusion, motion, and other interfering factors. Whereas there is greater absorbance of red light by melanin in darker skin, potentially resulting in positive bias in saturation estimates (ie, pulse oximeters reading higher than the true blood oxygen saturation), this effect is incompletely characterized.8 Further studies are necessary to quantify this effect, to determine if dark skin pigment is correlated with other factors that may impact oximeter performance, and to determine the impact of dark skin pigment when other factors that affect signal quality and processing may coexist (eg, low perfusion or low-quality oximeter software and hardware). ### Understand “Sjoding’s Outliers” and Conduct Real-World, Patient-Centered Trials of Performance Based on the data presented in Figure 1, we suggest that factors including but not limited to oximeter design may explain what Sjoding reported. First, further characterizing the effect of skin pigmentation on the accuracy of pulse oximeter saturation readings during hypoxemia requires a well-controlled testing environment. SaO2 is prone to fluctuations over short periods of time, necessitating that the blood sampling and the pulse oximeter reading be done as close together as possible for accurate comparisons (ie, seconds, not minutes). The ability to control and maintain a steady state of hypoxemia from which to record measurements is a benefit of the laboratory environment. If a researcher in the clinical environment is unable to record an SpO2 reading from the pulse oximeter and a corresponding SaO2within a narrow window, then the data should be interpreted with great caution. In the Sjoding study, some paired recordings were documented up to 10 min apart from each other. It is also important to acknowledge that discordance between SpO2 and SaO2 values can be due to many factors besides skin pigmentation. Such factors include variations in breathing, excessive motion, incorrectly applied probes, anemia, temperature, and low perfusion.9-11 All studies, whether in the clinical or laboratory setting, should record and account for these factors. Establishing standard data sets that include sufficient information to account for the potential impact of outliers on clinical performance is necessary for improving oximetry performance. Low perfusion is overdue for special consideration. Published data demonstrate that low perfusion causes error and dysfunction for some oximeters, and unpublished data show this error can be significant especially for some low-cost fingertip devices.10 Furthermore, the combination of low perfusion and darkly pigmented skin also produces a significant degree of discordance between SpO2 and SaO2 and is likely a factor in the clinical environment of Sjoding’s study. Currently, there are no requirements that low perfusion be accounted for when assessing pulse oximeter performance for clinical approval. We propose development of standardized protocols for measuring, reporting, and accounting for perfusion that can be incorporated into certification standards such as those by the FDA. This should include reporting performance thresholds during low perfusion. As an interim step, we propose inclusion of available perfusion data in all studies of pulse oximetry accuracy. For example, one might report the “perfusion index” or a comparable but standardized value, as measured by the study device or a reference device. Better characterizing the effect of low perfusion on oximeter performance is an essential step toward rectifying bias caused by skin pigment and may in fact be fundamentally related. ### Ensure Adequate Diversity for Study Subjects in Testing Protocols Standard practice in clinical trials of pulse oximeters involves pooling data from all subjects to derive overall performance data, including the mean bias and the root mean square error (ARMS). Such an approach may hide outliers of performance in darkly pigmented individuals. Whereas all data are typically presented to regulatory agencies for device clearance, individual examiners may be tasked with raising concerns about these performance outliers. The current standard for FDA approval of pulse oximeters follows the 510(k) clearance process, last updated in 2013, by which a manufacturer proves that their medical device meets the standards previously approved for constructions in the same category.12 At present, the approval process for pulse oximeters requires at least 2 volunteers or 15% of the study group, whichever is higher, to be “darkly-pigmented” but does not provide specific details about exactly how to meet this requirement.12 Many pulse oximeter studies, like the UCSF study in 2005, have employed the use of the Fitzpatrick phototype scale, a method of classifying human skin by phototype based on the presence of melanin. However, despite commonly being used for oximetry validation studies, the Fitzpatrick classification was not designed for this purpose and has many limitations, including potential for inter-operator variability and subjectivity.13 Using race to define patients with dark skin pigmentation, as was done in the Sjoding et al study,1 is also problematic on many levels. As such, we propose utilizing standardized and less subjective methods for classifying research subjects into categories by degree of skin pigmentation for the purpose of understanding the impact of skin pigment on oximeter performance. These methods should also be used to determine skin pigment at the site of oximeter measurement (eg, ear or finger), including both surfaces of measurement for transmittance devices (eg, the nail bed and ventral aspect of the fingertip). There are several such methods employed by the dermatology community that can quantify skin color at the site of measurement.14 By objectively quantifying skin color, researchers can better distribute study subject representation to accurately account for the spectrum of skin pigmentation. Additionally, we propose establishing standards requiring manufacturers to demonstrate that a pulse oximeter meets performance standards in patients with darker skin when data are analyzed alone for patients with dark-skin pigmentation and not only when combined in an ARMS fashion with data from lighter-skin patients. Currently, pooled analyses are the common practice and only requirement for FDA 510(k) approval. We must consciously include more study subjects with darkly pigmented skin in pulse oximetry research for 2 additional reasons. First, the majority of device testing is done in high-income countries with citizens that may be predominantly lighter skinned. Furthermore, we acknowledge that due to the history of medical injustice as well inequities that persist today in the United States, recruitment of Black subjects for research is negatively affected by mistrust of medical research establishments.15 ### Harmonize Standards for Accuracy The most widely accepted performance standards for pulse oximetry are specified by the FDA and ISO (Table 1). The FDA requires ARMS of < 3% for transmittance devices and 3.5% for reflectance devices, and the ISO 80601 requires ARMS of 4%.12,16 During the COVID-19 pandemic, consumer-grade pulse oximeters have been widely used for clinical-care decision making, both in the home and in hospital settings. We propose that all devices, whether reflectance or transmittance, should meet 3% ARMS to receive FDA 510(k) or compliance with ISO. View this table: [Table 1.](http://rc.rcjournal.com/content/67/2/252/T1) Table 1. Current and Proposed Requirements for Pulse oximeter Performance Standards ### Address the Need for More Evaluation and Transparency for Inexpensive or Non-FDA-Cleared Oximeters In recent years, there has been a surge in the number of pulse oximeters on the consumer market for less than US $100 and many for less than US $30. Whereas a lower price point may increase access, in our opinion it also increases the number of unsafe oximeters being used for clinical care, especially in low- and middle-income countries. Some inexpensive oximeters perform well enough in laboratory studies to meet FDA accuracy requirements for clinical care, but many do not.17 Poor performance of many inexpensive devices may be especially true under conditions of low perfusion and dark-skin pigmentation among others. Consumers, clinicians, aid organizations, and health care systems may be unaware of the shortcomings of these devices due to lack of transparency by some manufacturers and laxity in certification requirements. This is especially noteworthy with the recent uptick in donations and utilization of oximeters during the COVID-19 pandemic. We call for more testing of consumer-marketed pulse oximeters in human subjects in laboratories capable of rigorous, transparent analysis and compliance with ISO 17025 and ISO 14155 as well as sharing of such data. To promote this goal, we are working with multiple collaborators to launch the OpenOximetry.org Project to accelerate testing and dissemination of oximeter performance data. This includes independent testing of devices that do and do not currently have FDA 510(k) and is based on our finding that some FDA-approved devices do not meet performance standards for FDA 510(k) certification. A summary of our recommendations for improving performance and equity in pulse oximetry is presented in Table 1. ## Summary We propose the following steps to reduce bias in pulse oximeter performance and improve access to safe oximeters: standardize the characterization of pulse oximeter errors in patients with darkly pigmented skin, closely examine the performance outliers and conduct patient-centered trials, standardize diversification of study subjects in testing protocols, harmonize standards for accuracy among leading regulatory agencies, and create mechanisms for evaluating and transparently sharing performance data for inexpensive or non-FDA-cleared oximeters. Such efforts will also help address barriers to adequate representation of populations with dark-skin pigment in the conduct of medical research, a longstanding and complex challenge not unique to pulse oximetry. Further, we solicit the commitment of device manufacturers, regulatory bodies, and the medical community to complete this work not because it is profitable but because it is morally just. Doing so will advance equitable health care across the globe. ## Footnotes * Correspondence: Michael S Lipnick MD, 1001 Potrero Avenue, San Francisco, CA 94110. E-mail: Michael.lipnick{at}ucsf.edu * The UCSF Hypoxia Research Laboratory, Clinimark, and Physio Monitor, LLC charge pulse oximeter manufacturers for performing validation studies, but no companies were involved with the writing or data analysis presented in this paper. * Copyright © 2022 by Daedalus Enterprises ## References 1. 1.Sjoding MW, Dickson RP, Iwashyna TJ, Gay SE, Valley TS. Racial bias in pulse oximetry measurement. N Engl J Med 2020;383(25):2477-2478. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1056/NEJMc2029240&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=33326721&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) 2. 2.Valbuena VSM, Barbaro RP, Claar D, Valley TS, Dickson RP, Gay SE, et al. Racial bias in pulse oximetry measurement among patients about to undergo ECMO in 2019–2020, A retrospective cohort study. Chest 2021. Published online September 27. 3. 3.Warren E, Wyden R, Booker C. Letter to FDA re: bias in pulse oximetry measurements.pdf. [https://www.warren.senate.gov/imo/media/doc/2020.01.25%20Letter%20to%20FDA%20re%20Bias%20in%20Pulse%20Oximetry%20Measurements.pdf.](https://www.warren.senate.gov/imo/media/doc/2020.01.25%20Letter%20to%20FDA%20re%20Bias%20in%20Pulse%20Oximetry%20Measurements.pdf.) Accessed February 12, 2021 4. 4.Bickler PE, Feiner JR, Severinghaus JW. Effects of skin pigmentation on pulse oximeter accuracy at low saturation. Anesthesiology 2005;102(4):715-719. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1097/00000542-200504000-00004&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=15791098&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) [Web of Science](http://rc.rcjournal.com/lookup/external-ref?access_num=000227923900003&link_type=ISI) 5. 5.Batchelder PB, Raley DM. Maximizing the laboratory setting for testing devices and understanding statistical output in pulse oximetry. Anesth Analg 2007;105(6 Suppl):S85-94. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1213/01.ane.0000268495.35207.ab&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=18048904&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) [Web of Science](http://rc.rcjournal.com/lookup/external-ref?access_num=000205336800014&link_type=ISI) 6. 6.Feiner JR, Severinghaus JW, Bickler PE. Dark skin decreases the accuracy of pulse oximeters at low oxygen saturation: the effects of oximeter probe type and gender. Anesth Analg 2007;105(6 Suppl):S18-S23. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1213/01.ane.0000285988.35174.d9&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=18048893&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) [Web of Science](http://rc.rcjournal.com/lookup/external-ref?access_num=000205336800004&link_type=ISI) 7. 7.Foglia EE, Whyte RK, Chaudhary A, Mott A, Chen J, Propert KJ, et al. The effect of skin pigmentation on the accuracy of pulse oximetry in infants with hypoxemia. J Pediatr 2017;182:375-377. e2. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1016/j.jpeds.2016.11.043&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) 8. 8.Batchelder PB. Ear lobe spectra scans in subjects with varying skin pigment. Unpublished, 1995. 9. 9.Biebuyck JF, Severinghaus JW, Kelleher JF. Recent developments in pulse oximetry. Anesthesiology 1992;76(6):1018-1038. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1097/00000542-199206000-00024&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=1599088&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) [Web of Science](http://rc.rcjournal.com/lookup/external-ref?access_num=A1992HY13700024&link_type=ISI) 10. 10.Louie A, Feiner JR, Bickler PE, Rhodes L, Bernstein M, Lucero J. Four types of pulse oximeters accurately detect hypoxia during low perfusion and motion. Anesthesiology 2018;128(3):520-530. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1097/ALN.0000000000002002&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) 11. 11.Schramm WM, Bartunek A, Gilly H. Effect of local limb temperature on pulse oximetry and the plethysmographic pulse wave. Int J Clin Monit Comput 1997;14(1):17-22. [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=9127780&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) 12. 12.Food and Drug Administration. Pulse oximeters - Premarket notification submissions (510[k]s) guidance for industry and Food and Drug Administration staff. [https://www.fda.gov/media/72470/download.](https://www.fda.gov/media/72470/download.) Accessed February 8, 2021. 13. 13.Ware OR, Dawson JE, Shinohara MM, Taylor SC. Racial limitations of Fitzpatrick skin type. Cutis 2020;105(2):77-80. 14. 14.1. Humbert P, 2. Maibach H, 3. Fanian F, 4. Agache P Baquié M, Kasraee B. Discrimination between cutaneous pigmentation and erythema: comparison of the skin colorimeters dermacatch and mexameter. In: Humbert P, Maibach H, Fanian F, Agache P, editors. Agache’s measuring the skin. Cham, Switzerland: Springer International Publishing; 2016: 1-12. 15. 15.Corbie-Smith G. The continuing legacy of the Tuskegee syphilis study: considerations for clinical investigation. Am J Med Sci 1999;317(1):5-8. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1097/00000441-199901000-00002&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=9892266&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom) [Web of Science](http://rc.rcjournal.com/lookup/external-ref?access_num=000077839800002&link_type=ISI) 16. 16.International Organization for Standardization. ISO 80601–2-61:2017. ISO. [https://www.iso.org/cms/render/live/en/sites/isoorg/contents/data/standard/06/79/67963.html](https://www.iso.org/cms/render/live/en/sites/isoorg/contents/data/standard/06/79/67963.html). Accessed February 15, 2021. 17. 17.Lipnick MS, Feiner JR, Au P, Bernstein M, Bickler PE. The accuracy of 6 inexpensive pulse oximeters not cleared by the Food and Drug Administration: the possible global public health implications. Anesth Analg 2016;123(2):338-345. [CrossRef](http://rc.rcjournal.com/lookup/external-ref?access_num=10.1213/ane.0000000000001300&link_type=DOI) [PubMed](http://rc.rcjournal.com/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Frespcare%2F67%2F2%2F252.atom)