Abstract
BACKGROUND: Few data are available on the use of spontaneous breathing trials (SBTs) in the neonatal population, despite advocacy of the practice in many neonatal ICUs. In this meta-analysis, we systematically reviewed the literature regarding the accuracy of SBTs as a predictor for extubation failure in premature infants.
METHODS: Following the PRISMA recommendations, scientific articles were collected in December 2019 and January 2020 using PubMed, LILACS, Web of Science, Scopus, Google Scholar, OATD, and BDTD databases. The risk of bias in the studies included herein was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. The pooled sensitivity and specificity of the studies were estimated using a mixed logistic regression model of 2 levels and a normal bivariate model.
RESULTS: Six studies were included for qualitative and quantitative evaluation in this study. All SBTs were performed with endotracheal CPAP, with a total observation time of 3–5 min. The parameters for passing/failing the test were similar in 5 of the 6 studies and included bradycardia or desaturation during the test. The SBT showed a high pooled sensitivity (0.97, 95% CI 0.85–0.99), indicating proper identification of neonates “ready” for successful extubation. However, a low pooled specificity (0.40, 95% CI 0.24–0.58), with many false-positive cases, indicated inaccurate prediction of extubation failure. Heterogeneity of included studies was considerable for sensitivity and substantial for specificity.
CONCLUSIONS: The SBT in premature infants can accurately predict extubation success but not extubation failure. Therefore, even though it is an attractive, practical, and easy-to-perform bedside assessment tool, there is a lack of evidence to support its use as an independent predictor of extubation failure in premature infants. Its routine use should be evaluated and monitored carefully.
- infant
- premature
- infant
- low birth weight
- airway extubation
- respiration
- artificial
- ventilator weaning
- intensive care
- neonatal
Introduction
Advances in health care have led to increased survival of premature infants, with requirements for mechanical ventilation during the initial days after birth in many cases.1–2 Prolonged mechanical ventilation in premature infants is correlated with adverse outcomes, such as bronchopulmonary dysplasia, pneumonia, and neurodevelopmental impairment.3–4 Moreover, extubation failure is also associated with increased death, length of hospital stay, and use of supplementary oxygen.5–6 Determining the ideal time to withdraw ventilatory support remains a major clinical challenge for the clinical team. The decision to extubate primarily relies on clinical judgment, but various strategies have been proposed for clinical teams to assess the ideal time to extubate, minimize mechanical ventilation duration, and maximize the chances of success; one such strategy is a spontaneous breathing trial (SBT).7–8
The SBT, also called a readiness test, was developed for the adult population as an attempt to assess a patient's ability to breathe spontaneously with minimal or no support. It is a simple test, and it is performed to facilitate decision-making regarding timely extubation to minimize patients' exposure to invasive ventilation.9
In the adult population, the incorporation of an SBT in weaning protocols is a well-established, common practice, which has led to higher rates of successful extubation and a trend toward lower ICU mortality.7,10,11 However, few robust studies for SBTs have been carried out in the neonatal population, even though it is used in many neonatal ICUs in Brazil and worldwide.12,13 A recent systematic review14 evaluated the accuracy of all extubation readiness tests in preterm infants, including the SBT, and concluded that there is a lack of evidence supporting the use of SBTs in preterm infants. Only 2 studies8,15 were included in the pooled estimation of sensitivity and specificity. Recently, an important study was published in this context.16
This information suggests that understanding the role of SBTs during the weaning process of premature infants is necessary to provide reliable assistance to the clinical team in decision making. This study aims to systematically review the available literature on SBT accuracy as a predictor of extubation failure in premature infants.
Methods
A systematic review of the literature was conducted to understand the SBT accuracy in premature infants. A protocol was developed in conformity with standard guidelines for systematic reviews of diagnostic studies and reported using recommended Preferred Reporting Items for a Systematic Review and Meta-Analysis of Diagnostic Test Accuracy Studies (PRISMA-DTA).17,18 No institutional review board approval was required because this was a systematic review of published data.
This study focused on the following question: What is the accuracy of SBTs as a predictor of extubation failure in premature infants? We included studies based on the following criteria: (1) they were analytic observational studies; (2) subjects were premature infants; and (3) they evaluated the accuracy of SBTs (ie, sensitivity and specificity), considering it a test to assess the infant's ability to breathe spontaneously through the endotracheal tube for a pre-established time and with well-defined criteria for the interruption of the test.
A systematic search for studies that evaluated the accuracy of SBTs as a predictor of extubation failure in premature infants was conducted using PubMed, SCOPUS, LILACS, and Web of Science databases. A gray-literature search was conducted using Google Scholar OATD, and Biblioteca Digital Brasileira de Teses e Dissertações (BDTD). Publications were screened using the terms “premature infant,” “preterm,” “low birthweight,” “spontaneous breathing test,” “spontaneous breathing trial,” “extubation predict,” and “extubation readiness.” This search was performed between December 2019 and January 2020, without language restrictions. The list of all eligible studies was also scanned manually to identify additional studies for inclusion (see the supplementary materials at http://www.rcjournal.com).
Two independent investigators (RFT and RDA) further screened the searched studies on the basis of the paper's title and the abstract. Relevant studies were read in full and selected according to the eligibility criteria. Disagreements between the 2 reviewers were resolved by consensus or by a third reviewer (SBK).
Data Extraction
Two independent investigators (RFT and ACAC) extracted the data from the published reports using a predefined protocol. The number of subjects included and the inclusion criteria of each study were recorded, along with gestational age, birthweight, clinical diagnosis, and duration of mechanical ventilation. The decision to extubate was defined as the reference standard and was based on the clinical judgment of the team or weaning protocols routinely used in each study center. A note was made on the ventilator mode, settings, and arterial blood gas ranges when infants were deemed ready for extubation. Parameters used to perform the SBT, such as duration, ventilator mode, level of PEEP, and cutoffs for passed/failed assessment, were recorded. The primary definition and the time frame used to classify infants into extubation success or failure were recorded.
Assessment of Risk of Bias
The Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2)19 was used to grade the quality of the individual study. This assessment consisted of 4 key domains including patient selection, index test, reference standard, the flow of subjects throughout the study, timing of the index test, and reference standard (“flow and timing”). The reference standard was defined by the decision to extubate based on the team's clinical judgment. The studies were classified regarding their risk of bias and any applicability concerns into “low” if all signaling questions for a domain were answered in the affirmative, into “high” if any signaling question was answered in the negative, or into “unclear” when insufficient data were reported to permit a judgment.
Data Analysis
Sensitivity, specificity, and predictive values were abstracted for each study. In the studies included, sensitivity referred to the proportion of neonates with successful extubation correctly identified by a passed SBT, whereas specificity referred to the proportion of neonates with failed extubation correctly identified by a failed SBT (Table 1). The individual accuracy of each SBT in predicting extubation success was estimated using the Youden index, which is a measure of a test's overall discriminative power assuming equal weight between sensitivity and specificity and ranges from 0 (ie, poor accuracy) to 1 (ie, perfect accuracy).
A meta-analysis was conducted applying a 2-level mixed logistic regression model that used an independent binomial distribution for true positive and true negative, conditioned to the specificity and sensitivity of each study. A normal bivariate model was used for logarithmic transformations of sensitivity and specificity between the studies. The individual and pooled sensitivity and specificity with 95% CI were presented in a forest plot, and their joint distribution was summarized using a weighted receiver operating characteristic curve. The heterogeneity was evaluated with the Cochran Q test and with I2 statistics, with thresholds of 0–40% (might not be important), 30–60% (moderate), 50–90% (substantial), and > 75% (considerable). The heterogeneity was assessed by removing one study at a time and checking if there was a significant change in the I2 statistics. The potential publication bias was verified with a Deeks funnel plot and with a linear regression between the diagnostic odds ratio and the inverse of sample size's square root. All statistical analyses were performed using Stata 13.0 (StataCorp, College Station, Texas).
Results
The initial search listed 377 studies, 87 of which were collected from PubMed, 96 from Web of Science, 84 from SCOPUS, 6 from LILACS, 100 from Google Scholar, and 4 from OATD. Thirteen studies were considered potentially relevant and were analyzed completely. After a complete reading, 7 studies were excluded: 2 did not perform a structured SBT (ie, they did not establish passed/failed criteria in a readiness test),20,21 1 study had data that were inconsistent for the outcome,22 2 studies did not follow-up on outcomes (re-intubation rate) of neonates who failed the SBT and considered them as a weaning failure,23,24 and 2 studies that compared rates of extubation success between a SBT group and a control group who did not perform the test.25,26 Finally, 6 studies satisfied the eligibility criteria and were included in our systematic review and meta-analysis.8,15,16,27–29 A flow diagram depicting the selection process of references at each stage is provided in Figure 1.
The studies included had highly variable samples of premature infants, both for gestational age and birthweight, and mostly included mechanically ventilated babies who were clinically stable and “ready” for extubation by clinical judgment. All articles used endotracheal CPAP to perform the SBT, with total observation times ranging from 3 min15,28,29 to 5 min.8,16,27 The criteria for pass/fail were similar in 5 of the 6 studies (bradycardia or desaturation during the test).8,15,27–29 One study16 evaluated the accuracy of 41,602 passed/failed SBT combinations with clinical criteria including apnea requiring stimulation, bradycardia, desaturation, and increased supplemental oxygen. The authors reported that none of the clinical event combinations were sufficient to predict extubation failure in extremely preterm infants.
In line with the study by Shalish et al,16 which evaluated the accuracy of various criteria for SBT pass/fail, we selected the test definition with the highest accuracy reported by the authors. The SBT showed good sensitivity and moderate to low specificity among the studies evaluated, with the Youden index ranging from 0 to 0.7. All studies demonstrated a high rate of successful SBTs, and the main outcome was re-intubation within 72 h. The characteristics of the studies are listed in Table 2.
Assessment of Risk of Bias
The 6 studies were evaluated with the majority classified as presenting a low risk of bias or low applicability concerning the QUADAS-2 domains (Table 3).8,15,16,27–29 Despite this, the index test domain was considered as a high risk of bias in all the studies. Because SBTs were only performed after the neonates were judged by the clinical team as “ready” to extubate, the index test results were interpreted already knowing the result of the reference standard test.
SBT Accuracy
The meta-analysis of the 6 studies included in this systematic review showed high pooled sensitivity (0.97, 95% CI 0.85–0.99) for SBTs, properly identifying preterm infants as “ready” for a successful extubation. On the contrary, this test had a low pooled specificity (0.40, 95% CI 0.24–0.58) and could not accurately identify preterm infants that will fail the extubation process (Fig. 2). Analysis of the weighted receiver operating characteristic curve showed a moderate accuracy of SBTs (area under the curve 0.73, 95% CI 0.68–0.76) as a predictor of extubation failure in preterm infants (Fig. 3). The heterogeneity for sensitivity was considerable (I2 = 84.63% [73.49–95.77], P < .01) and was substantial for specificity (I2 = 65.39% [35.10–95.69], P = .01). In the subgroup analysis, no particular study was considered as the potential source of heterogeneity. The funnel plots did not indicate a significant publication bias (P = .78) (Fig. 4).
Discussion
Although the SBT is well established in weaning adult intensive care patients, we found only 6 articles for this review that used a structured SBT in preterm infants and delimited parameters and criteria for passing or failing the test. The studies showed a good pooled sensitivity of SBTs in preterm infants (97%), with an optimal positive predictive value, but low pooled specificity (40%). These data indicate that almost all of the subjects who were successfully extubated was correctly identified by a passed test, but a significant proportion of infants who failed in the extubation were misclassified by the test. The large number of false positives casts a doubt on its applicability for premature infants.
Our systematic review was different from a previously published review because it focused specifically on studies about the accuracy of SBT.14 Compared to the published review, which included 2 studies in their meta-analysis, we had 4 additional studies to increase the sample by an additional 400 infants.16,27–29 Our findings corroborated the findings of the previous analysis, with the pooled sensitivity increasing slightly from 95% to 97% and the pooled specificity decreasing further from 62% to 40%.
We considered that, in all studies, the test was only performed when infants were considered “ready” for extubation by the clinical team. This increases the probability of finding good sensitivity and an optimal positive predictive value, with a high SBT pass rate (between 78% and 98% in the included studies), followed by an expected low extubation-failure rate. These results confirmed that the clinical team involved in the respiratory management of premature infants should be able to identify correctly which infants have a good probability of being successfully extubated based only on clinical judgment.
Conversely, the test is not able to identify those infants who are not yet ready for extubation and have a higher risk of failing, even if considered eligible by the clinical team. This is a key point for the extubation readiness tests, which should effectively assist professionals in achieving safe and more accurate decision making, avoiding extubation attempts at the wrong time, and reducing the chances of re-intubation. In this case, the SBT could be useful in reinforcing a clinician's intent to extubate but adds little or no value in detecting possible failures, and therefore, it should not be used as a major determining criterion for the final decision to extubate.
Two other studies evaluated the consequences of incorporating routine SBTs into clinical practice, comparing outcomes between the SBT group and control group but reporting controversial results.25,26 Kamlin et al25 did not report significant differences between the groups in the extubation success rate, bronchopulmonary dysplasia incidence, or the total time of ventilatory support. In comparison, Andrade et al26 noted a 30% higher extubation success rate in the test group.
The accuracy of SBTs could be influenced considerably by total observation time, level of support used, and measurements performed during the test. Many questions about the best approach to perform SBTs are unanswered, even in the adult population.30 These observations suggest that, in the context of premature infants who are highly vulnerable and can fail extubation due to many reasons, including respiratory and nonrespiratory causes, establishing good predictor tests for extubation readiness is even more difficult.31,32
All of the studies used endotracheal CPAP to perform the test, with an equivalent PEEP preset by the clinical team on conventional mechanical ventilation. Several techniques are commonly used to conduct SBTs, including pressure support mode with or without PEEP, endotracheal CPAP, automatic tube compensation, and T-piece. Considerable debate exists regarding the technique that best stimulates the patient's breathing after extubation.9,33 In the pediatric and adult population, a systematic review with meta-analysis which compared the different techniques suggested that using pressure support mode to perform the SBT increases the rate of successful extubation.11 On the other hand, a prospective trial recommended against the use of the pressure support mode in the pediatric population because this could underestimate the respiratory effort after extubation, instead suggesting the use of the endotracheal CPAP mode in this population.34
The use of endotracheal CPAP in the neonatal population has been justified by the trigger asynchrony, due to leaks around the cuffless endotracheal tube and increased work of breathing. This mode may improve respiratory mechanics and cardiac function, while also providing minimal, but potentially important, support during the SBT. In addition, this mode is often the only one available for spontaneous ventilation in neonatal ventilators, especially in countries with limited resources.13,35,36
Regarding the test's observation time, the studies were performed within the range of 3–5 min, much lower than in the adult population, which is typically 30 min. In this review, the highest accuracy was obtained in the studies of Kamlin et al15 (Youden index = 0.7) and Kaczmarek et al29 (Youden index = 0.63), both performed within 3 min. Due to the scarcity of data, the SBT observation time among premature infants was arbitrarily chosen, justified by the resistance imposed by endotracheal tubes with a small diameter. This criterion remains a gap that needs to be urgently filled because a time period that is too short may be insufficient to assess whether premature infants can support spontaneous breathing, whereas a time period that is too long may lead to fatigue and extubation failure.14
Measurements performed during the test are also key points. Most studies used the inability to maintain hemodynamic stability during the performance of the SBT, occurrence of bradycardia (ie, heart rate < 100 beats/min) for 10 s or 15 s, or desaturation ( < 85%) for 15 s, even with an increase in supplementary oxygen by 15%, as criteria for failure. The SBT protocols in adults and children use similar criteria, such as changes in heart rate, breathing frequency, systemic blood pressure, and peripheral oxygen saturation, in addition to signs of increased respiratory effort, anxiety, and agitation, with good accuracy.37–39
However, Shalish et al16 tested 41,602 possibilities between apnea with the need for stimulation, presence and duration of bradycardia, presence and duration of desaturation, and increased supplementary oxygen, but the authors did not find any combination of clinical events to define SBT pass/fail that could distinguish between extubation success and failure with sufficient accuracy. They only demonstrated that premature infants who failed extubation presented significantly more clinical events than those who were successfully extubated. The establishment of consistent criteria that could accurately define the test cutoff is essential for construction of a valid and reliable measurement instrument. The choice of the pass/fail criteria for SBT remains empirical and can be variable, depending on the neonatal ICU weaning protocol.
The accuracy of SBTs in premature infants was evaluated alone by Chawla et al,8 Kamlin et al,15 and Shalish et al,16 while Kaczmarek et al,29 Janjindamai et al,28 and Dassios et al27 associated the SBT with other respiratory variables. Kaczmarek et al29 and Dassios et al27 found a better specificity of the test when they associated SBTs with measured variability in respiratory parameters (63% to 75%) and the rate of respiratory muscle relaxation (22% to 83%). In the study by Janjindamai et al,28 despite an increase in specificity of the test when SBT results were associated with the mean breathing frequency or minute ventilation (0% to 17% or 33% to 67%), the test's accuracy remained low to moderate. However, only 1 neonate among the study's subjects failed the test, which might have affected the findings.
Most studies established the re-intubation rate within 72 h as the main outcome, based on bradycardia, apnea, a significant increase in work of breathing, and hypoxemia as criteria. Nonetheless, the ideal time to define extubation success is not yet validated in the neonatal population, especially for preterm infants. Some authors have emphasized the importance of monitoring re-intubation rates using a cumulative distribution curve over time. This is due to the fact that reporting this rate at a single time point might provide an incorrect picture of the actual re-intubation rate, making it difficult to interpret and compare them with each other. This time gap is suggested to be at least 7 d.6,40
The applicability of a standardized protocol to assess extubation readiness can be of great value to accelerate weaning because it simplifies the practice.41 However, there is not enough scientific evidence to state that the SBT is an independent predictor for extubation failure in preterm infants, despite its increased use in weaning protocols in the neonatal ICU.12,13
This review has some limitations. The studies included had considerable heterogeneity, probably due to a threshold effect related to the use of different observation times for the test between studies, which limits inference. Another limitation is the fact that the SBT was only performed when the infants were already deemed ready for extubation by the neonatal ICU clinical team, which may have incurred a selection bias of the “referral” type, increasing the chances of obtaining good sensitivity and positive predictive value rates.
Conclusions
Although the SBT is an attractive, practical, low-cost, and easy-to-apply bedside assessment tool, there is a lack of sufficient evidence to support its use as a predictor of extubation failure in premature infants. So far, the SBT in preterm infants has been reported as a sensitive test but not a very specific test. This indicates that test failure may be associated with extubation failure, but not with its accurate prediction. Although we have focused on various aspects, the findings of this review need to be further explored due to the low number of studies included and the quality of the evidence presented. More studies are required to determine the optimal strategies for improving accuracy of the SBT as a predictor of extubation failure along with establishing the appropriate trial duration and better criteria for test failure.
Footnotes
- Correspondence: Raphaela Farias Teixeira MSc. E-mail: ftraphaelafarias{at}hotmail.com
Supplementary material related to this paper is available at http://www.rcjournal.com.
The authors have disclosed no conflicts of interest.
- Copyright © 2021 by Daedalus Enterprises