Introduction

Noninvasive ventilation (NIV) is nowadays a basic respiratory support tool in adult patients and it is increasingly being used in children. Clinical experience and some randomized controlled studies in pediatric patients have shown the beneficial effects of NIV in selected patients [1, 2]. Failure of NIV and the consequent “delayed” intubation could increase patient’s morbidity and mortality, as suggested in adult studies [3]. Therefore, the availability of early predictors of NIV failure could be very helpful to physicians and have an impact in terms of patient safety. As a result, some studies have tried to identify predictors of pediatric NIV failure [49].

Several pediatric studies reported the association of hypoxemic acute respiratory failure (ARF), acute respiratory distress syndrome (ARDS) and high oxygen requirements with NIV failure [46, 8, 10, 11]. Similarly, adult studies showed that ARDS diagnosis and severe hypoxemia was linked to NIV failure [1214]. Furthermore, PaO2/FiO2 (PF) ratio has been identified as an outcome predictor of NIV in hypoxemic adults [14].

Transcutaneous oxygen hemoglobin saturation (SpO2)/FiO2 (SF) ratio is a non-invasive, easily and continuously available figure that has been shown to be a reliable marker for PF ratio [1618]. According to this, a low SF ratio would correspond to a severe hypoxemic ARF. Furthermore, Spada et al. [19] found SF ratio to be useful for identifying NIV failure in adult patients.

Considering that PF ratio may predict NIV outcome and that SF ratio correlates well with PF ratio, our hypothesis was that such an objective, noninvasive and continuous variable could help predict NIV outcome. In addition, we also sought to develop a predictive model for NIV outcomes.

Methods

A multiple-center prospective observational study was conducted in ten Spanish and two Portuguese pediatric intensive care units (PICUs). Median time performing NIV in the participating PICUs was 6.5 years (interquartile range 3.5–11.5). Four PICUs had performed NIV in 90–150 episodes, three PICUs in 150–500 cases, and five PICUs in more than 500 episodes. The study period was between 15 January 2010 and 14 January 14th. This research project was approved by the Research Ethics Committee of the Hospital Universitario Central de Asturias, with a waiver of written consent.

Inclusion criteria

Children with ARF or acute-on-chronic respiratory failure without improvement despite medical treatment and:

  • severe dyspnea at rest (modified Wood’s Clinical Asthma Score ≥5 in asthma or bronchiolitis) [20], or

  • a respiratory rate (RR) above two standard deviations (SD) for child’s age normal range [21], or

  • venous \( {\text{P}}_{{{\text{CO}}_{ 2} }} \) > 55 mmHg or arterial \( {\text{P}}_{{{\text{CO}}_{ 2} }} \) >50 mmHg, or

  • PF ratio under 250 and above 100 (above 150 if ARDS was suspected).

Exclusion criteria

Uncorrected cyanotic congenital heart disease, palliative use of NIV (do-not-intubate patients), ARDS with a PF ratio ≤150, unreliable SpO2 readings (for example, capillary refill time >3 s), contraindication to NIV, and previous invasive ventilation during the same PICU admission. Patients in whom heliox was used during NIV were also excluded.

We considered contraindications to NIV support any of the following: cardiorespiratory arrest, hemodynamic instability despite fluid load and vasoactive treatment (>10 µg/kg/min of dopamine), severe arrhythmia, Glasgow coma scale score <10, facial trauma or surgery, facial deformity (if helmet not available), vocal cords paralysis, undrained pneumothorax, need for endotracheal intubation to manage secretions or airway protection or active upper gastrointestinal tract bleeding.

Intubation criteria

NIV was withdrawn and patients were intubated when SpO2 was below 85 % or venous \( {\text{P}}_{{{\text{CO}}_{ 2} }} \) above 65 mmHg or dyspnea worsened despite maximal NIV setting, or if any of the exclusion criteria appeared. Maximal NIV settings were considered as inspiratory positive airway pressure (IPAP) ≥25 cmH2O, or continuous positive airway pressure (CPAP) or expiratory positive airway pressure (EPAP) ≥12 cmH2O, with FiO2 1.

NIV strategy

CPAP or bilevel NIV were delivered using a nasal mask, face mask, full-face mask, nasopharyngeal tube, nasal prongs or helmet device. Following the Respiratory Group of the Spanish Society of Pediatric Intensive Care guidelines [22], CPAP or EPAP initial ventilator setting was 5 cmH2O. IPAP was started at 6–8 cmH2O to achieve tolerance and patient-ventilator synchrony. It was suggested that FiO2 should be as low as possible in order to maintain SpO2 between 93 and 98 %. Sedation was administered, if required, at the discretion of the physician in charge according to each PICU sedation protocol.

Monitoring

All patients were continuously monitored by means of electrocardiography, pulse oximeter and RR. Masimo pulse oximeters (Masimo Corporation, Irvine, CA) were used in four PICUs, Philips (Philips Healthcare, Eindhoven, Nederlands) in four, Ohmeda (GE Healthcare, United Kingdom) in two, and different types of pulse oximeters in two PICUs. Blood gas analysis was only performed when considered necessary by the attending physician.

Data collection

Patients with multiple NIV episodes were considered individually, since each episode requiring NIV presents new variables potentially affecting outcome. For each episode, the following variables were collected: age, gender, weight, ARF cause, underlying disease, paediatric risk of mortality (PRISM) III score within the first 24 h of PICU admission (PRISM III-24) [23], NIV duration (hours), NIV outcome, and mortality. Clinical data collected were RR, heart rate (HR), SpO2 and FiO2 before NIV was started. Initial FiO2 values were estimated based on the following equivalences: Nasal cannulae with oxygen flow rate of 1, 2, 3 and 4 lpm: FiO2 24, 28, 32 and 36 %, respectively; aerosol mask with oxygen flow rate of 8 lpm: FiO2 40 %; Venturi mask: FiO2 from 24 to 50 %; nonrebreathing mask with reservoir bag: FiO2 100 %. The same data as well as CPAP, EPAP and IPAP values were collected at 1, 2, 6, 12 and 24 h. RR, RR variation, HR, HR variation, and SpO2 were registered when figures remained stable for at least 2 min.

Statistical analysis

The sample was described using either mean ±SD or median (interquartile range). Relative and absolute frequencies were used in order to describe categorical variables. The Nelson–Aalen estimator was used to estimate the incidence curve for NIV success and failure along the study. We analysed the sample as a whole and then made three subgroups according to the duration of NIV before failure occurred: less than 6 h (“early” failure), 6–24 h (“intermediate” failure) and more than 24 h (“late” failure). SpO2 values ≥98 % were not included in statistical analysis because SpO2–PaO2 correlation is lost above this value [18]. Variables with p < 0.1 in univariate analysis (avoiding colinearity) were included in a forward stepwise fashion based on the likelihood ratio multivariate analysis. Venous \( {\text{P}}_{{{\text{CO}}_{ 2} }} \) was excluded due to the scarcity of these values. Age was included in multivariate analysis to control statistical confusion due to this variable. Receiver operator characteristic (ROC) curves were used to find cutoff values of the predictive models obtained. As usual, the area under ROC curve (AUC) was used as a measure of the predictive capacity. A value of P < 0.05 was considered statistically significant.

Results

During the 12-month study, the total number of NIV episodes included was 390, in 369 patients. The main cause for ARF was acute bronchiolitis, which accounted for 42 % of the cases. Other frequent ARF causes were pneumonia and bronchospasm (Fig. 1). More than half of the episodes occurred in previously healthy children. The most frequent underlying conditions were cardiopathy and prematurity (Table 1 shows demographic data of the episodes included). Among children who were admitted to the PICU and received NIV more than once (N = 15), the median number of episodes was two (interquartile range 2–3). Only two of them did not have any underlying condition, while the remaining 13 had different diagnosis: cardiopathy (2), prematurity (2), cerebral palsy (2), neuromuscular disease (1), bronchopulmonary dysplasia (1), and “others” (5).

Fig. 1
figure 1

Flowchart representing the outcome of the 390 NIV episodes and the different acute respiratory failure causes, with the corresponding NIV failure rates. ARF acute respiratory failure, NIV non-invasive ventilation, ARDS acute respiratory distress syndrome, CPE cardiogenic pulmonary edema, UARI, upper airway respiratory infection (in chronic conditions). Note the high failure rate of ARDS episodes (statistically significant)

Table 1 Demographic and main baseline characteristics of the episodes; included data are expressed as median (interquartile range), or number (%)

One hundred and twenty-four episodes (31.3 %) were managed initially with CPAP. Twenty-eight of these 124 episodes (22.6 %) were switched to two levels of pressure NIV because of increasing respiratory distress despite CPAP. The remainder (166 episodes) was treated from the start with two levels of pressure NIV. Sedatives were used in 49.2 % of the cases; midazolam was the most frequent drug employed. None of the failed episodes were considered to be related to the use of sedatives.

The overall success rate was 81.3 % (Fig. 1). A total of 15 deaths occurred (3.8 %); none of them were considered to be related to NIV use, according to the opinion of the physicians in charge. Six of them occurred in the success group, due to progressing underlying conditions (not related to ARF). The other nine deaths were due to progressing underlying diseases, therapeutic effort withdrawal (2), and pulmonary hemorrhage and multiorgan failure in one case each.

Similar intubation rates were observed for all ARF causes except for ARDS, which accounted for five out of ten cases requiring invasive ventilation (p = 0.010). Mean NIV duration in ARDS-diagnosed patients who were eventually intubated was 16 ± 7.5 h. One of the ARDS cases successfully treated with NIV eventually died (due to the underlying condition), whereas four out of five ARDS episodes that were intubated, eventually died.

Failure analysis

Table 2 shows the main demographic and baseline characteristics of success and failure groups. When considering the whole sample, the only variables independently linked to NIV failure were SF ratio at 1 h (10 units) (odds ratio, OR 0.942, confidence interval, CI 95 % 0.880–1.008; p = 0.086), age (OR 0.970, CI 95 % 0.951–0.990; p = 0.003), and PRISM III-24 score (OR 1.232, CI 95 % 1.085–1.398; p = 0.001). The predictive model is shown in Table 3. Fig. 2 shows the Kaplan–Meier curve of the probability of avoiding intubation during the first 72 h using the cutoff value obtained with this predictive model. On the collected data, the behaviour of the children not included in multivariate analysis due to missing data did not show any statistically significant difference with the included ones (data not shown).

Table 2 Demographic and baseline characteristics of the successful and failed episodes
Table 3 Predictive models of NIV outcomes. No predictive model could be calculated for episodes failing after 24 h upon NIV initiation
Fig. 2
figure 2

Kaplan–Meier curve showing the probability of avoiding intubation during the first 72 h, based on the NIV failure predictive model calculated for the whole sample, which includes SF ratio at 1 h, age and PRISM III-24 (see Table 3). The solid line represents the NIV episodes considered of low risk of intubation (score ≤−1.310 in this predictive model), while the dashed line represents the NIV episodes considered of high risk of intubation (score ≥ −1.310 in this predictive model)

NIV failures occurred in the first 6 h in 27 cases, between 6 and 24 h in 26 episodes, and after 24 h in the remaining 20 cases (Fig. 1).

Episodes failing before 6 h

Multivariate analysis identified SF ratio at 1 h (OR 0.986, CI 95 % 0.974–0.997; p = 0.018) as the only variable independently linked to early NIV failure. The optimal cutoff value suggested to detect early NIV failures was 193. The predictive model is shown in Table 3.

Episodes failing between 6 and 24 h

Multivariate analysis identified PRISM III-24 (OR 1.224, CI 95 % 1.110–1.349; p < 0.001), RR decrease at 6 h (OR 0.929, CI 95 % 0.892–0.968; p < 0.001), and SF ratio at 6 h (10 units) (OR 0.923, CI 95 % 0.857–0.994; p = 0.035), as independent predictors of NIV failure between 6 and 24 h. The predictive model is shown in Table 3.

Episodes failing after 24 h

We found significant differences in SF ratio at 24 h (p = 0.029) and FiO2 at 24 h (p = 0.032). No variable was identified as independent predictor of late NIV failure.

Discussion

An important and controversial issue related to NIV in children is the early prediction and identification of NIV failure. Consequently, the availability of any easily obtained, noninvasive, and reliable clues or markers would be very helpful to physicians. To our knowledge, the present study is the largest one focusing on pediatric NIV and the first which assesses SF ratio as an early outcome marker. Considering the challenges of placing arterial catheters in small children, coupled with the stress for the child, which will likely increase respiratory effort and oxygen consumption, noninvasive measurements are more suitable for this population. Performance of blood gas analysis has been reported to be quite infrequent in previous studies [6]. Furthermore, we suggest three predictive models for NIV failure, showing acceptable AUC (Table 3). It should be noted the low positive predictive values, with very high negative predictive values in all three models. This is explained because the cutoff values have been chosen in an intent of not excluding any child at risk of failing the NIV trial. The low NIV failure rate also contributed to the low positive predictive values.

The major advantage of SF ratio is that it can be continuously monitored in a noninvasive manner. Spada et al. [19] reported that SF ratio during NIV could be useful to make clinical decisions in adult patients suffering from ARF with no underlying malignancy . Our data show that SF ratio might be useful to detect NIV failure throughout NIV delivery, especially in the first hours of noninvasive respiratory support, as suggested by SF ratio inclusion in all our three models for NIV outcome prediction (Table 3).

The optimal cutoff value suggested to detect early NIV failures was 193 (SF ratio at 1 h, Table 3). This would correspond to a PF ratio of 188 or 155 according to Khemani’s equations [16, 18], which are similar to the values reported by Antonelli [13, 14]. Thus, if a hypoxemic patient is not able to achieve an SF ratio of around 190 or higher after 1 h of NIV, endotracheal intubation should be duly considered.

It should be noted that SF ratio has been demonstrated to be a reliable surrogate for PF ratio as long as SpO2 is between 80 and 97 % [18]. When SpO2 is over 97 %, the oxyhemoglobin dissociation curve flattens and the SF ratio reliability is lost, potentially leading to inadvertent hyperoxygenation. Therefore, we believe that a target SpO2 between 92 and 97 % in NIV-treated patients should be established. Using SF ratio might help detect worsening hypoxemia early and patients receiving NIV who are prone to failing and also avoid potential oxygen toxicity.

We found a high intubation rate in ARDS (50 %), when compared to other ARF causes. However, ARDS diagnosis was not included in any predictive model of failure, which might be related to the low number of ARDS episodes included in our study (n = 10). ARDS had already been identified as a major risk factor for NIV failure in pediatric and adult patients [12, 13]. Consequently, hypoxemic patients who are at risk of developing ARDS should be monitored very closely. Invasive ventilation remains the gold standard therapy in ARDS; therefore, it should not be delayed when necessary as this could worsen the outcome and compromise patient safety. Lack of NIV effectiveness in ARDS may be related to the transient losses of EPAP (equivalent to positive end-expiratory pressure during invasive ventilation) due to the inevitable air leakage. This may produce lung derecruitment, which would impair gas exchange [24]. Of note, mean duration of NIV in ARDS episodes that required intubation was 16 h. This highlights the importance of monitoring these patients very closely, and not just in the first hours of admission. Furthermore, the high rate of deaths among ARDS episodes in our study (5/10), underlines the great severity of this type of patients.

An important variable included in two of the suggested predictive models was PRISM III-24 score (Table 3). Higher values in mortality prediction models were described as NIV failure predictors in several previous studies [4, 6, 13, 14]. It seems reasonable that the most severe patients are those at greater risk of NIV failure, and thus, this factor should be taken into account when treating a child with NIV. Nevertheless, it is important to note that we analyzed PRISM III-24, which includes the most abnormal values from the first 24 h of PICU stay [23], and that the score may worsen after intubation. This should be taken into account when using this predictive model. However, it should be highlighted that in our study, the highest PRISM III-24 values usually occurred before initiating invasive ventilation. It should also be noted that severity scores that include arterial gas analysis may lose some information as it may not be performed to all patients. Some authors suggest the use of noninvasive surrogates such as SF ratio, to improve the clinical and research utility of the severity scales that include PF ratio [25, 26].

Age was identified as another important factor in NIV outcome, included in the general predictive model (Table 3). This confirms the findings of previous studies, which reported a higher failure rate in younger patients [2, 9, 12], and is consistent with the frequent difficulties to achieve patient-ventilator synchrony in infants and small children during two levels of pressure NIV. Leaks [27] and inability to cycle off the ventilator due to their lack of strength are the main practical issues. It has been suggested that triggering and cycling off could be improved if neural diaphragmatic activity sensors were employed during NIV [2830]. Regarding our data, with a median age of 6.6 months (Table 1), a lot of these patients could potentially benefit from such an advance, but additional trials are needed to confirm or refute this speculation.

RR decrease at 6 h was also included in the predictive model of episodes failing between 6 and 24 h. This figure had already been related with NIV outcome in previous studies [4, 6, 8]. The increased work of breathing is relieved by NIV support in those cases with a good evolution, and thus, RR decrease represents this respiratory improvement. It should be underlined that, despite providing information about NIV progression at 6 h, which may seem to be “late”, almost two-thirds of the failures took place after 6 h from NIV initiation. This may help to highlight the need for continuous re-evaluation of patients on NIV, and not only during the very first hours.

Our study has several limitations. First, some SF ratio values could not be calculated due to high SpO2 (over 97 %). We compared these episodes of lost values with those included in the statistical analysis and found no statistically significant difference. Second, NIV skills may vary from one PICU to another and also among physicians. However, we consider this bias to be minimal because initial NIV management was based on a pre-established common protocol and the majority of the participating PICUs have passed the standardized practical course on invasive and non-invasive mechanical ventilation promoted and run by the Spanish Pediatric Respiratory Group [31, 32]. Furthermore, the predictive model is intended to be of help to physicians with different skills in NIV. Third, we could not analyse \( {\text{P}}_{{{\text{CO}}_{ 2} }} \)in the multivariate analysis due to the lack of values. However, despite having been described as an outcome predictor in some previous studies [4], its determination usually requires repeated punctures. Invasive techniques produce crying and patient-ventilator asynchrony, which may contribute to NIV failure, limiting its utility in NIV. Transcutaneous \( {\text{P}}_{{{\text{CO}}_{ 2} }} \) might be useful as a surrogate of PaCO2. Fourth, different pulse oximeters were used, which may reduce the reproducibility of the results. However, previous studies analyzing the correlation of SF and PF ratios also used different types of pulse oximeters [18]. Fifth, some other factors may alter SpO2–PaO2 correlation: \( {\text{P}}_{{{\text{CO}}_{ 2} }} \), pH, temperature and 2,3-diphosphoglycerate. These were not taken into account to calculate SF ratio, but previous studies suggest that the clinical importance of these variations may be of minor significance [18]. Mean airways pressure may also have an effect on SpO2 and PaO2 correlation, but this was not measured for obvious reasons.

On the other hand, our study has several strengths. First, it is the biggest study focusing on clinical predictive factors of NIV failure in critically ill children. Second, it is a multicenter prospective study in Spanish and Portuguese PICUs, which allows the findings to be representative for both countries.

In summary, a non-invasive measurement like the SF ratio seems to be a very useful predictive marker of the short-term outcome in children receiving an NIV trial. The described predictive models of outcome for different periods after NIV initiation could help physicians make clinical decisions whether to continue with NIV or switch to invasive ventilation. It would be of great interest if future research showed the utility of SF ratio as a NIV outcome predictor in the main different causes of ARF in children.