Deb Sanjay Nag, Ankur Dembla, Pratap Rudra Mahanty, Shashi Kant, Abhishek Chatterjee,Devi Prasad Samaddar, Parul Chugh
Abstract
Key words: Laparotomy; Emergencies; Acute physiology and chronic health evaluation II;Morbidity; Mortality
Laparotomy remains one of the commonest emergency surgical procedures.Even after advances in surgical skills, antimicrobial agents and postoperative care, the mortality has remained high (14.9%-19.4%)[1,2].Only over the last few years, various perioperative quality improvement initiatives involving early interventions, intensive postoperative care, and consultant led approaches have ensured a decrease in the average mortality rate to 11.1% in some studies[3].
Early prognostic evaluation would aid in selecting the high-risk patients for an aggressive treatment[3].Awareness about risks could potentially contribute to the quality of perioperative care and optimum utilization of resources[4].Regular audit and continuous improvement of clinical practice is essential to providing quality medical care[5].The doctor is legally bound to discuss the prognosis and the possible outcomes of the available treatment modalities[6].Estimating the risk preoperatively will help predict which patients would need aggressive treatment, which patients would need damage control surgeryversusdefinitive procedure, and who would benefit from postoperative intensive care and organ support[7].
An ideal scoring system should accurately predict outcomes, help determine who deserves more aggressive care, guide in deciding the extensiveness of surgery, and can be used broadly across emergency laparotomies for various disease pathologies[8].The scoring system should also be capable of analyzing risk-adjusted morbidity and mortality amongst various healthcare providers[9].
Portsmouth modification of Physiological and operative severity for the enumeration of mortality and morbidity (P-POSSUM) and the acute physiology and chronic health evaluation II (APACHE-II) have been the most widely used scoring systems for emergency laparotomies.While P-POSSUM remains the tool of choice in the United Kingdom[6], disparities have been observed between APACHE II and PPOSSUM in their discriminatory ability to predict mortality[10].
It has been suggested that preoperative assessment of individual risk would help the treating team and the patient make shared decisions[10].Although P-POSSUM is the commonest scoring system used for audit purposes in the United Kingdom[10]for National Emergency Laparotomy Audit, Enhanced Peri-Operative Care for High-risk patients or Emergency Laparotomy Pathway Quality improvement Care (ELPQuiC),it needs 18 data points as compared to 12 data points for APACHE-II.In addition, it needs intraoperative details such as blood loss, peritoneal contamination, and histopathology reports to suggest malignancy.
It is always better to have a single scoring system to predict outcomes and audit of healthcare organizations.Therefore, it has been suggested that studies should “update the performance (primarily the calibration) of APACHE-II and P-POSSUM” and compare its ability to predict postoperative morbidity and mortality[6,10].
After approval from the institutional ethics committee, this single center prospective observational study was conducted from December 2013 to November 2014.All patients undergoing emergency laparotomy at the Tata Main Hospital, Jamshedpur,India, during this period were included in this study.All patients below 18 years,those with acute trauma, undergoing re-exploratory laparotomy, or any laparotomy for vascular surgery were excluded from the study.
All patients were scored with APACHE-II on being posted for emergency surgery.Twelve components of the Physiologic Scores of and one component of the Operative Score for P-POSSUM were scored at the time of being posted for surgery.Four of the six components of the Operative Score for P-POSSUM were done intraoperatively(category, number of procedures, blood loss, and peritoneal soiling), while one was done postoperatively on availability of histopathology reports (malignancy).
The patients were followed up for at least 30 d after discharge or death (during admission or within 30 d after discharge) by telephonic interview.While postoperative mortality was the primary outcome that was analyzed, the following secondary outcomes were also compared:(1) Length of stay (LOS); (2) Need for postoperative ventilator support (any time during the postoperative period, either immediate based on the assessment of the anesthesiologist or later due to respiratory failure); (3) Need for postoperative ionotropic support (inotropic support would be initiated if the patient remained hypotensive despite fluid resuscitation to maintain a mean arterial pressure ≥ 65 mmHg); (4) Acute kidney injury (AKI) (diagnosed based on the Kidney Disease:Improving Global Outcomes Acute Kidney Injury Work Group (2012) guidelines[11]); (5) Patients needing re-exploration; and (6) Cardiac morbidity (acute myocardial infarction or arrhythmias needing treatment).
Receiver operating characteristics curve (ROC) was used as a statistical method to measure the diagnostic accuracy.Area under the curve (AUC) was used to measure the “size” of the prediction, and it consisted of graphically plotting “sensitivity” and the “1–specificity” relationship[12].AUC can range from 0.5 to 1.0, and a result of 1.0 indicates a perfect discriminatory ability.An AUC value > 0.8 is considered good, a range between 0.60-0.80 is considered as moderate, and an AUC value < 0.60 is regarded as poor.The ROC curve was used to display the optimal cut-off point when sensitivity and specificity reached an optimum for both values, by which the point on the ROC curved line was closest to the upper left corner on the curve.Statistical analysis was performed using the SPSS program for Windows, version 17.0 (Chicago,IL, United States).Continuous variables are presented as mean ± standard deviation(SD) or median (min-max), and categorical variables are presented as absolute numbers and percentage.Data were checked for normality before statistical analysis.Normally distributed continuous variables were compared using the unpaired t-test,whereas the Mann-Whitney U-test was used for those variables that were not normally distributed.Categorical variables were analyzed using either the chi square test or Fisher’s exact test.
In previous studies of perforative peritonitis[13], it was found that the sensitivity of APACHE-II was 87.5% at cut off value 16–20.For the sample size calculation, using a two tailed alpha value (0.05) and a beta value (0.2), 150 patients would have been sufficient to detect a significant difference of 10% between APACHE-II and PPOSSUM scoring systems in predicting postoperative mortality in patients undergoing emergency laparotomy.Thus, our sample size of 157 appears to be adequate to assess if there is any difference between the two scoring systems to predict mortality.
A total of 159 patients met the inclusion criteria.Two patients sought referral to a higher center and were lost on follow up and were excluded from the study.
Of the total 157 studied patients, 89 had perforative peritonitis, 57 had intestinal obstruction, and 11 were operated because of other reasons that included pancreatitis(4), cholecystitis (2), ruptured liver abscess (1), liver hematoma (1), rectal prolapsed(1), empyema gall bladder (1), and spontaneous hemoperitoneum because of thrombocytopenia (1).
The age of the patients ranged from 18 to 82 years.Of the 157 analyzed patients, 99(63.1%) were male and 58 (36.9%) were females.Twenty-three (14.6%) of the total patients analyzed died, and 134 (85.4%) survived.The mean ± SD of LOS was 10.18 ±8.24 and ranged from 1 to 70 d.Sixty-three patients (40.1%) required postoperative ventilatory support, 48 (30.6%) required perioperative ionotropic support, and 32(20.4%) developed AKI in the postoperative period.Four out of the 157 analyzed patients required re-exploration.A total of eight patients developed postoperative cardiac morbidity.
The median age [interquartile range (IQR)] amongst the survivors was 46 (30-60)years and 60 (44-69) years for those who died in the postoperative period.The statistically significantPvalue (0.029) indicated that increasing age is associated with a higher risk of mortality.While 43.5% of the patients who died were males, 56.5% of the patients who died were females, indicating a statistically significant (P =0.035)increased risk of mortality amongst female patients.
While the median APACHE-II score amongst the patients who died in the postoperative period was 31 (min-max 25-35), the median P-POSSUM Physiologic Score and Operative Score amongst them was 52 and 22, respectively (min-max 46-58 and 20-24, respectively).P< 0.001 signifies that higher scores are associated with statistically significant increased mortality.
For APACHE-II, the cut off value was found to be 24 to predict Mortality ROC analysis.In our studied patients, APACHE-II score of < 24 was associated with a significantly lower mortality of 17.4% as compared to an APACHE-II score of ≥ 24,which was associated with a mortality of 82.6% (P< 0.001) (Table 1).Using ROC, at cut off value 24, the AUC [95% confidence interval (CI)] was 0.965 (0.928–1.000).Sensitivity, specificity, positive predictive value, and negative predictive value of APACHE-II was found to be 82.6%, 98.5%, 90.5%, and 97.1%, respectively.
In comparison, for P-POSSUM the cut off value found to be 63 to predict Mortality using ROC analysis.P-POSSUM score of < 63 was associated with a significantly lower mortality of 8.7% as compared to a score of ≥ 63 which was associated with a mortality of 91.3% (P< 0.001) (Table 2).Using ROC, at cut off value 63, AUC (95%CI)was 0.989 (0.974–1.000).Sensitivity, specificity, positive predictive value, and negative predictive value of P-POSSUM was found to be 91.3%, 99.3%, 95.5%, and 98.5%respectively.
Using Pearson's Linear Correlation Coefficient, APACHE-II showed an overall predictive value of 95.5% with an odds ratio (OR) of 1.315, 95%CI of 1.193-1.448, and aP< 0.001.Similarly, P-POSSUM showed an overall predictive value of 98.1% with an OR of 1.364, 95%CI of 1.193-1.559, and aP< 0.001.Box-plots in R (Pearson Correlation Coefficient) using APACHE-II and P-POSSUM are depicted in Figures 1 and 2 respectively.
Multivariate logistic regression model has been used to identify independent risk factors (APACHE-II and P-POSSUM) for mortality.A ROC, the graphic display between the “sensitivity” and the “1–specificity” relationship to measure diagnostic accuracy of the true positivesversusthe false positives for APACHE-II and PPOSSUM, is depicted in Figure 3.AUC was 0.965 (using a cut-off value 0f 24) for APACHE-II and 0.989 (using a cut-off value 63) for P-POSSUM.AUC can range from 0.5 to 1.0, and a result of 1.0 indicates a perfect discriminatory ability.
Although both the scores were significantly good in predicting postoperative mortality in patients undergoing emergency laparotomy, the AUC of P-POSSUM(0.989) appeared better than APACHE-II (0.965).However, on comparing the sensitivity and specificity of APACHE-II and P-POSSUM (Table 3), there appears to be no statistically significant difference between their ability to predict postoperative mortality.Except for APACHE-II's inability to predict re-exploration, both were ableto predict all the secondary outcomes in a statistically significant manner (P< 0.001)(Table 4).
Table 1 Discriminating ability of APACHE-ll
Emergency laparotomy “describes an exploratory procedure for which the clinical presentation, underlying pathology, anatomical site of surgery, and perioperative management vary considerably”[1].The mere fact that over 400 different surgical procedures have been described as a part of emergency laparotomy reflect the diversity in pathology[1].Often there is little time to optimize these patients, resulting in significant adverse outcomes.The unadjusted 30-d postoperative mortality rate was 14.6% at our hospital.A study published in 2011 from a 650-bed general hospital(Royal United Hospitals, Bath) serving a population of over half a million reported a 30-d mortality of 16.9% amongst 124 patients undergoing emergency laparotomy[2].Like their study, we also excluded emergency vascular surgery, re-exploration, and simple appendectomy[2].Similarly, the Emergency Laparotomy Network[1]covering 35 NHS hospitals reported a 30-d mortality of 14.9% amongst 1853 patients who underwent emergency laparotomy.Similar incidence of mortality after emergency laparotomy of 20.2%[14]and 17%[15]were reported in 2017.Without adjusting for age,patient comorbidity, surgical presentation, and complexity of the involved pathology,we cannot be certain whether our 30-d postoperative mortality (14.6%) represents equivalent or better quality of care in comparison to that provided in European countries (14.9%-20.2%)[1,2].However there is increased understanding that standardization of care and quality improvement bundles can improve morbidity and mortality after emergency surgery[7].The male preponderance in our study group(63.1%) and statistically significant increased mortality amongst females (56.5% as compared to 43.5% in males) was in stark contrast to the UK Emergency Laparotomy Network observations[1].However, there is some evidence supporting our observation.Similar studies in India have shown a male preponderance for patients undergoing emergency laparotomy (69.5%)[16].Certain scoring systems, like the Mannheim Peritonitis Index, assign a higher risk for the female patients[17], a risk validated by our study also.
While no mortality was observed in any of our patients who were less than 20 years of age, it increased from 11.11% in the 21-40 year age group to 13.33% in the 41-60 year age group, 23.68% in the 61-80 year age group, and 33.33% amongst those above 80 years of age.Amongst the patients analyzed by Emergency Laparotomy Network,the risk of mortality increased by approximately 4% for each additional 10 years of age[1].Increasing age has been identified as an independent risk factor, and increase in mortality with age has been observed in most studies, thus validating the inclusion of age as risk factor[2,18,19].
In our study, the LOS (± SD) was 10.18 (± 8.24) d.This was similar to the observations by the Emergency Laparotomy Network in whom the median [IQR(range)] postoperative length of stay for all patients was 11 d [6–21 (0–216)][1].Although 30-d mortality after implementation of the ELPQuiC bundle indicated a reduction in the risk of death (14% to 10.5%), it had no bearing on the LOS, which remained at its median value of 11 d both before and after ELPQuiC[7].A number of factors, including the survival of patients who would not previously have survived surgery and the availability of suitable discharge facilities, may explain the lack of reduction of LOS even with improved quality of care.Similar LOS of a median [IQR(range)] of 13 [8–24 (1–176)] d following emergency laparotomy has been reported by other studies as well[20].While higher scores of APACHE-II a P-POSSUM do indicate some correlation with the LOS, the degree of correlation expressed by the Spearman Rank Correlation Coefficient is relatively small, 0.322 for APACHE-II and 0.374 for P-POSSUM.
Table 2 Discriminating ability of P-POSSUM
In our studied patients, APACHE-II score of < 24 was associated with a significantly lower mortality of 17.4%, as compared to a score of ≥ 24 which was associated with a mortality of 82.6%.At cut off value 24, the AUC (95%CI) was 0.965(0.928-1.000).While all studies have so far shown the ability of APACHE-II scores to predict mortality and similar AUC has been reported in other studies as well for patients undergoing emergency laparotomy either for varied causes[21]or for perforative peritonitis[22], our study has demonstrated the strongest correlation to date with AUC of 0.965 (as compared to 0.74-0.86 in other studies)[21,22].
In our study, P-POSSUM at cut off value of 63 to predict mortality using ROC analysis, a score of < 63 was associated with a significantly lower mortality of 8.7% as compared to a P-POSSUM score of ≥ 63 which was associated with a mortality of 91.3% (P< 0.001).Using ROC, at cut off value 63, AUC (95%CI) was 0.989(0.974–1.000).While our observations are similar to other studies[9,16,23,24], which demonstrates the ability of the P-POSSUM to predict mortality, our AUC of 0.989 at the cut off value of 63 shows a fairly high degree of accuracy of P-POSSUM.Studies have shown that P-POSSUM is a poor predictor in trauma[9], possibly resulting in our study showing a higher predictive ability of mortality, as we had excluded such cases.
While some studies have tried comparing APACHE-II and P-POSSUM across all surgeries[25], others have used it for specific pathologies[26,27].To date, no study with statistically significant sample size has compared APACHE-II and P-POSSUM in its ability to predict mortality in patients undergoing emergency laparotomy.Our study can potentially fill in the present void in published literature comparing APACHE-II and P-POSSUM in predicting mortality in patients undergoing emergency laparotomy.
ELPQuiC[7]has used P-POSSUM as a scoring system to assess the impact of introduction of quality improvement bundles, but our study shows that either of the scoring systems (APACHE-II or P-POSSUM) can be used as a tool for surgical audit and the impact of quality improvement initiatives on hospital mortality.
In the present study, both scoring systems were found to be accurate in predicting the mortality of patients, with patients having higher scores having a higher mortality.APACHE-II scores correlate well with mortality and are effective in the prediction of outcome.It considers the acute physiology of the patient and can be completed before surgery.Therefore, it is very useful in the acute stratification of the patients into risk groups and in predicting which patients can be considered for more extensive procedures.However, the APACHE-II score does not consider the etiology of peritonitis or the nature of peritoneal contamination, which has an important bearing on the outcome.In comparison, the P-POSSUM system appears to be of value as the physiologic status is assessed just before the operation or more accurately after full resuscitation and also takes the operative findings into consideration.
However, the P-POSSUM model also has its limitations.First of all, it does not include the patients who are managed conservatively and those who have refused or been denied surgery due to the significant associated risk of mortality.Secondly,while recording the operative variables such as estimated blood loss or peritoneal contamination the surgeon’s eye may be biased.And finally, the scores are not complete until the histopathology reports are available and may significantly delay the scoring and assessing of the risk.Possibly, that is the reason why in our study APACHE-II, being a physiologic score, was a poor indicator of the need for a reexploration surgery (Spearman Rank Correlation Coefficient of 0.112) (P= 0.112).PPOSSUM is possibly a better predictor of the need for re-exploration (Spearman Rank Correlation Coefficient of 0.178) as it includes the intra-operative finding with aP =0.026.This indicated that although P-POSSUM has some correlation with possible need re-exploration as compared to APACHE-II (which had no correlation), the correlation was quite low.
Figure 1 Box-plots in Pearson correlation coefficient using APACHE-ll.
Higher APACHE-II and P-POSSUM correlated well with our secondary outcomes like the postoperative need for inotropic support or ventilatory support or AKI.Such patients who need postoperative organ support are best managed in a critical care setup.Ability of APACHE-II to predict these sicker patients (without relying on the intraoperative or histology findings as for P-POSSUM) could allow us to plan better,optimize and utilize such scarce resources.
Because the ability of APACHE-II to predict mortality is similar to P-POSSUM, and the fact that APACHE-II does not need scoring for intra-operative findings and histopathology reports, APACHE-II can be used pre-operatively to assess the risk in patients undergoing emergency laparotomy.However, for audit purposes, either of the two scoring systems can be used.
Table 3 Sensitivity and specificity of APACHE-ll and P-POSSUM
Table 4 Discriminating ability of APACHE-ll and P-POSSUM in predicting the secondary outcomes
Figure 2 Box-plots in Pearson correlation coefficient using P-POSSUM.
Figure 3 Receiver operating characteristics curve for APACHE-ll and P-POSSUM using the Multivariate logistic regression model.
Various scoring systems have been used historically to predict outcomes in patients who are at increased risk of morbidity and mortality during their hospital stay.Emergency laparotomy,despite being one of the commonest surgical procedures, continued to have reasonably high postoperative mortality.Doctors are legally bound to discuss with their patients and relatives the potential risk of complications and adverse outcomes.A robust scoring system enables us to quantify the risk and serves as a tool to measure risk-based outcomes and enable audit of clinical results and impact of improvement initiatives.
Portsmouth modification of Physiological and operative severity for the enumeration of mortality and morbidity (P-POSSUM) and the acute physiology and chronic health evaluation II(APACHE-II) have been the most widely used scoring systems for emergency laparotomies.PPOSSUM remains the tool of choice in the United Kingdom.However, it is subject to observational bias while quantifying intraoperative blood loss and peritoneal contamination.It is always better that we have a single scoring system to predict outcomes and audit healthcare organizations.Besides, delay in histopathology reports would delay the P-POSSUM score of the patient, and patients managed conservatively or refused surgery could not be scored.In these circumstances, the APACHE-II score had the advantage of being available in the pre-operative period itself.However, to date no study with statistically significant sample size has compared P-POSSUM and APACHE-II in their ability to predict mortality in emergency laparotomies.This study aims to bridge this gap and assess if APACHE-II can be used as a single scoring system to predict outcomes and for audit of outcomes across healthcare organizations.
The study was conducted to compare the predictability of APACHE-II and P-POSSUM scoring systems on postoperative mortality and to see any correlation between these scoring systems and length of stay, requirement of postoperative ventilatory support, inotropic support, development of acute kidney injury (AKI), cardiac morbidity, and need for re-exploration.While the study showed that both APACHE-II and P-POSSUM can equally predict mortality, it also demonstrated comparability in predicting increased length of stay and need for postoperative ventilatory support, higher incidence of AKI, and increased risk of cardiac morbidity.However,P-POSSUM was a better predictor of the need for re-exploration as compared to APACHE-II.The study was successful in demonstrating that both APACHE-II and P-POSSUM can be interchangeably used not only for postoperative mortality but also for effectively predicting morbidity.With the advantage that the APACHE-II scoring can be done preoperatively, the study justifies the fact that APACHE-II can be the single scoring system to predict outcomes and audit healthcare organizations for emergency laparotomies.
All patients undergoing emergency laparotomy at Tata Main Hospital (Jamshedpur, India) form December 2013 to November 2014 were included in the study.All patients were scored with APACHE-II and P-POSSUM scoring systems.Receiver operating characteristics curve (ROC)was used as a statistical method to measure the diagnostic accuracy.Area under the curve(AUC) was used to measure the “size” of the prediction, and it consisted of graphically plotting“sensitivity” and the “1–specificity” relationship.The ROC curve was used to display the optimal cut-off point when sensitivity and specificity reached an optimum for both values, by which the point on the ROC curved line was closest to the upper left corner on the curve.
Out of a total of 159 patients who met the inclusion criteria, only 157 could be included in the study.For APACHE-II, the cut off value was found to be 24 for predicting mortality by ROC analysis.In comparison, for P-POSSUM, the cut off value was found to be 63 to predict mortality using ROC analysis.Multivariate logistic regression model was used to identify independent risk factors for mortality.A ROC, the graphic display between the “sensitivity” and the“1–specificity” relationship to measure diagnostic accuracy of the true positivesversusthe false positives for APACHE-II and P-POSSUM, depicted that AUC was 0.965 (using a cut-off value of 24) for APACHE-II and 0.989 (using a cut-off value 63) for P-POSSUM.Both the scores were significantly good in predicting postoperative mortality in patients undergoing emergency laparotomy and on comparing the sensitivity and specificity of APACHE-II and P-POSSUM,there appears to be no statistically significant difference between their ability to predict postoperative mortality.Except for APACHE-II's inability to predict re-exploration, both can predict all the secondary outcomes in a statistically significant manner.
This is possibly the first adequately powered study with alpha value (0.05) and a beta value (0.2)and statistically significant sample size that has compared P-POSSUM and APACHE-II in predicting mortality in emergency laparotomies.P-POSSUM above 63 and APACHE-II above 24 not only indicates higher risk, it also increases the risk of postoperative morbidity.However,APACHE-II, being a physiologic score, was a poor indicator of the need for a re-exploration after laparotomy.P-POSSUM is a significantly better predictor of the possibility of re-exploration.While P-POSSUM continues to be the most commonly used scoring system for audit purposes,risk-based outcome comparisons across hospitals and impact of quality improvement initiatives using APACHE-II would ensure that a single scoring system can be used not only for individual patient’s risk assessment and prognostication but also used interchangeably with P-POSSUM for audit purposes as well.
This study demonstrates that compared to the more widely used P-POSSUM, which needs 18 data points, APACHE-II needs only 12 data points, is easily available for risk assessment in the preoperative period, and does not need subjective assessments (intraoperative blood loss or peritoneal contamination) or wait for histopathology reports.While this study was an adequately powered single center study, future research should focus on multi-center trials to strengthen the findings of our study.
World Journal of Clinical Cases2019年16期