Introduction

The aim of glioblastoma surgery is to maximize tumor removal, while preserving the patient’s functional integrity. Because guidelines for surgical decision making are not available, treatment decisions can be highly personalized, but can introduce treatment variation and outcome variation as well. If the neurosurgeon considers a tumor unresectable, or if the patient is considered unfit or unmotivated for resective surgery, then a diagnostic biopsy can be done. More extensive tumor removal is associated with longer patient survival [1,2,3], whereas functional deficits from too extensive resections can result in poorer quality of life and shorter survival [4]. Before and during surgery, care teams use different techniques to optimize resections due to (1) varying access to image-guided navigation, fluorescence-guided microscopy, intraoperative MRI, or brain stimulation mapping (2) different surgical schools of education, i.c. more oncological or functional, and (3) diverging experts’ opinions on more aggressive or conservative approaches. None of these techniques has been proven to prolong patient survival.

Patient survival outcome after glioblastoma surgery varies considerably in reports from tertiary referral centers [1,2,3, 5,6,7,8,9,10,11,12,13,14,15,16,17]. Likewise, patient survival may differ among the hospitals within a nation.

The Dutch Society for Neurosurgery [18] established the Quality Registry for Neuro Surgery [19], starting with a consensus set of indicators for glioblastoma surgery in 2011. This registry provides feedback to all hospitals with neurosurgical units on clinical practice for self-assessment and quality-monitoring.

In this study, we measured variation in risk-standardized early mortality and late survival after glioblastoma surgery between all 14 hospitals that perform glioblastoma surgery in the Netherlands. Furthermore, we explored the association between survival and hospital characteristics, including case volume, academic setting and biopsy percentage, in addition to known prognostic patient characteristics, i.e. age and performance status.

Methods

We studied all 2409 patients who had first-time surgery for glioblastoma between January 1, 2011 and December 31, 2014 at all 14 hospitals engaged in glioblastoma surgery in The Netherlands. We collected data for patients 18 years or older at surgery and a histopathological diagnosis of glioblastoma according to the WHO 2007 criteria [20].

Data collection

Neurosurgeons, nurse specialists in neuro-oncology and trained physician assistants prospectively entered patient data in the Quality Registry for Neurological Surgery. Demographic and clinical information consisted of age at diagnosis, gender, Karnofsky performance status before surgery, type of surgery (biopsy or resection), and dates of treatment, last follow-up and death. A surgical procedure was considered a biopsy, when tissue was taken for diagnosis only, either by needle biopsy or open biopsy. For the patients who had died within one month after surgery, the cause of death was retrieved if available from the medical records or by contacting the primary care physician.

Treatment decisions for patients were made in multidisciplinary tumor board meetings in all hospitals. During resective surgery image-guided navigation was customarily used. Fluorescence-guided resection, stimulation mapping, and ultrasonography was applied by neurosurgeons’ preferences. Intraoperative MRI was not in use.

The dates of death were verified and updated against the information available from the National Cancer Registry (NCR). The NCR collects information on all newly-diagnosed cancer patients in the Netherlands following notification by the national pathology registry. Information on vital status is retrieved through yearly linkage with the Municipal Personal Records Database; on March 1, 2016 for our analyses. As a further data quality check, each hospital reviewed their data after the closure of patient inclusion.

Because this data was collected for evaluation of quality of care in accordance with the Dutch Quality Act for Healthcare [21], written informed consent was not needed. Ethical approval was waived because the study was not subject to the Medical Research Involving Human Subjects Act (WMO) [22] and de-identified data had been collected of patients mostly not alive. After delivery by a trusted third party [23], de-identified patient data was analyzed. Four authors (PWH, AZW, VHO, PRO) had full access to this data and are responsible for the data analysis and reporting.

Outcomes and risk predictors

The main outcome measures to evaluate variation were specified in the consensus set of quality indicators of the registry: the risk-standardized early mortality and late survival. Early mortality was defined as the percentage of patients who had died of any cause within 30 days after surgery; late survival was defined as the percentage of patients who were alive at 2 years (730 days) after surgery.

To account for risk differences in glioblastoma patients among hospitals, we used known patient-related predictors for survival as covariates, i.e. age at diagnosis and Karnofsky performance status [24, 25]. We also included the year of treatment for risk-standardization, because of the four year timespan in which care decisions may have changed, although national treatment guidelines did not alter [26]. Clinical management decisions were not included in risk-standardization, such as corticosteroid use [27], surgical technique and extent of resection [1,2,3], and participation in clinical trials [28, 29], although associated with survival. Standard adjuvant treatment consisted of 60 Gy fractionated conformal radiotherapy with concomitant temozolomide chemotherapy, and six cycles of adjuvant temozolomide [30]. Hospital characteristics that we explored for association with outcome were the total number of patients with glioblastoma treated in 4 years in a hospital (i.e. volume), academic setting, and the percentage of biopsy procedures.

Statistical analysis

Survival was analyzed in days with censoring at the last date of follow-up or the lookup date of alive status, and analyses were based on complete cases regarding information on covariates.

A multivariable hierarchical Cox proportional hazards model was used for risk-standardization in assessing outcome variation between hospitals. For statistical modeling and inferences, outcomes were assessed at patient-level using age, performance and year of treatment as covariates for risk-standardization. A random effect per hospital was included in the model in recognition of imperfect risk-standardization [31, 32]. Thus, for each patient, a predicted survival function was obtained based on the patient-related covariates and a hospital random effect of 0, representing treatment in a fictitious hospital with average performance. Using the patient-specific expected survival function, the standardized risk for having died within 30 days and for being alive by 730 days was calculated for each patient. In calculating risk-standardized ratios per hospital, we obtained the predicted probability of the events for each patient in a hospital and summed these probabilities to get the expected number of events for that hospital. Risk-standardized ratios per hospital were calculated as the observed number of events divided by the predicted number of events, i.e. the observed-to-expected ratio [32, 33]. For example, the risk-standardized early mortality is lower than expected with a ratio below one, and risk-standardized late survival is higher than expected with a ratio above one.

We applied a Bayesian predictive model with random effects using the Stan language [34, 35]. The counting process of events over time was assumed to follow a log-poisson density with unknown means and unknown precisions for the regression coefficients, the random effects and the hazards for which we used vague priors. Vague priors were chosen to primarily reflect inference from the presented data without substantive prior knowledge. Model details are provided as online code [36] and the predictive patient risk model as web application [37]. The posterior predictive model was verified using simulated data. No evidence against convergence was identified. The median values of posterior distributions were used as estimates with 95% credibility intervals.

Funnel plots were generated with expected number of events as precision and risk-standardized observed-to-expected event ratios per hospital as indicator as previously described [31]. The funnel control limits to identify potential outliers were obtained as 95% and 99% prediction limits from the Poisson distribution.

As a first exploration to plot regression lines between the hospital-related characteristics and observed early mortality and late survival, we used univariate logistic regression with death status at 30 days and alive status at 2 years as response measure [31]. For volume, log-transformed number of patients was modelled. Second, to estimate the effect sizes of hospital and patient-related characteristics on overall survival, we used the multivariable hierarchical Cox proportional hazards model. The hazard ratios for death were determined with 95% confidence intervals for hospital characteristics, i.e. log volume, academic setting, and biopsy percentage, and for patient-related characteristics, i.e. age, performance status, and year of treatment, as risk-standardization without hospital random effects [38].

Results

Of the 2,409 patients, 2,308 were available for complete case analysis (Table 1). At last follow-up, 462 patients were alive.

Table 1 Characteristics of patients and hospitals with survival outcome per hospital and overall

The observed overall survival over time per hospital and the risk-standardized survival function for all patients are plotted in Fig. 1. The observed and expected survival over time per hospital is shown in Supplemental Fig. 1.

Fig. 1
figure 1

Survival outcome over months as Kaplan–Meier curves per hospital in colors and overall survival function in black based on the Cox regression model with risk-standardization for age, performance, and treatment year. The hospital identifications correspond with Table 1

The overall 30-day mortality was 5.2% and the overall 2-year survival was 13.5%. The observed early mortality and late survival for each hospital is listed in Table 1. Median overall survival was 10.2 months, and varied between 4.8 and 14.9 months among hospitals.

The hospital characteristics of case volume, academic setting and biopsy percentage are plotted in relation to observed early mortality and late survival in Fig. 2. Case volume varied between 73 and 358 patients in 4 years. In univariate logistic regression analysis, a higher volume was related with lower early mortality (P = 0.031), but not with late survival. The estimated log OR of log volume on early mortality was − 0.39, so that a 10% increase in volume was estimated to be associated with 3.9% relative decrease in early mortality. The estimated boundary from higher-than-average to lower-than-average early mortality is located at a volume of about 180 patients in 4 years, i.e. 45 patients per year. In this first univariable exploration, an academic setting of hospitals was not significantly associated with early mortality or late survival. The biopsy percentage varied between 16 and 73% among hospitals, indicating considerable treatment variation. The biopsy percentage was not significantly associated with early mortality or late survival in this first exploration.

Fig. 2
figure 2

Hospital characteristics versus survival outcome. Plots of a case volume in 4 years versus observed early mortality percentage, b volume versus observed late survival percentage, c biopsy percentage versus observed early mortality percentage, and d biopsy percentage versus observed late survival percentage. Black circles indicate hospitals with an academic setting. The overall outcome percentages are represented by dotted lines. Logistic regression lines are drawn, significant association estimates in black, non-significant estimates in grey

In our effort to retrieve the causes of early death, information was obtained for all 119 patients who had died within 30 days. Early death was related to glioblastoma progression in 36 (30%) patients, the cause remained unknown in 36 (30%), death was directly related to surgery in 31 (26%), i.e. hemorrhage in 16, postresection edema and/or ischemia in seven, postoperative functional deterioration in seven, and intracranial infectious complications in three. Early death could be indirectly related to surgery in 13 (11%), i.e. seizures in five, pulmonary embolus in five, and extracranial infectious complications in three, and early death was unrelated to either the disease or surgery in three (3%), i.e. cardiac arrest in two, and trauma in one.

From the funnel plots in Fig. 3, one hospital (b) had lower early mortality than expected within 95% control, and four hospitals (a, i, k, and n) had lower late survival than expected within 95% control. The combination plot of early mortality and late survival indicates that the two outcomes are not related. In other words, this indicates that more extensive surgery for longer tumor control is not set off by an increase in postoperative mortality.

Fig. 3
figure 3

Multipanel plot of expected numbers of early deaths versus risk-standardized mortality ratios (a), expected numbers of late survivors versus risk-standardized late survival ratios (b), and combination plot of risk-standardized early mortality versus late survival ratios on log scales (c). The solid funnels are 95% control limits, the dotted funnels 99% control limits. Black dots indicate hospitals with ratios outside the 95% control limits. Better than expected early mortality and late survival is shown in green, worse than expected is shown in red. The institutional identifications are printed in the circles; sizes correspond with the case volumes according to the legend. Note that hospital i has 0 observed late survivors and 0.89 expected late survivors, which therefore is outside the plot below the lower 99% control limit and in the red region

In the multivariable Cox regression analysis, the known prognostic factors of older age (HR 1.54, 1.46–1.62, P < 0.00001) and worse performance (HR 0.77, 0.74–0.81, P < 0.00001) were strongly associated with shorter survival, as expected. More recent years of treatment were also associated with longer survival with 2011 as reference (2012 HR 0.94, 0.83–1.07, NS; 2013 HR 0.80, 0.71–0.92, P = 0.001; 2014 HR 0.78, 0.68–0.89, P = 0.0003). Therefore, risk-standardization with these patient-related characteristics seems justified. Of the hospital characteristics, a lower biopsy percentage was associated with longer overall survival (HR 2.09, 1.34–3.26, P = 0.001). Log case volume (HR 0.954, 0.866–1.05) and academic setting (HR 0.951, 0.858–1.05) were not associated with overall survival.

Discussion

This comprehensive, nation-wide, four-year prospective quality registry study on survival outcomes after glioblastoma surgery shows (1) between-hospital variation in 2-year survival and 30-day mortality (2) surgical treatment variation suggested by widely-varying biopsy percentages between hospitals (3) that 30-day mortality is not a suitable measure for glioblastoma surgery-related complications only, as many of these patients die early from progressive disease (4) that a larger case volume is associated with lower early mortality, but not with overall survival (5) that in addition to patient-related factors, i.e. younger age and better performance, a lower biopsy percentage in a hospital is an important indicator of overall survival outcome, and not case volume nor an academic setting.

Differences in late survival after glioblastoma surgery between hospitals have not been published. The 2-year survival of glioblastoma in our overall data is comparable to other registries, i.e. 14.8% in the Central Brain Tumor Registry of the United States [39] and 23.8% in the Surveillance, Epidemiology and End Results Program of the National Cancer Institute [40], and comparable to other communities, ranging from 12.0 to 25.3% [9, 15,16,17]. Registry-based populations include patients typically excluded from clinical trials and surgical resection series, such as elderly patients with treatment concessions, resulting in shorter survival in real-life data than reported in clinical trials.

Early mortality differences between hospitals have been reported earlier, indicating that high-volume surgeons in high-volume hospitals have lower in-hospital mortality and complication rates after brain tumor surgery based on the Nationwide Inpatient Sample data of 62,514 admissions [41]. Similarly, high-volume hospitals had lower mortality after brain tumor resections based on a state-based study of 4,723 patients in 33 hospitals [42]. Postoperative mortality estimates for brain tumor surgery have varied considerably: 0.26% in 8,091 patients based on a meta-analysis of 90 publications [43], 1.0% in 306 patients from one hospital [4], 1.5% in 408 patients from 52 hospitals [44], 1.7% in 400 patients from one hospital [45], 2.5% in 322 patients in a multicenter randomized trial [46], 3.5% in 4,723 patients from 33 hospitals [42], and 7.9% in 834 patients from 19 hospitals [9]. Variability in these estimates may be explained by different timings of mortality, inclusion of tumors other than glioblastoma, patient selection bias, and publication bias. Our results indicate that mortality within 30 days is not a useful quality indicator for glioblastoma surgery-related complications without information on causes of death. A more precise measure would be the percentage of surgery-related mortality.

Earlier reports have identified an association between larger case volumes and more favorable outcome after glioblastoma surgery [41, 42, 47] and after other cancer-related surgery [48, 49]. In our findings, larger case volume is not associated with overall survival when adjusted for patient-related risk factors. Patient-related factors clearly outweigh hospital-related effects. A prominent hospital-related predictor of overall survival in our data is the percentage of biopsies. One explanation is that higher extent of resection has been shown to prolong patient survival [1,2,3]. In addition, we speculate that the percentage of biopsies may be a surrogate marker for a more conservative general approach with possibly earlier cessation of therapies to rescue or to prolong life. The biopsy percentages among hospitals varied considerably, whereas the patient risk profiles of hospitals were quite similar. The causes for the biopsy percentage variation and MRI-based glioblastoma removal measurements need to be explored as quality indicators in further studies. This should enhance exchange of team expertise and surgical skills. Other quality indicators for glioblastoma surgery may include functional outcome, measured as patient-related outcome measures, cognitive performance, or health-related quality of life. Furthermore, the National Surgical Quality Improvement Program recently reported on hospital process measures, such as length of hospital stay [50], readmission rate [51], and unplanned reoperation [52].

Our results highlight the importance of risk-standardization in comparing hospital outcomes [31, 32]. For example, hospital ‘d’ has the highest observed percentage of 2-year survival without risk-adjustment (23% in Table 1), whereas the 2-year survival ratio adjusted for patient-related risk factors is less than expected within control limits (0.88 in Fig. 3b). This can be explained by a patient population that is on average younger and has better performance than populations of other hospitals (Table 1). This may indicate a deviating patient referral or selection in hospital ‘d’.

Survival time is an objective measure. The interpretation, however, of early mortality or late survival as success or failure of treatment decisions is not necessarily straightforward. For example, a patient not alive at 30 days may have died from rapidly progressive disease despite optimal treatment decisions, whereas a patient alive at 2 years may be in a poor condition with minimal quality of life for a prolonged period of time from overtreatment. Therefore, a spectrum of quality indicators is required to capture the nuances of quality of care.

The yearly feedback of our registry and the results of this analysis have resulted in discussions on quality-monitoring within our workgroup and in self-assessment by neurosurgeons within departments. This collaboration has been perceived as constructive and encouraging. The observation that more recent treatment years were associated with longer overall survival may indicate a quality improvement as a result of this collaboration.

The strengths of our study are the comprehensive population-based nation-wide cohort, the data quality checks from two data sources (QRNS and NCR), the near-complete follow-up of patients, and the modern methods for prediction modeling. The limitations of our study are the unavoidable imperfect patient-level risk-standardization, the unavailability of information - other than biopsy or resection - on applied surgical techniques and the unavailability of other hospital characteristics, such as treatment guideline adherence or clinical trial participation. Furthermore, the few patients who refrained from any treatment or who had radiation or chemotherapy without histopathological diagnosis were not included in this pathology-based registry. Similarly, the few patients, who may have crossed over between hospitals for treatment, remain unidentified. And treatment variation other than surgical decisions may contribute to the observed outcome variation. For this, we have recently started a national multidisciplinary collaboration involving radiation and medical oncologists, neurologists, radiologists, and pathologists for a joined quality registry, i.e. the Dutch Brain Tumor Registry.

There are several possible implications for clinical practice. First, a quality program is required to enforce hospitals with less than expected outcome to improve, for instance by identification of differences in care programs amenable to exchange expertise, and to ultimately devise a systematic quality evaluation and improvement cycle. At the same time regionalization of brain tumor care in networks may improve overall quality of care. Second, further investigation is necessary into the relation between hospital biopsy percentage, volume, and survival outcome. It remains undetermined whether a threshold for minimum case load per year is a robust criterium for ‘in control’ survival outcome. Third, early mortality should be reported with causes of death in quality comparisons. Fourth, patient counseling and surgical decision making should rather be based on personalized predictions from real practice data than on clinical trial results and tertiary referral center publications [53]. Therefore, our predictive patient risk model may be useful in clinical decision making [37].

Conclusion

Hospitals vary more in late survival than early mortality after glioblastoma surgery. Widely varying biopsy percentages indicate treatment variation. Patient-related factors have a stronger association with overall survival than hospital-related factors.