Introduction

Endovascular therapy (EVT) is the standard treatment for patients with acute ischemic stroke (AIS) due to large vessel occlusion up to 24 h after onset [1, 2], with recent evidence suggesting a potential treatment efficacy even for patients with low Alberta Stroke Program Early CT score (ASPECTS) at admission [3,4,5].

Several factors are known to influence the potential recovery of patients after treatment, and a successful and timely recanalization is often not self-sufficient to explain the final functional status, despite having a large influence on the clinical outcome [6]. Besides various clinical and treatment parameters, the pre-morbid status of patients is known to be particularly influential on the final clinical outcome and must always be considered when building predictive models [7,8,9]. In practice, the pre-stroke status is generally evaluated by recording the patient’s functional independence (i.e., through the modified Rankin Scale—mRS), age, and the presence or absence of other pre-stroke comorbidities (for example, diabetes or hypertension).

Alongside all these factors, previous studies have also shown the importance of chronic brain deterioration and reduced brain reserve on the final patient’s recovery after AIS, and demonstrated that the presence of chronic vascular insults and small vessel disease (SVD) [10,11,12] is associated with worse outcome and long term recovery [12]. Cortical atrophy has also been demonstrated to negatively influence clinical outcome after EVT [13,14,15,16,17] as well as to increase the likelihood of post-stroke cognitive decline [18, 19]. These factors are particularly relevant considering the increasing evidence suggesting and/or supporting the use of EVT over medical management also in elderly patients [20, 21].

In this context, brain atrophy has been mostly assessed using MRI data and automated measurements, both of which may not be readily available or cost-effective, especially in smaller centers. Moreover, while previous studies were able to prove the value of brain atrophy as an independent predictor, they did not investigate its overall impact on outcome prediction models, especially when combined alongside all other available information through machine learning algorithms.

Here, we independently validated the impact of pre-existing SVD and atrophy on clinical outcome after EVT for AIS in a large, retrospective single-center dataset. We aimed to assess these factors through simple CT-based visual rating scales of age-related white matter changes (ARWMC) and cortical atrophy, which could be broadly applicable in clinical practice, and to investigate their overall relevance for machine learning-based predictive modeling when combined with all available information at baseline. Furthermore, we investigated their impact on post-discharge improvements in functional status.

Methods

The study was approved by the local ethics committee, and the requirement for informed consent was waived (S-784/2018). n = 1103 consecutive patients with confirmed AIS from large vessel occlusion (LVO) affecting the middle cerebral artery (MCA) territory who underwent imaging and EVT at the Department of Neuroradiology of the Heidelberg University Hospital between 01/2013 and 11/2019 were retrospectively enrolled for analysis.

Imaging, endovascular treatment, and clinical assessment

Patients underwent imaging with a 64-slice SOMATOM Definition AS CT scanner (Siemens Healthcare GmbH) as described previously [7]. The complete diagnostic and interventional methods are included in the Data Supplement. Briefly, native cranial computed tomography (NCCT) was acquired with standard initial parameters of 120 kV and 20 mAs, which were then automatically adapted slice-wise using the CARE Dose 4D automatic exposure control system (Siemens Healthineers) and iteratively reconstructed with a J40s kernel (low pass filter—smooth, soft tissue kernel) at 1 mm (n = 485, 43%) and 4 mm (n = 1103 patients, 100%) slice thickness. NCCT was acquired at admission and the decision for endovascular treatment as well as the administration and dosing of recombinant human-tissue plasminogen activator (rtPA) was individually made for each patient based on a consensus between the treating neurologist and neurointerventionalist, following national and international guidelines [1, 2]. The following EVT was performed with a biplane angiographic system (Artis Zee Biplane and Artis Q, Siemens Healthineers). Routine follow-up NCCT was performed on the same scanner within 18 to 36 h or earlier in case of clinical deterioration for all patients.

Radiological assessment of pre-existing brain deterioration

Baseline NCCT was visually reviewed by AE (radiology resident, 3 years of experience) and UN (board-certified neuroradiologist, 8 years of experience) to determine (i) age-related white matter changes (ARWMC) of periventricular white matter and basal ganglia [22], (ii) presence/absence of previous strokes, (iii) Evans’ index[23], and (iv) maximal diameter of the third ventricle. Additionally, cortical atrophy was assessed using a four-point scale by both readers to subsequently assess the inter-observer variability of our method, as depicted in Fig. 1, also based on a previous publication on the topic [24]. Briefly, pathological grades were defined as follows: Grade 1—minor atrophy to the insular and/or the frontoparietal region, cerebral sulci slightly more pronounced. Grade 2— moderate atrophy, further widening of the cerebral sulci. Grade 3—global atrophy, severely widened sulci, thinning of the cortical gyri.

Fig. 1
figure 1

Visual examples for atrophy grading on CT images. Grade 1—minor atrophy to the insular and/or the frontoparietal region, cerebral sulci slightly more pronounced. Grade 2—moderate atrophy, further widening of the cerebral sulci. Grade 3—global atrophy, severely widened sulci, thinning of the cortical gyri. Windowing was set at L40/W60 for all images

Further clinical, imaging, and angiographic parameters

Clinical data were collected by a certified stroke neurologist. Baseline epidemiological and clinical characteristics were included in the study (sex, age, pre-morbid mRS score, NIHSS score on admission, concomitant treatment with intravenous rtPA, comorbidities [hypertension, dyslipidemia, coronary heart disease, atrial fibrillation, diabetes mellitus including serum blood levels for HbA1c and glucose]) as well as interventional angiographic characteristics (interval from symptom onset to groin puncture and from groin puncture to recanalization, extent of recanalization as assessed by the treating neurointerventionalist with the modified treatment in cerebral ischemia (mTICI score) and post-interventional clinical characteristics (NIHSS score after 24 h). The Alberta Stroke Program Early CT Score (ASPECTS) was evaluated automatically through e-ASPECTS (Brainomix) on 1 mm slices and visually reviewed by AE (radiology resident with 2 years of experience) and UN (board-certified neuroradiologist with 8 years of clinical experience) [25]. Follow-up NCCTs were visually assessed for follow-up ASPECTS as well as the presence of intracranial hemorrhages (ICH) (scored according to the Heidelberg Bleeding Classification—HBC) [26]. The correlation of neurological deterioration with imaging findings was assessed by reviewing clinical data (e.g., medical records) to further classify ICH as symptomatic ICH or asymptomatic ICH, in compliance with operational guidelines of HBC.

The functional outcome was assessed with the mRS score at 90 days (mRS90) and was classified as “favorable” outcome for patients with a mRS90 ≤ 2 or an mRS at 90 days equal to the pre-stroke mRS. A NIHSS score of 42 at discharge was assigned to patients who deceased during their initial hospitalization.

Statistical analysis

Statistical analyses were performed in R version 4.0.3. Metrics were tested for normality using the Shapiro–Wilk test, and Wilcoxon, t-test, or chi-squared test were then used accordingly depending on the nature of the metrics. The intraclass correlation coefficient (ICC) was calculated for the visual assessment of cortical atrophy between the two expert raters. Spearman’s rho was used to assess the correlation between variables.

Logistic regression as well as ordinal logistic regression were performed to assess the effects of the various measures of pre-existing brain deterioration on outcome while adjusting for other covariates, as listed below. The metrics were further tested in machine learning models. Briefly, these were built using the caret package [27] and based on gradient-boosting machine classifiers with the 0.632 bootstrapping procedure for cross-validation [28], as further described in the Data Supplement. Synthetic minority over-sampling (SMOTE) was applied to correct for sample imbalance. Model performance was assessed through ROC curves, and significant differences between the performance of the developed models were then assessed using DeLong’s test. The variable importance of parameters within the machine learning models was then assessed through random forest variable importance.

Within predictive modelling performed with either logistic regression, ordinal regression, and gradient boosting classifiers, three sets of metrics were present in the dataset in addition to ARWMC and cortical atrophy.

  1. 1.

    Baseline metrics: age, sex, stroke laterality (right vs. left), diabetes, coronary heart disease (CHD), hypertension, atrial fibrillation, dyslipidemia, i.v. rtPA lysis administration, baseline NIHSS, pre-morbid mRS, baseline ASPECTS, baseline glucose, baseline HbA1c, time from stroke onset to first imaging exam

  2. 2.

    Post-interventional metrics: successful recanalization (yes/no), complete recanalization (TICI ≥ 2b [yes/no]), NIHSS at 24 h post-intervention, presence of bleeding at follow-up imaging, time from groin puncture to final TICI score, number of thrombectomy maneuvers, follow-up ASPECTS

  3. 3.

    Post-discharge metrics: NIHSS at discharge, mRS at discharge

Results

Across the study cohort, n = 414 patients (38%) presented a favorable clinical outcome, and n = 689 patients with an unfavorable outcome at 90 days (mRS ≤ 2 or unchanged as compared to pre-morbid). For our secondary analysis, n = 284 patients (26%) presented an improvement in mRS at 90 days as compared to the mRS collected at hospital discharge. The different clinical and epidemiological characteristics between patients presenting favorable vs. unfavorable clinical outcome at 90 days are listed in Table 1. Briefly, the two groups presented significant differences in all listed metrics (p < 0.01) aside from the presence/absence of hypertension (p = 1.00), dyslipidemia (p = 0.894), and the distribution of patients’ sex (p = 0.079).

Table 1 Epidemiological and clinical statistics for patients with favorable (mRS ≤ 2) vs. unfavorable (mRS > 2) clinical outcome at 90 days. Metrics are listed as median (IQR) for continuous variables, and number (%) for ordinal variables. In univariate tests, the two cohorts presented significant differences in all listed metrics aside from the presence/absence of hypertension and dyslipidemia

An overview of the distribution of all measured metrics of pre-existing brain deterioration is provided in Table 2. Here, patients with an unfavorable clinical outcome at 90 days presented a higher rate of previous strokes (p = 0.010) as well as significant differences in all measures of pre-existing brain deterioration, with higher basal ganglia and periventricular ARWMC (p = 0.014 and p < 0.001, respectively), higher cortical atrophy scores (p < 0.001), Evans’ Index (p < 0.001), and third ventricle diameter (p < 0.001).

Table 2 Measures of pre-existing brain deterioration for patients with favorable (mRS ≤ 2) vs. unfavorable (mRS > 2) clinical outcome at 90 days. Metrics are listed as median (IQR) for continuous variables, and number (%) for ordinal variables. Wilcoxon Rank-sum test was used for continuous variables, and the chi-squared test was used for categorical variables

The related inter-rater analysis for the cortical atrophy measurements revealed a moderately high ICC of 0.706 (95% CI 0.673–0.736), suggesting a good reproducibility. Correlation analyses for both ARWMC and cortical atrophy revealed only weak to average correlations with the patient’s age, with rho = 0.385 (p < 0.001) for periventricular ARWMC, rho = 0.285 (p < 0.001) for basal ganglia ARWMC, and rho = 0.434 (p < 0.001) for cortical atrophy. Cortical atrophy presented a weak to average correlation with mRS at 90 days (rho = 0.360, p < 0.001).

Patients with hypertension (p < 0.001), coronary heart disease (p < 0.05), and atrial fibrillation (p < 0.001) presented significantly higher score distributions for both ARWMC and cortical atrophy, as further listed in Table 3, whereas no significant differences were noted for the presence of diabetes and dyslipidemia (p > 0.05).

Table 3 Multivariable logistic regression model for prediction of favorable vs. unfavorable outcome at 90 days after EVT. ARWMC, cortical atrophy, hypertension, baseline NIHSS, pre-morbid mRS, baseline ASPECTS, and time from onset to first imaging exam were all shown to be independent predictors for clinical outcome, as listed above. LR = Likelihood Ratio Test

Regression analysis

After performing an exploratory multivariable logistic regression with the inclusion of all brain deterioration metrics (but no confounders), ARWMCs and cortical atrophy emerged as the most relevant predictors (p < 0.001) and were included in subsequent predictive models, while other metrics were discarded.

Multivariable logistic regression with the inclusion of all clinical and imaging baseline parameters revealed the independent predictive significance of both cortical atrophy (adj. OR for grade 3–0.27 [0.13–0.54], p < 0.001) and periventricular ARWMC (adj. OR for grade 3–0.43 [0.22–0.85], p < 0.05) for predicting a favorable clinical outcome at 90 days, with adjusted OR for each class as well as the other independent predictors of outcome further listed in Table 3.

Ordinal logistic regression was then performed with the goal of predicting the ordinal, non-dichotomous mRS score at 90 days, and demonstrated again the relevance of cortical atrophy to independently predict outcome as measured by increases in mRS (e.g. adj. OR for grade 3 atrophy—3.13 [1.89–5.37], p < 0.001), while ARWMC did not reach significance (p > 0.05), as further listed in Supplemental Table S1. Figure 2 depicts the cumulative probabilities for each mRS class based on the patient’s atrophy scale, showing a linear increase of likelihood for higher mRS scores and death (mRS = 6) parallel to the stepwise increases in the atrophy scale.

Fig. 2
figure 2

Cumulative probability for each mRS score class at 90 days (mRS 0–6) sorted by cortical atrophy score as calculated through ordinal logistic regression, demonstrating a higher likelihood for a worse clinical outcome for patients with higher atrophy scores, independently of all other included variables

Machine learning models

The addition of both cortical atrophy and periventricular ARWMC to the other available baseline clinical and imaging metrics led to a minor but significant improvement in the prediction performance when using machine learning models on the cross-validation sample (p < 0.001), with an AUC of 0.775 (95% CI, 0.772–0.779), as compared to 0.763 (0.760–0.766) without ARWMC and atrophy scale—see Supplemental Figure S1 for a depiction of the ROC curves. Both ARWMC and atrophy were among the top-10 predictors used by the model after inclusion, as depicted in Supplemental Figure S2.

We then developed three multivariable logistic regression models for predicting mRS improvement after hospital discharge, with (i) baseline, (ii) baseline + post-EVT, and (iii) baseline + post-EVT + post-discharge metrics. As listed in Supplemental Table S2, cortical atrophy maintained its relevance as an independent predictor of clinical improvement post-discharge across all tested models (p < 0.001) despite the addition of post-interventional and even post-discharge metrics.

Discussion

Multiple variables beyond the extent of recanalization can impact the clinical outcome after AIS due to LVO. Here, we could confirm that both ARWMC and cortical atrophy were independent predictors of clinical outcome at 90 days also when controlled for all other baseline confounders, and the probability of a worse outcome or death increased linearly with the severity of atrophy. ARWMC and atrophy were only moderately correlated with patient’s age but presented significantly higher scores in patients with cardiovascular risk factors (namely hypertension, CHD, and atrial fibrillation). Scoring of cortical atrophy was performed using a simple and novel 4-point NCCT-based scale which demonstrated a good inter-observer agreement between our two raters, suggesting good reproducibility of our method. Overall, the inclusion of cortical atrophy and ARWMC into machine learning models for outcome prediction led to a minor but significant performance improvement in prediction accuracy, and both ARWMC and atrophy were considered among the top-10 predictors by the algorithm. Finally, atrophy had a predictive effect on a potential improvement of mRS after hospital discharge, even in the presence of all other post-interventional metrics.

Our results are consistent with other findings in the literature which previously demonstrated the independent predictive significance of atrophy [13,14,15,16,17] as well as ARWMC and SVD [9,10,11,12, 29] onto the final clinical outcome of patients undergoing EVT, and support a relevant underlying contribution of pre-stroke brain deterioration beyond all other examined parameters. This further corroborates the longstanding concept that patients with lower brain reserve would present an implicit disadvantage in compensating for the damage sustained through AIS [9, 30].

Moreover, the predictive effect of atrophy for clinical improvements after hospital discharge hints at the fact that the true role of cerebral atrophy might be played out on a longer time scale and/or in the rehabilitation phase, also considering the well-known post-stroke cognitive decline which inevitably affects some patients at later stages [18, 19].

As compared to previous works in the literature which relied on MRI-based and/or automated volumetric methods to assess atrophy [13, 14], we decided to use a simple 4-point grading scale performed visually on NCCT to grade cortical atrophy. While our method cannot implicitly be as reproducible or accurate as automated volumetric methods performed on MRI, it compensates for this issue through its simplicity, which makes it broadly applicable in clinical practice with minimal training, ultimately increasing the amount of information available to the treating physician even in smaller institutions where MRI or research IT infrastructures with specialized image analysis tools might not be available or cost-effective. Moreover, our data demonstrated a good inter-rater agreement in the atrophy scores, suggesting that the method could be implemented by other institutions.

Nonetheless, our data also showed that the addition of ARWMC and atrophy only led to a minor improvement in prediction performance with machine learning-based models that encompassed all available baseline data, despite their independent predictive effect. This aspect was never tested in the available literature, to the best of our knowledge. Overall, large leaps in performance are not to be expected with the addition of new parameters in the context of multimodal predictive modeling of AIS due to the complex multifactorial interactions that determine the final outcome, and existing models struggle to increase baseline performance due to the large number of confounders which are present throughout the subsequent treatment, hospital stay and rehabilitation [7, 8]. Taking these factors into consideration, atrophy and ARWMC achieved a significant result after the inclusion in our model, as they allowed to push the performance even further than what could be achieved with the previous inclusion of cardiovascular risk factors and other indicators of pre-morbid patient status, to which they are inevitably related. Due to the simplicity of our NCCT-based scoring system and the implicit availability of imaging for AIS patients, measures of ARWMC and atrophy could then be easily integrated into the evaluation of the patient status at baseline on a broad scale.

Our study presents some limitations. First, although we included a large patient sample, we used a retrospective, single-center study design. Further investigations and validations with a multi-centric dataset or prospective data might confirm our results and provide more definite evidence. Second, while we tried to assess the reproducibility of our NCCT atrophy score, we only had two raters at our disposal; further large-scale testing with a higher number of raters is warranted in order to precisely assess the variability of our method. Third, we did not use a test set or perform external validation with our machine learning models; although this is to be considered a major drawback in machine learning studies, our aim was not that of producing a high-performing and reproducible tool that could be applied in other centers, but rather to perform an indirect investigation on the potential added benefits of ARWMC and atrophy. Testing samples were therefore not implicitly required for the purpose of this study. Fourth, we did not validate our results against MRI volumetry as we lack access to a patient cohort that received both examinations. Further targeted studies with this type of design might aid in enhancing our NCCT-based score and its reproducibility. Lastly, our models did not include data derived from CT perfusion imaging, as this is performed only in selected cases at our institution; moreover, previous works have demonstrated the limited utility of CT perfusion for outcome prediction in the presence of all other available baseline information [7] and the correlation between ischemic core and ASPECTS [31,32,33].

In conclusion, cortical atrophy and ARWMC were strong independent predictors of clinical outcome after EVT in our retrospective dataset, and their inclusion in a prediction model using established parameters led to a significant improvement in prediction accuracy. Moreover, cortical atrophy could independently predict an improvement in mRS after hospital discharge. Both scores could easily be integrated into existing clinical routines and prediction models given the simple NCCT-based visual rating method in order to improve the selection of eligible patients and resource allocation.