Cost-effectiveness of stereotactic body radiation therapy versus video assisted thoracic surgery in medically operable stage I non-small cell lung cancer: A modeling study.

OBJECTIVES
Stage I non-small cell lung cancer (NSCLC) can be treated with either Stereotactic Body Radiotherapy (SBRT) or Video Assisted Thoracic Surgery (VATS) resection. To support decision making, not only the impact on survival needs to be taken into account, but also on quality of life, costs and cost-effectiveness. Therefore, we performed a cost-effectiveness analysis comparing SBRT to VATS resection with respect to quality adjusted life years (QALY) lived and costs in operable stage I NSCLC.


MATERIALS AND METHODS
Patient level and aggregate data from eight Dutch databases were used to estimate costs, health utilities, recurrence free and overall survival. Propensity score matching was used to minimize selection bias in these studies. A microsimulation model predicting lifetime outcomes after treatment in stage I NSCLC patients was used for the cost-effectiveness analysis. Model outcomes for the two treatments were overall survival, QALYs, and total costs. We used a Dutch health care perspective with 1.5 % discounting for health effects, and 4 % discounting for costs, using 2018 cost data. The impact of model parameter uncertainty was assessed with deterministic and probabilistic sensitivity analyses.


RESULTS
Patients receiving either VATS resection or SBRT were estimated to live 5.81 and 5.86 discounted QALYs, respectively. Average discounted lifetime costs in the VATS group were €29,269 versus €21,175 for SBRT. Difference in 90-day excess mortality between SBRT and VATS resection was the main driver for the difference in QALYs. SBRT was dominant in at least 74 % of the probabilistic simulations.


CONCLUSION
Using a microsimulation model to combine available evidence on survival, costs, and health utilities in a cost-effectiveness analysis for stage I NSCLC led to the conclusion that SBRT dominates VATS resection in the majority of simulations.


Introduction
Stage I Non-Small Cell Lung Cancer (NSCLC) has a relatively good prognosis of 81 % (stage IA) or 73 % (stage IB) 5-year survival following curative treatments [1]. The most commonly used procedures follow current guidelines, which suggest operable patients should preferably be treated with the appropriate thoracoscopic resection technique with Video Assisted Thoracic Surgery (VATS), while Stereotactic Body Radiation Therapy (SBRT) is preferred in inoperable patients [2][3][4]. These guidelines are based on current evidence, although there is disagreement whether there is equipoise for operable patients between SBRT and VATS resection.
Randomized controlled trials (RCTs) are seen as the highest quality of evidence, because the randomization process can prevent bias [5]. Some RCTs that compared SBRT to VATS resection in operable patients have been discontinued due to low patient accrual [6][7][8]. A pooled analysis of two discontinued RCTs suggested equal effectiveness, although small sample size and short follow-up prohibited definitive conclusions [9]. The POSTLIV and VALOR RCTs on clearly operable NSCLC patients are expected to publish their results in 2026 [10].
As a result, most of the available data on VATS resection and SBRT is observational, which has inherently a high risk for selection bias. Healthier patients who are considered operable often receive surgery, resulting in differences in the case-mix between VATS and SBRT patients. Selection bias has been addressed with propensity score matching, but this often leads to small sample sizes. A pooled metaanalysis of propensity score matched data showed no significant differences in cancer specific survival between the two treatment options [11].
When the differences in recurrence-free survival (RFS) and overall survival (OS) between the treatments are small, factors such as quality of life and costs become important to determine the optimal treatment choice. In a cost-effectiveness analysis (CEA), relevant health outcomes and associated costs are integrated to allow rational deliberation between treatments. This type of analysis is increasingly used to formulate guidelines and make reimbursement decisions.
Two CEAs that have thus far been performed to compare SBRT to surgery presented contradictory results. Shah et al. concluded that lobectomy was dominant in clearly operable patients, while Paix et al. found SBRT to be dominant [12,13]. Both studies have their limitations, and no clear conclusions can currently be drawn.
In this paper, we present a Dutch cost-effectiveness analysis comparing VATS resection and SBRT in stage I NSCLC, bringing together evidence on effectiveness, quality of life and costs in a microsimulation model.

Concept of the microsimulation model
A microsimulation model was developed for the purpose of this CEA. The model simulates underlying tumor growth to determine RFS and OS for each patient [14,15]. The most important assumptions made in the development of the microsimulation model are: 1 After curative treatment of the primary tumor, a proportion P metastatic of patients has a number of undetectable micro-metastases. 2 Micro-metastases grow exponentially, with a tumor volume doubling time, which can be reduced by systemic treatment after detection of recurrent disease. 3 All metastases below the minimum detectable size (5 mm diameter) cannot be detected by a surveillance scan or become symptomatic. 4 Death of Disease (DOD) occurs when the total metastatic volume reaches the lethal threshold. DOD is considered to be independent of age, sex and other patient and tumor characteristics. 5 The model uses competitive risks to determine the time of detection of metastases and time of death.
The most important parameters for this cost-effectiveness evaluation are found in Table 1. The model was programmed in C++, and analyzed by Microsoft Excel Professional plus 2016 and IBM SPSS statistics version 22. Fig. 1 depicts a flowchart of the model, consisting of two parts that closely interact. The disease course determines RFS and OS, and the clinical pathway keeps track of the costs and Quality-adjusted Life Years (QALYs) of each treatment plan.
Simulations start with generating a hypothetical RCT population of 100.000 operable stage I NSCLC patients. For this purpose, a life table for the simulated stage I NSCLC population is constructed, containing statistics on age, sex, and remaining life-years until death due to other causes (DOC) than cancer, adjusted for smoking (Appendix). A proportion of the stage I NSCLC population (Table 1) has undetected metastases after treatment of their primary tumor. The remaining patients cannot get recurrences or die of NSCLC. Subsequently, patients are randomly assigned to VATS resection or SBRT and the transition-times for transitions 1-4 ( Fig. 1A) are drawn. Recurrences can either be detected symptomatically or with a surveillance scan at 3, 6, 12, 18, 24, 36, 48 or 60 months. At the same time, the clinical pathway model keeps track of the costs of additional treatments and scans for each patient, and their QALYs.
Many model details and the mathematical functions that determine the model can be found in the appendix.

Patient level data and calibration
The RFS and OS transition rates (2 and 3 of Fig. 1) were calibrated to patient level RFS and OS data. For this purpose, data of stage I NSCLC patients curatively treated between 2003 and 2013 with SBRT or between 2007 and 2013 with VATS resection was obtained from two Dutch studies [16,17]. Diagnosis was made on basis of PET-CT scans, with or without histological confirmation. Patients were excluded if they had cTNM stage ≥ II, ECOG performance score ≥ 2, a second primary tumor or history of previous cancer. Different types of VATS were included consisting for 90 % of lobectomy, and the other 10 % of bilobectomy, sleeve -and sublobar resection. Patients from both studies were pooled and 1:1 propensity score matched. Before matching, investigators were blinded by temporarily removing the pathological stage as well as clinical outcome variables from the dataset. The propensity score was calculated using a cox-regression model that included FEV% and tumor diameter (see Appendix). The resulting cohort included 242 patients.
A log-rank test showed no difference in RFS and OS between VATS and SBRT (p = 0.68 and p = 0.76), therefore underlying tumor growth is assumed to be the same for VATS and SBRT, and it was decided to pool the patients before calibration ( Fig. 2 and Appendix Fig. 2).
Calibration was performed in the order in which the outcomes influence each other. First, RFS was used to calibrate the rate at which metastases become detectable (λ detectable ). Secondly, the symptomatic detection rate (λ symptom ) was calibrated to the proportion of patients with symptomatic detection of recurrences [19,20]. Thirdly, the lethal tumor burden (β lethal ) was calibrated to the OS of patients without systemic therapy for recurrent disease. Finally, the effect of systemic therapy on survival (β systemic ) was calibrated to the OS of patients with systemic therapy for recurrent disease. Additional information on these procedures can be found in the appendix.

Systemic therapy
After calibration of β lethal and β systemic , age-dependent probabilities to receive adjuvant or systemic therapy for the primary tumor or recurrences were added to the model. These probabilities were specified in age groups, of 40-69, 70-79, and 80-99 years, because within these groups, treatment probability was similar over the age range. The Dutch cancer registry data was used to estimate the probabilities [22].

Excess mortality
The excess mortality of VATS resection over SBRT as shown in transition 4 of Fig. 1 was simulated using the observed difference in the 90-day post-treatment mortality between VATS resection and SBRT, based on statistics reported by Stokes et al. [23]. For systemic therapy, excess mortality was assumed to be the difference in the 30-day death rate for systemic treatment and the DOC rate, using the weighted average over age and gender groups [24].

Health utilities
The health utilities were taken from a previous study [29]. The baseline treatment is assumed to affect health utilities for up to 3 months. After that, a treatment independent health utility was used to weight the time before a recurrence, and after recurrent disease till death.
Health utility decrements for adjuvant and post recurrence systemic therapy are assumed, lasting 3 months from the start of treatment. These were calculated by multiplying the frequencies of adverse events reported by Rittmeyer et al. with the corresponding health utilities decrements reported by Nafees et al. [30,33]. Default health utility in the post recurrence period was assumed to be equal to the reported health utility in stable disease without side effects [33], although health utility was assumed to be lowered in the last 3 months before death. The latter health utility was assumed to be equal to the health utility reported for progressive disease with high levels of symptoms [34,35].

Costs
Costs were derived from a Dutch 2012 costing study in NSCLC by selecting stage I SBRT and VATS patients from the costing study database [32]. All costs were inflated to 2018 according to the Dutch health economic guidelines [36].
Several types of costs were analyzed separately for patients receiving VATS resection or SBRT and pooled into a single parameter describing the baseline treatment costs of VATS and SBRT, respectively (Appendix).
During RFS, €26,617 was added for each additional surveillance scan according to the Dutch surveillance schedule [3]. The average costs after detection of a recurrence were assumed to be equal to the average total costs in stage IV [32].

Base case simulation
For the simulated cohort, the clinical pathway of the model (Fig. 1) keeps track of the time of recurrence, time of death, cause of death, and both discounted and undiscounted life-years (LYs), QALYs, and total costs for each patient. Subsequently, the population average is calculated. The Dutch 1.5 % discount rate for costs and 4 % for effects were used for the base case, and the 3 % WHO rates were used for comparison of the model outcomes to other studies [36,37].

Additional analyses
Univariate sensitivity analyses were performed by repeating the base-case simulation with each one of the parameters shown in Table 1 set to their upper and lower limits of 95 % CIs respectively.
A probabilistic sensitivity analysis was performed with all parameters used in the univariate sensitivity analysis using distributions shown in Table 1. 10,000 simulations were run using a Latin hypercube algorithm [38].
The appendix contains the following additional analyses: a validation experiment using real world data, the effect of age on the outcomes, and the effect of the metastatic prevalence.

Health outcomes
Average undiscounted LYs lived are 8.51 for VATS resection and 8.55 for SBRT, and the average undiscounted QALYs are 6.70 for VATS resection and 6.75 for SBRT. Discounted outcomes with a 1.5 % discount rate are 5.81 discounted QALYs for VATS resection and 5.86 QALYs for SBRT. When using WHO 3 % discount rates, the resulting QALYs are 5.26 for VATS resection and 5.31 for SBRT.

Costs
In the base case scenario, the average undiscounted costs of VATS resection are €29,269 versus €21,175 for SBRT. After treatment of the primary tumor, costs in the VATS resection and SBRT groups are the same. Further analysis reveals that hospital costs are decisive for the difference in baseline costs (Appendix). Hospital costs comprise 45 % of the total baseline costs for VATS resection versus 17 % for SBRT, and are also the largest contributor to the variation in costs between patients. Discounted costs of VATS resection are €28,805 versus €20,710 for SBRT when a 4 % discount rate was used, and €28,877 for VATS resection and €20,782 for SBRT when a 3 % discount rate was used.

Cost-effectiveness
SBRT dominates VATS resection in the base-case scenario, although the average difference in QALYs of 0.05 between VATS resection and SBRT is small. Average difference in discounted costs is −€8,095. The incremental cost-effectiveness ratio (ICER) is thus −162,334 €/QALY. Note, however, that negative ICERs should be carefully interpreted, as negative ICERs are in in principle ambiguous; they can refer to a less effective and more expensive treatment, or a more effective and less expensive treatment, which is the case for SBRT here.

Sensitivity analyses
Univariate sensitivity analyses showed that the ICER becomes positive when SBRT is more expensive than VATS resection ( Fig. 3 and Appendix Fig. 6). The model-predicted cost difference between VATS resection and SBRT is most sensitive to the baseline costs of VATS resection and SBRT.
The difference in QALYs is determined by the VATS excess mortality, and the health utility parameters for SBRT, VATS and systemic therapy. No scenarios exist where VATS resection is more effective when only a single parameter is changed within its 95 % confidence interval (Appendix Fig. 6). Fig. 4 depicts the difference in discounted QALYs and costs between VATS resection and SBRT for 10.000 probabilistic draws, each simulating 100.000 patients. The figure shows that SBRT is more effective and less expensive than VATS resection in 68.1 % of the simulations [39]. Likewise, VATS resection is more effective and less expensive in 6.2 % of the simulations. The cost-effective strategy in the other two trade-off quadrants is determined by the willingness to pay threshold. A €50.000 threshold was used according to the Dutch guidelines and the proportional shortfall method [36]. Using this threshold, SBRT is costeffective in 82 % of the simulations. Depending on the willingness to pay threshold SBRT is dominant in at least 74 % up to 94 % of the simulations shown by the cost-effectiveness acceptability curve (Fig. 5).

Conclusion and discussion
This cost-effectiveness analysis of SBRT versus VATS resection in stage I NSCLC patients found that SBRT is slightly more effective (0.05 QALYs), and less expensive (-€8,095) than VATS. Probabilistic sensitivity analyses (PSA) showed that SBRT is the most cost-effective option in at least 74 % of the simulations depending on the willingness to pay threshold used. The most important factors that determine the difference in outcomes between SBRT and VATS resection are related to the baseline treatment. Excess mortality and the negative effects of treatment on health utility are important factors that cause the difference in QALYs. Hospital and intensive care use are the main drivers for cost differences between SBRT and VATS.
These findings are relevant for both clinicians and policy makers. There has been an ongoing debate on the use of SBRT in stage I NSCLC [40]. When SBRT was introduced for treatment of tumors, surgery was widely considered the best curative treatment [41]. SBRT was first tested on inoperable patients, and has become the recommended treatment in this group, leading to survival improvements. However, the survival of inoperable patients was lower than operable patients, and made it difficult to enroll operable patients into RCTs to test equivalence of both treatments [9,42].
Without RCT data available, lower levels of evidence such as propensity-score matched comparisons are still accumulating, and the general opinion on potential equivalence of both options is shifting. At the same time, an increase in the usage of SBRT is observable [43]. Cost-effectiveness analyses can contribute to guideline adaptations and to reimbursement decisions. Against that background, we performed this cost-effectiveness study, essentially demonstrating that both treatment options result in similar patient outcomes, with a small QALY benefit for SBRT, and at a small decrease in costs.
The cost-effectiveness of SBRT versus surgery has been analyzed in the past by Paix et al., and Shah et al. [12,13]. Paix et al. found 16.35 and 15.80 discounted QALYs for SBRT and surgery respectively, using a pooled dataset of two RCTs containing 57 medically operable patients [9]. This is significantly higher than the QALYs estimated in this study or found by Shah et al. These QALYs are also much higher than we would expect from the clinical practice.
Shah et al. found a 0.68 QALY difference between SBRT and lobectomy in favor of lobectomy in clearly operable patients. The difference with our study outcomes can be explained by two things. Firstly, Shah et al. used pathological stage to calculate recurrence rates, introducing selection bias in the model. When the choice between VATS resection and SBRT is made, the pathological stage is still unknown, and a cost-effectiveness comparison should therefore be based on the Fig. 3. Tornadodiagram of univariate sensitivity analyses of incremental discounted ICERS. The bars represent the range of ICERS found between the lower and upper value of the 95 % CI of each parameter. SBRT is more effective than VATS in all simulated univariate sensitivity scenarios (Appendix Figure 6). A positive ICER in this figure therefore means that SBRT is both more effective and more expensive than VATS. This is only the case when the pooled baseline costs of VATS are lower than those of SBRT. clinical stage.
Secondly, no disutilities were calculated for the most common complications of lobectomy (postoperative pain and dyspnea), while disutilities of −0.249 and −0.268 were calculated for chest wall pain and radiation pneumonitis after SBRT. However, literature suggests that the average disutility caused by complications should be higher for lobectomy than for SBRT [44,45].
With respect to costs, Shah et al. presented a total lifetime healthcare costs estimate of $48.713 for VATS resection and $40.107 for SBRT in clearly operable patients. Paix at al. report total lifetime health-care costs of €10.727 for surgery and €9.234 for SBRT, while we found €26.877 for VATS resection and €19.444 for SBRT. These differences can be explained by differences between countries, and the sources of costs taken into account. Only our study included costs for intensive care and in-patient and outpatient hospital day costs at baseline. This had a large impact on the difference in baseline costs between VATS resection and SBRT in our analyses.
There are a number of limitations to this study that need to be addressed.
Firstly, we used non-randomized and retrospective data to develop the cost-effectiveness model. Ideally, cost-effectiveness analyses are performed using RCT data with long term follow-up for adequate assessment of treatment effects, possibly supplemented with real-world data (RWD) on recurrence and survival patterns. RWD is often affected by selection bias, which makes it difficult to correctly estimate treatment effects. Internationally available RCT data on SBRT and surgery, however, consists of 58 patients only with 3 years of follow-up [9]. This sample size is unfortunately too limited to provide an accurate assessment of long-term recurrence-free and overall survival in this patient group.
Therefore, we performed model-based simulation of an RCT in the Dutch population based on observational studies. Although these studies are not RCT data, propensity score methods were applied to minimize selection bias. This reduces confounding, but also decreases the number of patients used to estimate health utilities (N = 82), excess mortality (N = 27.200), RFS and OS (N = 242).
To guarantee objective matching, the investigators were blinded to the pathological stage and outcomes of the patients during the matching procedure. Matching on clinical stage may increase the local recurrence rate and the presence of positive lymph nodes or benign disease in the included patients. This will lead to increased comparability between the VATS and SBRT groups. 29 VATS and 63 SBRT patients did not have histological confirmation before propensity score matching of the survival data. Four patients in the VATS group had benign disease, which is higher than the expected value for SBRT (see Appendix). This may also explain why RFS and OS rates and the underlying disease are very similar between SBRT and VATS groups.
The remaining parameter uncertainty was addressed in the PSA and does not change our main study conclusions. However, in a PSA, potential systematic biases, may still be present due to residual confounding cannot be addressed.
Secondly, it should be noted that the model was built using Dutch databases assuming care according to the Dutch guidelines. For example, the Dutch follow up protocol was assumed, which prescribes  Table 1, using a Latin Hypercube algorithm. The X-axis represents the incremental discounted quality adjusted life-years, the Y-axis represents the incremental discounted costs. The grey diamonds represent all simulations with random parameters, and the base case is depicted by the black square. The dashed line is the Dutch willingness to pay threshold of € 50.000 as threshold value for one QALY [36]. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Fig. 5. Cost-effectiveness acceptability curve, depicting the percentage of PSA model simulations in which VATS or SBRT, respectively, is the optimal treatment choice as a function of the willingness to pay-threshold. If the willingness to pay is zero, SBRT becomes optimal in 74.3 % of the PSA simulations. This percentage increases gradually when the willingness to pay increases, and eventually reaches a plateau around 94 %. SBRT is optimal in 81 % of the simulations, if we use proportional shortfall to determine the willingness to pay threshold [36]. multiple CT scans, which is not a universal model across Europe or North America. As the nature of follow-up determines observed transition times to either relapse or detection of metastatic disease, tumor growth rates in the model were calibrated to observed RFS, taking this Dutch follow-up schedule into account in the modelling. In theory, the calibrated underlying tumor growth could therefore be considered 'universal', but this should be validated by simulating observational studies in countries with a different follow-up schedule and comparing model-predicted RFS to observed RFS. Apart from this aspect of validation, note that the precise follow-up schedule in model can be expected to have limited impact on the comparison between VATS resection and SBRT with respect to total costs and QALYs, as the same schedule is followed in both arms. Nevertheless, it remains important to realize that this economic evaluation is carried out in a Dutch context.
Thirdly, costs data were not propensity score matched (N = 185). Propensity score matching selects similar patients from both groups, which in the case of VATS resection and SBRT would most likely mean the unhealthier VATS patients and the healthier SBRT patients [46]. If propensity score matching would have been feasible, this would most likely have increased the differences in costs between VATS resection and SBRT in the cost dataset.
Finally, more modern procedures such as single port VATS and RATS, and MRI guided SBRT are likely to affect both the costs and effects of the procedures. On these topics there is still not enough longterm data available to make any conclusions, although differences in QALYs and costs may become even smaller. It is therefore important to update current economic evaluation when additional data becomes available.
To conclude, we constructed a microsimulation model to combine available Dutch evidence on survival, costs, and health utilities in stage I NSCLC. We found that SBRT dominates VATS resection in the majority of probabilistic model simulations.

Funding
HBW and VMHC received unrestricted grants from Novartis Pharma BV.
NL is employed by AstraZeneca BV as Market Access & Health Economics manager.
SS reports grants and personal fees from Varian Medical Systems, ViewRay Inc., AstraZeneca, Celgene, and Eli Lily, outside the submitted work.
All funding agreements, employments and personal fees ensured the authors' independence in designing the study, interpreting the data, writing, and publishing the report.