Heterogeneity of PD-L1 expression in non-small cell lung cancer; implications for specimen sampling in predicting treatment response

OBJECTIVES
PD-L1 expression on tumour cells can guide the use of anti-PD-1/PD-L1 immune modulators to treat patients with non-small cell lung cancer (NSCLC). Heterogeneity of PD-L1 expression both within and between tumour sites is a well-documented phenomenon that compromises its predictive power. Our aim was to better characterise the pattern and extent of PD-L1 heterogeneity with a view to optimising tumour sampling and improve its accuracy as a biomarker.


MATERIALS AND METHODS
Expression of PD-L1 was assessed by immunochemistry using the SP263 clone in 107 resected primary NSCLCs and their nodal metastases. Intra-tumoural heterogeneity, defined as 'small-scale' (mm²), 'medium-scale' (cm²) and 'large-scale' (between tumour blocks), was assessed by digital imaging using a novel 'squares method'. Inter-tumoural heterogeneity between the primary tumours and their nodal metastases and between N1 and N2 nodal stages was also assessed.


RESULTS
The majority of tumours demonstrated intra-tumoural heterogeneity (small-scale 78%, medium-scale 50%, large-scale 46%). Inter-tumoural heterogeneity between the primary and nodal metastases was present in 53% of cases and, in 17%, between N1 and N2 disease. These differences were occasionally sufficient to lead to discrepancy across the ≥1%, ≥25% and ≥50% cut-offs used to guide therapy.


CONCLUSION
Heterogeneity of PD-L1 expression is common, variable in scale and extent, and carries significant implications for its accuracy as a predictive biomarker. Extensive sampling reduces, but cannot eliminate, this inaccuracy.


Introduction
The treatment of patients with non-small cell lung cancer (NSCLC) 1 has been revolutionised by the emergence of immune-checkpoint inhibitors or 'immune modulators' (IMs), particularly those targeted against tumours exploiting the PD-1/PD-L1 (programmed death-1; programme death ligand-1) checkpoint as a mechanism of immune escape. [1][2][3][4] Currently, the level of expression of PD-L1 as detected by immunohistochemistry (IHC) is the only accepted biomarker for guiding the use of IMs to treat NSCLC, numerous clinical trials having shown that expression of PD-L1 by the tumour or tumour-associated immune cells is related to response to the drug. [1][2][3][4][5][6] Despite its rapid implementation in the routine profiling of NSCLC, PD-L1 expression as a predictor of response has several weaknesses compromising its predictive power. Amongst these are the multiplicity of assays, differing expression level percentage cut-offs for assigning 'positive' status and guiding therapy, and the biological fact that PD-L1 expression is heterogeneous. 7,8 These drawbacks have resulted in a confusing, mixed status of PD-L1 IHC as both a companion and complementary diagnostic and have raised justifiable doubts about its efficacy. [7][8][9][10][11][12][13] Despite these doubts, reliance on PD-L1 IHC for predicting response of NSCLC to IMs 1 Abbreviations: PD-L1, programmed-death-ligand-1; IHC, immunohistochemistry; IM, immuno-modulators; TPS, tumour proportion score; COV, co-efficient of variation; IOD, index of dispersion means it is imperative that, in the absence of alternative proven biomarkers, every effort should be made to maximise its utility in guiding clinical decisionmaking.
Crucial to addressing the problem of heterogeneity in the context of assessing PD-L1 expression is knowing how best to sample the tumour. Many clinical specimens used for the diagnosis, classification and profiling of NSCLC, including endoscopic bronchial ultrasound (EBUS)-guided aspirates, endobronchial and transthoracic needle biopsies, are very small, and sampling error is problematic in obtaining maximum accuracy. 8,[14][15][16] Understanding the pattern and extent of heterogeneity of PD-L1 expression is a prerequisite for developing and adapting approaches to tumour sampling and ultimately increasing the predictive power of the test. In order to help address this challenge, we thought it would be of value to try and assess the pattern and extent of intra-tumoural and inter-tumoural heterogeneity of PD-L1 expression and thereby develop some practical guidance for those obtaining these crucial specimens.

Specimens studied
We studied 107 resected NSCLCs consecutively collected and archived by the Accompanying clinical data were available within the LLP database, from casenote review. Details of these 107 tumours are given in Table 1. Ethical approval was granted by the Liverpool Research Ethics Committee (reference number 97/141).

Detection and assessment of PD-L1 expression
Serial sections 4μm thick were stained with haematoxylin and eosin (H&E) for assessment of general morphology and immuno-stained for PD-L1 using the Ventana SP263 antibody clone with a validated kit and protocol. 19 Slides were scanned at x20 magnification to create digital images using the Aperio CS2 Scanscope slide scanner and Aperio Scancope console software. 20 Images were viewed using either Aperio ImageScope or the opensource QuPath software package. 20,21 Expression of PD-L1 was assessed according to the Roche Ventana SP263 interpretation guide 22 by two pathologists trained and experienced in its interpretation and a concordant score agreed in all cases. The number of PD-L1+ve tumour cells as a proportion of the total number of tumour cells (the tumour proportion score, TPS) was expressed as a percentage.

Assessment of heterogeneity
Intra-tumoural heterogeneity was quantified comparing (a) different samples from the same tumour and (b) different samples from its nodal metastases.
Inter-tumoural heterogeneity was assessed comparing samples from the primary tumour with samples from its nodal metastases, and samples from separate nodal metastases.

Intra-tumoural heterogeneity
First, small scale heterogeneity, defined as heterogeneity within an approximately 1cm² area of tumour was assessed using a grid split into 1mm squares that was overlaid on to the section ( Figure 1). Only sections containing a continuous area of viable tumour were assessed; zones of confluent necrosis or fibrosis were avoided and sections in which these were extensive were not used. The PD-L1 TPS was assessed for every 1mm square to give 100 readings for each area of 1cm². Between one and three 1cm squares were assessed in every section studied by this 'squares method' for primary tumours. Second, medium scale heterogeneity, defined as heterogeneity between 1cm squares, was examined for primary tumours to give a broader assessment of intra-

Inter-tumoural heterogeneity
With the above data collected for primary and metastatic tumours individually, inter-tumoural heterogeneity, that is variability between primary tumours and their nodal metastases as well as between different nodal metastases, could then be assessed. For both primary and secondary tumours, PD-L1 TPS for inter-tumoural comparison was calculated from all available PD-L1 scored tissue.

Statistical analysis
Statistical analysis was performed using IBM SPSS statistics software, version 25 (IBM Corp). Variation of data was described using index of dispersion (IOD) and compared using co-efficient of variation (COV). Comparison of COV was performed according to Forkman. 23 All significances were taken as p<0.05.

Study population
Basic demographic, clinical and pathological details of the 107 subjects and tumours studied are given in Table 1. No patient from whom these tumours were resected had received neoadjuvant chemotherapy or radiotherapy treatment.

Small and Medium Scale Heterogeneity
There was sufficient quantity and quality (>1cm² of continuous viable tumour cells) for assessment by the 'squares method' in 50 of the primary tumours and in 19 of these there was sufficient tissue for 2 blocks to be studied. In 16 tumours, there was sufficient tissue (>2cm² and ≥200 viable tumour cells) for assessment of multiple, non-overlapping 1cm² squares in a single section (two squares in 14 and three squares in 2) such that 87 1cm² squares were Data on small scale heterogeneity, within an area of 1cm², are summarised in Table 2. In 68 primary tumours (78%) the IOD was >1. In the 66 primary tumours scoring a TPS of ≥1%, 32 (48%) had a standard deviation (SD) greater than their mean.

Large scale heterogeneity
There was sufficient tissue in 61 primary tumours to permit assessment of large scale heterogeneity, that is variability between two tissue blocks. In 33 of these (54%), there was no difference in TPS between the two blocks. 28 cases (46%) had a TPS change of ≥1% and 17 cases (28%) had a TPS change of ≥10%.

Intratumoural heterogeneity within nodal metastases
In the nodal metastases from 26 cases there was sufficient assessable tumour tissue (≥100 viable tumour cells) for assessment of heterogeneity by the 'squares method'. In 19 metastases (73%), the IOD was >1. In the 23 nodal metastases scoring a PD-L1 TPS of ≥1%, 6 (23%) had a SD greater than their mean. These results are summarized in Table 2.
Intra-tumoural heterogeneity within primary tumours as assessed by the 'squares method' had a greater COV than it did in their nodal metastases, but the difference was not statistically significant (146 vs 98; p=0.3706).

Primary versus matched nodal metastases
PD-L1 expression by the primary tumour and its nodal metastases was compared in all 107 tumours studied. In 50 tumours, there was no difference.
In the remaining 57 (53%) there was a difference of ≥1%, with 30 displaying higher expression by the primary than by their nodal metastases and 27 the converse. The median difference in TPS between the primaries and their nodal metastases was 10% (range 1-94). In 25 cases (23%), this difference was sufficient to move the TPS across a clinical guidance cut-off. In 13 cases (12%), the PD-L1 TPS was ≥1% in the primary but 0% in its metastases. In 3 cases (3%), the PD-L1 TPS was 0% in the primary, but ≥1% in its metastases. These data are summarised in Table 3 and example shown in Figure 3a and 3b.

Variation between nodal metastases
In 35 of the tumours studied, there was sufficient tissue from nodal metastases for variation in PD-L1 expression between them to be studied; N1 vs N1 in four cases and N1 vs N2 in 31. In 29 cases (83%), there was no difference between stations, including N1 vs N1. In the remaining 6 cases (17%), the difference between N1 and N2 stations was ≥10%. In all of these, it was sufficient to move the TPS across a cut-off. These results are summarised in Table 3 and example shown in Figure 3c and 3d.

Discussion
The extent of expression of PD-L1 as detected by IHC is currently the only clinically-validated means of determining the likely response of NSCLC to IMs. [1][2][3][4] Characterising and understanding the strengths and limitations of PD-L1 expression in this context are crucial to improving its predictive power.
Several studies have attempted to quantify how many biopsy specimens of a NSCLC are required to provide accurate coverage of PD-L1 expression within a tumour [24][25][26] , many concluding rather obviously that, for example, multiple core biopsies are likely to provide greater accuracy than one or two and that tumours displaying marked heterogeneity still present significant difficulty.
The present study concurs with this; increasing quantities of tissue for assessment will clearly improve its accuracy, but even a whole tissue section might still not be representative of the entire tumour. Even the detailed and extensive study of a large series of tumours that we describe here fails to reveal any particular pattern to this heterogeneity, which seems highly variable in extent and scale. This observation holds for not only the primary tumour, but also its nodal metastases. Intra-tumoural heterogeneity is unlikely to be random, but reflects ill-understood aspects of the interaction between the tumour and the immune environment and underlying clonal variation within the tumour. More sophisticated analytical approaches are required to untangle these relationships.
Inter-tumoural heterogeneity of PD-L1 expression is a no less significant challenge in terms of achieving high accuracy and predictive power. Several studies have examined PD-L1 expression between a primary NSCLC and its metastases [27][28][29] and, though approaches and methodologies differ, the general consensus of these is that expression of PD-L1 varies between tumour sites in the majority of cases. Our investigation supports this, revealing a fairly equal divide between tumours in which expression of PD-L1 'increases' or 'decreases' as they metastasise into regional lymph nodes, with complete loss of PD-L1 expression during metastasis occurring with more frequency than its apparent de novo expression in the environment of the node. An important observation is that this variation between the primary and its metastases was often sufficient to cross one of the cut-off thresholds used for guiding management.
This raises the important question of which score should be acted upon. It would seem reasonable to assume that a tumour deposit expressing high levels of PD-L1 would be likely to respond to an IM, whereas a different deposit expressing low levels would not; this might be one cause for variable response of different lesions of a disseminated tumour. On the grounds that any response would be beneficial, whenever such variability is apparent, it would seem appropriate to act on the highest score.
Ultimately, in the context of NSCLC, expression of PD-L1 is being determined in an already heterogeneous population of tumour cells further affected by their interaction with the tumour micro-environment (TME) 30 . Immune escape of NSCLC is thought to require, in addition to PD-L1 expression, specific conditions within the TME, such as the proximity of CD8+ cytotoxic T-cell lymphocytes and a non-suppressive immune environment. [31][32][33][34] With this in mind, it is not surprising that PD-L1 expression varies between a primary NSCLC and its nodal metastases; the environment in the lung, especially the immune environment, is very different from that in a lymph node.
Irrespective of its nature, bronchoscopic, transthoracic needle or EBUS-guided, there is a high risk that a single diagnostic sample of a NSCLC, primary or metastatic, will be inadequately representative for determining something as heterogeneous as PD-L1 expression. Notwithstanding the obvious conclusion that greater accuracy is more likely with a larger specimen and, ideally, multiple biopsies or aspirates from multiple points within a tumour, it is difficult to see how this challenge can be easily overcome. Not surprisingly, therefore, efforts are being made to find an alternative or, more likely, complementary biomarkers to use in conjunction with PD-L1 expression and improve predictive capabilities, with much current interest focussed on tumour mutational burden (TMB) or assessment of the immune environment of the tumour. [35][36][37][38] In the interim, however, with PD-L1 expression still the only validated biomarker for predicting response of NSCLC to anti-PD-1/PD-L1 IMs, an optimal approach to improved tumour sampling may be guided by the intended therapeutic target. Neoadjuvant treatment of NSCLC by IMs is being assessed in current clinical trials 39 and extensive sampling of primary tumour in this setting would seem prudent. Metastasis, however, is a reflection of evolution of the tumour, a manifestation of its inherent drive to survival, and it would seem reasonable to assume that the most advanced and potentially successful component of a disseminated tumour would be the most informative in terms of targeting for biopsy. 30,40,41 When metastases are present, therefore, sampling and testing of these in preference to the primary growth, whenever possible, would seem the most scientifically sound approach and most likely to provide informative information.