br Validation in the patient samples
2.5. Validation in the patient samples
Patients diagnosed with squamous cell carcinoma of oral cavity during 2010 to 2015 were selected retrospectively based on the specific inclusion criteria (1) treatment naïve patients (2) availability of tissue samples (3) treatment with curative-intent and (4) minimum of two years follow up. The exclusion criteria were patients treated prior with
S. Mohanta et al.
chemotherapy/radiation. The tissue samples stored in RNA later (Ambion, Thermo Fisher Scientific, MA, USA) were collected for quantitative 142861-00-5 profiling from the tissue repository at the center.
2.5.2. Gene expression of selected markers
Total RNA was extracted from the tissues using Nucleospin® RNA/ Protein extraction kit (Cat # 740,933.50, Macherey-Nagel, Germany) and converted into complementary DNA (High Capacity cDNA Conversion kit; Cat# 4374966, Applied Biosystems, CA, USA) as per the manufacturer’s instructions. The primers for selected markers were evaluated for specificity using Basic Local Alignment Search Tool ana-lysis (National Center for Biological Information, NLM, US). The effi-ciency of each primer in Real Time PCR was assessed as per standard protocols and calculated using the formula, Efficiency = 10 (-1/slope). All reactions were performed in triplicates using the Roche LightCycler® 480 Real time PCR system (Roche Diagnostics, Germany) using the Kapa SYBR Green PCR Master Mix (Cat # KK4602, Kapa Biosystems, MA, USA). A set of reference genes (N = 4) were assessed using the RefFinder (Reddy et al., 2016; Xie, Xiao, Chen, Xu, & Zhang, 2012) and 2 of the best reference genes were selected for further analysis. The relative change in expression was evaluated using the geometric mean of the selected reference genes (Pfaffl, 2001). The fold level of each marker in recurrent and non-recurrent tumor samples was calibrated against the median expression in non-recurrent samples.
2.5.3. Statistical analysis
The statistical analysis was performed by STATA11.2 (College Station, USA). Receiver Operating characteristic curve analysis was used to evaluate the predictive power of each of the biomarkers, the optimal cut point that yielded the maximum sensitivity and specificity was determined for each biomarker. Receiver operating curves were then plotted on the basis of the set of optimal sensitivity and specificity values. Kaplan Meier survival curve was used to estimate the associa-tion of survival function with marker expression. Cox regression was used to predict the risk of markers expression for recurrence and sur-vival. Comparison between two groups was evaluated by student’s t-test, independent (unpaired) samples. p < 0.05 is considered as sta-tistically significant.
3.1. Selection of markers
The significant genes entities (p < 0.05; fold change > 2) were identified in a microarray-based meta-analysis of head and neck cancer studies carried out previously in our lab (Reddy et al., 2016). Analysis of series that were carried out in the Affymetrix platform (U133 plus 2) with 626 samples (normal = 207, tumor = 419) identified a total of 12,079 genes (Analysis I). In the Agilent platform (4 × 44k G4112 F), 44 samples (normal = 18, tumor = 26) were included and 8493 genes (Analysis II) were identified from the analysis. This head and neck cancer-specific significant gene list was used for downstream analysis.
In order to obtain a comprehensive gene list specific to head and neck, three marker profiles from the CSC database; cancer stem cell markers (n = 57), related genes (n = 1769) and functionally related genes (n = 9475) were combined to generate a list of 1614 non-re-dundant markers (Supplementary Table-1). A comparison of this marker list with the markers identified in Analysis I (12,079) and Analysis II (8493) identified a common list of 809 and 703 genes re-spectively (Fig. 1 A, B, C) (Supplementary Table-2, 3). As a next step, these genes were compared to the CSC database and lists of 364 con-cordant genes were found. The final list of head and neck cancer-as-sociated cancer stem cell list (n = 221) was arrived at after comparing the regulatory trend of the genes across Agilent and Affymetrix plat-forms (Supplementary Table-4). Archives of Oral Biology 99 (2019) 92–106
3.2. Functional annotation
The final gene list (n = 221) was analysed for gene ontology, pathway analysis and interaction and Gene Set Enrichment Analysis. Gene ontology analysis carried out in the Toppfun database indicated that in the molecular functions, the binding category showed maximum representation of these genes in enzyme binding category (n = 60, GO: 0019899), protein containing complex (n = 57, GO: 0044877) and in signalling receptor binding (n = 55, GO: 0005102). Similarly, in bio-logical processes, the regulation of cell proliferation (n = 98, GO: 0042127), programmed cell death (n = 85, GO: 0012501) and loco-motion (n = 83, GO: 0040011) showed maximum representation of these genes (Fig. 1 D, E). Pathway analysis in Toppfun showed that the highest percentage of genes fall in cancer related pathways (bladder cancer: 24%; endometrial cancer: 18%), followed by epidermal growth factor receptor-tyrosine kinase inhibitor resistance (17%) (Table 1). Gene Set Enrichment Assay analysis of these 221 concordant genes identified the major gene sets as epithelial-mesenchymal transition signalling (n = 30), p53 gene family (n = 19), Tumor Necrosis Factor Alpha signalling (n = 19), hypoxia gene family (n = 17) and apoptosis (n = 16) (Table 2).