b iframe width height src https www
Given the varied mechanisms that are instrumental in specifying the drug resistant and metastatic properties of the cancer stem cells, es-tablishment of a marker panel that can define these properties will help to develop a predictive biomarker panel of clinical significance. Subject to large scale patient validation, the candidate prognosticators identi-fied will enable a triaging of the patients based on the cancer stem cell pattern in the tumor post-surgery/biopsy. This in turn will identify the patients susceptible to poor disease-free and overall survival and thereby facilitate appropriate surveillance or management of the pa-tients. While previous studies have correlated cancer stem CI-898 markers with poor prognosis in patients with head and neck cancers (Fan et al., 2017; Major, Pitty, & Farah, 2013), this study attempts to catalogue the repertoire of oral cancer stem cell-specific and associated markers that correlate with clinico-pathological parameters, recurrence and survival in an effort towards evaluating their utility as predictors of disease outcome.
2. Materials and methods
2.1. Selection of markers
Microarray data obtained from publicly available database, Gene Expression Omnibus (National Centre for Biotechnology Information—http://www.ncbi.nlm.nih.gov/geo/) and Array Express [http://www.ebi.ac.uk/arrayexpress], were analyzed in the Genespring and published previously from our lab (Reddy et al., 2016). The series selected for the initial analysis included data from i) studies carried out in head and neck squamous cell carcinoma patients ii) studies including treatment naïve patients and iii) studies carrying out global profiling of transcriptomics using high-density arrays. Studies/samples including thyroid, oropharynx and nasopharynx were excluded from the study due to their varied aetiologies. The two platforms included in the analyses were Affymetrix [Affymetrix Inc., California, USA] and Agilent [Agilent Technologies, California, USA]. The raw data series profiled by both the platforms were grouped based on the individual technology and each technology (of either platform) was included in the analysis if Archives of Oral Biology 99 (2019) 92–106
at least 2 series were available in the database. The p-value computa-tion (asymptotic) and multiple testing correction (Benjamini Hochberg; false discovery rate) were further performed to obtain gene entities with p-value < 0.05 and fold change > 2.0 (Reddy et al., 2016). The processed data from this analysis was used to identify the cancer stem cell-specific markers in oral cancer.
2.2. Selection of cancer stem cell markers
The publicly available cancer stem cell database (CSC database) (http://bioinformatics.ustc.edu.cn/cscdb) was used for the selection of cancer stem cells and their associated markers. CSC database included the markers specific to cancer stem cells (n = 57) and those related them (n = 1769). Additionally, the database also provided functional annotations (n = 9475) of molecules with cancer stem cell-related functions. These lists were downloaded and combined to obtain a comprehensive list of cancer stem cell markers for further analysis.
In order to evaluate their relevance in head and neck cancer, the cancer stem cell markers were compared individually with the head and neck cancer specific list obtained from meta-analysis of each platform. Furthermore, the two concordant gene lists, Affymetrix vs CSC database and Agilent vs CSC database, were cross compared for the identification of cancer stem cell genes in head and neck/oral cancer.
2.3. Functional annotation using Toppfun enrichment analysis
The significant cancer stem cell markers identified were used for the identification of pathways and gene ontology by using Toppfun (https://toppgene.cchmc.org/enrichment.jsp). The pathways were identified based on significance (p < 0.05) and percentage re-presentation in the data set. The Gene Set Enrichment Assay was per-formed in Molecular Signature database (http://software. broadinstitute.org/gsea/msigdb/index.jsp), while The Search Tool for the Retrieval of Interacting Genes/Proteins database (STRINGv10.5) (http://string-db.org) was used to predict and catalogue the protein-protein interactions between the concordant genes. GeneMANIA (GeneMANIA.org) was used to identify the interaction among the se-lected genes. Cytoscape (Cytoscape v3.6.1) was used to visualize and represent these interactions from Gene mania.
2.4. Validation in The Cancer Genome Atlas (TCGA) database
The final concordant marker list was cross compared with The Cancer Genome Atlas database (n = 313 oral cancer cohort) (http:// www.cbioportal.org) for their mutation, copy number variation and expression status. The validation was carried out at different levels, in the first level; the significant gene panel was assessed for their mutual exclusivity, co-occurrence and co-expression. At the next level, oral cancer patients were classified based on stage (early and late stage), metastasis, risk habits, pathological parameters (perineural invasion, angiolymphatic invasion and extra-capsular spread) and surgical margin along with recurrence, disease-free survival and overall sur-vival. The markers identified were correlated with these parameters. In the final level, the markers were correlated with disease outcome (re-currence and survival) of the patients at the expression (z-scores) level and in terms of survival predictability to identify accurate, cancer stem cell-specific prognosticators.