Novomics

Sofrware Information

1. Use of nDx 1 Software is only available to partners who are registered separately in Novomics.

2. You must pass the certification procedure on the next screen before you can use the software.

3. Click the "Certification" button to install it automatically on your connection PC.

S/W InstallPlease enter your
partner number and certification number
Partner No. Certification No.

* Contact us : +82-2-2068-3700

Scientific
Background

Development of Algorithm for Prognosis Classification (Single Patient Classifier, SPC)

Gene expression-based Single Patient Classifier (SPC), aiming to explore treatment options by patient stratification according to risk parameters, was developed and validated through a multi-step strategy. To identify potential candidates for the single patient classifier genes, the gene expression profiles of stage I-IV gastric cancer patients with gastrectomy were analyzed and identified five molecular subtypes using five cohorts (n=1,259 from GSE13861, GSE84437, GSE15459, TCGA, GSE62254).

Cohort N Measuring Platform Specimen Type
GSE13861 64 Microarray
(Illumina)
Fresh-frozen Tissue
GSE84437 433
GSE15459 200 Microarray
(Affymetrix)
GSE62254 300
TCGA 262 Sequencing (Illumina)

To discover hidden patterns in mRNA expression profiles of 1,259 GC patients, machine learning and representation learning were applied for analysis.

NMF identified five molecular subtypes of GC, which were annotated upon characterization by gene set enrichment analysis (GSEA). To further structuralize the GC transcriptome, WGCNA with the Dynamic Tree Cut was performed and detected 34 gene-modules, which were characterized by gene ontology enrichment analysis (GOEA). Next, using gene-module eigengene values, 21 gene-modules were identified to have high association with the NMF subtypes (Pearson’s correlation coefficient, |ρ| > 0.3). As several gene-modules showed inter-modular correlation between them, correlated gene-modules were grouped mathematically and ontologically. Finally, the subtype-associated gene-modules were further reduced to six GC molecular features.

  • NMF: 5 GC Molecular Features

    NMF (Non-negative Matric Factorization), a machine learning algorithm, has been applied to the analysis of microarray data, obtained from freshly frozen gastric cancer samples, for identification of molecular subtypes of stomach cancer.
    NMF identified five molecular subtypes of GC, which were annotated upon characterization by gene set enrichment analysis (GSEA): (i) inflammatory subtype with immune characteristic, (ii) intestinal subtype with high expression of intestinal epithelial differentiation and proliferative genes, (iii) gastric subtype with high expression of gastric mucosa-specific genes, (iv) mixed-stromal subtype with heterogeneous transit-amplifying features, and (v) mesenchymal subtype with EMT and mesenchymal hallmarks. A set of pooled database from three independent data sets (GSE15459, TCGA, GSE62254) was used to validate the classified molecular subtypes of GC.



    ※ NMF, Non-negative Matric Factorization: is a new approach to analyze gene expression data that models data by additive combinations of non-negative basis vectors.

    Brunet JP, Tamayo P, Golub TR, Mesirov JP. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9.
    Lei Z, et al., Identification of molecular subtypes of gastric cancer with different responses to PI3-kinase inhibitors and 5-fluorouracil. Gastroenterology. 2013 Sep;145(3):554-65. Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature. 2014 Sep 11;513(7517):202-9.
    Cristescu R, et al., Molecular analysis of gastric cancer identifies subtypes associated with distinct clinical outcomes. Nat Med. 2015 May;21(5):449-56.

  • WGCNA: Gene Module Analysis

    WGCNA (Weighted Gene Co-expression Network Analysis) has been used to define sets of genes that are highly correlated. 34 gene-modules (M01~M34) which are highly interconnected genes were detected and 21 gene modules among them were selected as they showed high association with the 5 subtypes.



    ※ WGCNA, Weighted Gene Co-expression Network Analysis: is a clustering method to understanding correlations amongst genes and grouping genes into modules.

    Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008 Dec 29;9:559.

  • Correlation Analysis of Gene Module and 5 GC Molecular Features

    As several gene-modules showed inter-modular correlation between them, we grouped mathematically and ontologically correlated gene-modules. Finally, the subtype-associated gene-modules were further reduced to six GC molecular features.
    Spearman correlation has been used to evaluate relationship between 21 gene-modules and 5 subtypes. The intensities of red and blue color indicates values of positive and negative correlations.



    (i) immune (M06, M07, M15, M26, and M28), (ii) intestinal epithelial (M22), (iii) gastric epithelial (M18), (iv) proliferative (M04, M05, M12, M13, M16, M23, M27, M32, and M33), (v) stem-like (M08), and (vi) stromal (M01, M03, M17, and M25).

  • 3 Core Biological Features

    A study was designed to select the prognosis related classifier genes from a series of transcriptomic analysis and develop an algorithm in a supervised manner. Based on the study, three core biological features of gastric cancer; Immune (IM), Stem-like (ST), and Intestinal epithelial (EP) features were chosen and these features have biological relevance:

  • Core Classifier Gene Selection (4 Classifier Genes)

    The core gene-module of M26 (48 genes), M08 (131 genes), and M22 (60 genes) were selected for candidate genes for IM, ST, and EP classifiers.
    The aim of this study is to develop a RT-qPCR-based platform and plan using FFPE samples, the candidate genes were confirmed for compatibilities between platforms (microarray vs. RT-qPCR) and sample preparation methods (fresh-frozen vs. FFPE). We analyzed the correlation between gene expression profiles obtained by both platforms and sample preparation methods. The passing criterion for both analyses was a Pearson’s correlation coefficient ρ > 0.5. (12 Candidate classifier genes)
    According to the width of expression range, GZMB, SFRP4, and CDX1 was selected for IM, ST, and EP classifier genes, respectively. Additionally, the infection-related WARS was included as an auxiliary IM classifier for more stringent selection of the immune-high patients.

  • Reference Gene Selection

    Reference genes which aim to normalize the expression level of classifier genes, were reviewed from published journals and the reference genes of commercialized assay kits. Based on review, candidate genes; ACTB, ATP5E, HPRT1, PGK1, GPX1, RPL29, UBB, and VDAC2 were considered and the best combinations of five reference genes; ACTB, ATP5E, HPRT1, GPX1, and UBB were finally selected.