Robust diagnosis of non-Hodgkin lymphoma phenotypes validated on gene expression data from different laboratories.

Gyan Bhanot, Gabriela Alexe, Arnold J. Levine, Gustavo Stolovitzky

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


A major challenge in cancer diagnosis from microarray data is the need for robust, accurate, classification models which are independent of the analysis techniques used and can combine data from different laboratories. We propose such a classification scheme originally developed for phenotype identification from mass spectrometry data. The method uses a robust multivariate gene selection procedure and combines the results of several machine learning tools trained on raw and pattern data to produce an accurate meta-classifier. We illustrate and validate our method by applying it to gene expression datasets: the oligonucleotide HuGeneFL microarray dataset of Shipp et al. ( and the Hu95Av2 Affymetrix dataset (DallaFavera's laboratory, Columbia University). Our pattern-based meta-classification technique achieves higher predictive accuracies than each of the individual classifiers , is robust against data perturbations and provides subsets of related predictive genes. Our techniques predict that combinations of some genes in the p53 pathway are highly predictive of phenotype. In particular, we find that in 80% of DLBCL cases the mRNA level of at least one of the three genes p53, PLK1 and CDK2 is elevated, while in 80% of FL cases, the mRNA level of at most one of them is elevated.

Original languageEnglish
Pages (from-to)233-244
Number of pages12
JournalGenome informatics. International Conference on Genome Informatics
Issue number1
StatePublished - 2005
Externally publishedYes


Dive into the research topics of 'Robust diagnosis of non-Hodgkin lymphoma phenotypes validated on gene expression data from different laboratories.'. Together they form a unique fingerprint.

Cite this