Comparison of location-scale and matrix factorization batch effect removal methods on gene expression datasets

Emilie Renard; P.-A. Absil

doi:10.1109/BIBM.2017.8217888

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Comparison of location-scale and matrix factorization batch effect removal methods on gene expression datasets

Year: 2017, Pages: 1530-1537

DOI Bookmark: 10.1109/BIBM.2017.8217888

Authors

Emilie Renard, ICTEAM Institute, Université catholique de Louvain, Louvain-la-Neuve, 1348, Belgium
P.-A. Absil, ICTEAM Institute, Université catholique de Louvain, Louvain-la-Neuve, 1348, Belgium

Abstract

Merging gene expression datasets is a simple way to increase the number of samples in an analysis. However experimental and data processing conditions, which are proper to each dataset or batch, generally influence the expression values and can hide the biological effect of interest. It is then important to normalize the bigger merged dataset, as failing to adjust for those batch effects may adversely impact statistical inference. Batch effect removal methods are generally based on a location-scale approach, however less widespread methods based on matrix factorization have also been proposed. We investigate on breast cancer data how those batch effect removal methods improve (or possibly degrade) the performance of simple classifiers. Our results indicate that the matrix factorization approach would deserve greater attention, as it gives results at least as good as common location-scale methods, and even significantly better results in specific cases.

Like what you’re reading?

Already a member?

Get this article FREE with a new membership!

Regularized nonnegative matrix factorization for clustering gene expression data
2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Discovering negative correlated gene sets from integrative gene expression data for cancer prognosis
2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
A comparative study of two matrix factorization methods applied to the classification of gene expression data
2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Comparative Analysis of Transformation Methods for Gene Expression Profiles in Breast Cancer Datasets
2016 IEEE 16th International Conference on Bioinformatics and Bioengineering (BIBE)
An Improved Ratio-Based (IRB) Batch Effects Removal Algorithm for Cancer Data in a Co-Analysis Framework
2014 IEEE International Conference on Bioinformatics and Bioengineering (BIBE)
Sparse nonnegative matrix factorization with the elastic net
2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Characteristic Gene Selection Based on Robust Graph Regularized Non-Negative Matrix Factorization
IEEE/ACM Transactions on Computational Biology and Bioinformatics
GENESHIFT: A Nonparametric Approach for Integrating Microarray Gene Expression Data Based on the Inner Product as a Distance Measure between the Distributions of Genes
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Semiring Rank Matrix Factorization
IEEE Transactions on Knowledge & Data Engineering
FaStaNMF: A Fast and Stable Non-Negative Matrix Factorization for Gene Expression
IEEE/ACM Transactions on Computational Biology and Bioinformatics

Comparison of location-scale and matrix factorization batch effect removal methods on gene expression datasets

Authors

Abstract

Related Articles