A comparative study of biomarker gene selection methods in presence of outliers

Authors

  • M Shahjaman Department of Statistics, Begum Rokeya University, Rangpur
  • N Kumar Department of Statistics, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj
  • AA Begum Bioinformatics Lab., Department of Statistics, University of Rajshahi, Rajshahi
  • SMS Islam Institutitute of Biological Sciences, University of Rajshahi, Rajshahi
  • MNH Mollah Bioinformatics Lab., Department of Statistics, University of Rajshahi, Rajshahi

DOI:

https://doi.org/10.3329/jbs.v25i0.37493

Keywords:

Biomarker genes, DE genes, FCROS, outliers, robustness

Abstract

The main purpose of gene expression data analysis is to identify the biomarker genes by comparing the gene expression levels between two different groups or conditions. There are several methods to select biomarker genes and many comparative studies have been performed to select the appropriate method. However, they did not consider the problems of outliers in their data sets though it is very essential to select the method from robustness point of view due to outliers may occur in the different steps of the gene expression data generating process. In this paper, it is evaluated the performance among five popular statistical biomarker gene selection methods viz. T-test, SAM, LIMMA, KW and FCROS using both simulated and real gene expression data sets in absence and presence of outliers. In the simulated data analysis, it was demonstrated the performance of these methods in terms of different performance measures such as TPR, TNR, FPR, FNR and AUC and based on these measures, it was found that in absence of outliers, for both small-and-large sample cases all the methods perform almost similar. Whereas, in presence of outliers, for small-sample case only the FCROS method perform well than other methods. From a real colon cancer data analysis, it was elucidated that FCROS method identified additional 59 genes that were not detected by the other methods and most of them belongs to the different cancer related pathways.

J. bio-sci. 25: 9-16, 2017

Downloads

Download data is not yet available.
Abstract
2
PDF
2

Downloads

Published

2018-07-18

How to Cite

Shahjaman, M., Kumar, N., Begum, A., Islam, S., & Mollah, M. (2018). A comparative study of biomarker gene selection methods in presence of outliers. Journal of Bio-Science, 25, 9–16. https://doi.org/10.3329/jbs.v25i0.37493

Issue

Section

Articles