Clustering gene expression time series data embedded in a nonparametric setup

Mukti  Khetan; Savita  Pareek; Siuli Mukhopadhyay; Kalyan Das

Authors

Mukti Khetan Department of Mathematics, Indian Institute of Technology Bombay, Mumbai 400 076, India
Savita Pareek Department of Mathematics, Indian Institute of Technology Bombay, Mumbai 400 076, India
Siuli Mukhopadhyay Department of Mathematics, Indian Institute of Technology Bombay, Mumbai 400 076, India
Kalyan Das Department of Mathematics, Indian Institute of Technology Bombay, Mumbai 400 076, India

Keywords:

Dirichlet Process; Monte Carlo EM algorithm; Mixed effect model; Autoregressive process

Abstract

A clustering methodology for time series data is proposed. The idea has been cropped up when a subset of gene expression dataset is used to build up the system model by compressing the information through clustering and then by tracing out inherent patterns in the data. A linear mixed model is considered that accommodates time dependent components. The temporal effects are modelled through an autoregressive process that arises in the dispersion of the random component. The joint distribution of coefficients in the time dependent quadratic function and the random effects are embedded within a non-parametric prior (Dirichlet process prior). Such a non-parametric prior induces clustering in the data. Monte Carlo EM (MCEM) based technique has been considered for estimating the parameters. The best cluster is selected through some heterogeneity measures. A rigorous simulation study has been carried out prior to analysis of a gene expression time series data.

Journal of Statistical Research 2021, Vol. 55, No. 1, pp. 207-224

Abstract
101

PDF
79

Clustering gene expression time series data embedded in a nonparametric setup

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Information

Current Issue