Clustering gene expression time series data embedded in a nonparametric setup
Keywords:Dirichlet Process; Monte Carlo EM algorithm; Mixed effect model; Autoregressive process
A clustering methodology for time series data is proposed. The idea has been cropped up when a subset of gene expression dataset is used to build up the system model by compressing the information through clustering and then by tracing out inherent patterns in the data. A linear mixed model is considered that accommodates time dependent components. The temporal effects are modelled through an autoregressive process that arises in the dispersion of the random component. The joint distribution of coefficients in the time dependent quadratic function and the random effects are embedded within a non-parametric prior (Dirichlet process prior). Such a non-parametric prior induces clustering in the data. Monte Carlo EM (MCEM) based technique has been considered for estimating the parameters. The best cluster is selected through some heterogeneity measures. A rigorous simulation study has been carried out prior to analysis of a gene expression time series data.
Journal of Statistical Research 2021, Vol. 55, No. 1, pp. 207-224
How to Cite
Copyright (c) 2021 Journal of Statistical Research
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.