Identifying Risk Factors of Early Marriage among Women in Bangladesh Using Machine Learning Algorithms

Authors

  • Md Fahim Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh
  • Papia Sultana Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh
  • Md Rezaul Karim Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh
  • Dulal Chandra Roy Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh
  • Md Mahfuz Uddin Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh

DOI:

https://doi.org/10.3329/ijss.v25i2.85769

Keywords:

Early marriage, Feature selection, Machine learning algorithms, Socio-economic and household decision making factors, Bangladesh

Abstract

Early marriage, defined as marriage before age 18, is a human rights violation with serious consequences for women’s health and well-being, and remains a major public health issue, particularly in South Asia and Bangladesh. The objective of this study is to identify the key socio-demographic and household decision making risk factors associated with early marriage among women in Bangladesh by applying various machine learning algorithms, and to evaluate the predictive performance of these models for effective policy formulation using data from the nationally representative BDHS 2022. Chi-square tests assessed associations between respondent characteristics and early marriage, while three advanced feature selection methods Boruta, LASSO, and Information Gain were employed for selection of relevant features. Eight machine learning algorithms, including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, XGBoost, AdaBoost, LogitBoost, and Neural Network, were evaluated using 5-fold cross-validation. Model performance was assessed by sensitivity, specificity, precision, accuracy, FDR, and AUC. The analysis revealed that the prevalence of early marriage was 67.02%. Combining significance tests and feature selection, Division, Wealth Index, Reading Newspaper, Religion, Residence, Household Purchases, and Age consistently emerged as the most influential predictors. Among all models, Decision Tree provided the best balanced performance on the testing set (sensitivity: 0.248, specificity: 0.896, precision: 0.540, accuracy: 0.682, AUC: 0.690), indicating its suitability for generalizable early marriage prediction. Feature importance analysis highlighted Wealth Index and Division as primary drivers. This study guides policymakers to target interventions by pinpointing high-risk regions and socioeconomic groups driving early marriage. Strengthening girls’ education, economic support, and community awareness can effectively reduce its prevalence in Bangladesh.

International Journal of Statistical Sciences, Vol. 25(2), November, 2025, pp 81-99

Abstract
42
PDF
13

Downloads

Published

2025-12-17

How to Cite

Fahim, M., Sultana, P., Karim, M. R., Roy, D. C., & Uddin, M. M. (2025). Identifying Risk Factors of Early Marriage among Women in Bangladesh Using Machine Learning Algorithms. International Journal of Statistical Sciences , 25(2), 81–99. https://doi.org/10.3329/ijss.v25i2.85769

Issue

Section

Original Articles