Sequential and robust data selection in active learning for classification

Authors

  • Xiaojian Xu Department of Mathematics, Brock University 1812 Sir Isaac Brock Way, St. Catharines, ON, Canada L2S 3A1
  • Charlie Shay Sprout Studio, 110 James St, St. Catharines, ON, Canada, L2R 7E8

Keywords:

active learning; passive learning; robust design; logistic regression; Fisher’s linear discriminant; heteroscedasticity.

Abstract

Active learning has become a popular learning process for classification. By selecting the most beneficial training data, an active classifier achieves better classification accuracy than a passive classifier. In this paper, we first investigate the methods of robustifying optimal active learning processes, via either a sequential approach or taking consideration of the classifiers possibly developed from a misspecified model. A comparison study has been presented for the classifiers obtained by a two-stage learning and a sequential learning as proposed and it indicates that the sequential method generally outperforms its competitor. Then, we further analyze the sensitivities of three different classifiers (linear discriminant classifier, quadratic discriminant classifier, and logistic regression classifier) in active learning for classification purpose. Our analysis reveals that the logistic regression classifier is sensitive to the misspecification involved in the assumed logistic model whereas the linear discriminant classifier is relatively robust to moderate violations of assumed homscedasticity.

Journal of Statistical Research 2021, Vol. 55, No. 1, pp. 249-266

Abstract
3
PDF
2

Downloads

Published

2021-12-09

How to Cite

Xu, X., & Shay, C. . (2021). Sequential and robust data selection in active learning for classification. Journal of Statistical Research, 55(1), 249–266. Retrieved from https://banglajol.info/index.php/JStR/article/view/56591

Issue

Section

Articles