Debias random forest regression predictors

Authors

  • Lihua Chen Department of Mathematics and Statistics, James Madison University, Harrisonburg, VA, USA
  • Prabhashi Withana Gamage Department of Mathematics and Statistics, James Madison University, Harrisonburg, VA, USA
  • John Ryan Department of Economics, University of Wisconsin-Madison, Madison, WI, USA

DOI:

https://doi.org/10.3329/jsr.v56i2.67466

Keywords:

Random forest, Regression predictor, Bias correction, Univariate smoothing, Boosted forest

Abstract

The random forest can reduce the variance of regression predictors through bagging while leaving the bias mostly unchanged. In general, the bias is not negligible and consequently bias correction is necessary. The default bias correction method implemented in the R package randomForest often works poorly. Several approaches have been developed which in general outperform the R default. However, little work has been done to com- prehensively evaluate the performance of these methods and thus guide users to select an appropriate method for bias correction. This paper fills this gap by providing an informa- tive ranking of these bias correction methods based on an extensive numerical study. We further offered practical suggestions on the application of the winner of these methods and suggested a visualization technique to help users decide when bias correction is needed.

Journal of Statistical Research 2022, Vol. 56, No. 2, pp. 115-131

Abstract
82
PDF
47

Downloads

Published

2023-07-09

How to Cite

Chen, L., Gamage, P. W. ., & Ryan, J. . (2023). Debias random forest regression predictors. Journal of Statistical Research, 56(2), 115–131. https://doi.org/10.3329/jsr.v56i2.67466

Issue

Section

Articles