I believe there is a bug in Poisson regression and I have checked with other regression such as gamma. My understanding is the results should be the same with eval_metric=“logloss” or “poisson-nloglik”. But the outputs are quite different. An example follows:
library(“xgboost”)
set.seed(15)
x <- matrix(rnorm(100*2),100,2)
g2 <- sample(c(0,1),100,replace=TRUE)
fit1 <- xgboost(data=x, label=g2, objective = “count:poisson”, eval_metric=“logloss”, nrounds=10)
[1] train-logloss:0.604225
[2] train-logloss:0.533964
[3] train-logloss:0.477919
[4] train-logloss:0.430387
[5] train-logloss:0.390779
[6] train-logloss:0.367838
[7] train-logloss:0.348413
[8] train-logloss:0.330266
[9] train-logloss:0.315782
[10] train-logloss:0.302425
fit2 <- xgboost(data=x, label=g2, objective = “count:poisson”, eval_metric=“poisson-nloglik”, nrounds=10)
[1] train-poisson-nloglik:0.781148
[2] train-poisson-nloglik:0.745825
[3] train-poisson-nloglik:0.717435
[4] train-poisson-nloglik:0.693129
[5] train-poisson-nloglik:0.672651
[6] train-poisson-nloglik:0.660460
[7] train-poisson-nloglik:0.650108
[8] train-poisson-nloglik:0.640582
[9] train-poisson-nloglik:0.633043
[10] train-poisson-nloglik:0.625980
sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] xgboost_1.5.0.2
loaded via a namespace (and not attached):
[1] compiler_4.1.2 Matrix_1.4-0 grid_4.1.2 data.table_1.14.0
[5] jsonlite_1.7.2 lattice_0.20-41