I’m having a similar problem which is not readily fixed by changing the base score of the custom loss function mode to zero. I create the same loss function, create some dummy data and train on it. The results from the built-in objectives “binary:logistic” and “reg:logistic” are materially different from using the custom objective, no matter how I set the base score. Did I misunderstand something?
The below script reproduces the problem. I am aware RMSE is not really the right metric but it shows very neatly the differences in behaviour. Differences are also noticeable in other metrics (e.g. AUC).
# Attempt to reproduce log-loss objective
library(data.table)
library(xgboost)
# custom objective function
logloss <- function(preds, dtrain){
# Get weights and labels
labels<- getinfo(dtrain, "label")
# Apply logistic transform to predictions
preds <- 1/(1 + exp(-preds))
# Find gradient and hessian
grad <- (preds - labels)
hess <- preds * (1-preds)
return(list("grad" = grad, "hess" = hess))
}
# Generate test data
generate_test_data <- function(n_rows = 1e5, feature_count = 5){
# Make targets
test_data <- data.table(
target = sign(runif(n = n_rows, min=-1, max=1))
)
# Add feature columns.These are normally distributed and shifted by the target
# in order to create a noisy signal
for(feature in 1:feature_count){
# Randomly create features of the noise
mu <- runif(1, min=-1, max=1)
sdev <- runif(1, min=5, max=10)
# Create noisy signal
test_data[, paste0("feature_", feature) := rnorm(
n=n_rows, mean = mu, sd = sdev)*target + target]
}
# Make vector of feature names
feature_names <- paste0("feature_", 1:feature_count)
# Make training matrix and labels
split_data[["train_trix"]] <- as.matrix(split_data$train[, feature_names, with=FALSE])
split_data[["train_labels"]] <- as.logical(split_data$train$target + 1)
return(split_data)
}
# Build the tree
build_model <- function(split_data, objective, params = list()){
# Make evaluation matrix
train_dtrix <-
xgb.DMatrix(
data = split_data$train_trix, label = split_data$train_labels)
# Train the model
model <- xgb.train(
data = train_dtrix,
watchlist = list(
train = train_dtrix),
nrounds = 5,
objective = objective,
eval_metric = "rmse",
params = params
)
return(model)
}
split_data <- generate_test_data()
cat("\nUsing built-in binary:logistic objective.\n")
test_1 <- build_model(split_data, "binary:logistic")
cat("\nUsing built-in reg:logistic objective.\n")
test_2 <- build_model(split_data, "reg:logistic")
cat("\n\nUsing custom objective\n")
test_3 <- build_model(split_data, logloss, params = list(base_score = 0.0))
This produces the following output:
Using built-in binary:logistic objective.
[1] train-rmse:0.476833
[2] train-rmse:0.463433
[3] train-rmse:0.455049
[4] train-rmse:0.449588
[5] train-rmse:0.446047
Using built-in reg:logistic objective.
[1] train-rmse:0.476833
[2] train-rmse:0.463433
[3] train-rmse:0.455049
[4] train-rmse:0.449588
[5] train-rmse:0.446047
Using custom objective
[1] train-rmse:0.481920
[2] train-rmse:0.554571
[3] train-rmse:0.641242
[4] train-rmse:0.719437
[5] train-rmse:0.784012
I would have assumed that the custom objective produces an output pretty close to that observed for reg:logistic and binary:logistic.