Understanding training error and validation error

I have a toy binary classification with xgboost, and I would like to understand how training error and validation error are calculated. I would have thought I could use the model to predict the training set, and then the training error would be the mean of y log ( p ) + (1-y)log(1-p), while the validation error would be mean((y-p)^2). Is that wrong? In my toy example, none of these numbers match. The training output ends with : train-error:0.022000 val-error:0.373000, but:

mean(y*log(pred.train) + (1-y)*log(1-pred.train))
[1] -1.258822

[1] 0.3731826
[1] 0.2698876
mean(y*log(pred.val) + (1-y)*log(1-pred.val))
[1] -0.8609891

The code is below. Many thanks.


N <- 1000

getData <- function(N) {
x1 <- runif(N)
x2 <- runif(N)
x3 <- runif(N)
z <- x1^2 + x2^2 + x3^2
z <- ( z - min(z) ) / (max(z)-min(z))
y <- rbinom(N,size=1,prob=z)
X <- as.matrix(cbind(y,x1,x2,x3))

X <- getData(N)
X.val <- getData(N)

xgtrain <- xgb.DMatrix(X[, -c(1)],
label = X[, 1])
xgval <- xgb.DMatrix(X.val[, -c(1)],
label = X.val[, 1])

watchlist = list(train = xgtrain,val=xgval)
param <- list(max_depth = 2, eta = 0.3, nthread = 2, gamma = 0, min_child_weights = 1,
objective = “binary:logistic”, eval_metric = “error”, subsample = 1, colsample_bytree = 1)
m <- xgb.train(param, xgtrain, nrounds = 1000, watchlist = watchlist, verbose = TRUE)

pred.train <- predict(m, X[,-1])
pred.val <- predict(m, X.val[,-1])

y <- X[,1]
mean(y*log(pred.train) + (1-y)*log(1-pred.train))

y <- X.val[,1]
mean(y*log(pred.val) + (1-y)*log(1-pred.val))

The error is calculated as follows:

sum(y == (p >= 0.5))

i.e. the fraction of the data points whose class prediction matches the true label.

Thank you, that is useful to know. So this is not the value of the loss function.?

For the record, that would be the accuracy, so the error printed would be:

(1/N) * sum(y == (p >= 0.5))
Many thanks for your help, also for your answer to my other question.

Sorry I made a mistake in my earlier post. The error is 1 - accuracy, so the error is

(1/N) * sum(y != (p >= 0.5))

Yes indeed, I had meant to write:

(1/N) * sum(y == (p < 0.5))

So the error printed is determined by eval_metric. If we have eval_metric=“error”, it does as written here, but if you write “eval_metric” is “auc”, then obviously it prints auc. Is the following true or false? Whatever I write in eval_metric, the loss function minimised is the binary cross entropy function, but eval_metric determines the early stopping, and it also determines what is printed as training-error, which may not be what is actually minimised.

Yes, that’s right. The evaluation metric is not necessarily same as the loss function that’s being minimized.