The script has two losses, the squared loss L_a = (y-F(x))^2
and the same loss but with a 0.5
factor: L_b = 0.5*(y-F(x))^2
. Using L_a
gets me trees with one splits (even if max_split
is set to > 1), but using L_b
results in trees with no splits. There should be no difference between the two since they differ only by a constant factor.
I made sure to set gamma=0
(the minimum improvement allowed for a split) and also set lambda=0
(the L_2
penalty) to zero, but still getting this issue.
The eval-rmse
and the eval-error
reported in lines 202 and 203 don’t match, even after taking the sqrt of the eval-error.
I tried this on R
and Python
, both return the same result. Is there a parameter I’m omitting? Not sure how to approach this issue. Any help would be appreciated!
x1 = np.random.uniform(0,1,10000)
x2 = np.random.uniform(0,1,10000)
x3 = np.random.uniform(0,1,10000)
y = 10*x1*x2+np.random.normal(size=10000)*x2
x1_train = x1[:8000,]
x2_train = x2[:8000,]
y_train = y[:8000,]
x1_test = x1[8000:,]
x2_test = x2[8000:,]
y_test = y[8000:,]
data_train= np.asmatrix(pd.DataFrame(x1_train,x2_train))
data_test = np.asmatrix(pd.DataFrame(x1_train,x2_train))
def logregobj(preds, data_train):
labels = data_train.get_label()
grad = -2*(labels-preds)/len(labels)
hess = grad*0 + 2/len(labels)
return grad, hess
def evalerror(preds, data_train):
labels = data_train.get_label()
return 'error', float((sum(labels - preds)**2) / len(labels))
def logregobj2(preds, data_train):
labels = data_train.get_label()
grad = (-2*(labels-preds)/len(labels))/2
hess = (grad*0 + 2/len(labels))/2
return grad, hess
def evalerror2(preds, data_train):
labels = data_train.get_label()
return 'error', (float(sum(labels - preds)**2) / len(labels))/2
data_train = xgb.DMatrix(data_train,label=y_train)
data_test = xgb.DMatrix(data_test,label=y_test)
param = {'max_depth': 3, 'eta': 1, 'n_thread': 3, 'verbosity':2, 'lambda':0, 'gamma':0}
watchlist = list([(dtest, 'eval'), (dtrain, 'train')])
num_round = 5
bst = xgb.train(param, dtrain, num_round, watchlist, logregobj, evalerror)
bst = xgb.train(param, dtrain, num_round, watchlist, logregobj2, evalerror2)