[jvm-packages] xgboost4j doesn't learn on linux for eta < 1

AtR1an · November 20, 2018, 12:08pm

Hi,

I encountered an odd behaviour of xgboost4j under linux (Ubuntu 17.10).
Namely, if I specify eta to be smaller than 1.0 e.g. 0.3 (the default listed in the documentation), then the resulting model seems to not have learned anything outputting the same probabilities for all inputs if the objective multi:softprob is used.
Note that this happens for 0.72 and 0.81.
Did anyone else encounter this issue or can tell me how to avoid it?

Thank you very much.

Best,

AtR1an

AtR1an · November 21, 2018, 2:29pm

Here is a minimal example with which I can reproduce the issue:

try {
	final DMatrix dmat = new DMatrix(libsvmFile);
	final Map<String, Object> params = new HashMap<>();
	params.put("objective", "multi:softprob");
	params.put("num_class", 2);
	params.put("eta", 0.3);
	final Map<String, DMatrix> watches = new HashMap<>();
	watches.put("train", dmat);
	final int nround = 100;
        final Booster booster = XGBoost.train(dmat, params, nround, watches, null, null);
} catch (final XGBoostError e) {
	e.printStackTrace();
}

Note that the same issue appears if we use a float or string for eta.

Best,

AtR1an

hcho3 · November 21, 2018, 7:33pm

Can you post the data too?

AtR1an · November 23, 2018, 10:38am

The file I used is a bit too large to share but this happens for any data I use.
Here is a small snippet that creates a dummy matrix for which I also see the issue:

private static DMatrix createLabeledPointMatrix() throws XGBoostError {
		final Random random = new Random();
		final List<LabeledPoint> train = new ArrayList<>(100);
		for (int i = 0; i < 100; i++) {
			final float label = i < 50 ? 0 : 1;
			float feature = (float) random.nextGaussian();
			if (i < 50) {
				feature += 0.5f;
			} else {
				feature -= 0.5f;
			}
			train.add(new LabeledPoint(label, new int[] { 0 }, new float[] { feature }));
		}
		return new DMatrix(train.iterator(), null);
	}

Best,

AtR1an

dietzc · November 27, 2018, 5:35pm

@hcho3 do you have an idea here? this is kind of a blocker for us at the moment