Forecasting: same learner to two copies of data set

Hi,

I work mainly in time series and forecasting, and currently I’m thinking about using xgboost in a very specific problem:

I have a problem that requires a custom loss function. It takes not just the output of the current timestamp, but the output of the last, with a small penalty if there’s a change in the prediction from last time.

This is equivalent to passing both two copies of the training (and other) data, just one backshifted by one timestamp. We’d predict using the same learner on both data sets, and apply the loss function, gradient and hessian asymmetrically. That last part is easy (given the previous ones): what’s not obvious to me is how I apply the learner to both the current time step and the last one separately.

I’ve only used xgboost a few times, but I have been coding for a long time. Rather than go too deep into the documentation and code, though, I thought that I would see if anyone knew any way to do something like the above, as my time is rather constrained right now. Even if someone has an idea of where to start with this, that would be of great help.

Thanks!

JQ

Edit: I’m guessing that I need to extend xgboost.Booster/xgboost. XGBModel? Also, my bad, should have made it clear, I’m working in Python, but I am/can be moderately good with C/C++ if that’s necessary. I doubt it is, though.

Further edit: got bored with what I was doing, started to dig around in the code anyway. It looks like, for this, I might really need to get my hands dirty with C++. That sucks. If anyone can point me to a better way, that would be awesome.

You will need to modify the C++ code, since XGBoost does not currently recognize the idea of temporal sequence.

Have you looked into using neural networks to model sequences? Unlike XGBoost, neural networks let you explicitly model relationship between the output at time t and the output at time (t+1). In addition, neural networks provide automatic differentiation, so you won’t have to manually derive the derivative of your custom loss function. Given the limited time, you should explore the neural networks before trying to extend XGBoost with C++. You can find tutorial at http://d2l.ai/chapter_recurrent-neural-networks/sequence.html

Yeah, that’s what I had thought after looking at the code yesterday.

Yes, I use neural networks regularly. There are certain aspects of the problem that made me want to try both tree based models and boosting. Whether it’s still worth it is an open question right now.

Thanks!