Which train data combination is better when train regression model and why


I collect data of everyday of each month and try to predict results of specific month.

But I found different combinations of data will affect the prediction.

Firstly, I trained per 3 months incrementally ex: Jan~Mar, Apr~Jun…etc. But I found catastrophic forgetting happened.

I found no patterns existing in Jan when I train Oct~Dec and then predict patterns of Jan.

So I tried another data combination which train months incrementally but overlapped two months ex: Jan~Mar, Feb~Apr, Oct~Dec.

The patterns of Jan shows when I trained Oct~Dec and then predict patterns of Jan.

I thought it forgetting when trained many iterations, so I thought (2) will forget again.

But it didn’t forget, it remember the patterns of previous data(Jan) after more iterations than (1) .