Hi again.
I have another SSCCE related to XGBoost. Below are two ways of obtaining it:
-
Just browse to https://pastebin.com/fsKctYtb and follow the link there.
-
Browse to https://pastebin.com/gkL2eKXF - to extract:
- Download the data at the pastebin URL
- echo >> 5526giJW.txt
- uudecode 5526giJW.txt
- tar tvJf RM-454-SSCCE-5.tar.xz
- tar xvJf RM-454-SSCCE-5.tar.xz
BTW, I tried to post a link to my website with this stuff, but the forum told me I was a spammer and to go away
Unfortunately, this SSCCE is not entirely predictable. The majority of the time it works fine, but some smallish percentage of the time it gets NaN’s that prevent it from working correctly - despite being given the same inputs each time at the CPython level.
Consequently the SSCCE runs its replicate() function a number of times, and keeps track of what percentage of the runs completed successfully. Spoiler: I don’t think it’s ever 100% unless you request a pretty small number of runs. 30 or 100 are generally enough to get some bad runs - the script is currently hardcoded to just use 30.
Any suggestions? Are we feeding it something bogus? And is there a way of holding constant the seeds to the random number generators at the numpy and/or C++ level, like I’ve already done at the CPython level?
Thanks!
PS: It has problems about 17% of the time.