Forgive me if this is a question that is basic. I have not, however found any solid explanations or answers on this in this or any other LETOR project.
Question: Relevance judgements in all the examples I’ve found in this project always have whole integers as the relevance judgements for a given query:doc pair. However, that seems unlikely to me in a real-world scenario.
Meaning, if your relevance scores are being computed from user behavior in an online scenario, wouldn’t the final relevance scores for a given query:doc pair be the average of all user behavior over some time period?
Example: If 50 users searched the same term and each clicked various results, wouldn’t the relevance score be the average of their click activity over a given time period grouped by the query:doc pair?
Or is there a well known practice of information retrieval that I am missing? Or would you treat the user as a feature and then include as many rows as you have users that searched/clicked in that period instead? I have never seen user id as a feature in the published data sets I’ve found.
Apologies again if this is painfully basic, however, finding real-world explanations of the process for building a system on XGBoost has proven difficult for me.
Thanks in advance for any guidance!