The collection comprises the relevance judgments used in the 2023 LongEval Information Retrieval Lab (https://clef-longeval.github.io/), organized at CLEF. It consists of three sets of relevance judgments:
1) Relevance judgments for the heldout queries from the LongEval Train Collection (http://hdl.handle.net/11234/1-5010).
2) Relevance judgments for the short-term persistence (sub-task A) queries from the LongEval Test Collection (http://hdl.handle.net/11234/1-5139).
3) Relevance judgments for the long-term persistence (sub-task B) queries from the LongEval Test Collection (http://hdl.handle.net/11234/1-5139).
These judgments were provided by the Qwant search engine (https://www.qwant.com) and were generated using a click model. The click model output was based on the clicks of Qwant's users, but it mitigates noise from raw user clicks caused by positional bias and also better safeguards users' privacy. Consequently, it can serve as a reliable soft relevance estimate for evaluating and training models.
The collection includes a total of 1,420 judgments for the heldout queries, with 74 considered highly relevant and 326 deemed relevant. For the short-term sub-task queries, there are 12,217 judgments, including 762 highly relevant and 2,608 relevant ones. As for the long-term sub-task queries, there are 13,467 judgments, with 936 being highly relevant and 2,899 relevant.