Dear All,

as far as the production of a huge amount relevance assessment is
concerned, you could have a look at the TREC Million Query Track (,

As far as the production of a test collection in an interactive way is
concerned, you could look at:

Corkmack et al., "Efficient construction of large test
collections",SIGIR 1998,

Sanderson & Joho, "Forming test collections with no system pooling",
SIGIR 2004,

Wrt the creation of pools (and sampling of collections) targeted
towards a specific metric, you could have a look at:

Aslam et al., "A statistical method for system evaluation using
incomplete judgments", SIGIR 2006,

Finally, a system that can be of your interest is DIRECT (Distributed
Information Retrieval Evaluation Campaign Tool), that we have built
for managing the CLEF evaluation campaigns. Among other things, it
allows for interactive topic creation by searching in document
collections (by the way we use Lucene to do this) and interactive
relevance assessments. You can find some information about DIRECT at:

All the best,
Nicola Ferro

Nicola Ferro - Ph.D. in Computer Science
Assistant Professor

Department of Information Engineering (DEI)
University of Padua
Via Gradenigo, 6/A - 35131 Padova - Italy
Tel +39 049 827 7939 Fax: +39 049 827 7799

skype: nicola.ferro
home page:


Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupopenrelevance-dev @
postedAug 12, '09 at 5:04p
activeAug 12, '09 at 5:04p

1 user in discussion

Nicola Ferro: 1 post



site design / logo © 2019 Grokbase