Hello,

I would like to know if there is interest in trying some experiments on Mechanical Turk for the OpenRelevance project. I've done TREC and INEX on MTurk and is a good platform for trying relevance experiments.

Regards,

Omar

Search Discussions

  • Grant Ingersoll at Oct 16, 2009 at 10:38 pm
    Hi Omar,

    It sounds interesting, can you elaborate more on what you had in mind?

    A few questions come to mind:

    1. Cost associated w/ Turk.
    2. What dataset would you use?

    -Grant
    On Oct 16, 2009, at 5:49 PM, Omar Alonso wrote:

    Hello,

    I would like to know if there is interest in trying some experiments
    on Mechanical Turk for the OpenRelevance project. I've done TREC and
    INEX on MTurk and is a good platform for trying relevance experiments.

    Regards,

    Omar

  • Omar Alonso at Oct 17, 2009 at 12:30 am
    Sure.

    1- We can start by paying between 2 and 5 cents per document/query pair (or document/topic) on a short data set (say 200 docs). That should be in the order of $25 (assuming 2 cents and 5 turkers per assignment + AMZN fee).

    It also depends how many experiments one would like to run. My suggestion would be to run 2 or 3 experiments with some small data sets for say $100 to see what kind of response we get back and then think about something else at large scale.

    I have some tips on how to run crowdsourcing for relevance evaluation here: http://wwwcsif.cs.ucdavis.edu/~alonsoom/ExperimentDesign.pdf

    2- If the goal is to have everything open source (gold set + relevance judgments), we need to produce a new data set from scratch. Also, what is the goal here? What is the domain? Enterprise search? Ad-hoc retrieval?

    In summary, I would start with something small (English only, Creative Commons or Wikipedia). Build a few experiments and see the results. Then expand on data sets and also make it multilingual.

    o.

    --- On Fri, 10/16/09, Grant Ingersoll wrote:
    From: Grant Ingersoll <gsingers@apache.org>
    Subject: Re: OpenRelevance and crowdsourcing
    To: openrelevance-dev@lucene.apache.org
    Cc: openrelevance-user@lucene.apache.org
    Date: Friday, October 16, 2009, 3:38 PM
    Hi Omar,

    It sounds interesting, can you elaborate more on what you
    had in mind?

    A few questions come to mind:

    1. Cost associated w/ Turk.
    2. What dataset would you use?

    -Grant
    On Oct 16, 2009, at 5:49 PM, Omar Alonso wrote:

    Hello,

    I would like to know if there is interest in trying
    some experiments on Mechanical Turk for the OpenRelevance
    project. I've done TREC and INEX on MTurk and is a good
    platform for trying relevance experiments.
    Regards,

    Omar

  • Grant Ingersoll at Oct 21, 2009 at 1:16 pm

    On Oct 16, 2009, at 8:30 PM, Omar Alonso wrote:

    Sure.

    1- We can start by paying between 2 and 5 cents per document/query
    pair (or document/topic) on a short data set (say 200 docs). That
    should be in the order of $25 (assuming 2 cents and 5 turkers per
    assignment + AMZN fee).

    It also depends how many experiments one would like to run. My
    suggestion would be to run 2 or 3 experiments with some small data
    sets for say $100 to see what kind of response we get back and then
    think about something else at large scale.
    While I realize $100 isn't a lot, we simply don't have a budget for
    such experiments and the point of ORP is to be able do this in the
    community. I suppose we could ask the ASF board for the money, but I
    don't think we are ready for that anyway. I very much have a "If you
    build it, they will come" mentality, so I know if we can just get
    bootstrapped with some data and some queries and a way to collect
    their judgments, we can get people interested.
    I have some tips on how to run crowdsourcing for relevance
    evaluation here: http://wwwcsif.cs.ucdavis.edu/~alonsoom/ExperimentDesign.pdf Thanks!
    2- If the goal is to have everything open source (gold set +
    relevance judgments), we need to produce a new data set from
    scratch. Also, what is the goal here? What is the domain? Enterprise
    search? Ad-hoc retrieval?
    Yes. I think the primary goal of ORP is to give people within Lucene
    a way to judge relevance that doesn't require us to purchase datasets,
    just like the contrib/benchmarker gives us a way to talk about
    performance. So, while it may evolve to be more, I'd be happy with a
    simple, fixed collection at this point. Wikipedia is OK, but in my
    experience, there is often only a few good answers for a query to
    begin with, so it's harder to judge recall, but that doesn't mean it
    isn't useful.

    I know there are a lot of issues around curating a good collection,
    but I'd like to be pragmatic and just say, what can we arrive at in a
    reasonable amount of time that best approximates what someone doing,
    say, genetic/biopharma research might do. Just getting a raw dataset
    like PubMed on a given day seems like a good first step, then we can
    work to clean it up and generate queries on it.
    In summary, I would start with something small (English only,
    Creative Commons or Wikipedia). Build a few experiments and see the
    results. Then expand on data sets and also make it multilingual.
    Agreed. I'm not too worried about multilingual just yet, but it is a
    fun problem.
    o.

    --- On Fri, 10/16/09, Grant Ingersoll wrote:
    From: Grant Ingersoll <gsingers@apache.org>
    Subject: Re: OpenRelevance and crowdsourcing
    To: openrelevance-dev@lucene.apache.org
    Cc: openrelevance-user@lucene.apache.org
    Date: Friday, October 16, 2009, 3:38 PM
    Hi Omar,

    It sounds interesting, can you elaborate more on what you
    had in mind?

    A few questions come to mind:

    1. Cost associated w/ Turk.
    2. What dataset would you use?

    -Grant
    On Oct 16, 2009, at 5:49 PM, Omar Alonso wrote:

    Hello,

    I would like to know if there is interest in trying
    some experiments on Mechanical Turk for the OpenRelevance
    project. I've done TREC and INEX on MTurk and is a good
    platform for trying relevance experiments.
    Regards,

    Omar

    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com/

    Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
    using Solr/Lucene:
    http://www.lucidimagination.com/search
  • Omar Alonso at Oct 21, 2009 at 2:04 pm

    While I realize $100 isn't a lot, we simply don't have a
    budget for such experiments and the point of ORP is to be
    able do this in the community.  I suppose we could ask
    the ASF board for the money, but I don't think we are ready
    for that anyway.  I very much have a "If you build it,
    they will come" mentality, so I know if we can just get
    bootstrapped with some data and some queries and a way to
    collect their judgments, we can get people interested.
    I'm not defending MTurk but it gives you a "world view" in terms of assessments versus a specific community. You can run the test within the community but you may also introduce bias in the experiment. There is a paper from Ellen Voorhees on SIGIR where she shows different agreement levels between NIST and U. Waterloo assessors.

    You can still do a closed HIT (Human Intelligence Task) that pays $0 cent and is by invitation only. You probably need to pay Amazon something for hosting the experiment but that would reduce the cost dramatically. Of course, only the community would have access to it not all workers on MTurk.

    If you want to build everything that is possible too. You can have a website that collects judgments for a set of query/docs.

    The INEX folks do the assessments on a volunteer basis but it takes quite a bit of time.

    In any case, MTurk or not MTurk, I have some spare cycles in case people are interested in trying ideas.

    Regards,

    o.
  • Omar Alonso at Oct 17, 2009 at 12:38 am
    Forgot to add this extra info.

    Here is an example of a graded relevance evaluation experiment that I'm currently running:

    https://www.mturk.com/mturk/preview?groupId=5WPZ72HM8TVZZV1XGYG0

    You can login to MTurk using your Amazon account and do a few assignments just to get an idea of the kind of work.

    o.

    --- On Fri, 10/16/09, Grant Ingersoll wrote:
    From: Grant Ingersoll <gsingers@apache.org>
    Subject: Re: OpenRelevance and crowdsourcing
    To: openrelevance-dev@lucene.apache.org
    Cc: openrelevance-user@lucene.apache.org
    Date: Friday, October 16, 2009, 3:38 PM
    Hi Omar,

    It sounds interesting, can you elaborate more on what you
    had in mind?

    A few questions come to mind:

    1. Cost associated w/ Turk.
    2. What dataset would you use?

    -Grant
    On Oct 16, 2009, at 5:49 PM, Omar Alonso wrote:

    Hello,

    I would like to know if there is interest in trying
    some experiments on Mechanical Turk for the OpenRelevance
    project. I've done TREC and INEX on MTurk and is a good
    platform for trying relevance experiments.
    Regards,

    Omar

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupopenrelevance-dev @
categorieslucene
postedOct 16, '09 at 9:49p
activeOct 21, '09 at 2:04p
posts6
users2
websitelucene.apache.org...

2 users in discussion

Omar Alonso: 4 posts Grant Ingersoll: 2 posts

People

Translate

site design / logo © 2018 Grokbase