FAQ
I have a index from which I have a number of documents from authors,
but would like to drop the relevance/score for documents from one
particular author using the query. That is for documents returned by
querying: (content:"miracle cure"), I would like to reduce the
relevancy of authorid:3024

However I tried several combinations and they didnt work the way I
expected, e.g.

+(content:"miracle cure") (authorid:3024^0.5)
=> increases authorid:3024 score instead

+(content:"miracle cure") +(authorid:3024^0.5 ((-authorid:3024))^10.0)
=> The boosted optional prohibited term seems to have no effect

+(content:"miracle cure") (-authorid:3024^0.5)
=> Again, the boosted optional prohibited term seems to have no effect

Also this approach appears to add a small modifier to the overall
relevancy returned by +(content:"miracle cure"). Is there a better way
to do a multiplicative modification - so that for example the score as
returned by +(content:"miracle cure") is reduced to 0.6 of its value
if the authorid is 3024.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Chris Hostetter at Jul 11, 2006 at 3:41 am
    : particular author using the query. That is for documents returned by
    : querying: (content:"miracle cure"), I would like to reduce the
    : relevancy of authorid:3024

    : +(content:"miracle cure") +(authorid:3024^0.5 ((-authorid:3024))^10.0)
    : => The boosted optional prohibited term seems to have no effect
    :
    : +(content:"miracle cure") (-authorid:3024^0.5)
    : => Again, the boosted optional prohibited term seems to have no effect

    that's because you can't have a boolean query containing nothing but
    negative clauses ... to "penalize" documents that match a query, you must
    reward documents which match the inverse of that query. From a Set
    perspective the inverse of a query like "authorid:3024" is...

    +(+matchalldocs:true -authorid:3024)

    ...but for your purposes, you could just as easily use...

    +content:"miracle cure" +(+content:"miracle cure" -authorid:3024)

    : Also this approach appears to add a small modifier to the overall
    : relevancy returned by +(content:"miracle cure"). Is there a better way
    : to do a multiplicative modification - so that for example the score as
    : returned by +(content:"miracle cure") is reduced to 0.6 of its value
    : if the authorid is 3024.

    First off: don't place to much stock in the specific numeric score value
    returned by a search, Scores aren't absolute, so trying to do absolute
    things to them isn't really meaningful.

    Second: take a look at the Explanation class (and the Searcher.explain
    method) ... it will help you understand why you get the scores you are
    getting, and how methods in the Similarity class (which you can change)
    affect things. With the query structure i outlined above, you can
    probably get pretty close to what you want just by tewaking the boosts.

    Third: if you really want to go all out, you could write a HitCollector to
    penalize documents by performing some arbitrary match on the
    scores of documents matching some specific criteria (like the authorid)



    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJul 11, '06 at 2:33a
activeJul 11, '06 at 3:41a
posts2
users2
websitelucene.apache.org

2 users in discussion

Chun Wei Ho: 1 post Chris Hostetter: 1 post

People

Translate

site design / logo © 2022 Grokbase