FAQ
in javadoc http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html#formula_norm
norm(t,d) = doc.getBoost() ・ lengthNorm(field) ・ ∏ f.getBoost()
field
f in d named as t
whre is field come from in lengthNorm(field)?
In my option, term t may appear in a doc d many times with different fields. So
∏ f.getBoost()
field f in d named as t
makes sense.
But Why there is only one lengthNorm(field) ?
does it mean that norm(t,d)=norm(t.text, t.field, d)? That's is--
every doc,every field,every term , there is a norm value ?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Rebecca Watson at Jun 2, 2010 at 8:27 am
    There are index time boosts ie calculated at index time and search
    time boosts. The field f always relates to the field(s) that the term
    t appears in.

    My understanding is that--
    Norm(t,d) includes the index time boosts for each field but I think t
    is only included in this calc in terms of field.getboost() -- because
    you effectively get the product over each field in which the specific
    term occurs. But you don't have s norm value for e every term- field -
    doc combo.

    Similarity (defaultsimilarity by default!) method computeNorm is used
    to compute this factor at indexing time so have a look at that code to
    see what I mean!

    IndexReader.norms(fieldname) gives a norm array the size of the number
    of docs so norm[docid] is actually used to get the index time boost
    for the given field/doc in which the term occurs.

    t.getboost() is the search time boost for the particular term --
    practically this is the query boost for the term (or I think product
    of query boosts if the term is also a sub- query (where the
    surrounding query has a query boost too).

    I may be wrong but when I dug into the docs + code these were my
    conclusions!

    bec :)

    Sent from my iPhone
    On 02/06/2010, at 3:19 PM, Li Li wrote:

    in javadoc http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html#formula_norm
    norm(t,d) = doc.getBoost() · lengthNorm(field) · ∏
    f.getBoost()
    field
    f in d named as t
    whre is field come from in lengthNorm(field)?
    In my option, term t may appear in a doc d many times with different
    fields. So
    ∏ f.getBoost()
    field f in d named as t
    makes sense.
    But Why there is only one lengthNorm(field) ?
    does it mean that norm(t,d)=norm(t.text, t.field, d)? That's is--
    every doc,every field,every term , there is a norm value ?

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 2, '10 at 7:19a
activeJun 2, '10 at 8:27a
posts2
users2
websitelucene.apache.org

2 users in discussion

Rebecca Watson: 1 post Li Li: 1 post

People

Translate

site design / logo © 2022 Grokbase