FAQ
Hi,

i've got a problem concerning encoding of norms.
I want to use int values (0-255) instead of float interpreted bytes.

In my own Similarity-Class, which I use for indexing and searching, I
implemented the static methods encodeNorms, decodeNorms and
getNormDecoder.
But because they are static and the encoding of norms happens in
NormsWriterPerField.finish() with the following lines of code:

final float norm =
docState.similarity.computeNorm(fieldInfo.name, fieldState);
norms[upto] = Similarity.encodeNorm(norm);
docIDs[upto] = docState.docID

my implementation is only used for computation of norm values but not
for the encoding.
Is there a reason why norm encoding and decoding is hardwired to the
implementation in Similarity?

And is there any elegant way to bypass this behaviour instead of
implementing an mapper, which maps every int between 0 and 255 to an
float value out of Similarity.NORM_TABLE, befor encoding.


Benjamin

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Michael McCandless at Nov 9, 2009 at 5:03 pm

    On Mon, Nov 9, 2009 at 11:04 AM, Benjamin Heilbrunn wrote:

    i've got a problem concerning encoding of norms.
    I want to use int values (0-255) instead of float interpreted bytes.

    In my own Similarity-Class, which I use for indexing and searching, I
    implemented the static methods encodeNorms, decodeNorms and
    getNormDecoder.
    But because they are static and the encoding of norms happens in
    NormsWriterPerField.finish() with the following lines of code:

    final float norm =
    docState.similarity.computeNorm(fieldInfo.name, fieldState);
    norms[upto] = Similarity.encodeNorm(norm);
    docIDs[upto] = docState.docID

    my implementation is only used for computation of norm values but not
    for the encoding.
    Is there a reason why norm encoding and decoding is hardwired to the
    implementation in Similarity?
    I don't think there's a particular reason... this is just how it has
    always been. I think making it more extensible would be good!a
    And is there any elegant way to bypass this behaviour instead of
    implementing an mapper, which maps every int between 0 and 255 to an
    float value out of Similarity.NORM_TABLE, befor encoding.
    I think a patch is needed, to allow the Similarity instance (not the
    static class) to provide the mapping, and decode table? Various
    queries call the decode, so you'd need to fix those too... wanna cough
    up a patch?

    Mike

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Benjamin Heilbrunn at Nov 9, 2009 at 5:20 pm
    Hi Mike,

    thanks for your reply.
    After making my post i found this (without taking a deeper look):

    http://issues.apache.org/jira/browse/LUCENE-1260

    Looks like a solution for that problem.
    Why wasn't it applied to lucene?

    Benjamin

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Nov 9, 2009 at 6:11 pm

    On Mon, Nov 9, 2009 at 12:19 PM, Benjamin Heilbrunn wrote:

    After making my post i found this (without taking a deeper look):

    http://issues.apache.org/jira/browse/LUCENE-1260

    Looks like a solution for that problem.
    Indeed the most recent patch there looks almost exactly like what
    you're proposing? I guess the earlier versions of the patch was a
    bigger change (but I haven't looked that closely).
    Why wasn't it applied to lucene?
    I guess it sort of fizzled out from lack of attention? Sometimes that
    happens! And then something, like your interest here, come along and
    revive it :)

    Mike

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Benjamin Heilbrunn at Nov 10, 2009 at 9:33 am
    Hi,

    I applied http://issues.apache.org/jira/secure/attachment/12411342/Lucene-1260.patch
    That's exactly what I was looking for.

    The problem is, that from know on I'm on a patched version and I'm not
    very happy with breaking compatibility to the "original" jars...
    So is there a chance that this patch becomes a part of lucenes upcoming changes?


    Benjamin

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Nov 10, 2009 at 9:46 am
    Well, assuming there are no objections to the current approach, and
    performance checks out, I'll try to get this into 3.1...

    Mike
    On Tue, Nov 10, 2009 at 4:33 AM, Benjamin Heilbrunn wrote:
    Hi,

    I applied http://issues.apache.org/jira/secure/attachment/12411342/Lucene-1260.patch
    That's exactly what I was looking for.

    The problem is, that from know on I'm on a patched version and I'm not
    very happy with breaking compatibility to the "original" jars...
    So is there a chance that this patch becomes a part of lucenes upcoming changes?


    Benjamin

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedNov 9, '09 at 4:04p
activeNov 10, '09 at 9:46a
posts6
users2
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase