FAQ
The javadocs for NumericField (lucene 2.9.4) state:

You may add the same field name as a NumericField to the same document
more than once. Range querying and filtering will be the logical OR of
all values; so a range query will hit all documents that have at least
one value in the range

Furthermore, the precisionStep is defined only in terms of performance
and disk space, not in terms of affecting the results.

However, the unit test below shows that the precisionStep directly
affects the results when using multi-value fields:

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_29);
RAMDirectory directory = new RAMDirectory();
IndexWriter writer = new IndexWriter(directory, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);
Document document = new Document();
document.add(new NumericField("number", 4, Field.Store.YES,
true).setLongValue(480));
document.add(new NumericField("number", 4, Field.Store.YES,
true).setLongValue(180));
writer.addDocument(document);
IndexSearcher searcher = new IndexSearcher(directory, true);
Query q = new FilteredQuery(new MatchAllDocsQuery(),
NumericRangeFilter.newLongRange("number", 444l, 10000l, true, true));
TopDocs docs = searcher.search(q, null, 10);
Assert.assertEquals(1, docs.totalHits);

analyzer = new StandardAnalyzer(Version.LUCENE_29);
directory = new RAMDirectory();
writer = new IndexWriter(directory, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);
document = new Document();
document.add(new NumericField("number", 6, Field.Store.YES,
true).setLongValue(480));
document.add(new NumericField("number", 6, Field.Store.YES,
true).setLongValue(180));
writer.addDocument(document);
searcher = new IndexSearcher(directory, true);
q = new FilteredQuery(new MatchAllDocsQuery(),
NumericRangeFilter.newLongRange("number", 444l, 10000l, true, true));
docs = searcher.search(q, null, 10);
Assert.assertEquals(1, docs.totalHits); // fails, due to increased
precision Step

Any help would be greatly appreciated

Greg

Please consider the environment before printing this email.

This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately.

Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of Detica Limited, details of which can be found at http://www.detica.com/statutory-information.

Detica Limited is registered in England under No: 1337451.
Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, England.

Search Discussions

  • Uwe Schindler at Jun 9, 2011 at 9:32 am
    Hi,

    if you pass in a precisionStep != theDefault (which is 4), you also have to
    pass the same precStep also into the NumericRangeQuery / NumericRangeFilter.
    If you not do this, you may get too few / incorrect results. This is
    explained in JavaDocs of NumericRangeQuery. You example does not do this.

    Uwe

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de

    -----Original Message-----
    From: Tarr, Gregory
    Sent: Thursday, June 09, 2011 11:26 AM
    To: java-user@lucene.apache.org
    Subject: Multi value NumericFields - major issue

    The javadocs for NumericField (lucene 2.9.4) state:

    You may add the same field name as a NumericField to the same document
    more than once. Range querying and filtering will be the logical OR of all
    values; so a range query will hit all documents that have at least one value in
    the range

    Furthermore, the precisionStep is defined only in terms of performance and
    disk space, not in terms of affecting the results.

    However, the unit test below shows that the precisionStep directly affects
    the results when using multi-value fields:

    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_29);
    RAMDirectory directory = new RAMDirectory(); IndexWriter writer = new
    IndexWriter(directory, analyzer, true,
    IndexWriter.MaxFieldLength.UNLIMITED);
    Document document = new Document();
    document.add(new NumericField("number", 4, Field.Store.YES,
    true).setLongValue(480)); document.add(new NumericField("number", 4,
    Field.Store.YES, true).setLongValue(180)); writer.addDocument(document);
    IndexSearcher searcher = new IndexSearcher(directory, true); Query q =
    new FilteredQuery(new MatchAllDocsQuery(),
    NumericRangeFilter.newLongRange("number", 444l, 10000l, true, true));
    TopDocs docs = searcher.search(q, null, 10); Assert.assertEquals(1,
    docs.totalHits);

    analyzer = new StandardAnalyzer(Version.LUCENE_29);
    directory = new RAMDirectory();
    writer = new IndexWriter(directory, analyzer, true,
    IndexWriter.MaxFieldLength.UNLIMITED);
    document = new Document();
    document.add(new NumericField("number", 6, Field.Store.YES,
    true).setLongValue(480)); document.add(new NumericField("number", 6,
    Field.Store.YES, true).setLongValue(180)); writer.addDocument(document);
    searcher = new IndexSearcher(directory, true); q = new FilteredQuery(new
    MatchAllDocsQuery(), NumericRangeFilter.newLongRange("number", 444l,
    10000l, true, true)); docs = searcher.search(q, null, 10);
    Assert.assertEquals(1,
    docs.totalHits); // fails, due to increased precision Step

    Any help would be greatly appreciated

    Greg

    Please consider the environment before printing this email.

    This message should be regarded as confidential. If you have received this
    email in error please notify the sender and destroy it immediately.

    Statements of intent shall only become binding when confirmed in hard copy
    by an authorised signatory. The contents of this email may relate to dealings
    with other companies under the control of Detica Limited, details of which
    can be found at http://www.detica.com/statutory-information.

    Detica Limited is registered in England under No: 1337451.
    Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP,
    England.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Uwe Schindler at Jun 9, 2011 at 10:21 am
    I opened an issue to *maybe* store the precision step in index metadata, but
    this is similar like using a different analyzer on query and index side:
    https://issues.apache.org/jira/browse/LUCENE-3187

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de

    -----Original Message-----
    From: Uwe Schindler
    Sent: Thursday, June 09, 2011 11:32 AM
    To: java-user@lucene.apache.org
    Subject: RE: Multi value NumericFields - major issue

    Hi,

    if you pass in a precisionStep != theDefault (which is 4), you also have to pass
    the same precStep also into the NumericRangeQuery / NumericRangeFilter.
    If you not do this, you may get too few / incorrect results. This is
    explained in
    JavaDocs of NumericRangeQuery. You example does not do this.

    Uwe

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de

    -----Original Message-----
    From: Tarr, Gregory
    Sent: Thursday, June 09, 2011 11:26 AM
    To: java-user@lucene.apache.org
    Subject: Multi value NumericFields - major issue

    The javadocs for NumericField (lucene 2.9.4) state:

    You may add the same field name as a NumericField to the same document
    more than once. Range querying and filtering will be the logical OR of
    all values; so a range query will hit all documents that have at least
    one value in
    the range

    Furthermore, the precisionStep is defined only in terms of performance
    and disk space, not in terms of affecting the results.

    However, the unit test below shows that the precisionStep directly
    affects the results when using multi-value fields:

    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_29);
    RAMDirectory directory = new RAMDirectory(); IndexWriter writer = new
    IndexWriter(directory, analyzer, true,
    IndexWriter.MaxFieldLength.UNLIMITED);
    Document document = new Document();
    document.add(new NumericField("number", 4, Field.Store.YES,
    true).setLongValue(480)); document.add(new NumericField("number", 4,
    Field.Store.YES, true).setLongValue(180));
    writer.addDocument(document); IndexSearcher searcher = new
    IndexSearcher(directory, true); Query q = new FilteredQuery(new
    MatchAllDocsQuery(), NumericRangeFilter.newLongRange("number", 444l,
    10000l, true, true)); TopDocs docs = searcher.search(q, null, 10);
    Assert.assertEquals(1, docs.totalHits);

    analyzer = new StandardAnalyzer(Version.LUCENE_29);
    directory = new RAMDirectory();
    writer = new IndexWriter(directory, analyzer, true,
    IndexWriter.MaxFieldLength.UNLIMITED);
    document = new Document();
    document.add(new NumericField("number", 6, Field.Store.YES,
    true).setLongValue(480)); document.add(new NumericField("number", 6,
    Field.Store.YES, true).setLongValue(180));
    writer.addDocument(document); searcher = new IndexSearcher(directory,
    true); q = new FilteredQuery(new MatchAllDocsQuery(),
    NumericRangeFilter.newLongRange("number", 444l, 10000l, true, true));
    docs = searcher.search(q, null, 10);
    Assert.assertEquals(1,
    docs.totalHits); // fails, due to increased precision Step

    Any help would be greatly appreciated

    Greg

    Please consider the environment before printing this email.

    This message should be regarded as confidential. If you have received
    this email in error please notify the sender and destroy it immediately.

    Statements of intent shall only become binding when confirmed in hard
    copy by an authorised signatory. The contents of this email may
    relate to dealings
    with other companies under the control of Detica Limited, details of
    which can be found at http://www.detica.com/statutory-information.

    Detica Limited is registered in England under No: 1337451.
    Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP,
    England.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 9, '11 at 9:26a
activeJun 9, '11 at 10:21a
posts3
users2
websitelucene.apache.org

2 users in discussion

Uwe Schindler: 2 posts Tarr, Gregory: 1 post

People

Translate

site design / logo © 2022 Grokbase