FAQ
BooleanQuery can not find all matches in special condition
----------------------------------------------------------

Key: LUCENE-1974
URL: https://issues.apache.org/jira/browse/LUCENE-1974
Project: Lucene - Java
Issue Type: Bug
Components: Query/Scoring
Affects Versions: 2.9
Reporter: tangfulin


query: (name:tang*)
doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
query: name:tang* name:notexistnames
doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>

It is two queries on the same index, one is just a prefix query in a
boolean query, and the other is a prefix query plus a term query in a
boolean query, all with Occur.SHOULD .

what I wonder is why the later query can not find the doc=11377 doc ?

the problem can be repreduced by the code in the attachment .

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

  • tangfulin (JIRA) at Oct 12, 2009 at 2:21 am
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    tangfulin updated LUCENE-1974:
    ------------------------------

    Attachment: BooleanQueryTest.java
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Attachments: BooleanQueryTest.java


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Hoss Man (JIRA) at Oct 13, 2009 at 10:55 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hoss Man updated LUCENE-1974:
    -----------------------------

    Attachment: LUCENE-1974.test.patch

    this is the same as the previously attached test but i've simplified it (to me) and revamped it to be a patch that can be applied to 2.9.0.

    I can confirm that it fails for me (against 2.9.0) and seems to suggest a weird hit collection bug somwhere in the BooleanScorer or Prefix scoring code

    (a prefix query works, a boolean query containing term queries work, but a boolean query containing a prefix query fails to find all the expected matches)

    Unless i'm missing something really silly, this suggests a pretty heinious bug somewhere in the core scoring code.
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Hoss Man (JIRA) at Oct 13, 2009 at 11:03 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hoss Man updated LUCENE-1974:
    -----------------------------

    Attachment: LUCENE-1974.test.patch

    tweaked test so that it can be applied to 2.4.1 (by removing readOnly param from IndexSearcher constructor)

    verified this test passes against 2.4.1 ... it's a new bug in 2.9.0
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 13, 2009 at 11:07 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless reassigned LUCENE-1974:
    ------------------------------------------

    Assignee: Michael McCandless
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 13, 2009 at 11:09 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765299#action_12765299 ]

    Michael McCandless commented on LUCENE-1974:
    --------------------------------------------

    Hmm... seems to be a bug in BooleanScorer... if you call static BooleanQuery.setAllowDocsOutOfOrder(false) the test passes (so that's a viable workaround it seems).
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Muir (JIRA) at Oct 13, 2009 at 11:17 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765303#action_12765303 ]

    Robert Muir commented on LUCENE-1974:
    -------------------------------------

    Hoss man, i played with this a little, maybe this is all obvious tho
    * test passes if you set BooleanQuery.setAllowDocsOutOfOrder(false) [its booleanscorer, not booleanscorer2]
    * to simplify things, you can use ConstantScoreQuery of a single term instead of PrefixQuery to trigger it

    agree with the comment in the original test, if you trace the execution, the problem is it doesnt actually refill the queue with his second doc (which is docid 11,000 or something). this is because .score() is being called on the subscorer with an end limit of 8192 or so.

    {code}
    // refill the queue
    more = false;
    ...
    if (subScorerDocID != NO_MORE_DOCS) {
    more |= sub.scorer.score(sub.collector, end, subScorerDocID);
    ...
    } while (current != null || more);
    {code}


    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 13, 2009 at 11:51 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765310#action_12765310 ]

    Michael McCandless commented on LUCENE-1974:
    --------------------------------------------

    Ugh, this is the bug:

    {code}
    Index: src/java/org/apache/lucene/search/Scorer.java
    ===================================================================
    --- src/java/org/apache/lucene/search/Scorer.java (revision 824846)
    +++ src/java/org/apache/lucene/search/Scorer.java (working copy)
    @@ -87,7 +87,7 @@
    collector.collect(doc);
    doc = nextDoc();
    }
    - return doc == NO_MORE_DOCS;
    + return doc != NO_MORE_DOCS;
    }

    /** Returns the score of the current document matching the query.

    {code}

    I'll commit shortly, to trunk & 2.9 branch.
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at Oct 13, 2009 at 11:57 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765315#action_12765315 ]

    Michael Busch commented on LUCENE-1974:
    ---------------------------------------

    It's also concerning that no unit test catches this...
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 14, 2009 at 12:17 am
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765327#action_12765327 ]

    Michael McCandless commented on LUCENE-1974:
    --------------------------------------------

    bq. It's also concerning that no unit test catches this...

    I agree.... I'll commit tangfulin & Hoss's test case.

    I think the other tests do not catch it because the error only happens if the docID is over 8192 (the chunk size that BooleanScorer uses). Most of our tests work on smaller sets of docs.
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Chris Hostetter at Oct 14, 2009 at 2:11 am
    : I think the other tests do not catch it because the error only happens
    : if the docID is over 8192 (the chunk size that BooleanScorer uses).
    : Most of our tests work on smaller sets of docs.

    I don't have time to try this out right now, but i wonder if just
    modifying the QueryUtils wrap* functions to create bigger "empty" indexes
    (with thousands of deleted docs instead of just a handful) would have
    triggered this bug ... might be worth testing against 2.9.0 to make sure
    there aren't any other weird edge cases before cutting 2.9.1.


    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless at Oct 14, 2009 at 10:30 am
    I just tried this (I increased the numDeletedDocs by adding 10000 to
    the original counts), but it doesn't hit this bug, I believe because
    all the deleted docs are created after the real docs.

    It does provoke new failures, but they all seem to be false failures
    [floating point precision issues], eg:

    [junit] expected:<1379.988> but was:<1379.9883>

    Also, the bug only happens when a BooleanQuery has a clause whose
    Query falls back to Scorer.score(Collector, int, int). EG TermQuery
    has it's own [correct] impl for that, but PrefixQuery does not, so a
    BQ with a PrefixQuery clause will hit it. Plus the BQ must have only
    SHOULD and up to 32 MUST_NOT clauses (so that it uses BooleanScorer
    not BooleanScorer2).

    I'm trying to modify TestBoolean2 so that it takes the tiny (4 doc)
    index it's using, and multiplies it up to 16K docs, and then verifies
    that each query finds 4K * the number of matches. But so far I can't
    get that test to fail either... still digging.

    Mike

    On Tue, Oct 13, 2009 at 10:11 PM, Chris Hostetter
    wrote:
    : I think the other tests do not catch it because the error only happens
    : if the docID is over 8192 (the chunk size that BooleanScorer uses).
    : Most of our tests work on smaller sets of docs.

    I don't have time to try this out right now, but i wonder if just
    modifying the QueryUtils wrap* functions to create bigger "empty" indexes
    (with thousands of deleted docs instead of just a handful) would have
    triggered this bug ... might be worth testing against 2.9.0 to make sure
    there aren't any other weird edge cases before cutting 2.9.1.


    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 14, 2009 at 12:23 am
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless resolved LUCENE-1974.
    ----------------------------------------

    Resolution: Fixed
    Fix Version/s: 3.0
    2.9.1

    Thanks tangfulin and Hoss! I think we need to spin 2.9.1 for this.
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Fix For: 2.9.1, 3.0

    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Yonik Seeley (JIRA) at Oct 14, 2009 at 12:25 am
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765332#action_12765332 ]

    Yonik Seeley commented on LUCENE-1974:
    --------------------------------------

    bq. It's also concerning that no unit test catches this...

    I've said it before, I'll say it again... anything of sufficient complexity really benefits from random tests to hit boundary cases that one would not have thought to code for. We have quite a few in Solr, but not enough. We obviously don't have enough in Lucene either.

    One other simple tactic I've used in Solr to increase the chance of hitting boundary conditions is to make sure many segments are created by default (bad for performance, good for testing), and that cache sizes, window sizes, etc are small so that they are crossed more often by more tests.


    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Fix For: 2.9.1, 3.0

    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 14, 2009 at 12:37 am
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765341#action_12765341 ]

    Michael McCandless commented on LUCENE-1974:
    --------------------------------------------

    +1... we need many more tests that do this in Lucene.
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Fix For: 2.9.1, 3.0

    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at Oct 14, 2009 at 12:41 am
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765342#action_12765342 ]

    Michael Busch commented on LUCENE-1974:
    ---------------------------------------

    {quote}
    I think we need to spin 2.9.1 for this.
    {quote}

    +1

    {quote}
    +1... we need many more tests that do this in Lucene.
    {quote}

    +1
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Fix For: 2.9.1, 3.0

    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • tangfulin (JIRA) at Oct 14, 2009 at 1:59 am
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765354#action_12765354 ]

    tangfulin commented on LUCENE-1974:
    -----------------------------------

    Good job, Thanks you all for this!

    Though we have spent about a day to change our project back to Lucene 2.4 to avoid the bug, now I think it is time to change it back
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Fix For: 2.9.1, 3.0

    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 14, 2009 at 12:46 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765544#action_12765544 ]

    Michael McCandless commented on LUCENE-1974:
    --------------------------------------------

    As a test, to tease out more corner cases, I temporarily dropped BooleanScorer's chunk size from 2048 to 16, and ran all tests. Everything passed.

    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Fix For: 2.9.1, 3.0

    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 14, 2009 at 12:46 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765545#action_12765545 ]

    Michael McCandless commented on LUCENE-1974:
    --------------------------------------------

    bq. Though we have spent about a day to change our project back to Lucene 2.4 to avoid the bug, now I think it is time to change it back

    Thank you for finding the bug, narrowing down, and opening issue!! Sorry for all the hassle :(
    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Fix For: 2.9.1, 3.0

    Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Oct 14, 2009 at 12:58 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless updated LUCENE-1974:
    ---------------------------------------

    Attachment: LUCENE-1974.patch

    I've modified TestBoolean2 to show the bug (attached patch), by
    building up a larger index from the small test index it normally uses.
    I'll commit shortly.

    Here are the conditions that tickle the bug:

    * Must be a BooleanQuery, that contains only SHOULD and up to 32
    MUST_NOT clauses (so that BooleanScorer not BooleanScorer2 is
    used).

    * At least one of the clauses must not be a TermQuery.

    * Must be a segment with more than 4096 docs, and, the clause(s)
    that are not TermQuery must all have no matches in a 2048 chunk
    (and must have valid matches after that chunk). When such a chunk
    is hit, then BooleanScorer stops prematurely.

    BooleanQuery can not find all matches in special condition
    ----------------------------------------------------------

    Key: LUCENE-1974
    URL: https://issues.apache.org/jira/browse/LUCENE-1974
    Project: Lucene - Java
    Issue Type: Bug
    Components: Query/Scoring
    Affects Versions: 2.9
    Reporter: tangfulin
    Assignee: Michael McCandless
    Fix For: 2.9.1, 3.0

    Attachments: BooleanQueryTest.java, LUCENE-1974.patch, LUCENE-1974.test.patch, LUCENE-1974.test.patch


    query: (name:tang*)
    doc=5137 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    doc=11377 score=1.0 doc:Document<stored,indexed<name:tangfulin>>
    query: name:tang* name:notexistnames
    doc=5137 score=0.048133932 doc:Document<stored,indexed<name:tangfulin>>
    It is two queries on the same index, one is just a prefix query in a
    boolean query, and the other is a prefix query plus a term query in a
    boolean query, all with Occur.SHOULD .
    what I wonder is why the later query can not find the doc=11377 doc ?
    the problem can be repreduced by the code in the attachment .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedOct 12, '09 at 2:19a
activeOct 14, '09 at 12:58p
posts20
users3
websitelucene.apache.org

People

Translate

site design / logo © 2021 Grokbase