FAQ
[ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Busch reopened LUCENE-730:
----------------------------------

Assignee: Michael Busch
Lucene Fields: (was: [Patch Available, New])

As discussed on java-dev the default behavior of BooleanScorer should be to return the documents in order, because there are people who rely in their apps on that. Docs out of order should only be allowed if BooleanQuery.setUseScorer14(true) is set explicitly.
Restore top level disjunction performance
-----------------------------------------

Key: LUCENE-730
URL: https://issues.apache.org/jira/browse/LUCENE-730
Project: Lucene - Java
Issue Type: Improvement
Components: Search
Reporter: Paul Elschot
Assigned To: Michael Busch
Priority: Minor
Fix For: 2.2

Attachments: TopLevelDisjunction20061127.patch


This patch restores the performance of top level disjunctions.
The introduction of BooleanScorer2 had impacted this as reported
on java-user on 21 Nov 2006 by Stanislav Jordanov.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

  • Michael Busch (JIRA) at May 24, 2007 at 8:40 pm
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael Busch updated LUCENE-730:
    ---------------------------------

    Fix Version/s: 2.2
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at May 25, 2007 at 12:11 am
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael Busch updated LUCENE-730:
    ---------------------------------

    Attachment: lucene-730.patch

    With this patch the old BooleanScorer is only used if BooleanQuery.setUseScorer14(true) is set. It also enables the tests in QueryUtils again that check if the docs are returned in order.

    All tests pass.
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Paul Elschot (JIRA) at May 26, 2007 at 7:46 am
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499293 ]

    Paul Elschot commented on LUCENE-730:
    -------------------------------------

    The patch applies cleanly here, all core tests pass.
    And I like the allowDocsOutOfOrder approach.

    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at May 26, 2007 at 4:16 pm
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499320 ]

    Michael Busch commented on LUCENE-730:
    --------------------------------------

    Thanks for reviewing, Paul!

    I will commit this soon if nobody objects...
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Paul Elschot (JIRA) at May 26, 2007 at 7:18 pm
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499344 ]

    Paul Elschot commented on LUCENE-730:
    -------------------------------------

    No objection, only some remarks.

    One bigger issue:

    The latest patch defaults to docs in order above performance,
    but my personal taste is to have performance by default.

    And some smaller ones:

    One could still adapt QueryUtills to take the possibility
    of docs out of order into account.

    Some performance tests with prohibited scorers could still
    be needed to find out which of the boolean scorers does better
    on them.

    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Doron Cohen (JIRA) at May 26, 2007 at 10:40 pm
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499362 ]

    Doron Cohen commented on LUCENE-730:
    ------------------------------------

    Two comments:

    With this patch the class BooleanWeight is not
    in (direct) use anymore - it is extended by
    BooleanWeight2 and then only the latter is used,
    and creates either Scorer2 or Scorer. We could
    get rid of BolleanWeight2, and have a single
    class BooleanWeight.

    Javadocs for useScorer14 methods:
    /**
    * Indicates that 1.4 BooleanScorer should be used.
    * Being static, This setting is system wide.
    * Scoring in 1.4 mode may be faster.
    * But note that unlike the default behavior, it does
    * not guarantee that docs are collected in docid
    * order. In other words, with this setting,
    * {@link HitCollector#collect(int,float)} might be
    * invoked first for docid N and only later for docid N-1.
    */
    public static void setUseScorer14(boolean use14) {

    /**
    * Whether 1.4 BooleanScorer should be used.
    * @see #setUseScorer14(boolean)
    */
    public static boolean getUseScorer14() {

    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Hoss Man (JIRA) at May 26, 2007 at 11:44 pm
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499365 ]

    Hoss Man commented on LUCENE-730:
    ---------------------------------

    The latest patch defaults to docs in order above performance,
    but my personal taste is to have performance by default.
    I think it makes more sense to "default" to the most consistent rigidly defined behavior (docs in order), since that behavior will work (by definition) for any caller regardless of whether the caller expects the docs in order or not.

    people who find performance lacking can then assess their needs and make a conscious choice to change the setting, and see if it actually improves performance in their use cases.

    (ie: "avoid premature optimization" and all that)
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at May 27, 2007 at 2:00 am
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499371 ]

    Michael Busch commented on LUCENE-730:
    --------------------------------------
    The latest patch defaults to docs in order above performance,
    but my personal taste is to have performance by default.
    I agree with Hoss here. IMO allowing docs out of order is a big
    API change. I think if people switch to 2.2 they just want it
    to work as before without having to add special settings. If
    they need better performance for certain types of queries and
    they know that their application can deal with docs out of order
    they can enable the faster scoring.
    So my vote is +1 for docs in order by default.
    Some performance tests with prohibited scorers could still
    be needed to find out which of the boolean scorers does better
    on them.
    That'd be helpful. However, I'm currently working on some other
    issues. Maybe you or others would have some time to run those
    tests?
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at May 27, 2007 at 2:02 am
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499372 ]

    Michael Busch commented on LUCENE-730:
    --------------------------------------
    With this patch the class BooleanWeight is not
    in (direct) use anymore - it is extended by
    BooleanWeight2 and then only the latter is used,
    and creates either Scorer2 or Scorer. We could
    get rid of BolleanWeight2, and have a single
    class BooleanWeight.
    Agree. Will do.
    Javadocs for useScorer14 methods:
    This is good! Thanks Doron, I will add the javadocs
    to my patch.
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at May 27, 2007 at 2:10 am
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael Busch updated LUCENE-730:
    ---------------------------------

    Attachment: lucene-730.patch

    New patch with the following changes:

    - Removes BooleanWeight2
    - Javadocs for useScorer14 methods provided by Doron
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Paul Elschot (JIRA) at May 27, 2007 at 8:51 am
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499400 ]

    Paul Elschot commented on LUCENE-730:
    -------------------------------------

    (Is the patch reversed? It did not apply at the first attempt,
    probably because my working copy is not the same as the trunk.)
    After ant clean, the boolean tests still pass here:
    ant -Dtestcase='TestBool*' test-core

    A slight improvement for the javadocs of BooleanQuery.java.
    In the javadocs of the scorer() method it is indicated that a BooleanScorer2
    will always be used, so it is better to mention here that BooleanScorer2
    delegates to a 1.4 scorer in some cases:

    /**
    * Indicates that BooleanScorer2 will delegate
    * the scoring to a 1.4 BooleanScorer
    * for most queries without required clauses.
    * Being static, this setting is system wide.
    * Scoring in 1.4 mode may be faster.
    * But note that unlike the default behavior, it does
    * not guarantee that docs are collected in docid
    * order. In other words, with this setting,
    * {@link HitCollector#collect(int,float)} might be
    * invoked first for docid N and only later for docid N-1.
    */
    public static void setUseScorer14(boolean use14) {
    useScorer14 = use14;
    }

    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at May 27, 2007 at 5:09 pm
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499415 ]

    Michael Busch commented on LUCENE-730:
    --------------------------------------
    A slight improvement for the javadocs of BooleanQuery.java.
    In the javadocs of the scorer() method it is indicated that a BooleanScorer2
    will always be used, so it is better to mention here that BooleanScorer2
    delegates to a 1.4 scorer in some cases:
    Maybe we should just deprecate the useScorer14 methods and add new methods
    allowDocsOutOfOrder. That should be easier to understand for the users.
    And probably most users don't know (or don't care about) the differences
    between BooleanScorer and BooleanScorer2 anyway.
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at May 27, 2007 at 7:21 pm
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael Busch updated LUCENE-730:
    ---------------------------------

    Attachment: lucene-730.patch

    New patch that deprecates the useScorer14 methods and adds new
    methods:

    /**
    * Indicates whether hit docs may be collected out of docid
    * order. In other words, with this setting,
    * {@link HitCollector#collect(int,float)} might be
    * invoked first for docid N and only later for docid N-1.
    * Being static, this setting is system wide.
    * If docs out of order are allowed scoring might be faster
    * for certain queries (disjunction queries with less than
    * 32 prohibited terms). This setting has no effect for
    * other queries.
    */
    public static void setAllowDocsOutOfOrder(boolean allow);

    /**
    * Whether hit docs may be collected out of docid order.
    * @see #setAllowDocsOutOfOrder(boolean)
    */
    public static boolean getAllowDocsOutOfOrder();


    I think this is easier to understand for the users because it
    tells them what they need to know (docs in or out of order)
    and hides technical details (BooleanScorer vs. BooleanScorer2).

    All tests pass.

    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assigned To: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, lucene-730.patch, lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael Busch (JIRA) at May 28, 2007 at 7:35 pm
    [ https://issues.apache.org/jira/browse/LUCENE-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael Busch resolved LUCENE-730.
    ----------------------------------

    Resolution: Fixed

    I just committed the latest patch. Thanks everyone!
    Restore top level disjunction performance
    -----------------------------------------

    Key: LUCENE-730
    URL: https://issues.apache.org/jira/browse/LUCENE-730
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Search
    Reporter: Paul Elschot
    Assignee: Michael Busch
    Priority: Minor
    Fix For: 2.2

    Attachments: lucene-730.patch, lucene-730.patch, lucene-730.patch, TopLevelDisjunction20061127.patch


    This patch restores the performance of top level disjunctions.
    The introduction of BooleanScorer2 had impacted this as reported
    on java-user on 21 Nov 2006 by Stanislav Jordanov.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-dev @
categorieslucene
postedMay 24, '07 at 8:40p
activeMay 28, '07 at 7:35p
posts15
users1
websitelucene.apache.org

1 user in discussion

Michael Busch (JIRA): 15 posts

People

Translate

site design / logo © 2021 Grokbase