Grokbase Groups Lucene dev July 2011
FAQ
BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
-------------------------------------------------------------------------------------------------

Key: LUCENE-3282
URL: https://issues.apache.org/jira/browse/LUCENE-3282
Project: Lucene - Java
Issue Type: Improvement
Components: core/search
Affects Versions: 3.4, 4.0
Reporter: Shay Banon


It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Search Discussions

  • Shay Banon (JIRA) at Jul 7, 2011 at 12:59 am
    [ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shay Banon updated LUCENE-3282:
    -------------------------------

    Attachment: LUCENE-3282.patch
    BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
    -------------------------------------------------------------------------------------------------

    Key: LUCENE-3282
    URL: https://issues.apache.org/jira/browse/LUCENE-3282
    Project: Lucene - Java
    Issue Type: Improvement
    Components: core/search
    Affects Versions: 3.4, 4.0
    Reporter: Shay Banon
    Attachments: LUCENE-3282.patch


    It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Jul 8, 2011 at 8:13 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062151#comment-13062151 ]

    Michael McCandless commented on LUCENE-3282:
    --------------------------------------------

    This looks great Shay!

    What was the use case for subclassing to translate the filter into OBS? Is it a custom filter cache? Makes me nervous because the app really should create & reuse this OBS filter, usually...

    On the Collector: we try to keep our Querys IR-state-free... so it makes me nervous to stick a Collector right on the Query. Can we add a CollectorProvider that the Query invokes when it makes the Weight/Scorer?

    Instead of NoOpCollector can we just check for null?
    BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
    -------------------------------------------------------------------------------------------------

    Key: LUCENE-3282
    URL: https://issues.apache.org/jira/browse/LUCENE-3282
    Project: Lucene - Java
    Issue Type: Improvement
    Components: core/search
    Affects Versions: 3.4, 4.0
    Reporter: Shay Banon
    Attachments: LUCENE-3282.patch


    It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Shay Banon (JIRA) at Jul 12, 2011 at 12:20 am
    [ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063619#comment-13063619 ]

    Shay Banon commented on LUCENE-3282:
    ------------------------------------

    Heya,

    In my app, I have a wrapper around OBS, that has a common interface that allows to access bits by index (similar to Bits in trunk), so I need to extract from it the OBS.

    Regarding the Collector, I will work on CollectorProvider interface. I liked the NoOpCollector option since then you don't have to check for nulls each time...
    BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
    -------------------------------------------------------------------------------------------------

    Key: LUCENE-3282
    URL: https://issues.apache.org/jira/browse/LUCENE-3282
    Project: Lucene - Java
    Issue Type: Improvement
    Components: core/search
    Affects Versions: 3.4, 4.0
    Reporter: Shay Banon
    Attachments: LUCENE-3282.patch


    It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Shay Banon (JIRA) at Jul 12, 2011 at 12:28 am
    [ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shay Banon updated LUCENE-3282:
    -------------------------------

    Attachment: LUCENE-3282.patch

    New version, with CollectorProvider.
    BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
    -------------------------------------------------------------------------------------------------

    Key: LUCENE-3282
    URL: https://issues.apache.org/jira/browse/LUCENE-3282
    Project: Lucene - Java
    Issue Type: Improvement
    Components: core/search
    Affects Versions: 3.4, 4.0
    Reporter: Shay Banon
    Attachments: LUCENE-3282.patch, LUCENE-3282.patch


    It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Jul 12, 2011 at 5:45 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064034#comment-13064034 ]

    Michael McCandless commented on LUCENE-3282:
    --------------------------------------------

    Thanks Shay... looking close, but:

    Mulling on this more... I think having the BlockJoinQuery invoke
    the child collector is not the right place, because docs collected
    here don't necessarily match the overall query being executed. And
    also we can miss some docs, eg we don't collect in advance.

    Should we instead move the child collector into BlockJoinCollector?
    It has access to all the scorers for BlockJoinQuery involved in the
    parent query, and to all child docs for each parent doc collected.

    BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
    -------------------------------------------------------------------------------------------------

    Key: LUCENE-3282
    URL: https://issues.apache.org/jira/browse/LUCENE-3282
    Project: Lucene - Java
    Issue Type: Improvement
    Components: core/search
    Affects Versions: 3.4, 4.0
    Reporter: Shay Banon
    Attachments: LUCENE-3282.patch, LUCENE-3282.patch


    It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Shay Banon (JIRA) at Jul 16, 2011 at 9:22 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066536#comment-13066536 ]

    Shay Banon commented on LUCENE-3282:
    ------------------------------------

    The idea of this is to collect matching child docs regardless of what matches parent wise, and yea, we might miss some depending on the type of query that is actually "wrapping" it, but I think its still useful.
    BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
    -------------------------------------------------------------------------------------------------

    Key: LUCENE-3282
    URL: https://issues.apache.org/jira/browse/LUCENE-3282
    Project: Lucene - Java
    Issue Type: Improvement
    Components: core/search
    Affects Versions: 3.4, 4.0
    Reporter: Shay Banon
    Attachments: LUCENE-3282.patch, LUCENE-3282.patch


    It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Jul 18, 2011 at 10:55 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067369#comment-13067369 ]

    Michael McCandless commented on LUCENE-3282:
    --------------------------------------------

    I really don't like that this child collector would be buggy (lose results when involved in a parent query that uses advance); this will cause problems for users, asking why some hits are missing.

    Maybe, instead, we could make a generic wrapper class, taking any Query and a CollectorProvider, producing a Query, so that all hits "visited" by the sub-query are sent to the collector, with clear warnings that this collector will hit false positives (it'll see hits that don't match the top-level Query) and false negatives (it'll miss hits that did match the wrapped Query)?

    How are you using this child collector such that the false positives/negatives aren't a problem? EG do you know the parent query will never use advance?
    BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
    -------------------------------------------------------------------------------------------------

    Key: LUCENE-3282
    URL: https://issues.apache.org/jira/browse/LUCENE-3282
    Project: Lucene - Java
    Issue Type: Improvement
    Components: core/search
    Affects Versions: 3.4, 4.0
    Reporter: Shay Banon
    Attachments: LUCENE-3282.patch, LUCENE-3282.patch


    It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Shay Banon (JIRA) at Jul 29, 2011 at 7:00 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072979#comment-13072979 ]

    Shay Banon commented on LUCENE-3282:
    ------------------------------------

    Hi, sorry for the late response, I the comment.

    Yea, I agree that there will be false positives, but thats the idea of it (sometimes you want to run facets for example on "sub queries"). Btw, I got your point on advance, do you think if a collector exists, then advance should be implemented by iterating over all docs up to the provided doc to advance to.

    Regarding the wrapper, interesting!. I need to have a look at how to generalize it, but it should be simple, I think, I'll try and work on it.
    BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
    -------------------------------------------------------------------------------------------------

    Key: LUCENE-3282
    URL: https://issues.apache.org/jira/browse/LUCENE-3282
    Project: Lucene - Java
    Issue Type: Improvement
    Components: core/search
    Affects Versions: 3.4, 4.0
    Reporter: Shay Banon
    Attachments: LUCENE-3282.patch, LUCENE-3282.patch


    It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedJul 7, '11 at 12:57a
activeJul 29, '11 at 7:00p
posts9
users1
websitelucene.apache.org

1 user in discussion

Shay Banon (JIRA): 9 posts

People

Translate

site design / logo © 2021 Grokbase