FAQ
(thanks fort he many answers to my initial lucene question "Best practices for multiple languages?")

We shall be confronted with the followong problem:
due to the very dynamic access rules on our content, we shall not be able to formulate these in/as Filter(s).
Hence we need to first search and then apply the access rules (i.e. security filter).
What is the best approach to implement paging in this situation? Not to forget, the "overall context" is a web app ;-)

Thx for your advices
- Clemens

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Ian Lea at Jan 21, 2011 at 10:10 am
    The standard recommendation for paging is to re-execute the search
    for second and subsequent pages and return the second or subsequent
    chunk of hits. Would that not work in your case?

    An alternative is to read and cache hits from the initial search but
    that is generally more complex.


    --
    Ian.
    On Thu, Jan 20, 2011 at 7:36 AM, Clemens Wyss wrote:
    (thanks fort he many answers to my initial lucene question "Best practices for multiple languages?")

    We shall be confronted with the followong problem:
    due to the very dynamic access rules on our content, we shall not be able to formulate these in/as Filter(s).
    Hence we need to first search and then apply the access rules (i.e. security filter).
    What is the best approach to implement paging in this situation? Not to forget, the "overall context" is a web app ;-)

    Thx for your advices
    - Clemens

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Clemens Wyss at Jan 21, 2011 at 1:32 pm
    The problem is, that due to the "filtering" AFTER having searched the index, we don't know how many TopDocs to read in order have "enough" for page x.

    Does lucene's search allow injecting kind of a "voter"/"vetoer", which is called for any hit (ScoreDoc) lucene has encountered. This voter should be able to reject the given hit from TopDocs (by returning false). Can this be done with a custom Filter?
    This of course comes with a performance penalty, but it would allow to search for n ScoreDocs, even in our case...
    -----Ursprüngliche Nachricht-----
    Von: Ian Lea
    Gesendet: Freitag, 21. Januar 2011 11:10
    An: java-user@lucene.apache.org
    Betreff: Re: Paging with Lucene

    The standard recommendation for paging is to re-execute the search for
    second and subsequent pages and return the second or subsequent chunk
    of hits. Would that not work in your case?

    An alternative is to read and cache hits from the initial search but that is
    generally more complex.


    --
    Ian.
    On Thu, Jan 20, 2011 at 7:36 AM, Clemens Wyss wrote:
    (thanks fort he many answers to my initial lucene question "Best
    practices for multiple languages?")

    We shall be confronted with the followong problem:
    due to the very dynamic access rules on our content, we shall not be able
    to formulate these in/as Filter(s).
    Hence we need to first search and then apply the access rules (i.e. security filter).
    What is the best approach to implement paging in this situation? Not
    to forget, the "overall context" is a web app ;-)

    Thx for your advices
    - Clemens

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ian Lea at Jan 21, 2011 at 1:57 pm
    The problem is, that due to the "filtering" AFTER having searched the index, we don't know how many TopDocs to read in order have "enough" for page x.
    Think of a number and double it? Unless the number get really high
    lucene is generally plenty fast enough. Or read n and if, after
    filtering, don't have enough, loop and read n + something.
    Does lucene's search allow injecting kind of a "voter"/"vetoer", which is called for any hit (ScoreDoc) lucene has encountered. This voter should be able to reject the given hit from TopDocs (by returning false). Can this be done with a custom Filter?
    This of course comes with a performance penalty, but it would allow to search for n ScoreDocs, even in our case...
    You can use a custom Collector. From the javadocs:

    "Collectors are primarily meant to be used to gather raw results from
    a search, and implement sorting or custom result filtering, collation,
    etc.".

    Custom result filtering sounds familiar ...


    --
    Ian.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Uwe Schindler at Jan 21, 2011 at 1:58 pm
    You can write a custom Collector that does this (just not delegating the
    collect(int) call) and wrap TopDocsCollector with that.

    Alternative: Plug in a Filter that filters your documents during the query.
    As doing this on iterating hits is often costly, the ideal solution would be
    to create a cachedLlucene Filter that stores the documents allowed to access
    in an OpenBitSet and returns that on getDocIdSet().

    When implementing such a filter (and also the collector above), remember
    that Lucene works on index segments so please use the IndexReader and
    docBase passed to Collector.setNextReader() and Filter.getDocIdSet().

    Uwe

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de
    -----Original Message-----
    From: Clemens Wyss
    Sent: Friday, January 21, 2011 2:32 PM
    To: java-user@lucene.apache.org
    Subject: AW: Paging with Lucene

    The problem is, that due to the "filtering" AFTER having searched the index,
    we don't know how many TopDocs to read in order have "enough" for page
    x.

    Does lucene's search allow injecting kind of a "voter"/"vetoer", which is
    called for any hit (ScoreDoc) lucene has encountered. This voter should be
    able to reject the given hit from TopDocs (by returning false). Can this be
    done with a custom Filter?
    This of course comes with a performance penalty, but it would allow to
    search for n ScoreDocs, even in our case...
    -----Ursprüngliche Nachricht-----
    Von: Ian Lea
    Gesendet: Freitag, 21. Januar 2011 11:10
    An: java-user@lucene.apache.org
    Betreff: Re: Paging with Lucene

    The standard recommendation for paging is to re-execute the search for
    second and subsequent pages and return the second or subsequent chunk
    of hits. Would that not work in your case?

    An alternative is to read and cache hits from the initial search but
    that is generally more complex.


    --
    Ian.

    On Thu, Jan 20, 2011 at 7:36 AM, Clemens Wyss <clemensdev@mysign.ch>
    wrote:
    (thanks fort he many answers to my initial lucene question "Best
    practices for multiple languages?")

    We shall be confronted with the followong problem:
    due to the very dynamic access rules on our content, we shall not be
    able
    to formulate these in/as Filter(s).
    Hence we need to first search and then apply the access rules (i.e.
    security filter).
    What is the best approach to implement paging in this situation? Not
    to forget, the "overall context" is a web app ;-)

    Thx for your advices
    - Clemens

    --------------------------------------------------------------------
    - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJan 20, '11 at 7:36a
activeJan 21, '11 at 1:58p
posts5
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase