FAQ
Hi Folks,

Before I run off and reinvent the wheel here - has anyone done any form
of result grouping with lucene?

My use case looks something like this:
Newspaper pages are stored as documents in the lucene index.
I need to list the newpapers that match my criteria in date order, so
that I can then in a subsequent search enumerate the first n pages from
each. n is derived from the UI - in this case the screen width the user
has available.
Ideally I'd want to pull all papers for a given date - so a way to pull
a result set that identifies a set of dates that have pages stored
against them would be ideal. It seems to me that the only way to do this
at present would be to define a custom collector and aggregate such a
result set on the fly?

My reason for wanting to group is so that I can easily compute the
next/previous start indexes as the user browses through the timeline. If
I have to include the (variable) page count each time it gets
convoluted. More so since some pages may be missing from each paper.

Any thoughts appreciated.

--

Rgds.
*Dawn Raison*
Technical Director, Digitorial Ltd.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Grant Ingersoll at Mar 23, 2011 at 7:23 pm

    On Mar 22, 2011, at 6:43 AM, Dawn Zoë Raison wrote:

    Hi Folks,

    Before I run off and reinvent the wheel here - has anyone done any form of result grouping with lucene?

    My use case looks something like this:
    Newspaper pages are stored as documents in the lucene index.
    I need to list the newpapers that match my criteria in date order, so that I can then in a subsequent search enumerate the first n pages from each. n is derived from the UI - in this case the screen width the user has available.
    Ideally I'd want to pull all papers for a given date - so a way to pull a result set that identifies a set of dates that have pages stored against them would be ideal. It seems to me that the only way to do this at present would be to define a custom collector and aggregate such a result set on the fly?

    My reason for wanting to group is so that I can easily compute the next/previous start indexes as the user browses through the timeline. If I have to include the (variable) page count each time it gets convoluted. More so since some pages may be missing from each paper.

    Any thoughts appreciated.
    Have you looked at Solr and date faceting capabilities? Also, it has result grouping, but I think you are just describing faceting/filtering.

    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com/

    Search the Lucene ecosystem docs using Solr/Lucene:
    http://www.lucidimagination.com/search


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Dawn Zoë Raison at Mar 25, 2011 at 10:15 am

    On 23/03/2011 17:55, Grant Ingersoll wrote:
    Have you looked at Solr and date faceting capabilities? Also, it has result grouping, but I think you are just describing faceting/filtering.
    SOLR is not an option, we are already have the index (>2 million pages
    some with 100,000 terms).
    What I'm looking to do is to create some new ways to view the data.

    Is there a good FAQ on faceting/filtering I can peruse.

    Ta.
    --

    Rgds.
    *Dawn Raison*
    Technical Director, Digitorial Ltd.
  • Ian Lea at Mar 25, 2011 at 10:31 am
    I'm not aware of a particular FAQ on this.

    There is something called bobo-browse - "Faceted search library based
    on Lucene - Google ...
    Bobo Browse is an information retrieval technology that provides
    navigational browsing into a semi-structured dataset. Beyond the
    result set from queries ...".

    http://sbdevel.wordpress.com/2010/09/24/sorting-faceting-index-lookup/
    and https://issues.apache.org/jira/browse/LUCENE-2369 sound
    interesting.


    --
    Ian.

    On Fri, Mar 25, 2011 at 10:14 AM, Dawn Zoë Raison wrote:
    On 23/03/2011 17:55, Grant Ingersoll wrote:

    Have you looked at Solr and date faceting capabilities?  Also, it has
    result grouping, but I think you are just describing faceting/filtering.
    SOLR is not an option, we are already have the index (>2 million pages some
    with 100,000 terms).
    What I'm looking to do is to create some new ways to view the data.

    Is there a good FAQ on faceting/filtering I can peruse.

    Ta.
    --

    Rgds.
    *Dawn Raison*
    Technical Director, Digitorial Ltd.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 22, '11 at 10:44a
activeMar 25, '11 at 10:31a
posts4
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase