Grokbase Groups Lucene dev March 2013
FAQ
[ https://issues.apache.org/jira/browse/LUCENE-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597853#comment-13597853 ]

Bill Bell commented on LUCENE-3079:
-----------------------------------

Has this been integrated into SOLR? Even if we just add it to SOLR as an option with hooks would be good. Or as an override in the lower levels?
Faceting module
---------------

Key: LUCENE-3079
URL: https://issues.apache.org/jira/browse/LUCENE-3079
Project: Lucene - Core
Issue Type: Improvement
Components: modules/facet
Reporter: Michael McCandless
Assignee: Shai Erera
Fix For: 3.4, 4.0-ALPHA

Attachments: facet-userguide.pdf, LUCENE-3079_4x_broken.patch, LUCENE-3079_4x.patch, LUCENE-3079-dev-tools.patch, LUCENE-3079.patch, LUCENE-3079.patch, LUCENE-3079.patch, LUCENE-3079.patch, TestPerformanceHack.java


Faceting is a hugely important feature, available in Solr today but
not [easily] usable by Lucene-only apps.
We should fix this, by creating a shared faceting module.
Ideally, we factor out Solr's faceting impl, and maybe poach/merge
from other impls (eg Bobo browse).
Hoss describes some important challenges we'll face in doing this
(http://markmail.org/message/5w35c2fr4zkiwsz6), copied here:
{noformat}
To look at "faceting" as a concrete example, there are big the reasons
faceting works so well in Solr: Solr has total control over the
index, knows exactly when the index has changed to rebuild caches, has a
strict schema so it can make sense of field types and
pick faceting algos accordingly, has multi-phase distributed search
approach to get exact counts efficiently across multiple shards, etc...
(and there are still a lot of additional enhancements and improvements
that can be made to take even more advantage of knowledge solr has because
it "owns" the index that we no one has had time to tackle)
{noformat}
This is a great list of the things we face in refactoring. It's also
important because, if Solr needed to be so deeply intertwined with
caching, schema, etc., other apps that want to facet will have the
same "needs" and so we really have to address them in creating the
shared module.
I think we should get a basic faceting module started, but should not
cut Solr over at first. We should iterate on the module, fold in
improvements, etc., and then, once we can fully verify that cutting
over doesn't hurt Solr (ie lose functionality or performance) we can
later cutover.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Search Discussions

  • Michael McCandless (JIRA) at Mar 9, 2013 at 8:51 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598066#comment-13598066 ]

    Michael McCandless commented on LUCENE-3079:
    --------------------------------------------

    bq. Has this been integrated into SOLR?

    Hi Bill,

    No, not yet ... the separate taxonomy index makes it tricky. Although, the new facet method based on SortedSetDocValues is being used in Solr and is a patch (LUCENE-4795) to add to the faceting module, so there's a small overlap there...
    Faceting module
    ---------------

    Key: LUCENE-3079
    URL: https://issues.apache.org/jira/browse/LUCENE-3079
    Project: Lucene - Core
    Issue Type: Improvement
    Components: modules/facet
    Reporter: Michael McCandless
    Assignee: Shai Erera
    Fix For: 3.4, 4.0-ALPHA

    Attachments: facet-userguide.pdf, LUCENE-3079_4x_broken.patch, LUCENE-3079_4x.patch, LUCENE-3079-dev-tools.patch, LUCENE-3079.patch, LUCENE-3079.patch, LUCENE-3079.patch, LUCENE-3079.patch, TestPerformanceHack.java


    Faceting is a hugely important feature, available in Solr today but
    not [easily] usable by Lucene-only apps.
    We should fix this, by creating a shared faceting module.
    Ideally, we factor out Solr's faceting impl, and maybe poach/merge
    from other impls (eg Bobo browse).
    Hoss describes some important challenges we'll face in doing this
    (http://markmail.org/message/5w35c2fr4zkiwsz6), copied here:
    {noformat}
    To look at "faceting" as a concrete example, there are big the reasons
    faceting works so well in Solr: Solr has total control over the
    index, knows exactly when the index has changed to rebuild caches, has a
    strict schema so it can make sense of field types and
    pick faceting algos accordingly, has multi-phase distributed search
    approach to get exact counts efficiently across multiple shards, etc...
    (and there are still a lot of additional enhancements and improvements
    that can be made to take even more advantage of knowledge solr has because
    it "owns" the index that we no one has had time to tackle)
    {noformat}
    This is a great list of the things we face in refactoring. It's also
    important because, if Solr needed to be so deeply intertwined with
    caching, schema, etc., other apps that want to facet will have the
    same "needs" and so we really have to address them in creating the
    shared module.
    I think we should get a basic faceting module started, but should not
    cut Solr over at first. We should iterate on the module, fold in
    improvements, etc., and then, once we can fully verify that cutting
    over doesn't hurt Solr (ie lose functionality or performance) we can
    later cutover.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators
    For more information on JIRA, see: http://www.atlassian.com/software/jira

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedMar 9, '13 at 5:53a
activeMar 9, '13 at 8:51p
posts2
users1
websitelucene.apache.org

1 user in discussion

Michael McCandless (JIRA): 2 posts

People

Translate

site design / logo © 2021 Grokbase