FAQ
One final Lucene question for the day...

Is it possible for me to retrieve all the values of a particular field
that exists within an index, across all documents? I don't really need
(or want) to retrieve all documents and then collate them myself into
unique values, and it seems this information should be readily
accessible somehow. If so, could you share how?

For example, I'm indexing documents that have a "category" associated
with them. Several documents will share the same category. I'd like
to be able to retrieve all categories.

And if you are curious what my project actually is - its my custom
blogger, BlogScene: http://www.ehatchersolutions.com/servlets/blogscene
- its a Lucene-based blogger (hence part of the reason for its name).
All documents are indexed and stored in a Lucene index, and retrieved
dynamically for every request. Thanks to Lucene its working very very
nicely! I just need to add some polish to it, such as a category list
and such.

Erik


--
To unsubscribe, e-mail:
For additional commands, e-mail:

Search Discussions

  • Terry Steichen at Dec 30, 2002 at 5:38 pm
    Erik,

    The attached class does what you want.
    Regards,

    Terry

    PS: I've discovered that this code may not work if the index isn't optimized
    (though I've not a clue why that's so).

    ----- Original Message -----
    From: "Erik Hatcher" <lists@ehatchersolutions.com>
    To: <lucene-user@jakarta.apache.org>
    Sent: Monday, December 30, 2002 12:24 PM
    Subject: How to obtain unique field values

    One final Lucene question for the day...

    Is it possible for me to retrieve all the values of a particular field
    that exists within an index, across all documents? I don't really need
    (or want) to retrieve all documents and then collate them myself into
    unique values, and it seems this information should be readily
    accessible somehow. If so, could you share how?

    For example, I'm indexing documents that have a "category" associated
    with them. Several documents will share the same category. I'd like
    to be able to retrieve all categories.

    And if you are curious what my project actually is - its my custom
    blogger, BlogScene: http://www.ehatchersolutions.com/servlets/blogscene
    - its a Lucene-based blogger (hence part of the reason for its name).
    All documents are indexed and stored in a Lucene index, and retrieved
    dynamically for every request. Thanks to Lucene its working very very
    nicely! I just need to add some polish to it, such as a category list
    and such.

    Erik


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:

    >
  • Erik Hatcher at Dec 30, 2002 at 5:47 pm
    Terry,

    Thanks for that quick reply and code. I love open source!

    But it seems there should be a way to do this without walking every
    document, since these fields are indexed after all :)

    I realize that Lucene is very sophisticated under the covers, and there
    may be some technical reason why this isn't feasible though. But
    certainly when I do a query "category:SomeCategory" Lucene is not
    crawling all documents. Hopefully if its possible to obtain this
    information without walking every
    document in the index someone will share how - but if not, its not the
    end of the world for my project.

    Thanks,
    Erik


    On Monday, December 30, 2002, at 12:31 PM, Terry Steichen wrote:
    Erik,

    The attached class does what you want.
    Regards,

    Terry

    PS: I've discovered that this code may not work if the index isn't
    optimized
    (though I've not a clue why that's so).

    ----- Original Message -----
    From: "Erik Hatcher" <lists@ehatchersolutions.com>
    To: <lucene-user@jakarta.apache.org>
    Sent: Monday, December 30, 2002 12:24 PM
    Subject: How to obtain unique field values

    One final Lucene question for the day...

    Is it possible for me to retrieve all the values of a particular field
    that exists within an index, across all documents? I don't really
    need
    (or want) to retrieve all documents and then collate them myself into
    unique values, and it seems this information should be readily
    accessible somehow. If so, could you share how?

    For example, I'm indexing documents that have a "category" associated
    with them. Several documents will share the same category. I'd like
    to be able to retrieve all categories.

    And if you are curious what my project actually is - its my custom
    blogger, BlogScene:
    http://www.ehatchersolutions.com/servlets/blogscene
    - its a Lucene-based blogger (hence part of the reason for its name).
    All documents are indexed and stored in a Lucene index, and retrieved
    dynamically for every request. Thanks to Lucene its working very very
    nicely! I just need to add some polish to it, such as a category list
    and such.

    Erik


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
    <Categories.java>--
    To unsubscribe, e-mail:
    For additional commands, e-mail:

    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Doug Cutting at Dec 30, 2002 at 5:53 pm

    Erik Hatcher wrote:
    Is it possible for me to retrieve all the values of a particular field
    that exists within an index, across all documents?

    For example, I'm indexing documents that have a "category" associated
    with them. Several documents will share the same category. I'd like to
    be able to retrieve all categories.
    The trick is to enumerate terms with that field. Terms are sorted first
    by field, then by text, so all terms with a given field are adjacent in
    enumerations. Term enumeration is also efficient.

    try {
    TermEnum terms = indexReader.terms(new Term("category", ""));
    while ("category".equals(enum.term().field())) {

    ... collect enum.term().text() ...

    if (!terms.next())
    break;
    }
    } finally {
    terms.close();
    }

    Doug


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Erik Hatcher at Dec 30, 2002 at 7:16 pm
    Perfect! Many thanks.

    Erik

    On Monday, December 30, 2002, at 12:54 PM, Doug Cutting wrote:

    Erik Hatcher wrote:
    Is it possible for me to retrieve all the values of a particular
    field that exists within an index, across all documents?
    For example, I'm indexing documents that have a "category" associated
    with them. Several documents will share the same category. I'd like
    to be able to retrieve all categories.
    The trick is to enumerate terms with that field. Terms are sorted
    first by field, then by text, so all terms with a given field are
    adjacent in enumerations. Term enumeration is also efficient.

    try {
    TermEnum terms = indexReader.terms(new Term("category", ""));
    while ("category".equals(enum.term().field())) {

    ... collect enum.term().text() ...

    if (!terms.next())
    break;
    }
    } finally {
    terms.close();
    }

    Doug


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:

    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedDec 30, '02 at 5:24p
activeDec 30, '02 at 7:16p
posts5
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase