FAQ
I have created an index and each document has a contents field and a
language field.

contents has the flags: Indexed Tokenized Stored Vector
language has the flags: Indexed Stored

In luke I can search contents fine, but when I try to search the field
language, I never ever get results.

Every document has a language field populated which I can see in Luke's
document browser.

I have tried language:english and even made language the default field.

I gotta be missing something simple....

Search Discussions

  • Erick Erickson at Sep 5, 2009 at 1:03 am
    Hmmmmm. Let's see the queries, and query.toString() may give
    you some clues. I *suspect* that you really didn't index language.
    Did you, perhaps, not re-index all your docs? Or use an analyzer
    that didn't fold case when searching but did when searching (or
    vice-versa)?

    It's *possible* that you've got some kind of case problem. I've also
    done really silly things like stared at the field for hours and never
    noticed that it was "langauge".

    Two things would help:
    1> show us the index code you actually use.
    2> make an index with, say, two documents and attach it to a post. Maybe
    someone with eyes less tired than yours will spot the issue.

    Because you're right. This *should* work just fine, so there's probably
    something that'll make you slap your forehead when you figure it out.
    I speak from experience here, I've given myself lumps on my forehead
    when someone pointed out *perfectly obvious* problems that I couldn't
    find for hours<G>.

    Best
    Erick
    On Fri, Sep 4, 2009 at 7:55 PM, Ian Vink wrote:

    I have created an index and each document has a contents field and a
    language field.

    contents has the flags: Indexed Tokenized Stored Vector
    language has the flags: Indexed Stored

    In luke I can search contents fine, but when I try to search the field
    language, I never ever get results.

    Every document has a language field populated which I can see in Luke's
    document browser.

    I have tried language:english and even made language the default field.

    I gotta be missing something simple....
  • Ian Vink at Sep 5, 2009 at 2:08 am
    Case?

    Hmm. I thought it was case intensive.

    I will re-index and revert all to lower case and see. The language is stored
    as "English" not "english"

    ian

    P.S. Here's what I built with a basic understanding of Lucene.
    http://BahaiResearch.com it's open source, ad-free. It allows people in 20
    languages search the texts of any religion. Promotes understanding among the
    different peoples of the world. Chinese Buddhists can better understand say
    Spanish Baha'is or Dutch Jews.

    On Fri, Sep 4, 2009 at 10:02 PM, Erick Erickson wrote:

    Hmmmmm. Let's see the queries, and query.toString() may give
    you some clues. I *suspect* that you really didn't index language.
    Did you, perhaps, not re-index all your docs? Or use an analyzer
    that didn't fold case when searching but did when searching (or
    vice-versa)?

    It's *possible* that you've got some kind of case problem. I've also
    done really silly things like stared at the field for hours and never
    noticed that it was "langauge".

    Two things would help:
    1> show us the index code you actually use.
    2> make an index with, say, two documents and attach it to a post. Maybe
    someone with eyes less tired than yours will spot the issue.

    Because you're right. This *should* work just fine, so there's probably
    something that'll make you slap your forehead when you figure it out.
    I speak from experience here, I've given myself lumps on my forehead
    when someone pointed out *perfectly obvious* problems that I couldn't
    find for hours<G>.

    Best
    Erick
    On Fri, Sep 4, 2009 at 7:55 PM, Ian Vink wrote:

    I have created an index and each document has a contents field and a
    language field.

    contents has the flags: Indexed Tokenized Stored Vector
    language has the flags: Indexed Stored

    In luke I can search contents fine, but when I try to search the field
    language, I never ever get results.

    Every document has a language field populated which I can see in Luke's
    document browser.

    I have tried language:english and even made language the default field.

    I gotta be missing something simple....
  • Erick Erickson at Sep 9, 2009 at 1:28 pm
    It's all in the analyzers. Depending upon which analyzer you use manythings
    happen to the input stream. Casing is one example, but that's just
    the simplest. Which is why it's so important to use the same analyzer
    when indexing and querying unless you have a *very* good reason not to.

    I'd really advise looking at the Analyzer docs to see what transformations
    they make. You'll see they're composed of Filters that perform various
    transformations..

    Best
    Erick
    On Fri, Sep 4, 2009 at 10:07 PM, Ian Vink wrote:

    Case?

    Hmm. I thought it was case intensive.

    I will re-index and revert all to lower case and see. The language is
    stored
    as "English" not "english"

    ian

    P.S. Here's what I built with a basic understanding of Lucene.
    http://BahaiResearch.com it's open source, ad-free. It allows people in 20
    languages search the texts of any religion. Promotes understanding among
    the
    different peoples of the world. Chinese Buddhists can better understand say
    Spanish Baha'is or Dutch Jews.


    On Fri, Sep 4, 2009 at 10:02 PM, Erick Erickson <erickerickson@gmail.com
    wrote:
    Hmmmmm. Let's see the queries, and query.toString() may give
    you some clues. I *suspect* that you really didn't index language.
    Did you, perhaps, not re-index all your docs? Or use an analyzer
    that didn't fold case when searching but did when searching (or
    vice-versa)?

    It's *possible* that you've got some kind of case problem. I've also
    done really silly things like stared at the field for hours and never
    noticed that it was "langauge".

    Two things would help:
    1> show us the index code you actually use.
    2> make an index with, say, two documents and attach it to a post. Maybe
    someone with eyes less tired than yours will spot the issue.

    Because you're right. This *should* work just fine, so there's probably
    something that'll make you slap your forehead when you figure it out.
    I speak from experience here, I've given myself lumps on my forehead
    when someone pointed out *perfectly obvious* problems that I couldn't
    find for hours<G>.

    Best
    Erick
    On Fri, Sep 4, 2009 at 7:55 PM, Ian Vink wrote:

    I have created an index and each document has a contents field and a
    language field.

    contents has the flags: Indexed Tokenized Stored Vector
    language has the flags: Indexed Stored

    In luke I can search contents fine, but when I try to search the field
    language, I never ever get results.

    Every document has a language field populated which I can see in Luke's
    document browser.

    I have tried language:english and even made language the default field.

    I gotta be missing something simple....

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedSep 4, '09 at 11:56p
activeSep 9, '09 at 1:28p
posts4
users2
websitelucene.apache.org

2 users in discussion

Ian Vink: 2 posts Erick Erickson: 2 posts

People

Translate

site design / logo © 2022 Grokbase