FAQ
Hello all,
I have a how-to question. I have a field with these tokens in it (a b c f b g a) and I am searching on it with these tokens (a f e g a). So far this is easy I just set up a BooleanQuery with a bunch of optional TermQueries and get hits on (a f g a) but not (e) which is close to what I want. Now the part I am having difficulties with, I want to only find document that have those search tokens in order. Search (a f e g a) gets a hit, search (a f g e a) would not. A PhraseQuery expects all the search terms to be there but is close to what I want. Currently I do the set of TermQueries then filter out any document that does not have the terms in the right order as I am displaying them. I do not cycle through the entire Hits object, just enough to get something to display This works but then I never know the real count of found documents without going through all of them, which is slow. The option I am pondering is writing my own Query object but I'd rather find an easier
solution. Any thoughts?

Thanks,
Richard K.


---------------------------------
8:00? 8:25? 8:40? Find a flick in no time
with theYahoo! Search movie showtime shortcut.

Search Discussions

  • Erick Erickson at Mar 21, 2007 at 1:23 am
    I'm betting you can make SpanNearQuery work for you. In the simple case it's
    a bunch of SpanQuerys (which in its simplest form is just a Span version of
    TermQuery). The two other parameters are slop (See Lucene In Action for an
    explanation of this) and whether the terms must appear in the order they are
    in the array of SpanQuery you use in the constructor to the SpanNearQuery.

    Once you construct the SpanNearQuery, you can use it just like any other
    query.

    So, if you wanted to find all the documents with your terms in order but
    didn't care how close together they were, you could specify a very large
    slop and inOrder = true. If you only wanted proximate terms, select a
    reasonable slop. Defining "reasonable" is an exercise left to the reader...

    Best
    Erick
    On 3/20/07, Santa Clause wrote:

    Hello all,
    I have a how-to question. I have a field with these tokens in it (a b c
    f b g a) and I am searching on it with these tokens (a f e g a). So far this
    is easy I just set up a BooleanQuery with a bunch of optional TermQueries
    and get hits on (a f g a) but not (e) which is close to what I want. Now the
    part I am having difficulties with, I want to only find document that have
    those search tokens in order. Search (a f e g a) gets a hit, search (a f g e
    a) would not. A PhraseQuery expects all the search terms to be there but is
    close to what I want. Currently I do the set of TermQueries then filter out
    any document that does not have the terms in the right order as I am
    displaying them. I do not cycle through the entire Hits object, just enough
    to get something to display This works but then I never know the real count
    of found documents without going through all of them, which is slow. The
    option I am pondering is writing my own Query object but I'd rather find an
    easier
    solution. Any thoughts?

    Thanks,
    Richard K.


    ---------------------------------
    8:00? 8:25? 8:40? Find a flick in no time
    with theYahoo! Search movie showtime shortcut.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 21, '07 at 1:07a
activeMar 21, '07 at 1:23a
posts2
users2
websitelucene.apache.org

2 users in discussion

Erick Erickson: 1 post Santa Clause: 1 post

People

Translate

site design / logo © 2022 Grokbase