FAQ
I needed to return my hits list in date/time order (instead of
relevancy). So, I implemented a class that converted dates to an int
and stored the integer as a field in my index. I passed a Sort object
to the IndexSearcher (indicating that the sort field was convertible to
int) to get things back in date/time order. It works great.



Now I need to do a search within a date/time range (e.g., all the
documents between 1/1/2005 00:00:00 to 1/5/2005 23:59:59). It appears
that there is no way to use the integer date to do the range search
since Lucene is really looking at the string representation of the int.
Even prefixing enough zeros to make all of the integers the same length
doesn't help (darn negative numbers).



Have I missed something? Or, will I need to put some kind of date/time
string that looks like "20050308172533" (for 3/8/2005 17:25:33) in the
index?



My second question is whether range searches are efficient? I seem to
recall reading somewhere that range searches get converted to a list of
the individual items. Since a day has 86,400 seconds, a few days worth
of searching would require a very large list. Please tell me that I'm
out-in-the-weeds on my recall and that range searches are efficient
(i.e., it's effectively just doing a string compare with the start and
end terms).



Scott

Search Discussions

  • Otis Gospodnetic at Mar 9, 2005 at 2:53 am
    Your memory is serving you well.
    http://www.lucenebook.com/search?query=%22range+query%22+performance

    Note the hit in section 6.5.1 - the fact that we used range queries in
    the performance section is an indicator that one can really mess things
    up if using range queries injudiciously. :) In particular, the typical
    advice is to try to round the time information, to avoid costly range
    queries. Seconds sound like trouble.

    Otis


    --- Scott Smith wrote:
    I needed to return my hits list in date/time order (instead of
    relevancy). So, I implemented a class that converted dates to an int
    and stored the integer as a field in my index. I passed a Sort
    object
    to the IndexSearcher (indicating that the sort field was convertible
    to
    int) to get things back in date/time order. It works great.



    Now I need to do a search within a date/time range (e.g., all the
    documents between 1/1/2005 00:00:00 to 1/5/2005 23:59:59). It
    appears
    that there is no way to use the integer date to do the range search
    since Lucene is really looking at the string representation of the
    int.
    Even prefixing enough zeros to make all of the integers the same
    length
    doesn't help (darn negative numbers).



    Have I missed something? Or, will I need to put some kind of
    date/time
    string that looks like "20050308172533" (for 3/8/2005 17:25:33) in
    the
    index?



    My second question is whether range searches are efficient? I seem
    to
    recall reading somewhere that range searches get converted to a list
    of
    the individual items. Since a day has 86,400 seconds, a few days
    worth
    of searching would require a very large list. Please tell me that
    I'm
    out-in-the-weeds on my recall and that range searches are efficient
    (i.e., it's effectively just doing a string compare with the start
    and
    end terms).



    Scott




    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chris Hostetter at Mar 14, 2005 at 7:06 am
    : Note the hit in section 6.5.1 - the fact that we used range queries in
    : the performance section is an indicator that one can really mess things
    : up if using range queries injudiciously. :) In particular, the typical
    : advice is to try to round the time information, to avoid costly range
    : queries. Seconds sound like trouble.

    I hate to sound like a broken record, but in my opinion, there is no
    reason to ever do a "RangeQuery" on dates. Using a "DateFilter" (or a
    "RangeFilter") on the other hand, might make a lot of sense -- and doesn't
    suffer from any of RangeQuery's downsides.

    See "Using a Filter Instead" in...
    http://wiki.apache.org/jakarta-lucene/DateRangeQueries


    : > I needed to return my hits list in date/time order (instead of
    : > relevancy). So, I implemented a class that converted dates to an int
    : > and stored the integer as a field in my index. I passed a Sort
    : > object
    : > to the IndexSearcher (indicating that the sort field was convertible
    : > to
    : > int) to get things back in date/time order. It works great.
    : >
    : >
    : >
    : > Now I need to do a search within a date/time range (e.g., all the
    : > documents between 1/1/2005 00:00:00 to 1/5/2005 23:59:59). It
    : > appears
    : > that there is no way to use the integer date to do the range search
    : > since Lucene is really looking at the string representation of the
    : > int.
    : > Even prefixing enough zeros to make all of the integers the same
    : > length
    : > doesn't help (darn negative numbers).
    : >
    : >
    : >
    : > Have I missed something? Or, will I need to put some kind of
    : > date/time
    : > string that looks like "20050308172533" (for 3/8/2005 17:25:33) in
    : > the
    : > index?
    : >
    : >
    : >
    : > My second question is whether range searches are efficient? I seem
    : > to
    : > recall reading somewhere that range searches get converted to a list
    : > of
    : > the individual items. Since a day has 86,400 seconds, a few days
    : > worth
    : > of searching would require a very large list. Please tell me that
    : > I'm
    : > out-in-the-weeds on my recall and that range searches are efficient
    : > (i.e., it's effectively just doing a string compare with the start
    : > and
    : > end terms).
    : >
    : >
    : >
    : > Scott
    : >
    : >
    : >
    : >
    : >
    : >
    :
    : ---------------------------------------------------------------------
    : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    : For additional commands, e-mail: java-user-help@lucene.apache.org
    :



    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 9, '05 at 1:54a
activeMar 14, '05 at 7:06a
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase