FAQ
I have a large index that needs to yield very fast query times. I am
sorting by date as default since I am interested in the most recent
documents. I was wondering if I boosted the score of my documents in
proportion to the date and not sorting would this increase search
performance. Thoughts?



Thanks,

Michael

Search Discussions

  • Erik Hatcher at Mar 15, 2005 at 1:43 pm
    I've been effectively off-line for a few days, so I'm not sure if
    anyone has replied on this thread yet.

    Using boosts will definitely use less resources than sorting. If you
    do use sorting for dates, be sure you're doing it numerically rather
    than lexicographically.

    Erik
    On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:

    I have a large index that needs to yield very fast query times. I am
    sorting by date as default since I am interested in the most recent
    documents. I was wondering if I boosted the score of my documents in
    proportion to the date and not sorting would this increase search
    performance. Thoughts?



    Thanks,

    Michael

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael Celona at Mar 17, 2005 at 2:00 pm
    I am sorting against an epoch time stored in my index. By using:

    contactDocument.add( Field.Keyword( "epoch_time", epoch );

    Then I sort by this field. My search time is in the order of 3sec on an
    index of about 6G using simple searches against a text field. By using
    boosts I was hoping to increase performance. Do you think this will make a
    big difference?

    Michael

    -----Original Message-----
    From: Erik Hatcher
    Sent: Tuesday, March 15, 2005 8:43 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    I've been effectively off-line for a few days, so I'm not sure if
    anyone has replied on this thread yet.

    Using boosts will definitely use less resources than sorting. If you
    do use sorting for dates, be sure you're doing it numerically rather
    than lexicographically.

    Erik
    On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:

    I have a large index that needs to yield very fast query times. I am
    sorting by date as default since I am interested in the most recent
    documents. I was wondering if I boosted the score of my documents in
    proportion to the date and not sorting would this increase search
    performance. Thoughts?



    Thanks,

    Michael

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Erik Hatcher at Mar 17, 2005 at 2:54 pm
    Is epoch a Date? or a String? If a String, what format is it?

    Sorting by a Date keyword field will be sorting as a String value,
    which is a fair bit more resource intensive than if it was numeric.

    Try using a purely numeric field (though as a String) that can be
    represented as an int be sure to specify the sort type as an int and
    see if that improves performance. I'm pretty certain you'd still get
    better performance by using a boost than a sort though.

    Erik
    On Mar 17, 2005, at 8:59 AM, Michael Celona wrote:

    I am sorting against an epoch time stored in my index. By using:

    contactDocument.add( Field.Keyword( "epoch_time", epoch );

    Then I sort by this field. My search time is in the order of 3sec on
    an
    index of about 6G using simple searches against a text field. By using
    boosts I was hoping to increase performance. Do you think this will
    make a
    big difference?

    Michael

    -----Original Message-----
    From: Erik Hatcher
    Sent: Tuesday, March 15, 2005 8:43 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    I've been effectively off-line for a few days, so I'm not sure if
    anyone has replied on this thread yet.

    Using boosts will definitely use less resources than sorting. If you
    do use sorting for dates, be sure you're doing it numerically rather
    than lexicographically.

    Erik
    On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:

    I have a large index that needs to yield very fast query times. I am
    sorting by date as default since I am interested in the most recent
    documents. I was wondering if I boosted the score of my documents in
    proportion to the date and not sorting would this increase search
    performance. Thoughts?



    Thanks,

    Michael

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael Celona at Mar 17, 2005 at 4:14 pm
    Epoch is in seconds... I am also forced to used a date filter on most of
    searches... how bad is the performance hit of that.

    -----Original Message-----
    From: Erik Hatcher
    Sent: Thursday, March 17, 2005 9:54 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    Is epoch a Date? or a String? If a String, what format is it?

    Sorting by a Date keyword field will be sorting as a String value,
    which is a fair bit more resource intensive than if it was numeric.

    Try using a purely numeric field (though as a String) that can be
    represented as an int be sure to specify the sort type as an int and
    see if that improves performance. I'm pretty certain you'd still get
    better performance by using a boost than a sort though.

    Erik
    On Mar 17, 2005, at 8:59 AM, Michael Celona wrote:

    I am sorting against an epoch time stored in my index. By using:

    contactDocument.add( Field.Keyword( "epoch_time", epoch );

    Then I sort by this field. My search time is in the order of 3sec on
    an
    index of about 6G using simple searches against a text field. By using
    boosts I was hoping to increase performance. Do you think this will
    make a
    big difference?

    Michael

    -----Original Message-----
    From: Erik Hatcher
    Sent: Tuesday, March 15, 2005 8:43 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    I've been effectively off-line for a few days, so I'm not sure if
    anyone has replied on this thread yet.

    Using boosts will definitely use less resources than sorting. If you
    do use sorting for dates, be sure you're doing it numerically rather
    than lexicographically.

    Erik
    On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:

    I have a large index that needs to yield very fast query times. I am
    sorting by date as default since I am interested in the most recent
    documents. I was wondering if I boosted the score of my documents in
    proportion to the date and not sorting would this increase search
    performance. Thoughts?



    Thanks,

    Michael

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Erik Hatcher at Mar 17, 2005 at 4:41 pm

    On Mar 17, 2005, at 11:13 AM, Michael Celona wrote:
    Epoch is in seconds...
    But you still haven't provided the *type* of epoch. It's a Date? a
    String? What do the string values look like?
    I am also forced to used a date filter on most of
    searches... how bad is the performance hit of that.
    Only testing will tell. The hit of a filter comes the first time (as
    long as you cache and use the same IndexReader), so its not likely to
    be a factor over many queries.

    Erik
    -----Original Message-----
    From: Erik Hatcher
    Sent: Thursday, March 17, 2005 9:54 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    Is epoch a Date? or a String? If a String, what format is it?

    Sorting by a Date keyword field will be sorting as a String value,
    which is a fair bit more resource intensive than if it was numeric.

    Try using a purely numeric field (though as a String) that can be
    represented as an int be sure to specify the sort type as an int and
    see if that improves performance. I'm pretty certain you'd still get
    better performance by using a boost than a sort though.

    Erik
    On Mar 17, 2005, at 8:59 AM, Michael Celona wrote:

    I am sorting against an epoch time stored in my index. By using:

    contactDocument.add( Field.Keyword( "epoch_time", epoch );

    Then I sort by this field. My search time is in the order of 3sec on
    an
    index of about 6G using simple searches against a text field. By
    using
    boosts I was hoping to increase performance. Do you think this will
    make a
    big difference?

    Michael

    -----Original Message-----
    From: Erik Hatcher
    Sent: Tuesday, March 15, 2005 8:43 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    I've been effectively off-line for a few days, so I'm not sure if
    anyone has replied on this thread yet.

    Using boosts will definitely use less resources than sorting. If you
    do use sorting for dates, be sure you're doing it numerically rather
    than lexicographically.

    Erik
    On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:

    I have a large index that needs to yield very fast query times. I am
    sorting by date as default since I am interested in the most recent
    documents. I was wondering if I boosted the score of my documents in
    proportion to the date and not sorting would this increase search
    performance. Thoughts?



    Thanks,

    Michael

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael Celona at Mar 17, 2005 at 4:55 pm
    My epoch looks like 1110816121 but is represented by a string.

    -----Original Message-----
    From: Erik Hatcher
    Sent: Thursday, March 17, 2005 11:41 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    On Mar 17, 2005, at 11:13 AM, Michael Celona wrote:
    Epoch is in seconds...
    But you still haven't provided the *type* of epoch. It's a Date? a
    String? What do the string values look like?
    I am also forced to used a date filter on most of
    searches... how bad is the performance hit of that.
    Only testing will tell. The hit of a filter comes the first time (as
    long as you cache and use the same IndexReader), so its not likely to
    be a factor over many queries.

    Erik
    -----Original Message-----
    From: Erik Hatcher
    Sent: Thursday, March 17, 2005 9:54 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    Is epoch a Date? or a String? If a String, what format is it?

    Sorting by a Date keyword field will be sorting as a String value,
    which is a fair bit more resource intensive than if it was numeric.

    Try using a purely numeric field (though as a String) that can be
    represented as an int be sure to specify the sort type as an int and
    see if that improves performance. I'm pretty certain you'd still get
    better performance by using a boost than a sort though.

    Erik
    On Mar 17, 2005, at 8:59 AM, Michael Celona wrote:

    I am sorting against an epoch time stored in my index. By using:

    contactDocument.add( Field.Keyword( "epoch_time", epoch );

    Then I sort by this field. My search time is in the order of 3sec on
    an
    index of about 6G using simple searches against a text field. By
    using
    boosts I was hoping to increase performance. Do you think this will
    make a
    big difference?

    Michael

    -----Original Message-----
    From: Erik Hatcher
    Sent: Tuesday, March 15, 2005 8:43 AM
    To: java-user@lucene.apache.org
    Subject: Re: search performace

    I've been effectively off-line for a few days, so I'm not sure if
    anyone has replied on this thread yet.

    Using boosts will definitely use less resources than sorting. If you
    do use sorting for dates, be sure you're doing it numerically rather
    than lexicographically.

    Erik
    On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:

    I have a large index that needs to yield very fast query times. I am
    sorting by date as default since I am interested in the most recent
    documents. I was wondering if I boosted the score of my documents in
    proportion to the date and not sorting would this increase search
    performance. Thoughts?



    Thanks,

    Michael

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 10, '05 at 1:45p
activeMar 17, '05 at 4:55p
posts7
users2
websitelucene.apache.org

2 users in discussion

Michael Celona: 4 posts Erik Hatcher: 3 posts

People

Translate

site design / logo © 2022 Grokbase