FAQ
Hello all,

During concurency test, i.e. indexing and searching simultaniosly,
Searcher stumbled with following error:

java.lang.RuntimeException: no terms in field modified - cannot determine
sort type
at
org.apache.lucene.search.FieldSortedHitQueue.determineComparator(FieldSortedHitQueue.java:187)
at
org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:125)
at
org.apache.lucene.search.MultiFieldSortedHitQueue.(IndexSearcher.java:118)
at
org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:141)
at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
at org.apache.lucene.search.Hits.(Searcher.java:41)
at novartis.lucene.LuceneItems.getItems(LuceneItems.java:304)
at novartis.lucene.LuceneItems.doAllItems(LuceneItems.java:246)
at novartis.lucene.LuceneItems.go(LuceneItems.java:368)
at novartis.lucene.LuceneItems.main(LuceneItems.java:574)

Indexer was optimizing and closing after every 300 entries.

Searcher did query every second: hits = ms.search(query,new Sort("modified",true));
where "modified" is in DateField.timeToString(modified)) format and query
like "+contents:novartis"

The values for field "modified" are definitly existing.

On Indexer side no exceptions took place.

Both processes used the same lockDir.

Searcher works smartly on created index.

Please help.

Have a nice day
J.

Search Discussions

  • Otis Gospodnetic at Jun 16, 2004 at 8:41 am
    Hello,

    I can't comment on the soft exception (Erik or Tim may be able to
    help), but I can comment on optimizing your index after every 300
    entries:
    unless your search is getting too slow, or you are running out of file
    handles, there is no need to optimize the index while you are building
    it, and that actually prolongs indexing. Just optimize it at the end.

    Otis


    --- iouli.golovatyi@group.novartis.com wrote:
    Hello all,

    During concurency test, i.e. indexing and searching simultaniosly,
    Searcher stumbled with following error:

    java.lang.RuntimeException: no terms in field modified - cannot
    determine
    sort type
    at
    org.apache.lucene.search.FieldSortedHitQueue.determineComparator(FieldSortedHitQueue.java:187)
    at
    org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:125)
    at
    org.apache.lucene.search.MultiFieldSortedHitQueue.<init>(MultiFieldSortedHitQueue.java:54)
    at
    org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:118)
    at
    org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:141)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
    at org.apache.lucene.search.Hits.<init>(Hits.java:51)
    at org.apache.lucene.search.Searcher.search(Searcher.java:41)
    at novartis.lucene.LuceneItems.getItems(LuceneItems.java:304)
    at
    novartis.lucene.LuceneItems.doAllItems(LuceneItems.java:246)
    at novartis.lucene.LuceneItems.go(LuceneItems.java:368)
    at novartis.lucene.LuceneItems.main(LuceneItems.java:574)

    Indexer was optimizing and closing after every 300 entries.

    Searcher did query every second: hits = ms.search(query,new
    Sort("modified",true));
    where "modified" is in DateField.timeToString(modified)) format and
    query
    like "+contents:novartis"

    The values for field "modified" are definitly existing.

    On Indexer side no exceptions took place.

    Both processes used the same lockDir.

    Searcher works smartly on created index.

    Please help.

    Have a nice day
    J.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Iouli Golovatyi at Jun 16, 2004 at 8:54 am
    As a matter of fact there are no end. I just try to simulate the real
    life, i.e. information is coming from different providers with speed 3-5
    messages per s. I cash it ( c. 300 entries in bulk) and process the whole
    bulk. After that I optimize, otherwise search is too slow.





    Hello,

    I can't comment on the soft exception (Erik or Tim may be able to
    help), but I can comment on optimizing your index after every 300
    entries:
    unless your search is getting too slow, or you are running out of file
    handles, there is no need to optimize the index while you are building
    it, and that actually prolongs indexing. Just optimize it at the end.

    Otis


    --- iouli.golovatyi@group.novartis.com wrote:
    Hello all,

    During concurency test, i.e. indexing and searching simultaniosly,
    Searcher stumbled with following error:

    java.lang.RuntimeException: no terms in field modified - cannot
    determine
    sort type
    at
    org.apache.lucene.search.FieldSortedHitQueue.determineComparator(FieldSortedHitQueue.java:187)
    at
    org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:125)
    at
    org.apache.lucene.search.MultiFieldSortedHitQueue.<init>(MultiFieldSortedHitQueue.java:54)
    at
    org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:118)
    at
    org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:141)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
    at org.apache.lucene.search.Hits.<init>(Hits.java:51)
    at org.apache.lucene.search.Searcher.search(Searcher.java:41)
    at novartis.lucene.LuceneItems.getItems(LuceneItems.java:304)
    at
    novartis.lucene.LuceneItems.doAllItems(LuceneItems.java:246)
    at novartis.lucene.LuceneItems.go(LuceneItems.java:368)
    at novartis.lucene.LuceneItems.main(LuceneItems.java:574)

    Indexer was optimizing and closing after every 300 entries.

    Searcher did query every second: hits = ms.search(query,new
    Sort("modified",true));
    where "modified" is in DateField.timeToString(modified)) format and
    query
    like "+contents:novartis"

    The values for field "modified" are definitly existing.

    On Indexer side no exceptions took place.

    Both processes used the same lockDir.

    Searcher works smartly on created index.

    Please help.

    Have a nice day
    J.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Erik Hatcher at Jun 16, 2004 at 9:14 am
    Are you sure every document has a single "modified" indexed term? How
    are you indexing it?

    Erik

    On Jun 16, 2004, at 3:51 AM, iouli.golovatyi@group.novartis.com wrote:

    Hello all,

    During concurency test, i.e. indexing and searching simultaniosly,
    Searcher stumbled with following error:

    java.lang.RuntimeException: no terms in field modified - cannot
    determine
    sort type
    at
    org.apache.lucene.search.FieldSortedHitQueue.determineComparator(FieldS
    ortedHitQueue.java:187)
    at
    org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldS
    ortedHitQueue.java:125)
    at
    org.apache.lucene.search.MultiFieldSortedHitQueue.<init>(MultiFieldSort
    edHitQueue.java:54)
    at
    org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:118)
    at
    org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:141)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
    at org.apache.lucene.search.Hits.<init>(Hits.java:51)
    at org.apache.lucene.search.Searcher.search(Searcher.java:41)
    at novartis.lucene.LuceneItems.getItems(LuceneItems.java:304)
    at novartis.lucene.LuceneItems.doAllItems(LuceneItems.java:246)
    at novartis.lucene.LuceneItems.go(LuceneItems.java:368)
    at novartis.lucene.LuceneItems.main(LuceneItems.java:574)

    Indexer was optimizing and closing after every 300 entries.

    Searcher did query every second: hits = ms.search(query,new
    Sort("modified",true));
    where "modified" is in DateField.timeToString(modified)) format and
    query
    like "+contents:novartis"

    The values for field "modified" are definitly existing.

    On Indexer side no exceptions took place.

    Both processes used the same lockDir.

    Searcher works smartly on created index.

    Please help.

    Have a nice day
    J.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Iouli Golovatyi at Jun 16, 2004 at 9:33 am
    Are you sure every document has a single "modified" indexed term?

    What do You call single? It's just one field, defined as keyword, but it
    content can be the same, because it's a timestamp. Every doc has it, this I
    garantee.

    How are you indexing it?

    I have a bulk file with entries like:

    FT¬20040219174432¬¬20040219/17/44/AUT_33957308¬Watch out for relative
    valuations performance¬FT¬11111111¬D:¬yyyyMM
    ...
    where 20040219174432 is "modified" field content
    and 20040219/17/44/AUT_33957308 relative pathname of document to be indexed

    I use 1.4-rc3





    Erik

    On Jun 16, 2004, at 3:51 AM, iouli.golovatyi@group.novartis.com wrote:

    Hello all,

    During concurency test, i.e. indexing and searching simultaniosly,
    Searcher stumbled with following error:

    java.lang.RuntimeException: no terms in field modified - cannot
    determine
    sort type
    at
    org.apache.lucene.search.FieldSortedHitQueue.determineComparator(FieldS
    ortedHitQueue.java:187)
    at
    org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldS
    ortedHitQueue.java:125)
    at
    org.apache.lucene.search.MultiFieldSortedHitQueue.<init>(MultiFieldSort
    edHitQueue.java:54)
    at
    org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:118)
    at
    org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:141)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
    at org.apache.lucene.search.Hits.<init>(Hits.java:51)
    at org.apache.lucene.search.Searcher.search(Searcher.java:41)
    at novartis.lucene.LuceneItems.getItems(LuceneItems.java:304)
    at novartis.lucene.LuceneItems.doAllItems(LuceneItems.java:246)
    at novartis.lucene.LuceneItems.go(LuceneItems.java:368)
    at novartis.lucene.LuceneItems.main(LuceneItems.java:574)

    Indexer was optimizing and closing after every 300 entries.

    Searcher did query every second: hits = ms.search(query,new
    Sort("modified",true));
    where "modified" is in DateField.timeToString(modified)) format and
    query
    like "+contents:novartis"

    The values for field "modified" are definitly existing.

    On Indexer side no exceptions took place.

    Both processes used the same lockDir.

    Searcher works smartly on created index.

    Please help.

    Have a nice day
    J.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org







    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Erik Hatcher at Jun 16, 2004 at 10:50 am

    On Jun 16, 2004, at 5:33 AM, iouli.golovatyi@group.novartis.com wrote:
    Are you sure every document has a single "modified" indexed term?

    What do You call single? It's just one field, defined as keyword, but
    it
    content can be the same, because it's a timestamp. Every doc has it,
    this I
    garantee.
    Single means a single term for the entire document and that there is
    not possibly two "modified" terms for a document.
    How are you indexing it?

    I have a bulk file with entries like:

    FT¬20040219174432¬¬20040219/17/44/AUT_33957308¬Watch out for relative
    valuations performance¬FT¬11111111¬D:¬yyyyMM
    ...
    where 20040219174432 is "modified" field content
    and 20040219/17/44/AUT_33957308 relative pathname of document to be
    indexed

    I use 1.4-rc3
    But how about some code? Folks, please help us volunteers that love to
    field questions by posting *code*. Field.Keyword? Or Field.Text?
    Or...???? Full line of code too... not just some partial snippet of a
    line. Your modified there doesn't look like a java.util.Date.

    Erik


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Iouli Golovatyi at Jun 16, 2004 at 11:27 am
  • Erik Hatcher at Jun 16, 2004 at 12:20 pm

    On Jun 16, 2004, at 7:25 AM, iouli.golovatyi@group.novartis.com wrote:

    Well, I just didn't want to overload people with too much code.
    There is an art to providing just enough detail :)
    doc is created like this ("modified" get formated with SimpleDateFormat
    tformat = new SimpleDateFormat ("yyyyMMddhhmmss") by cashToIndex metod,
    where the IndexWriter created) :


    doc.add(Field.Keyword("modified",DateField.timeToString(modified)));
    This looks fine.
    formated_query=query.toString();
    if (sort_byscore)hits = ms.search(query);
    else hits = ms.search(query,new
    Sort("modified",true));
    // here the "cannot determine.." exception generated!!!
    How about using:

    ms.search(query,
    new Sort(new SortField("modified", SortField.STRING, true)));

    Does that fix it?

    If "modified" is only there for sorting and not for querying, perhaps
    index it as a Integer.toString or Float.toString instead - this will
    give you better resource usage and performance most likely - and change
    the type to SortField.INT or SortField.FLOAT appropriately. The
    sorting infrastructure can detect a type, but it may have issues doing
    so if the strings look like a number in the first document but later
    appear like a String. DateField.timeToString makes them String, so
    forcing it to sort on String type should work.

    Erik


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Iouli Golovatyi at Jun 16, 2004 at 4:09 pm
    Erik,
    thank You very much, I tried it and it looked like it really fixed the
    problem.



    formated_query=query.toString();
    if (sort_byscore)hits = ms.search(query);
    else hits = ms.search(query,new
    Sort("modified",true));
    // here the "cannot determine.." exception generated!!!
    How about using:

    ms.search(query,
    new Sort(new SortField("modified", SortField.STRING, true)));

    Does that fix it?

    If "modified" is only there for sorting and not for querying, perhaps
    index it as a Integer.toString or Float.toString instead - this will
    give you better resource usage and performance most likely - and change
    the type to SortField.INT or SortField.FLOAT appropriately. The
    sorting infrastructure can detect a type, but it may have issues doing
    so if the strings look like a number in the first document but later
    appear like a String. DateField.timeToString makes them String, so
    forcing it to sort on String type should work.


    This field is used mostly for sorting, but I'm going to use it for query as
    well
    just get the end user possibility to see the incoming data in real time,
    i.e. modified[timefrom TO timeto]
    I used this format because I sow it doc doing like this. Should I really
    change to numbers?

    Regards,
    J.

    Erik


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org







    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Erik Hatcher at Jun 16, 2004 at 4:15 pm

    On Jun 16, 2004, at 12:09 PM, iouli.golovatyi@group.novartis.com wrote:
    thank You very much, I tried it and it looked like it really fixed the
    problem. *whew*
    If "modified" is only there for sorting and not for querying, perhaps
    index it as a Integer.toString or Float.toString instead - this will
    give you better resource usage and performance most likely - and change
    the type to SortField.INT or SortField.FLOAT appropriately. The
    sorting infrastructure can detect a type, but it may have issues doing
    so if the strings look like a number in the first document but later
    appear like a String. DateField.timeToString makes them String, so
    forcing it to sort on String type should work.


    This field is used mostly for sorting, but I'm going to use it for
    query as
    well
    just get the end user possibility to see the incoming data in real
    time,
    i.e. modified[timefrom TO timeto]
    I used this format because I sow it doc doing like this. Should I
    really
    change to numbers?
    If you are going to query on it, you cannot change it to a numeric...
    it has to remain a lexicographically orderable String. See the wiki
    for other info on date fields, like using YYYYMMDD String instead of
    the DateField methods.

    Erik


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 16, '04 at 7:51a
activeJun 16, '04 at 4:15p
posts10
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase