In Scorers, when skipTo() or next() returns true for the second or later
time, the result of doc() will be increased.
When Scorer.skipTo() does not have document order, documents will
be lost, which means that not all matching documents will be found
by the search.

For disjunctions (OR), one needs to merge the documents of
two Scorers using next() to iterate over the documents.
The merging is normally done on the fly using a specialized priority queue
on the doc() values in DisjunctionSumScorer.
No sorting of complete document lists is done at search time,
that is done at indexing time. And since TermScorer uses the
index directly, it will always return documents in order.

The only exception to document ordering is BooleanScorer.next(),
which is used by BooleanQuery for some cases of top
level disjunctions, and then only when documents are allowed
to be scored out of order. The reason for that is performance,
BooleanScorer uses a faster data structure than a priority queue,
but BooleanScorer does not implement skipTo().

Paul Elschot

On Thursday 04 October 2007 09:12, Dan Rich wrote:

I have a custom Query class that provides a long list of lucene docIds (not
for filtering purposes), which is one clause in a standard BooleanQuery
(which also contains TermQuery instances).

I have a custom Scorer that goes along with the custom Query class.

What (if any) document ordering requirements does the Scorer class have for
its skipTo(int docId) method?

In particular, currently I'm sorting/returning the docIds in ascending
order from my custom Query class. That can be expensive for large docId
lists; is sorting necessary? It looks like skipTo() might expect the
documents it gets to be in ascending order to behave correctly as part of a
BooleanQuery, but I can't tell for sure from the doc.

If the document list from my custom Scorer class does not have its document
list in ascending order (e.g. 10, 80, 40, 60, 50) will whatever uses
skipTo() potentially lose hits? If not, is there any performance concern
with having the docIds unordered?

_________ Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s
user panel and lay it on us.
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 2 | next ›
Discussion Overview
groupjava-user @
postedOct 4, '07 at 7:12a
activeOct 4, '07 at 5:36p

2 users in discussion

Dan Rich: 1 post Paul Elschot: 1 post



site design / logo © 2022 Grokbase