Hi,
In Java I am using RAM based index
For a small case
for (int i = 0; i < hits.length; ++i) {
// Document D = searcher.doc(hits[i].doc);
}
Found 37 hits.
0 total milliseconds
==================================
In case I uncomment the lines
for (int i = 0; i < hits.length; ++i) {
Document D = searcher.doc(hits[i].doc);
}
Found 37 hits.
17 total milliseconds
How to improve this. If I am doing something wrong here.
The same index search in clucene takes jst less than 1 ms that too , when it
is File based indexes.
Regards,
Suman
-----Original Message-----
From: suman.holani
Sent: Friday, March 11, 2011 11:35 AM
To: '
[email protected]'
Subject: RE: document object
Hello Erick,
Hits .length is 1800
Version is lucene 3.0.3
I need the entire result set . As I ll be fetching records which satisfy the
search conditions. And will be validating them wrt to current counts ,
scheduling the successful resultset.Selecting one of them on basis of random
scheduling.
I cannot take page wise result. As that will lead to starvation of documents
which are at end.
I cannot add validating current counts onto index as it is changing v
frequently. So not possible to change entire index everytime for that.
Let me know of some soln .
Let say there are 5 fields in indexing . A, B C ,D ,E
when I search 1000 records are fetched
I wanna use A, D for the time being for validating the records wrt counts.
Note:fields B,C,E is nt required now, bt I am fetching it and storing in a
list
A,D in list are given to another process for validation
After validation 700 records are in list
Of wchich one of the record displayed after scheduling with entire fields A,
B,C,D,E
Regards,
Suman
-----Original Message-----
From: Erick Erickson
Sent: Thursday, March 10, 2011 7:46 PM
To:
[email protected]Subject: Re: document object
If you're loading 100,000 documents, you can expect it to be slow. If
you're loading 10 documents, it should be quite fast... So how big is
hits.length?
And what version of Lucene are you using? The Hits object has been
deprecated for quite some time I believe.....
The problem here is that you're loading the entire result set. This is
rarely the right thing to do, which is why paging is used normally.
Why do you need to load the entire result set? That seems to be the
crux of the issue.
Best
Erick
On Thu, Mar 10, 2011 at 5:22 AM, Anshum wrote:Depends on your data. I know that's a vague answer but that's the point.
What you could do is use FieldCache if memory and data let you do so. Would
it?
--
Anshum Gupta
http://ai-cafe.blogspot.comOn Thu, Mar 10, 2011 at 3:12 PM, suman.holani
wrote:
Hi Anshum,
Thanks for prompt reply.
I am only storing the fields in index , which I want to get/fetch after
search.
The area I am not sure is when we call searcher/reader class to
initialize
Document object is heavy?
Can we use something else in that place, which doesnot needs to load all
doc
again.
Regards,
Suman
-----Original Message-----
From: Anshum
Sent: Thursday, March 10, 2011 3:11 PM
To:
[email protected]Subject: Re: document object
Hi Suman,
Do you need to load/use all fields that you have stored in the index? If
that's not the case I'd suggest you to use the
public Document
<
http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/document/Document.html>
*doc*(int i, FieldSelector fieldSelector)
http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/IndexSearcher.html#doc(int,
org.apache.lucene.document.FieldSelector)
<
http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/IndexSearcher.html#doc(int,
org.apache.lucene.document.FieldSelector)>function .
This should help you. Also, otherwise if you're using very selective
field
which may be used though a FieldCache it'd be a nice thing to do.
Hope that helps.
--
Anshum Gupta
http://ai-cafe.blogspot.comOn Thu, Mar 10, 2011 at 3:01 PM, suman.holani
wrote:
Hi,
I am facing the problem
The line in the loop is going very slow giving me a performance hit
for (int i = 0; i < hits.length; ++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId); //problem
}
How can I improve this. Please give me an example of the improved code
Thanks,
Suman
Ps :
In one of post Erick said ..
this line is really suspicious:
Document document = this.indexReader.document(doc)
From the Javadoc for HitCollector.collect:
Note: This is called in an inner search loop. For good search
performance,
implementations of this method should not call
Searcher.doc(int)<file:///C:/lucene-2.1.0/docs/api/org/apache/lucene/search/
Searcher.html#doc%28int%29>or
IndexReader.document(int)<file:///C:/lucene-2.1.0/docs/api/org/apache/lucene
/index/IndexReader.html#document%28int%29>on
every document number encountered. Doing so can slow searches by an
order
of magnitude or more.
---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]For additional commands, e-mail:
[email protected]---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]For additional commands, e-mail:
[email protected]---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]For additional commands, e-mail:
[email protected]