FAQ
The # of documents that we are going to index could be potentially more than
2G. So I guess I have to split the index file into multiple of files with
each contain up to 2G files. Any other suggestion?

Thanks.

-----Original Message-----
From: Karl Wettin
Sent: Thursday, May 08, 2008 11:00 AM
To: java-user@lucene.apache.org
Subject: Re: Limit of Lucene

Michael Siu skrev:
What is the limit of Lucene: # of docs per index?
Integer.MAX_VALUE

Multiple indices joined in a single MultiWhatNot is still limited to
that number.

If RangeFilter.Bits(), for example, it initializes a bitset to the size of
maxDoc from the indexReader. I wonder what happen if the # of docs is huge,
say MaxInt (4G in 32bit or 2^63 in 64 bit)?
ArrayIndexOutOfBoundsException ?

It should not be that difficult to upgrade int to longs, but it is a
rather large job.

How many documents do you have? You might want to consider alternative
ways to represent your corpus in the index so it takes less documents.


karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 4 | next ›
Discussion Overview
groupjava-user @
categorieslucene
postedMay 8, '08 at 5:23p
activeMay 8, '08 at 6:29p
posts4
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase