We've got 100GB of data which has doc,txt,pdf,ppt,etc.., we've
separate parser for each file format, so we're going to index those data by
lucene. (since we scared of Nutch setup , thats why we didn't use it) My
doubt is , will it be scalable when i index those dcouments ? we planned to
do separate index for each file format , and we planned to use multi index
reader for searching, please anyone suggest me
1. Are we going on the right way?
2. Please suggest me about mergeFactors & segments
3. How much index size can lucene handle?
4. Will it cause for java OOM.
--
View this message in context: http://www.nabble.com/indexing-100GB-of-data-tp24600563p24600563.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
View this message in context: http://www.nabble.com/indexing-100GB-of-data-tp24600563p24600563.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]