FAQ
What is the file you have attached? It is not safe.

I don't know the format of lucene index, would you please give an example?

On Sat, Dec 25, 2010 at 12:34 AM, Black, Michael (IS) wrote:

Using hadoop-0.20


I'm doing custom input splits from a Lucene index.

I want to split the document ID's across N mappers (I'm testing the
scalabilty of the problem across 4 nodes and 8 cores).

So the key is the document# and they are not sequential.

At this point I'm using splits.add to add each document...but that sets up
one task for every document...not something I want to do of course.

How can I add a group of documents to each split? I found a scant
reference
to PrimeInputSplit but that doesn't seem to resolve on hadoop-0.20.


Michael D. Black
Senior Scientist
Nothrop Grumman Information Systems
Advanced Analytics Directorate


Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 8 of 11 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedDec 22, '10 at 6:46p
activeDec 27, '10 at 2:34a
posts11
users9
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase