FAQ
Hi,

In the case of RawLocalFilesystem or FTPFileSystem being used as input of a
map-red job,
How does the jobtracker apply the data locality logic .i.e How many map
tasks to start and in which machines?

I want to understand this keeping in mind two scenarios,

Scenario 1: RawLocalFileSystem
- All the data nodes have a local directory called /fooLocalBar each
having 10 files (each 200MB size) to be processed.

Scenario 2: FTPFileSystem
- A common external machine has a directory called /fooRemoteBar which has
10 files (each 200MB) to be processed


./zahoor

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 15, '10 at 1:41p
activeOct 15, '10 at 1:41p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Zooni Zooni: 1 post

People

Translate

site design / logo © 2022 Grokbase