I tried to figure how Pig set the number of task for Map and Reduce jobs.
The number of Map task is always tied to the number of input file.
Since there is one input file, number of Map tasks is 1, enven I had a
5.4 GB file and more than 1000 blocks.
setting mapred.amp.taks has no effect what so ever.
<property>
<name>mapred.map.tasks</name>
<value>7</value>
<description>The default number of reduce tasks per job. Typically set
to a prime close to the number of available hosts. Ignored when
mapred.job.tracker is "local".
</description>
</property>
The number of Reduce tasks could be set by Hadoop-site.xml
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
<description>The default number of reduce tasks per job. Typically set
to a prime close to the number of available hosts. Ignored when
mapred.job.tracker is "local".
</description>
</property>
Please advise,
MIckey Hsieh