Grokbase Groups Pig user July 2011
FAQ
I have a Hadoop job running through Pig for which I would like to limit the
number of concurrently running mappers per task tracker. The
mapred.tasktracker.map.tasks.maximum property seems to be just what I want
to modify, but unfortunately I cannot modify it in mapred-site.xml as this
configuration is shared by many different jobs, most of which don't need to
be limited in the same way.
I'm wondering what the best way to set this option would be. I noticed that
using the Configuration returned by UDFContext.getJobConf() will not work,
as it is a copy of the configuration and so writing the property here will
not get passed back to the system. I'm given access to the Job object in my
store func's setStoreLocation method, would setting the property on this
Job's configuration get passed back to the system? If not is there a good
way to set a property like this from within a Pig UDF?

Search Discussions

  • Daniel Dai at Jul 12, 2011 at 4:42 am
    One thing I not sure is whether "mapred.tasktracker.map.tasks.maximum" is a
    client side setting. If it is, you can create "pig-cluster-hadoop-site.xml",
    and put the directory containing it in classpath.
    pig-cluster-hadoop-site.xml is the additional hadoop settings specific to
    Pig. It has the same format as other hadoop config files.

    Daniel
    On Fri, Jul 8, 2011 at 10:08 AM, Dylan Scott wrote:

    I have a Hadoop job running through Pig for which I would like to limit the
    number of concurrently running mappers per task tracker. The
    mapred.tasktracker.map.tasks.maximum property seems to be just what I want
    to modify, but unfortunately I cannot modify it in mapred-site.xml as this
    configuration is shared by many different jobs, most of which don't need to
    be limited in the same way.
    I'm wondering what the best way to set this option would be. I noticed that
    using the Configuration returned by UDFContext.getJobConf() will not work,
    as it is a copy of the configuration and so writing the property here will
    not get passed back to the system. I'm given access to the Job object in my
    store func's setStoreLocation method, would setting the property on this
    Job's configuration get passed back to the system? If not is there a good
    way to set a property like this from within a Pig UDF?
  • Thejas Nair at Jul 12, 2011 at 10:09 pm
    If you want to set any property for a specific script (including MR
    client side settings), you can add a set statement to your pig script.
    see http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html#set
    -Thejas

    On 7/11/11 9:41 PM, Daniel Dai wrote:
    One thing I not sure is whether "mapred.tasktracker.map.tasks.maximum" is a
    client side setting. If it is, you can create "pig-cluster-hadoop-site.xml",
    and put the directory containing it in classpath.
    pig-cluster-hadoop-site.xml is the additional hadoop settings specific to
    Pig. It has the same format as other hadoop config files.

    Daniel

    On Fri, Jul 8, 2011 at 10:08 AM, Dylan Scottwrote:
    I have a Hadoop job running through Pig for which I would like to limit the
    number of concurrently running mappers per task tracker. The
    mapred.tasktracker.map.tasks.maximum property seems to be just what I want
    to modify, but unfortunately I cannot modify it in mapred-site.xml as this
    configuration is shared by many different jobs, most of which don't need to
    be limited in the same way.
    I'm wondering what the best way to set this option would be. I noticed that
    using the Configuration returned by UDFContext.getJobConf() will not work,
    as it is a copy of the configuration and so writing the property here will
    not get passed back to the system. I'm given access to the Job object in my
    store func's setStoreLocation method, would setting the property on this
    Job's configuration get passed back to the system? If not is there a good
    way to set a property like this from within a Pig UDF?

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 8, '11 at 5:09p
activeJul 12, '11 at 10:09p
posts3
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase