I have a Hadoop job running through Pig for which I would like to limit the
number of concurrently running mappers per task tracker. The
mapred.tasktracker.map.tasks.maximum property seems to be just what I want
to modify, but unfortunately I cannot modify it in mapred-site.xml as this
configuration is shared by many different jobs, most of which don't need to
be limited in the same way.
I'm wondering what the best way to set this option would be. I noticed that
using the Configuration returned by UDFContext.getJobConf() will not work,
as it is a copy of the configuration and so writing the property here will
not get passed back to the system. I'm given access to the Job object in my
store func's setStoreLocation method, would setting the property on this
Job's configuration get passed back to the system? If not is there a good
way to set a property like this from within a Pig UDF?