FAQ
Setting default tmp directory for java createTempFile (java.io.tmpdir)
----------------------------------------------------------------------

Key: HADOOP-2735
URL: https://issues.apache.org/jira/browse/HADOOP-2735
Project: Hadoop Core
Issue Type: New Feature
Components: mapred
Reporter: Koji Noguchi
Priority: Minor


On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
(also inefficient since all the local tasks were spilling to the same disk)

Pig is simply using java api createTempFile,

http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File

Can we add -Djava.io.tmpdir="./tmp" somewhere ?

so that,

1) Tasks can utilize all disks when using tmp
2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.


The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Marco Nicosia (JIRA) at Jan 31, 2008 at 3:09 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Marco Nicosia updated HADOOP-2735:
    ----------------------------------

    Fix Version/s: 0.16.1

    Setting Fix Version to Hadoop 0.16.1, this bug is affecting user jobs.

    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Priority: Minor
    Fix For: 0.16.1


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Sameer Paranjpye (JIRA) at Jan 31, 2008 at 4:39 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Sameer Paranjpye updated HADOOP-2735:
    -------------------------------------

    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical (was: Minor)
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 5, 2008 at 12:08 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565718#action_12565718 ]

    Amareshwari Sri Ramadasu commented on HADOOP-2735:
    --------------------------------------------------

    Yes. You can add -Djava.io.tmpdir=./tmp as part of mapred.child.java.opts. Moreover, mapred.child.java.opts can take many values space seperated, like
    {noformat}
    <property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx200m -Djava.io.tmpdir=./tmp</value>
    </property>
    {noformat}

    But, the working directory of task doesnot have a 'tmp' directory already. I'm uploading a patch which will create it.

    Koji, can you confirm whether this is what you wanted ? I tested this creating a temporary file in ./tmp from the map task.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 5, 2008 at 12:10 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Attachment: patch-2735.txt
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 5, 2008 at 12:12 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Patch Available (was: Open)

    This patch creates 'tmp' directory in working directory of task.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 5, 2008 at 2:08 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565752#action_12565752 ]

    Hadoop QA commented on HADOOP-2735:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12374765/patch-2735.txt
    against trunk revision 616796.

    @author +1. The patch does not contain any @author tags.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests -1. The patch failed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1742/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1742/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1742/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1742/console

    This message is automatically generated.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Koji Noguchi (JIRA) at Feb 5, 2008 at 2:38 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565757#action_12565757 ]

    Koji Noguchi commented on HADOOP-2735:
    --------------------------------------

    Thanks Amareshwari!

    Even though I created this ticket, I don't know what should be done...

    The problem I'm having is,

    - heapsize is sometimes set by users
    - tmp is set by ops/admins


    And when users set the heapsize, this tmp entry would be overwritten.
    Also, on Windows, I'm not sure which tmp to set.

    May I ask adding new conf entry?
    <name>mapred.child.java.tmp</name>

    And if <mapred.child.java.tmp> is set,
    1) Append -Djava.io.tmpdir=<mapred.child.java.tmp>
    2) If mapred.child.java.tmp doesn't exist, create one.


    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 6, 2008 at 8:51 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566054#action_12566054 ]

    Amareshwari Sri Ramadasu commented on HADOOP-2735:
    --------------------------------------------------

    bq. And when users set the heapsize, this tmp entry would be overwritten. Also, on Windows, I'm not sure which tmp to set.

    Shall we create tmp directory in task's working directory and make java.io.tmpdir=./tmp always?
    This will remove setting a config item assuming tasks always use ./tmp for java.io.tmpdir.

    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Pi Song (JIRA) at Feb 6, 2008 at 11:41 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566095#action_12566095 ]

    Pi Song commented on HADOOP-2735:
    ---------------------------------

    That might cause by big databags being spilled to the disk into the tmp dir.
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    I don't understand this Koji. So your intention is just pointing the tmp directory to another physical disk right? If that is the case how setting java.io.tmpdir=./tmp will help? Then you also have to set the mapred working dir to the new physical disk as well right?

    Actually I think the way Pig handles big databags should be revisited. Spilling to local disk and read it back doesn't sound efficient to me. If the databag is really too big, why can't it just spill to HDFS to take advantage of disk parallelism?
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Koji Noguchi (JIRA) at Feb 6, 2008 at 6:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566245#action_12566245 ]

    Koji Noguchi commented on HADOOP-2735:
    --------------------------------------
    Then you also have to set the mapred working dir to the new physical disk as well right?
    I believe each task is assigned with its own working directory utilizing all the disks already.
    Actually I think the way Pig handles big databags should be revisited.
    I'm not qualified to comment about Pig. Pig just happened to be using File.createTempFile.
    I could have asked Pig to add 'java.io.tmpdir=./tmp', but any application can be using tmpdir.

    Shall we create tmp directory in task's working directory and make java.io.tmpdir=./tmp always?
    It would work for our setting.
    However, I don't know how tmp is used outside. Maybe people use it to share data among tasks. (not likely but can happen)



    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Allen Wittenauer (JIRA) at Feb 6, 2008 at 6:39 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566261#action_12566261 ]

    Allen Wittenauer commented on HADOOP-2735:
    ------------------------------------------

    For the most part, I agree with Pi's comments.

    Koji and I just had a quick discussion about this and I think we've come up with a good idea. Now we want to toss it to the wolves. :)

    Quick summary of the issue as I understand it:

    1) We have applications that depend upon java.io.tmp properties to be set.

    2) These applications may be independently/inadvertently writing data to the same place. If this data is large, there may be a disk overflow issue. On UNIX, this may have dire consequences (/tmp being either on / or be in swap)

    3) Hard coding is generally bad, as it makes assumptions about task behavior and file system layout. In particular, ./tmp is bad because, it makes the assumption that the task hasn't changed cwd itself.

    So this is what we propose:

    We create a new Hadoop property called mapred.child.tmp. This property takes three values:

    default == we leave java.io.tmp alone

    dynamic == we dynamically calculate the full path of our mapred task directories tmp dir (the end result would be the equivalent of ./tmp, except that instead of depending upon '.', it would be the actual path to where mapred normally cwd's to.. mapred.local.dir/blah/blah/blah/.../tmp .)

    anything else == a path provided by the user

    With this type of change, we can cover a wide variety of cases, such as applications that assume that io.tmp is the same across all tasks, applications that require separate io.tmp's across all tasks, gives ops the benefit of being able to 'spread the load' across multiple drives, etc.

    Thoughts?
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Feb 7, 2008 at 7:01 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566476#action_12566476 ]

    Owen O'Malley commented on HADOOP-2735:
    ---------------------------------------

    This sounds way too complicated.

    I propose java.io.tmp with a default of "./tmp". What ever the value is, it is made absolute in localizeTask. So, if the user sets it to something absolute, it is left alone. But the default will be made absolute before the application's Task gets control.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Feb 7, 2008 at 7:17 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566479#action_12566479 ]

    Devaraj Das commented on HADOOP-2735:
    -------------------------------------

    +1
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Pi Song (JIRA) at Feb 7, 2008 at 10:37 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566544#action_12566544 ]

    Pi Song commented on HADOOP-2735:
    ---------------------------------

    +1 for setting java.io.tmp to "mapred working dir/tmp". I've just found out that mapred working dir can spread across a given list of paths in "mapred.local.dir" setting.

    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 7, 2008 at 10:47 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Open (was: Patch Available)
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 7, 2008 at 10:47 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Attachment: patch-2735.txt
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 7, 2008 at 10:53 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Patch Available (was: Open)

    Here is the patch with config item "mapred.child.tmp", with default value ./tmp.
    If the value of mapred.child.tmp is not absolute path, it is prepended with task's working directory.
    And task is run with option -Djava.io.tmpdir=<the absolute path of tmp dir>.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 7, 2008 at 12:02 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566565#action_12566565 ]

    Hadoop QA commented on HADOOP-2735:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12374957/patch-2735.txt
    against trunk revision 616796.

    @author +1. The patch does not contain any @author tags.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1755/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1755/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1755/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1755/console

    This message is automatically generated.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 7, 2008 at 10:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566831#action_12566831 ]

    Hadoop QA commented on HADOOP-2735:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12374957/patch-2735.txt
    against trunk revision 619499.

    @author +1. The patch does not contain any @author tags.

    tests included -1. The patch doesn't appear to include any new or modified JUnit tests.
    Please justify why no unit tests are needed for this patch.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1756/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1756/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1756/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1756/console

    This message is automatically generated.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Feb 8, 2008 at 12:13 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-2735:
    ----------------------------------

    Status: Open (was: Patch Available)

    -1

    I think the task should fail if the create fails.

    As a style point, it is better to just do
    {code}
    if (a != null && a.meth()) { ... }
    {code}

    rather than

    {code}
    if (a != null) {
    if (a.meth() { ... }
    }
    {code}
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 8, 2008 at 6:01 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Attachment: patch-2735.txt
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 8, 2008 at 6:07 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Patch Available (was: Open)

    I changed the patch to throw exception when it fails to create directory.
    Since this feature involves only adding a config variable and making that directory if it doesnt exist, I dont think it needs a test case.

    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 8, 2008 at 7:11 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566938#action_12566938 ]

    Hadoop QA commented on HADOOP-2735:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375038/patch-2735.txt
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included -1. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1761/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1761/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1761/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1761/console

    This message is automatically generated.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Nigel Daley (JIRA) at Feb 9, 2008 at 6:16 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Nigel Daley updated HADOOP-2735:
    --------------------------------

    Status: Open (was: Patch Available)

    -1. This needs a regression test.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 11, 2008 at 10:10 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Attachment: patch-2735.txt
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 11, 2008 at 10:14 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Patch Available (was: Open)

    Adding the patch with test case. In the test case we give different values to mapred.child.tmp
    both relative and absolute. And check whether the temp directory is created. also check whether java.io.tmpdir value is same as the directory specified. We create a temp file and check if is is
    created in the directory specified.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 11, 2008 at 11:22 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567589#action_12567589 ]

    Hadoop QA commented on HADOOP-2735:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375212/patch-2735.txt
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 3 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac -1. The applied patch generated 613 javac compiler warnings (more than the trunk's current 612 warnings).

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1770/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1770/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1770/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1770/console

    This message is automatically generated.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 11, 2008 at 11:56 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Open (was: Patch Available)
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 11, 2008 at 11:58 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Attachment: patch-2735.txt
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 11, 2008 at 11:58 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Patch Available (was: Open)

    fixed javac warning.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 11, 2008 at 1:31 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567633#action_12567633 ]

    Hadoop QA commented on HADOOP-2735:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375215/patch-2735.txt
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 3 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1771/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1771/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1771/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1771/console

    This message is automatically generated.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Nigel Daley (JIRA) at Feb 11, 2008 at 4:34 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567690#action_12567690 ]

    Nigel Daley commented on HADOOP-2735:
    -------------------------------------

    Is this supposed to work with Pipes and Streaming? Do we know if it does?
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Allen Wittenauer (JIRA) at Feb 11, 2008 at 7:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567751#action_12567751 ]

    Allen Wittenauer commented on HADOOP-2735:
    ------------------------------------------

    Given this is supposed to be setting a Java-specific property, I'm not sure it really matters. Although, it might be worthwhile setting the shell TMPDIR environment variable so that other languages have something they could use as well.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Allen Wittenauer (JIRA) at Feb 11, 2008 at 7:02 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567756#action_12567756 ]

    Allen Wittenauer commented on HADOOP-2735:
    ------------------------------------------

    BTW, has this patch been tested on Windows? Given the / vs. \ issue...


    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 12, 2008 at 9:47 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Open (was: Patch Available)

    Thats a good catch Nigel. I havent thought of pipes and streaming. Will add it in TMPDIR environment variable.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 13, 2008 at 7:08 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Attachment: patch-2735.txt
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 13, 2008 at 7:10 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Patch Available (was: Open)

    Added tmp dir path to TMPDIR environment variable for pipes and streaming. Tested boths pipes and streaming applications which create temp files.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 13, 2008 at 8:16 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568463#action_12568463 ]

    Hadoop QA commented on HADOOP-2735:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375454/patch-2735.txt
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 3 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1787/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1787/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1787/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1787/console

    This message is automatically generated.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Feb 15, 2008 at 8:17 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Devaraj Das updated HADOOP-2735:
    --------------------------------

    Status: Open (was: Patch Available)

    This doesn't work on Windows (cygwin) when the tmpdir is set to something absolute. Please resubmit a patch with that fixed.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 15, 2008 at 10:15 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Attachment: patch-2735.txt
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sri Ramadasu (JIRA) at Feb 15, 2008 at 10:19 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amareshwari Sri Ramadasu updated HADOOP-2735:
    ---------------------------------------------

    Status: Patch Available (was: Open)

    Changed instances of file to Path.
    Now it works in cygwin also.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Feb 15, 2008 at 11:22 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569238#action_12569238 ]

    Hadoop QA commented on HADOOP-2735:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12375659/patch-2735.txt
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 3 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1803/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1803/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1803/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1803/console

    This message is automatically generated.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Feb 16, 2008 at 12:39 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Devaraj Das updated HADOOP-2735:
    --------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    I just committed this. Thanks, Amareshwari!
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Feb 17, 2008 at 12:23 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569692#action_12569692 ]

    Hudson commented on HADOOP-2735:
    --------------------------------

    Integrated in Hadoop-trunk #403 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/403/])
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sri Ramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Marco Nicosia (JIRA) at Mar 7, 2008 at 1:58 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576010#action_12576010 ]

    Marco Nicosia commented on HADOOP-2735:
    ---------------------------------------

    Any consideration for Allen's comment?
    aw> Although, it might be worthwhile setting the shell TMPDIR
    aw> environment variable so that other languages have something
    aw> they could use as well.

    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sriramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Koji Noguchi (JIRA) at Mar 7, 2008 at 10:31 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576417#action_12576417 ]

    Koji Noguchi commented on HADOOP-2735:
    --------------------------------------

    Marco,
    https://issues.apache.org/jira/browse/HADOOP-2735?focusedCommentId=12568023#action_12568023
    Any consideration for Allen's comment?
    aw> Although, it might be worthwhile setting the shell TMPDIR
    aw> environment variable so that other languages have something
    aw> they could use as well.
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sriramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amareshwari Sriramadasu (JIRA) at Mar 10, 2008 at 4:05 am
    [ https://issues.apache.org/jira/browse/HADOOP-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576855#action_12576855 ]

    Amareshwari Sriramadasu commented on HADOOP-2735:
    -------------------------------------------------

    bq. Any consideration for Allen's comment?
    aw> Although, it might be worthwhile setting the shell TMPDIR
    aw> environment variable so that other languages have something
    aw> they could use as well.

    This is done as part of the patch. The environment variable,TMPDIR is set to the temp directory created for boths pipes and streaming.
    https://issues.apache.org/jira/browse/HADOOP-2735?focusedCommentId=12568453#action_12568453
    Setting default tmp directory for java createTempFile (java.io.tmpdir)
    ----------------------------------------------------------------------

    Key: HADOOP-2735
    URL: https://issues.apache.org/jira/browse/HADOOP-2735
    Project: Hadoop Core
    Issue Type: New Feature
    Components: mapred
    Reporter: Koji Noguchi
    Assignee: Amareshwari Sriramadasu
    Priority: Critical
    Fix For: 0.16.1

    Attachments: patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt, patch-2735.txt


    On our cluster, we've seen Pig(http://incubator.apache.org/pig/) filling up the /tmp and failing.
    (also inefficient since all the local tasks were spilling to the same disk)
    Pig is simply using java api createTempFile,
    http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File
    Can we add -Djava.io.tmpdir="./tmp" somewhere ?
    so that,
    1) Tasks can utilize all disks when using tmp
    2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.
    The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJan 29, '08 at 11:27p
activeMar 10, '08 at 4:05a
posts48
users1
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase