Feng, Ao wrote:
I probably know what the problem it, as we are encountering the same issue on our prod cluster. Every once a while jobs start failing on the same task trackers, and the only error message is this exit status 1.

Go to the userlogs directory on the host where your tasks fail, and verify if there are 31,999 directories all looking like attempt_... Once you get to that point, JVM would run out of file descriptors, as it tries to create the 32,000 one. I confirmed that cleaning up the userlogs directory solves the problem... temporarily.

So my questions are:

1. Where is the 32,000 limit imposed, and how do we change it?

As far as ext3 file system capabilities are concerned,



"There is a limit of 31998 sub-directories per one directory, stemming from its limit of 32000 links per inode"

There is actually a funny story behind my personal experience with this (which I shall shorten for

After I typed "ls <tab>" (to get the list of files/directories via bash completion) one day in a directory,
the system came back (after a while) and said (from memory),

Display all 31998 possibilities? (y or n)

Hmm, where have I seen a number like (or close to) that before ?

Cheers / Frank

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 13 of 14 | next ›
Discussion Overview
groupcommon-user @
postedSep 23, '09 at 6:06p
activeOct 27, '09 at 10:51p



site design / logo © 2022 Grokbase