FAQ
Hi,

I'm getting this error on install:

MapReduce System Directory
mapred.system.dir /tmp/mapred/system The HDFS directory where the
MapReduce service stores system files. This directory must be accessible
from both the server and client machines. For example:
/hadoop/mapred/system/

Any suggestions? Why didn't CM take care of this (as certainly the machines
are able to communicate with each other)?

Thanks,

Gene

Search Discussions

  • Chris at Oct 4, 2012 at 10:23 pm
    Not sure if with the new version of CM that this would be a potential
    problem or not, it didnt seem to pose problems in 3u4, but /tmp is not a
    good directory to assign to the mapred.system.dir variable simply because
    this is a system directory. Not saying it would happen, but some people
    insist on cleaning everything out of /tmp because they think the fires
    there are temporary... if this were to happen you would be in a world of
    hurt if you didnt have backups.

    maybe check the permissions on that directory to see if your relevant users
    are able to access it, like mapred and hdfs, hadoop... Im purely guessing
    here so forgive me if I am wrong - either way the sure answer would be
    something good to know.

    Best of luck,

    -Chris


    On Thu, Oct 4, 2012 at 2:43 PM, Gene wrote:

    Hi,

    I'm getting this error on install:

    MapReduce System Directory
    mapred.system.dir /tmp/mapred/system The HDFS directory where the
    MapReduce service stores system files. This directory must be accessible
    from both the server and client machines. For example:
    /hadoop/mapred/system/

    Any suggestions? Why didn't CM take care of this (as certainly the
    machines are able to communicate with each other)?

    Thanks,

    Gene
  • Gene at Oct 4, 2012 at 10:33 pm
    Hi, thanks for the reply. I am using CDH4 for the manager. I never set
    /tmp for anything, as I expected CM to either make smart choices as to
    where to put things, or at least prompt me for where to put things.

    My sys admin has recommended that he has to create an NFS or CIFS share (I
    don't know what these are but he does). Do you think that's necessary w/
    this? Does CM not take care of that?

    Thanks!
    On Thursday, October 4, 2012 3:23:34 PM UTC-7, BludGeonT wrote:

    Not sure if with the new version of CM that this would be a potential
    problem or not, it didnt seem to pose problems in 3u4, but /tmp is not a
    good directory to assign to the mapred.system.dir variable simply because
    this is a system directory. Not saying it would happen, but some people
    insist on cleaning everything out of /tmp because they think the fires
    there are temporary... if this were to happen you would be in a world of
    hurt if you didnt have backups.

    maybe check the permissions on that directory to see if your relevant
    users are able to access it, like mapred and hdfs, hadoop... Im purely
    guessing here so forgive me if I am wrong - either way the sure answer
    would be something good to know.

    Best of luck,

    -Chris



    On Thu, Oct 4, 2012 at 2:43 PM, Gene <gene....@mtvn.com <javascript:>>wrote:
    Hi,

    I'm getting this error on install:

    MapReduce System Directory
    mapred.system.dir /tmp/mapred/system The HDFS directory where the
    MapReduce service stores system files. This directory must be accessible
    from both the server and client machines. For example:
    /hadoop/mapred/system/

    Any suggestions? Why didn't CM take care of this (as certainly the
    machines are able to communicate with each other)?

    Thanks,

    Gene
  • Chris at Oct 4, 2012 at 11:11 pm
    In my brief but ever growing knowledge of all this stuff, I would probably
    stray away from using a NFS or CIFS share to have this data get written to,
    you would be creating a potential for disaster if the NFS or CIFS mount
    went crazy or offline. Not to mention, you will most likley be passing a
    significant amount of data to this directory which, in my opinion, should
    be a local directory.

    It seems the default directory for the variable in question appends /tmp to
    it, it being Cloudera Manager's suggestions. While they do things for the
    most part correctly, they should have chosen a better default location for
    this mapred directory.

    Perhaps this is fixed in Cloudera Manager 4.0, or at least made not to
    default in /tmp.

    The only thing you should create NFS or CIFS shares for, in my opinion
    again, are to back up your critical HDFS catalogs' information. This
    variable is called :

    NameNode Data Directory
    dfs.name.dir

    from the configuration tab off of the main HDFS menu, scroll down and you
    will see this. I currently have the secondary namenode and our hadooptools
    node being mounted on the main namenode via NFS and then I added these
    directories into the lineup to be like:

    /dfs/nn,/dfs/irvnamenode02-livecatalog,/dfs/irvhadooptools-livecatalog,/dfs/irvbackup01


    (irvbackup01 is a windows backup server, I installed samba onto the
    namenode and locked it down so that only irvbackup01 could mount this
    directory (read only). So on the windows backup server, I created a
    mapped drive that points to the /dfs/irvbackup01 directory on the namenode.
    So I have a total of 3 other places aside from the /dfs/nn that I store
    the critical information for the cluster too.

    Just remember, that if you reboot your name node, and if you have NFS
    mounts, to make sure that everything is mounted properly BEFORE you try to
    start HDFS or it will die with a horrible message and if you dig deep into
    the logs you will see that unless Cloudera Manager and the namenode can
    talk to all the places that are listed in the dfs.name.dir, it wont start
    up. Make sure you have these entries in your /etc/fstab set up properly.

    I hope this helps,

    -Chris


    On Thu, Oct 4, 2012 at 3:33 PM, Gene wrote:

    Hi, thanks for the reply. I am using CDH4 for the manager. I never set
    /tmp for anything, as I expected CM to either make smart choices as to
    where to put things, or at least prompt me for where to put things.

    My sys admin has recommended that he has to create an NFS or CIFS share(I don't know what these are but he does). Do you think that's necessary
    w/ this? Does CM not take care of that?

    Thanks!

    On Thursday, October 4, 2012 3:23:34 PM UTC-7, BludGeonT wrote:

    Not sure if with the new version of CM that this would be a potential
    problem or not, it didnt seem to pose problems in 3u4, but /tmp is not a
    good directory to assign to the mapred.system.dir variable simply because
    this is a system directory. Not saying it would happen, but some people
    insist on cleaning everything out of /tmp because they think the fires
    there are temporary... if this were to happen you would be in a world of
    hurt if you didnt have backups.

    maybe check the permissions on that directory to see if your relevant
    users are able to access it, like mapred and hdfs, hadoop... Im purely
    guessing here so forgive me if I am wrong - either way the sure answer
    would be something good to know.

    Best of luck,

    -Chris


    On Thu, Oct 4, 2012 at 2:43 PM, Gene wrote:

    Hi,

    I'm getting this error on install:

    MapReduce System Directory
    mapred.system.dir /tmp/mapred/system The HDFS directory where the
    MapReduce service stores system files. This directory must be accessible
    from both the server and client machines. For example:
    /hadoop/mapred/system/

    Any suggestions? Why didn't CM take care of this (as certainly the
    machines are able to communicate with each other)?

    Thanks,

    Gene
  • Gene at Oct 6, 2012 at 12:45 am
    Thank you very much for this info, it's great.

    We have a disk mounted -- /disk2 -- that was added just for Hadoop. In
    disk2 there are 3 directories: mapred, dfs, and lost+found. Here are the
    files with permissions:

    drwxr-xr-x 4 mapred hadoop 4.0K Sep 24 22:55 mapred
    drwxr-xr-x 5 root root 4.0K Sep 24 22:42 dfs
    drwx------ 2 root root 16K Sep 21 12:21 lost+found

    Under dfs there are 3 directories: nn, snn and dn.
    Why it's trying to access /tmp/mapred/system, I don't know. Here's the
    mapred directory:

    drwxr-xr-x 7 mapred hadoop 4.0K Oct 3 23:13 local
    drwxr-xr-x 3 mapred hadoop 4.0K Sep 24 22:55 jt
    drwxr-xr-x 4 mapred hadoop 4.0K Sep 24 22:55 .

    'hadoop' is a superuser setup for the installation.

    There is no 'system' file or subdirectory, but it seems like
    /tmp/mapred/system should be something like /disk2/mapred/system. Does
    that seem logical to you? Where would I change that?
    On Thursday, October 4, 2012 4:11:42 PM UTC-7, BludGeonT wrote:

    In my brief but ever growing knowledge of all this stuff, I would probably
    stray away from using a NFS or CIFS share to have this data get written to,
    you would be creating a potential for disaster if the NFS or CIFS mount
    went crazy or offline. Not to mention, you will most likley be passing a
    significant amount of data to this directory which, in my opinion, should
    be a local directory.

    It seems the default directory for the variable in question appends /tmp
    to it, it being Cloudera Manager's suggestions. While they do things for
    the most part correctly, they should have chosen a better default location
    for this mapred directory.

    Perhaps this is fixed in Cloudera Manager 4.0, or at least made not to
    default in /tmp.

    The only thing you should create NFS or CIFS shares for, in my opinion
    again, are to back up your critical HDFS catalogs' information. This
    variable is called :

    NameNode Data Directory
    dfs.name.dir

    from the configuration tab off of the main HDFS menu, scroll down and you
    will see this. I currently have the secondary namenode and our hadooptools
    node being mounted on the main namenode via NFS and then I added these
    directories into the lineup to be like:


    /dfs/nn,/dfs/irvnamenode02-livecatalog,/dfs/irvhadooptools-livecatalog,/dfs/irvbackup01


    (irvbackup01 is a windows backup server, I installed samba onto the
    namenode and locked it down so that only irvbackup01 could mount this
    directory (read only). So on the windows backup server, I created a
    mapped drive that points to the /dfs/irvbackup01 directory on the namenode.
    So I have a total of 3 other places aside from the /dfs/nn that I store
    the critical information for the cluster too.

    Just remember, that if you reboot your name node, and if you have NFS
    mounts, to make sure that everything is mounted properly BEFORE you try to
    start HDFS or it will die with a horrible message and if you dig deep into
    the logs you will see that unless Cloudera Manager and the namenode can
    talk to all the places that are listed in the dfs.name.dir, it wont start
    up. Make sure you have these entries in your /etc/fstab set up properly.

    I hope this helps,

    -Chris



    On Thu, Oct 4, 2012 at 3:33 PM, Gene <gene....@mtvn.com <javascript:>>wrote:
    Hi, thanks for the reply. I am using CDH4 for the manager. I never set
    /tmp for anything, as I expected CM to either make smart choices as to
    where to put things, or at least prompt me for where to put things.

    My sys admin has recommended that he has to create an NFS or CIFS share(I don't know what these are but he does). Do you think that's necessary
    w/ this? Does CM not take care of that?

    Thanks!

    On Thursday, October 4, 2012 3:23:34 PM UTC-7, BludGeonT wrote:

    Not sure if with the new version of CM that this would be a potential
    problem or not, it didnt seem to pose problems in 3u4, but /tmp is not a
    good directory to assign to the mapred.system.dir variable simply because
    this is a system directory. Not saying it would happen, but some people
    insist on cleaning everything out of /tmp because they think the fires
    there are temporary... if this were to happen you would be in a world of
    hurt if you didnt have backups.

    maybe check the permissions on that directory to see if your relevant
    users are able to access it, like mapred and hdfs, hadoop... Im purely
    guessing here so forgive me if I am wrong - either way the sure answer
    would be something good to know.

    Best of luck,

    -Chris


    On Thu, Oct 4, 2012 at 2:43 PM, Gene wrote:

    Hi,

    I'm getting this error on install:

    MapReduce System Directory
    mapred.system.dir /tmp/mapred/system The HDFS directory where the
    MapReduce service stores system files. This directory must be accessible
    from both the server and client machines. For example:
    /hadoop/mapred/system/

    Any suggestions? Why didn't CM take care of this (as certainly the
    machines are able to communicate with each other)?

    Thanks,

    Gene
  • Vinithra Varadharajan at Oct 5, 2012 at 2:14 am
    Gene,

    At which point of the install are you seeing an error? Also, can you paste
    the full body of the error?

    The mapred.system.dir refers to an HDFS dir, not local FS dir.

    -Vinithra
    On Thu, Oct 4, 2012 at 2:43 PM, Gene wrote:

    Hi,

    I'm getting this error on install:

    MapReduce System Directory
    mapred.system.dir /tmp/mapred/system The HDFS directory where the
    MapReduce service stores system files. This directory must be accessible
    from both the server and client machines. For example:
    /hadoop/mapred/system/

    Any suggestions? Why didn't CM take care of this (as certainly the
    machines are able to communicate with each other)?

    Thanks,

    Gene
  • Gene at Oct 5, 2012 at 11:48 pm
    Hi Vinithra,

    The error is actually in the CM under Services->mapreduce1->Configuration
    under "MapReduce System Directory". The error is showing 'Currently Started
    with Bad Health' and the value I posted earlier is what it says is wrong.
    That's actually all there is to that, I posted the full text.

    Thanks!
    On Thursday, October 4, 2012 7:14:51 PM UTC-7, Vinithra wrote:

    Gene,

    At which point of the install are you seeing an error? Also, can you paste
    the full body of the error?

    The mapred.system.dir refers to an HDFS dir, not local FS dir.

    -Vinithra

    On Thu, Oct 4, 2012 at 2:43 PM, Gene <gene....@mtvn.com <javascript:>>wrote:
    Hi,

    I'm getting this error on install:

    MapReduce System Directory
    mapred.system.dir /tmp/mapred/system The HDFS directory where the
    MapReduce service stores system files. This directory must be accessible
    from both the server and client machines. For example:
    /hadoop/mapred/system/

    Any suggestions? Why didn't CM take care of this (as certainly the
    machines are able to communicate with each other)?

    Thanks,

    Gene
  • Gene at Oct 5, 2012 at 10:43 pm
    Hi,

    It's been a real struggle getting Cloudera Manager to install Hadoop and
    we've had to call in experts beyond my knowledge to help. We've gotten
    pretty far, but are now getting the following error:

    *MapReduce System Directory*
    mapred.system.dir

    /tmp/mapred/system

    The HDFS directory where the MapReduce service stores system files. This
    directory must be accessible from both the server and client machines. For
    example: /hadoop/mapred/system/

    Isn't this something CM should have taken care of? Is this some manual
    configuration thing I'm not aware of?

    I also noticed that HADOOP_HOME AND HADOOP_CLASSPATH variables have not
    been set. I expected CM to do that too, is that not what it does?

    Thanks for the help!

    Gene
  • Philip Zeyliger at Oct 5, 2012 at 10:39 pm
    I don't see the error that you're actually experiencing.

    I suspect that you're not downloading or deploying the client
    configurations appropriately.

    -- Philip
    On Wed, Oct 3, 2012 at 5:01 PM, Gene wrote:


    Hi,

    It's been a real struggle getting Cloudera Manager to install Hadoop and
    we've had to call in experts beyond my knowledge to help. We've gotten
    pretty far, but are now getting the following error:

    *MapReduce System Directory*
    mapred.system.dir

    /tmp/mapred/system

    The HDFS directory where the MapReduce service stores system files. This
    directory must be accessible from both the server and client machines. For
    example: /hadoop/mapred/system/

    Isn't this something CM should have taken care of? Is this some manual
    configuration thing I'm not aware of?

    I also noticed that HADOOP_HOME AND HADOOP_CLASSPATH variables have not
    been set. I expected CM to do that too, is that not what it does?

    Thanks for the help!

    Gene

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedOct 4, '12 at 9:48p
activeOct 6, '12 at 12:45a
posts9
users4
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase