FAQ
According to the documentation there are two ways to run HBase M/R jobs:


1. The HBase book states to run M/R jobs like export here: http://hbase.apache.org/book/ops_mgt.html#export
bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]

2. Whereas the Javadoc says here: http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0.jar export ...


In the first case (#1) I find that the job allways fails to create the output dir:
java.io.IOException: Mkdirs failed to create file:/exports/_temporary/_attempt_local_0001_m_000000_0
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:378)

...


In the 2nd case (#2) I get past the creation of the output dir, and then it fails because it cannot find class com.google.protobuf.Message.
I am using the HBase security branch and find that I need to add com.google.protobuf.Message.class in TableMapReduceUtil.addDependencyJars.
If I do that, I can successfully run an export jobs using method #2.


The 2nd issue I found looks like a bug with the HBase security branch.
I am not sure about the first issue, is the documentation in the HBase book outdated?


-- Lars

Search Discussions

  • Ted Yu at Feb 23, 2012 at 3:44 am
    Lars:
    Is the second problem present in insecure HBase ?

    Thanks
    On Wed, Feb 22, 2012 at 6:36 PM, lars hofhansl wrote:

    According to the documentation there are two ways to run HBase M/R jobs:


    1. The HBase book states to run M/R jobs like export here:
    http://hbase.apache.org/book/ops_mgt.html#export
    bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename>
    <outputdir> [<versions> [<starttime> [<endtime>]]]

    2. Whereas the Javadoc says here:
    http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
    HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
    ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0.jar export ...


    In the first case (#1) I find that the job allways fails to create the
    output dir:
    java.io.IOException: Mkdirs failed to create
    file:/exports/_temporary/_attempt_local_0001_m_000000_0
    at
    org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:378)

    ...


    In the 2nd case (#2) I get past the creation of the output dir, and then
    it fails because it cannot find class com.google.protobuf.Message.
    I am using the HBase security branch and find that I need to add
    com.google.protobuf.Message.class in TableMapReduceUtil.addDependencyJars.
    If I do that, I can successfully run an export jobs using method #2.


    The 2nd issue I found looks like a bug with the HBase security branch.
    I am not sure about the first issue, is the documentation in the HBase
    book outdated?


    -- Lars
  • Lars hofhansl at Feb 23, 2012 at 4:55 am
    I saw a bunch of security related classes in the trace. I'll try with a non-secure branch and file a jira if the problem is not present there.
    Any input on the first issue?


    -- Lars



    ________________________________
    From: Ted Yu <yuzhihong@gmail.com>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Sent: Wednesday, February 22, 2012 7:43 PM
    Subject: Re: Some HBase M/R confusion


    Lars:
    Is the second problem present in insecure HBase ?

    Thanks


    On Wed, Feb 22, 2012 at 6:36 PM, lars hofhansl wrote:

    According to the documentation there are two ways to run HBase M/R jobs:

    1. The HBase book states to run M/R jobs like export here: http://hbase.apache.org/book/ops_mgt.html#export%0Abin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]

    2. Whereas the Javadoc says here: http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
    HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0.jar export ...


    In the first case (#1) I find that the job allways fails to create the output dir:
    java.io.IOException: Mkdirs failed to create file:/exports/_temporary/_attempt_local_0001_m_000000_0
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:378)

    ...


    In the 2nd case (#2) I get past the creation of the output dir, and then it fails because it cannot find class com.google.protobuf.Message.
    I am using the HBase security branch and find that I need to add com.google.protobuf.Message.class in TableMapReduceUtil.addDependencyJars.
    If I do that, I can successfully run an export jobs using method #2.


    The 2nd issue I found looks like a bug with the HBase security branch.
    I am not sure about the first issue, is the documentation in the HBase book outdated?


    -- Lars
  • Michael Stack at Feb 23, 2012 at 4:59 am

    On Wed, Feb 22, 2012 at 6:36 PM, lars hofhansl wrote:
    1. The HBase book states to run M/R jobs like export here: http://hbase.apache.org/book/ops_mgt.html#export
    bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]
    This is running the Export tool, i.e. the Export class's main. The
    CLASSPATH is that built by bin/hbase.

    2. Whereas the Javadoc says here: http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
    HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0.jar export ...
    Here we're loading the HADOOP_CLASSPATH with hbase classpath. We then
    pass the hbase.jar as a 'mapreduce fat jar' for bin/hadoop to run.
    Our hbase.jar, when we make it, we set its Main-Class to be the Driver
    class under mapreduce. In here, it parses args to figure which of our
    selection of common mapreduce programs to run. Here you've chosen
    export (leave off the 'export' arg to see the complete list).

    Either means should work but #2 is a bit more palatable (excepting the
    ugly CLASSPATH preamble).

    In the first case (#1) I find that the job allways fails to create the output dir:
    java.io.IOException: Mkdirs failed to create file:/exports/_temporary/_attempt_local_0001_m_000000_0
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:378)
    Its running local? Its trying to write to /export on your local
    disk? Its probably not picking up hadoop configs and so is using
    local mapreducing.

    In the 2nd case (#2) I get past the creation of the output dir, and then it fails because it cannot find class com.google.protobuf.Message.
    Its not adding protobufs to CLASSPATH? Or versions disagree? The
    hbase included protobufs is being found first and its not what Hadoop
    protobuffing wants?

    I am using the HBase security branch and find that I need to add com.google.protobuf.Message.class in TableMapReduceUtil.addDependencyJars.
    If I do that, I can successfully run an export jobs using method #2.
    This is probably a bug.

    This is 0.92.x? Or trunk? The protobufs is a new dependency hbase needs?
    The 2nd issue I found looks like a bug with the HBase security branch.
    I am not sure about the first issue, is the documentation in the HBase book outdated?
    I think yeah, we should encourage #2; e.g. we'll use the proper config
    and find the cluster. Would have to add hadoop config. to #1 to make
    it work.

    My guess is its not just security branch.

    St.Ack
  • Lars hofhansl at Feb 23, 2012 at 5:15 am
    Thanks Stack.

    Missed the "file:" part in the first case... Stupid... Must pickup an hbase-site.xml from somewhere else (or more likely just using the defaults, because it can't fine one).

    Either way we need to update the book I think.


    As for the protobufs. This is trunk, and it looks like this is related to HBASE-5394. Happens also in the non-secure branch.
    Filed HBASE-5460. I assume we just add protobufs as jar dependency, I will do that tonight.

    -- Lars



    ________________________________
    From: Stack <stack@duboce.net>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Sent: Wednesday, February 22, 2012 8:59 PM
    Subject: Re: Some HBase M/R confusion
    On Wed, Feb 22, 2012 at 6:36 PM, lars hofhansl wrote:
    1. The HBase book states to run M/R jobs like export here: http://hbase.apache.org/book/ops_mgt.html#export
    bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]
    This is running the Export tool, i.e. the Export class's main.  The
    CLASSPATH is that built by bin/hbase.

    2. Whereas the Javadoc says here: http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
    HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0.jar export ...
    Here we're loading the HADOOP_CLASSPATH with hbase classpath.  We then
    pass the hbase.jar as a 'mapreduce fat jar' for bin/hadoop to run.
    Our hbase.jar, when we make it, we set its Main-Class to be the Driver
    class under mapreduce.  In here, it parses args to figure which of our
    selection of common mapreduce programs to run.  Here you've chosen
    export (leave off the 'export' arg to see the complete list).

    Either means should work but #2 is a bit more palatable (excepting the
    ugly CLASSPATH preamble).

    In the first case (#1) I find that the job allways fails to create the output dir:
    java.io.IOException: Mkdirs failed to create file:/exports/_temporary/_attempt_local_0001_m_000000_0
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:378)
    Its running local?  Its trying to write to /export on your local
    disk?  Its probably not picking up hadoop configs and so is using
    local mapreducing.

    In the 2nd case (#2) I get past the creation of the output dir, and then it fails because it cannot find class com.google.protobuf.Message.
    Its not adding protobufs to CLASSPATH?  Or versions disagree?  The
    hbase included protobufs is being found first and its not what Hadoop
    protobuffing wants?

    I am using the HBase security branch and find that I need to add com.google.protobuf.Message.class in TableMapReduceUtil.addDependencyJars.
    If I do that, I can successfully run an export jobs using method #2.
    This is probably a bug.

    This is 0.92.x?  Or trunk?  The protobufs is a new dependency hbase needs?
    The 2nd issue I found looks like a bug with the HBase security branch.
    I am not sure about the first issue, is the documentation in the HBase book outdated?
    I think yeah, we should encourage #2; e.g. we'll use the proper config
    and find the cluster.  Would have to add hadoop config. to #1 to make
    it work.

    My guess is its not just security branch.

    St.Ack
  • Michael Stack at Feb 23, 2012 at 5:20 am

    On Wed, Feb 22, 2012 at 9:14 PM, lars hofhansl wrote:
    Either way we need to update the book I think.
    Its not 'the book'. Its the 'reference guide'. One if published by
    O'Reilly. The latter by our man Doug Meil.

    As for the protobufs. This is trunk, and it looks like this is related to HBASE-5394. Happens also in the non-secure branch.
    Filed HBASE-5460. I assume we just add protobufs as jar dependency, I will do that tonight.
    Yeah, that should do it (above sounds reasonable).

    St.Ack
  • Lars hofhansl at Feb 23, 2012 at 5:35 am
    Then we should rename it already :)

    On the site it is still called "The HBase Book".


    For record. For the first approach I had forgotten to add hadoop/conf to my hbase classpath.
    It also works if I add -conf hadoop/conf/core-site.xml to the export command.


    -- Lars



    ________________________________
    From: Stack <stack@duboce.net>
    To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
    Sent: Wednesday, February 22, 2012 9:20 PM
    Subject: Re: Some HBase M/R confusion
    On Wed, Feb 22, 2012 at 9:14 PM, lars hofhansl wrote:
    Either way we need to update the book I think.
    Its not 'the book'.  Its the 'reference guide'.  One if published by
    O'Reilly.  The latter by our man Doug Meil.

    As for the protobufs. This is trunk, and it looks like this is related to HBASE-5394. Happens also in the non-secure branch.
    Filed HBASE-5460. I assume we just add protobufs as jar dependency, I will do that tonight.
    Yeah, that should do it (above sounds reasonable).

    St.Ack
  • Doug Meil at Feb 23, 2012 at 7:47 pm
    Hi folks-

    Regarding the rename, the deployment of the book rename will happen by end
    of day.

    https://issues.apache.org/jira/browse/HBASE-5465

    The files on the website will still have the same names (e.g., book.html,
    /book/book.html), so this is content-only at this point.


    On 2/23/12 12:20 AM, "Stack" wrote:
    On Wed, Feb 22, 2012 at 9:14 PM, lars hofhansl wrote:
    Either way we need to update the book I think.
    Its not 'the book'. Its the 'reference guide'. One if published by
    O'Reilly. The latter by our man Doug Meil.

    As for the protobufs. This is trunk, and it looks like this is related
    to HBASE-5394. Happens also in the non-secure branch.
    Filed HBASE-5460. I assume we just add protobufs as jar dependency, I
    will do that tonight.
    Yeah, that should do it (above sounds reasonable).

    St.Ack
  • Doug Meil at Feb 23, 2012 at 11:24 pm
    It's official.. it is now The Reference Guide.




    On 2/23/12 2:47 PM, "Doug Meil" wrote:


    Hi folks-

    Regarding the rename, the deployment of the book rename will happen by end
    of day.

    https://issues.apache.org/jira/browse/HBASE-5465

    The files on the website will still have the same names (e.g., book.html,
    /book/book.html), so this is content-only at this point.


    On 2/23/12 12:20 AM, "Stack" wrote:

    On Wed, Feb 22, 2012 at 9:14 PM, lars hofhansl <lhofhansl@yahoo.com>
    wrote:
    Either way we need to update the book I think.
    Its not 'the book'. Its the 'reference guide'. One if published by
    O'Reilly. The latter by our man Doug Meil.

    As for the protobufs. This is trunk, and it looks like this is related
    to HBASE-5394. Happens also in the non-secure branch.
    Filed HBASE-5460. I assume we just add protobufs as jar dependency, I
    will do that tonight.
    Yeah, that should do it (above sounds reasonable).

    St.Ack

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedFeb 23, '12 at 2:36a
activeFeb 23, '12 at 11:24p
posts9
users4
websitehbase.apache.org

People

Translate

site design / logo © 2022 Grokbase