FAQ
Hi

I have done MapReduce programming using Eclipse before but now I need to
learn the Hadoop code internals for one of my projects.

I have forked Hadoop from github (https://github.com/apache/hadoop-common
) and need to configure it to work with Eclipse. All the links I could
find list steps for earlier versions of Hadoop. I am right now following
instructions given in these links:
- http://wiki.apache.org/hadoop/GitAndHadoop
- http://wiki.apache.org/hadoop/EclipseEnvironment
- http://wiki.apache.org/hadoop/HowToContribute

Can someone please give me a link to the steps to be followed for getting
Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
changes to my forked repository on github.

Thanks in advance.
Regards,
Prajakta

Search Discussions

  • Harsh J at Jun 13, 2012 at 12:39 pm
    Hi Prajakta,

    I have Eclipse setup with M2E plugins. And once thats done, I merely
    clone a repo and import projects in via M2E's "Import existing maven
    projects" feature. This seems to work just fine for apache/hadoop's
    trunk.
    On Thu, Jun 7, 2012 at 5:18 PM, Prajakta Kalmegh wrote:
    Hi

    I have done MapReduce programming using Eclipse before but now I need to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I could
    find list steps for earlier versions of Hadoop. I am right now following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta


    --
    Harsh J
  • Prajakta Kalmegh at Jun 13, 2012 at 1:10 pm
    Hi Harsh

    Appreciate the response. I was able to configure and implement basic JUnits
    within eclipse and get some code running. Still getting familiar with the
    new YARN and federation architecture.

    I was, however, not able to check the MR jobs submitted within eclipse for
    a sample WordCount program on the
    http://localhost:8088/<http://localhost:8080/>page. I am starting my
    namenode/datanode/resourcemanager/nodemanager/historyserver as instructed
    on the wiki page. And then executing JUnit tests from eclipse.

    I believe a single MR job will be submitted as a single application in the
    new framework, right? The eclipse console shows a successful execution (the
    details are pretty neat). However, the webpage shows 'No applications
    submitted'. Do I have to tweak with any config properties to get this done?

    Please let me know.

    Regards,
    Prajakta




    On Wed, Jun 13, 2012 at 6:09 PM, Harsh J wrote:

    Hi Prajakta,

    I have Eclipse setup with M2E plugins. And once thats done, I merely
    clone a repo and import projects in via M2E's "Import existing maven
    projects" feature. This seems to work just fine for apache/hadoop's
    trunk.
    On Thu, Jun 7, 2012 at 5:18 PM, Prajakta Kalmegh wrote:
    Hi

    I have done MapReduce programming using Eclipse before but now I need to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (
    https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I could
    find list steps for earlier versions of Hadoop. I am right now following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta


    --
    Harsh J
  • Harsh J at Jun 13, 2012 at 2:48 pm
    Good to know your progress Prajakta!

    Did your submission surely go via the RM/NM or did it execute via the
    LocalJobRunner (logs show this classname)?

    You would ideally want to set the config "mapreduce.framework.name" to
    value "yarn" (in either config object before you use it, or in local
    mapred-site.xml), for it to use the YARN framework. A set of general
    configs for YARN deployment may be found at http://bit.ly/M2Eobz or at
    http://bit.ly/LW3Var.

    Does this help?
    On Wed, Jun 13, 2012 at 6:39 PM, Prajakta Kalmegh wrote:
    Hi Harsh

    Appreciate the response. I was able to configure and implement basic JUnits
    within eclipse and get some code running. Still getting familiar with the
    new YARN and federation architecture.

    I was, however, not able to check the MR jobs submitted within eclipse for
    a sample WordCount program on the
    http://localhost:8088/<http://localhost:8080/>page. I am starting my
    namenode/datanode/resourcemanager/nodemanager/historyserver as instructed
    on the wiki page. And then executing JUnit tests from eclipse.

    I believe a single MR job will be submitted as a single application in the
    new framework, right? The eclipse console shows a successful execution (the
    details are pretty neat). However, the webpage shows 'No applications
    submitted'. Do I have to tweak with any config properties to get this done?

    Please let me know.

    Regards,
    Prajakta




    On Wed, Jun 13, 2012 at 6:09 PM, Harsh J wrote:

    Hi Prajakta,

    I have Eclipse setup with M2E plugins. And once thats done, I merely
    clone a repo and import projects in via M2E's "Import existing maven
    projects" feature. This seems to work just fine for apache/hadoop's
    trunk.

    On Thu, Jun 7, 2012 at 5:18 PM, Prajakta Kalmegh <prkalmeg@in.ibm.com>
    wrote:
    Hi

    I have done MapReduce programming using Eclipse before but now I need to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (
    https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I could
    find list steps for earlier versions of Hadoop. I am right now following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta


    --
    Harsh J


    --
    Harsh J
  • Prajakta Kalmegh at Jun 15, 2012 at 7:01 am
    Hi Harsh

    You were right - it was executed via the LocalJobRunner. What settings do I
    need to change to ensure it goes through RM/NM? I did specify to use the
    yarn framework in the mapred-site.xml. The eclipse console even shows to
    check http://localhost:8080 to track job progress.

    I will again cross-check all the parameters today with the ones specified
    in the urls you gave me. By the way, I found this link <
    http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
    >
    which gives a list of deprecated properties. I have, in fact, specified a
    lot of old properties in my core-site.xml, hdfs-site.xml, mapred-site.xml
    files. Will that cause a problem?

    Regards,
    Prajakta


    On Wed, Jun 13, 2012 at 8:18 PM, Harsh J wrote:

    Good to know your progress Prajakta!

    Did your submission surely go via the RM/NM or did it execute via the
    LocalJobRunner (logs show this classname)?

    You would ideally want to set the config "mapreduce.framework.name" to
    value "yarn" (in either config object before you use it, or in local
    mapred-site.xml), for it to use the YARN framework. A set of general
    configs for YARN deployment may be found at http://bit.ly/M2Eobz or at
    http://bit.ly/LW3Var.

    Does this help?
    On Wed, Jun 13, 2012 at 6:39 PM, Prajakta Kalmegh wrote:
    Hi Harsh

    Appreciate the response. I was able to configure and implement basic JUnits
    within eclipse and get some code running. Still getting familiar with the
    new YARN and federation architecture.

    I was, however, not able to check the MR jobs submitted within eclipse for
    a sample WordCount program on the
    http://localhost:8088/<http://localhost:8080/>page. I am starting my
    namenode/datanode/resourcemanager/nodemanager/historyserver as instructed
    on the wiki page. And then executing JUnit tests from eclipse.

    I believe a single MR job will be submitted as a single application in the
    new framework, right? The eclipse console shows a successful execution (the
    details are pretty neat). However, the webpage shows 'No applications
    submitted'. Do I have to tweak with any config properties to get this done?
    Please let me know.

    Regards,
    Prajakta




    On Wed, Jun 13, 2012 at 6:09 PM, Harsh J wrote:

    Hi Prajakta,

    I have Eclipse setup with M2E plugins. And once thats done, I merely
    clone a repo and import projects in via M2E's "Import existing maven
    projects" feature. This seems to work just fine for apache/hadoop's
    trunk.

    On Thu, Jun 7, 2012 at 5:18 PM, Prajakta Kalmegh <prkalmeg@in.ibm.com>
    wrote:
    Hi

    I have done MapReduce programming using Eclipse before but now I need
    to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (
    https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I could
    find list steps for earlier versions of Hadoop. I am right now
    following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for
    getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta


    --
    Harsh J


    --
    Harsh J
  • Harsh J at Jun 15, 2012 at 11:14 am
    Hey,

    The presence of deprecated properties should not really hamper your
    submissions. The config for YARN (pseudo or fully distributed) I
    usually follow are all listed at the links I provided earlier. When
    running programs from within the IDE, I place the directory they are
    contained in onto my launcher's (Called 'Run Configuration' in
    eclipse, for instance) classpath and they are picked up by my program.
    On Fri, Jun 15, 2012 at 12:30 PM, Prajakta Kalmegh wrote:
    Hi Harsh

    You were right - it was executed via the LocalJobRunner. What settings do I
    need to change to ensure it goes through RM/NM? I did specify to use the
    yarn framework in the mapred-site.xml. The eclipse console even shows to
    check http://localhost:8080 to track job progress.

    I will again cross-check all the parameters today with the ones specified
    in the urls you gave me. By the way, I found this link <
    http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
    which gives a list of deprecated properties. I have, in fact, specified a
    lot of old properties in my core-site.xml, hdfs-site.xml, mapred-site.xml
    files. Will that cause a problem?

    Regards,
    Prajakta


    On Wed, Jun 13, 2012 at 8:18 PM, Harsh J wrote:

    Good to know your progress Prajakta!

    Did your submission surely go via the RM/NM or did it execute via the
    LocalJobRunner (logs show this classname)?

    You would ideally want to set the config "mapreduce.framework.name" to
    value "yarn" (in either config object before you use it, or in local
    mapred-site.xml), for it to use the YARN framework. A set of general
    configs for YARN deployment may be found at http://bit.ly/M2Eobz or at
    http://bit.ly/LW3Var.

    Does this help?

    On Wed, Jun 13, 2012 at 6:39 PM, Prajakta Kalmegh <pkalmegh@gmail.com>
    wrote:
    Hi Harsh

    Appreciate the response. I was able to configure and implement basic JUnits
    within eclipse and get some code running. Still getting familiar with the
    new YARN and federation architecture.

    I was, however, not able to check the MR jobs submitted within eclipse for
    a sample WordCount program on the
    http://localhost:8088/<http://localhost:8080/>page. I am starting my
    namenode/datanode/resourcemanager/nodemanager/historyserver as instructed
    on the wiki page. And then executing JUnit tests from eclipse.

    I believe a single MR job will be submitted as a single application in the
    new framework, right? The eclipse console shows a successful execution (the
    details are pretty neat). However, the webpage shows 'No applications
    submitted'. Do I have to tweak with any config properties to get this done?
    Please let me know.

    Regards,
    Prajakta




    On Wed, Jun 13, 2012 at 6:09 PM, Harsh J wrote:

    Hi Prajakta,

    I have Eclipse setup with M2E plugins. And once thats done, I merely
    clone a repo and import projects in via M2E's "Import existing maven
    projects" feature. This seems to work just fine for apache/hadoop's
    trunk.

    On Thu, Jun 7, 2012 at 5:18 PM, Prajakta Kalmegh <prkalmeg@in.ibm.com>
    wrote:
    Hi

    I have done MapReduce programming using Eclipse before but now I need
    to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (
    https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I could
    find list steps for earlier versions of Hadoop. I am right now
    following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for
    getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta


    --
    Harsh J


    --
    Harsh J


    --
    Harsh J
  • Prajakta Kalmegh at Jun 18, 2012 at 8:46 am
    Hi Harsh

    I can actually run the jobs (and browse them on localhost:8088) when
    executed from command line. The same program gives me problems from
    eclipse. :(

    Moreover, I cannot even browse DFS locations from eclipse (indigo). I am
    using hadoop-0.20.3-dev-eclipse-plugin.jar and following steps from
    shaswat's video to setup mapreduce environment in eclipse. It gives me
    "Error: IPC server version 7 trying to communicate with client version 3".

    I believe it is because of using the old (0.20.3) plugin and I could not
    find a newer plugin for latest Hadoop versions. Is there a workaround?

    I think (not sure) this is the reason that my jobs launched from eclipse
    are getting executed by the LocalJobRunner (inspite of specifying details
    of classpath in run configurations in eclipse as you suggested).

    One more thing, does maven automatically assume jre path even if i specify
    JAVA_HOME to my jdk? I had to comment out the following lines and create
    the artifact for jdk.tools in ~/.m2/repository manually as eclipse kept
    converting the jdk path to jre.
    <!-- scope>system</scope>
    <systemPath>${java.home}/../lib/tools.jar</systemPath-->

    The 'effective pom' view for the hadoop-project/pom.xml showed the jre path
    instead of jdk.

    I saw your old post for such problems at <
    http://www.harshj.com/2010/07/18/making-the-eclipse-plugin-work-for-hadoop/>
    but I believe the patch is already committed in the trunk now. Any idea why
    eclipse is troubling me like this? :(

    Regards,
    Prajakta

    On Fri, Jun 15, 2012 at 4:43 PM, Harsh J wrote:

    Hey,

    The presence of deprecated properties should not really hamper your
    submissions. The config for YARN (pseudo or fully distributed) I
    usually follow are all listed at the links I provided earlier. When
    running programs from within the IDE, I place the directory they are
    contained in onto my launcher's (Called 'Run Configuration' in
    eclipse, for instance) classpath and they are picked up by my program.
    On Fri, Jun 15, 2012 at 12:30 PM, Prajakta Kalmegh wrote:
    Hi Harsh

    You were right - it was executed via the LocalJobRunner. What settings do I
    need to change to ensure it goes through RM/NM? I did specify to use the
    yarn framework in the mapred-site.xml. The eclipse console even shows to
    check http://localhost:8080 to track job progress.

    I will again cross-check all the parameters today with the ones specified
    in the urls you gave me. By the way, I found this link <
    http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
    which gives a list of deprecated properties. I have, in fact, specified a
    lot of old properties in my core-site.xml, hdfs-site.xml, mapred-site.xml
    files. Will that cause a problem?

    Regards,
    Prajakta


    On Wed, Jun 13, 2012 at 8:18 PM, Harsh J wrote:

    Good to know your progress Prajakta!

    Did your submission surely go via the RM/NM or did it execute via the
    LocalJobRunner (logs show this classname)?

    You would ideally want to set the config "mapreduce.framework.name" to
    value "yarn" (in either config object before you use it, or in local
    mapred-site.xml), for it to use the YARN framework. A set of general
    configs for YARN deployment may be found at http://bit.ly/M2Eobz or at
    http://bit.ly/LW3Var.

    Does this help?

    On Wed, Jun 13, 2012 at 6:39 PM, Prajakta Kalmegh <pkalmegh@gmail.com>
    wrote:
    Hi Harsh

    Appreciate the response. I was able to configure and implement basic JUnits
    within eclipse and get some code running. Still getting familiar with
    the
    new YARN and federation architecture.

    I was, however, not able to check the MR jobs submitted within eclipse for
    a sample WordCount program on the
    http://localhost:8088/<http://localhost:8080/>page. I am starting my
    namenode/datanode/resourcemanager/nodemanager/historyserver as
    instructed
    on the wiki page. And then executing JUnit tests from eclipse.

    I believe a single MR job will be submitted as a single application in the
    new framework, right? The eclipse console shows a successful execution (the
    details are pretty neat). However, the webpage shows 'No applications
    submitted'. Do I have to tweak with any config properties to get this done?
    Please let me know.

    Regards,
    Prajakta




    On Wed, Jun 13, 2012 at 6:09 PM, Harsh J wrote:

    Hi Prajakta,

    I have Eclipse setup with M2E plugins. And once thats done, I merely
    clone a repo and import projects in via M2E's "Import existing maven
    projects" feature. This seems to work just fine for apache/hadoop's
    trunk.

    On Thu, Jun 7, 2012 at 5:18 PM, Prajakta Kalmegh <
    prkalmeg@in.ibm.com>
    wrote:
    Hi

    I have done MapReduce programming using Eclipse before but now I
    need
    to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (
    https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I
    could
    find list steps for earlier versions of Hadoop. I am right now
    following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for
    getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta


    --
    Harsh J


    --
    Harsh J


    --
    Harsh J
  • Shant..... at Jun 14, 2012 at 4:45 am
    Hi Prajakta,

    can u please tell me the steps to do MapReduce programming using Eclipse .
    I really appreciate.. ur help. i m new to hadoop . want to learn

    thanks
    shant
    On Thu, Jun 7, 2012 at 6:48 AM, Prajakta Kalmegh wrote:

    Hi

    I have done MapReduce programming using Eclipse before but now I need to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I could
    find list steps for earlier versions of Hadoop. I am right now following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta



    --
    Life is Just a dream on the way to death...........
    Regards
    Shantlingayya
  • Shashwat shriparv at Jun 14, 2012 at 7:03 am
    Follow this video...:

    http://www.youtube.com/watch?v=TavehEdfNDk



    On Thu, Jun 14, 2012 at 10:14 AM, shant.....
    wrote:
    Hi Prajakta,

    can u please tell me the steps to do MapReduce programming using Eclipse .
    I really appreciate.. ur help. i m new to hadoop . want to learn

    thanks
    shant

    On Thu, Jun 7, 2012 at 6:48 AM, Prajakta Kalmegh <prkalmeg@in.ibm.com
    wrote:
    Hi

    I have done MapReduce programming using Eclipse before but now I need to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (
    https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I could
    find list steps for earlier versions of Hadoop. I am right now following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta



    --
    Life is Just a dream on the way to death...........
    Regards
    Shantlingayya


    --



    Shashwat Shriparv
  • Prajakta Kalmegh at Jun 15, 2012 at 6:50 am
    Hi Shant

    Sorry for the delay in reply (I was not well). Did you manage to install
    using Shaswat's video? The video gives good demo for setting up basic
    Mapreduce programming environment.

    If you need steps for cloning and setting up Hadoop in Eclipse, please
    follow this link. <http://wiki.apache.org/hadoop/EclipseEnvironment>
    For setting up properties, follow this link <
    http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/SingleCluster.html>
    Please ensure to specify appropriate values for host:port combinations.

    Please let me know if this helps.

    Regards,
    Prajakta



    On Thu, Jun 14, 2012 at 12:32 PM, shashwat shriparv wrote:

    Follow this video...:

    http://www.youtube.com/watch?v=TavehEdfNDk



    On Thu, Jun 14, 2012 at 10:14 AM, shant.....
    wrote:
    Hi Prajakta,

    can u please tell me the steps to do MapReduce programming using Eclipse .
    I really appreciate.. ur help. i m new to hadoop . want to learn

    thanks
    shant

    On Thu, Jun 7, 2012 at 6:48 AM, Prajakta Kalmegh <prkalmeg@in.ibm.com
    wrote:
    Hi

    I have done MapReduce programming using Eclipse before but now I need
    to
    learn the Hadoop code internals for one of my projects.

    I have forked Hadoop from github (
    https://github.com/apache/hadoop-common
    ) and need to configure it to work with Eclipse. All the links I could
    find list steps for earlier versions of Hadoop. I am right now
    following
    instructions given in these links:
    - http://wiki.apache.org/hadoop/GitAndHadoop
    - http://wiki.apache.org/hadoop/EclipseEnvironment
    - http://wiki.apache.org/hadoop/HowToContribute

    Can someone please give me a link to the steps to be followed for
    getting
    Hadoop (latest from trunk) started in Eclipse? I need to be able to commit
    changes to my forked repository on github.

    Thanks in advance.
    Regards,
    Prajakta



    --
    Life is Just a dream on the way to death...........
    Regards
    Shantlingayya


    --



    Shashwat Shriparv

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJun 7, '12 at 11:50a
activeJun 18, '12 at 8:46a
posts10
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase