FAQ
I'm building a Hadoop project using Maven. I want to add
Maven dependencies to my project. What do I do?

I think the answer is I add a <dependency></dependency> section to my .POM
file, but I'm not sure what the contents of this section (groupId,
artifactId etc.) should be. Googling does not turn up a clear answer. Is
there a canonical Hadoop Maven dependency specification?

Search Discussions

  • Luke Lu at Aug 12, 2011 at 9:48 pm
    Pre-0.21 (sustaining releases, large-scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>0.20.203.0</version>
    </dependency>

    Pre-0.23 (small scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapred</artifactId>
    <version>...</version>
    </dependency>

    Trunk (currently targeting 0.23.0, large-scale tested) hadoop WILL be:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce</artifactId>
    <version>...</version>
    </dependency>
    On Fri, Aug 12, 2011 at 2:20 PM, W.P. McNeill wrote:
    I'm building a Hadoop project using Maven. I want to add
    Maven dependencies to my project. What do I do?

    I think the answer is I add a <dependency></dependency> section to my .POM
    file, but I'm not sure what the contents of this section (groupId,
    artifactId etc.) should be. Googling does not turn up a clear answer. Is
    there a canonical Hadoop Maven dependency specification?
  • W.P. McNeill at Aug 12, 2011 at 11:08 pm
    I want the latest version of Hadoop (with the new API). I guess that's the
    trunk version, but I don't see the hadoop-mapreduce artifact listed on
    https://repository.apache.org/index.html#nexus-search;quick~hadoop
    On Fri, Aug 12, 2011 at 2:47 PM, Luke Lu wrote:

    Pre-0.21 (sustaining releases, large-scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>0.20.203.0</version>
    </dependency>

    Pre-0.23 (small scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapred</artifactId>
    <version>...</version>
    </dependency>

    Trunk (currently targeting 0.23.0, large-scale tested) hadoop WILL be:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce</artifactId>
    <version>...</version>
    </dependency>
    On Fri, Aug 12, 2011 at 2:20 PM, W.P. McNeill wrote:
    I'm building a Hadoop project using Maven. I want to add
    Maven dependencies to my project. What do I do?

    I think the answer is I add a <dependency></dependency> section to my .POM
    file, but I'm not sure what the contents of this section (groupId,
    artifactId etc.) should be. Googling does not turn up a clear answer. Is
    there a canonical Hadoop Maven dependency specification?
  • Luke Lu at Aug 13, 2011 at 12:34 am
    There is a reason I capitalized WILL (SHALL) :) The current trunk
    mapreduce code is influx. Once mr2 (MAPREDUCE-279) is merged into
    trunk (soon!). We'll be producing hadoop-mapreduce-0.23.0-SNAPSHOT,
    which depends on hadoop-hdfs-0.23.0-SNAPSHOT, which depends on
    hadoop-common-0.23.0-SNAPSHOT.

    If you just want to play with the "new" API, you can use the
    0.22.0-SNAPSHOT artifacts. 0.23.0 is supposedly source compatible with
    previous hadoop versions including 0.20.x (for legacy API).
    On Fri, Aug 12, 2011 at 4:08 PM, W.P. McNeill wrote:
    I want the latest version of Hadoop (with the new API). I guess that's the
    trunk version, but I don't see the hadoop-mapreduce artifact listed on
    https://repository.apache.org/index.html#nexus-search;quick~hadoop
    On Fri, Aug 12, 2011 at 2:47 PM, Luke Lu wrote:

    Pre-0.21 (sustaining releases, large-scale tested)  hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>0.20.203.0</version>
    </dependency>

    Pre-0.23 (small scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapred</artifactId>
    <version>...</version>
    </dependency>

    Trunk (currently targeting 0.23.0, large-scale tested) hadoop WILL be:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce</artifactId>
    <version>...</version>
    </dependency>
    On Fri, Aug 12, 2011 at 2:20 PM, W.P. McNeill wrote:
    I'm building a Hadoop project using Maven. I want to add
    Maven dependencies to my project. What do I do?

    I think the answer is I add a <dependency></dependency> section to my .POM
    file, but I'm not sure what the contents of this section (groupId,
    artifactId etc.) should be. Googling does not turn up a clear answer. Is
    there a canonical Hadoop Maven dependency specification?
  • Steve Loughran at Aug 16, 2011 at 10:06 am

    On 13/08/11 00:08, W.P. McNeill wrote:
    I want the latest version of Hadoop (with the new API). I guess that's the
    trunk version, but I don't see the hadoop-mapreduce artifact listed on
    https://repository.apache.org/index.html#nexus-search;quick~hadoop
    I have a set up elsewhere, POM-less

    http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/antbuild/repository/org.apache.hadoop/

    There are some tagged .nolog to say the log4j.properties file has been
    stripped out
  • W.P. McNeill at Aug 16, 2011 at 3:57 pm
    Just to make sure I understand, the drop of smartfrog.svn.sourceforge.net is
    just a build of the latest Hadoop JARs, right? I can't use it as a Maven
    repository (because it's POM-less).
  • Steve Loughran at Aug 17, 2011 at 10:39 am

    On 16/08/11 16:56, W.P. McNeill wrote:
    Just to make sure I understand, the drop of smartfrog.svn.sourceforge.net is
    just a build of the latest Hadoop JARs, right? I can't use it as a Maven
    repository (because it's POM-less).
    It's an example of what to do.

    I use it in ivy because of its pom-less-ness is unimportant, and I can
    set up the dependancies downstream ( http://bit.ly/n5hbuB )

    they aren't private builds; they're the official releases, though by
    stripping out the log4j files I have diverged slightly.

    You can do something similar for your own project in the absence of a
    0.21 release
  • W.P. McNeill at Aug 14, 2011 at 12:53 am
    I'm trying to build a simple Hadoop word count application. I have the
    following pom.xml file:

    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="
    http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
    " rel="nofollow">http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>wpmcn</groupId>
    <artifactId>WordCountTestAdapter</artifactId>
    <packaging>jar</packaging>
    <version>1.0-SNAPSHOT</version>
    <name>WordCountTestAdapter</name>
    <url>http://maven.apache.org</url>
    <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>
    <dependencies>
    <dependency>
    <groupId>junit</groupId>
    <artifactId>junit</artifactId>
    <version>3.8.1</version>
    <scope>test</scope>
    </dependency>
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop</artifactId>
    <version>0.22.0</version>
    <type>POM</type>
    </dependency>
    </dependencies>
    </project>

    When I run "mvn install" I see the following error:

    [ERROR] Failed to execute goal on project WordCountTestAdapter: Could not
    resolve dependencies for project
    wpmcn:WordCountTestAdapter:jar:1.0-SNAPSHOT: Could not find artifact
    org.apache.hadoop:hadoop:POM:0.22.0 in central (
    http://repo1.maven.org/maven2) -> [Help 1]

    I've tried various different things in the hadoop entry to no avail.

    This is a vanilla Maven 3 install which works fine for building simple
    non-Hadoop Hello World applications, and I'm a Maven newbie so I may be
    missing something obvious. Can someone tell me what I'm doing wrong or
    direct me to a pom.xml that builds a simple Hadoop application?

    On Fri, Aug 12, 2011 at 2:47 PM, Luke Lu wrote:

    Pre-0.21 (sustaining releases, large-scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>0.20.203.0</version>
    </dependency>

    Pre-0.23 (small scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapred</artifactId>
    <version>...</version>
    </dependency>

    Trunk (currently targeting 0.23.0, large-scale tested) hadoop WILL be:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce</artifactId>
    <version>...</version>
    </dependency>
    On Fri, Aug 12, 2011 at 2:20 PM, W.P. McNeill wrote:
    I'm building a Hadoop project using Maven. I want to add
    Maven dependencies to my project. What do I do?

    I think the answer is I add a <dependency></dependency> section to my .POM
    file, but I'm not sure what the contents of this section (groupId,
    artifactId etc.) should be. Googling does not turn up a clear answer. Is
    there a canonical Hadoop Maven dependency specification?
  • W.P. McNeill at Aug 14, 2011 at 1:02 am
    More experimenting:

    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>0.20.203.0</version>
    <type>POM</type>
    </dependency>

    Works, but (as you indicate) gives the old Hadoop API.

    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapred</artifactId>
    <version>...</version>
    <type>POM</type>
    </dependency>

    Doesn't work. I can't find a hadoop-mapred artifact when I search for one on
    Maven Central <http://search.maven.org/>.

    On Fri, Aug 12, 2011 at 2:47 PM, Luke Lu wrote:

    Pre-0.21 (sustaining releases, large-scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>0.20.203.0</version>
    </dependency>

    Pre-0.23 (small scale tested) hadoop:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapred</artifactId>
    <version>...</version>
    </dependency>

    Trunk (currently targeting 0.23.0, large-scale tested) hadoop WILL be:
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce</artifactId>
    <version>...</version>
    </dependency>
    On Fri, Aug 12, 2011 at 2:20 PM, W.P. McNeill wrote:
    I'm building a Hadoop project using Maven. I want to add
    Maven dependencies to my project. What do I do?

    I think the answer is I add a <dependency></dependency> section to my .POM
    file, but I'm not sure what the contents of this section (groupId,
    artifactId etc.) should be. Googling does not turn up a clear answer. Is
    there a canonical Hadoop Maven dependency specification?
  • W.P. McNeill at Aug 16, 2011 at 5:22 pm
    Here is my specific problem:

    I have a sample word count Hadoop program up on github (
    https://github.com/wpm/WordCountTestAdapter) that illustrates unit testing
    techniques for Hadoop. This code uses the new API. (On my development
    machine I'm using version 0.20.2) I want to use Maven for its build
    framework because that seems like the way the Java world is going. Currently
    the pom.xml for this project makes no mention of Hadoop. If you try to do a
    "mvn install" you get the errors I described earlier. I want to change this
    project so that "mvn install" builds it.

    I can find the pre-0.21 (old API) hadoop-core JARs on
    http://mvnrepository.com, but I can't find the post-0.21 (new API)
    hadoop-mapred here. Do I need to add another Maven repository server to get
    the new API JARs?
  • Joey Echeverria at Aug 16, 2011 at 5:29 pm
    If you're talking about the org.apache.hadoop.mapreduce.* API, that
    was introduced in 0.20.0. There should be no need to use the 0.21
    version.

    -Joey
    On Tue, Aug 16, 2011 at 1:22 PM, W.P. McNeill wrote:
    Here is my specific problem:

    I have a sample word count Hadoop program up on github (
    https://github.com/wpm/WordCountTestAdapter) that illustrates unit testing
    techniques for Hadoop. This code uses the new API. (On my development
    machine I'm using version 0.20.2) I want to use Maven for its build
    framework because that seems like the way the Java world is going. Currently
    the pom.xml for this project makes no mention of Hadoop. If you try to do a
    "mvn install" you get the errors I described earlier. I want to change this
    project so that "mvn install" builds it.

    I can find the pre-0.21 (old API) hadoop-core JARs on
    http://mvnrepository.com, but I can't find the post-0.21 (new API)
    hadoop-mapred here. Do I need to add another Maven repository server to get
    the new API JARs?


    --
    Joseph Echeverria
    Cloudera, Inc.
    443.305.9434
  • W.P. McNeill at Aug 18, 2011 at 4:54 pm
    The versioning issue was a red herring. My problem turned out to the the
    fact that I had a <type>POM</type> in my Hadoop dependencies section, which
    was causing the JAR files not to be downloaded. I now have this project
    building.

    Other people trying to set up a simple Hadoop project in Maven can use Word
    Count Test Adapter <https://github.com/wpm/WordCountTestAdapter> as an
    example.

    Thanks everybody for your help.
    On Tue, Aug 16, 2011 at 10:29 AM, Joey Echeverria wrote:

    If you're talking about the org.apache.hadoop.mapreduce.* API, that
    was introduced in 0.20.0. There should be no need to use the 0.21
    version.

    -Joey
    --
    Joseph Echeverria
    Cloudera, Inc.
    443.305.9434
  • Dhodapkar, Chinmay at Aug 13, 2011 at 3:42 pm
    I am trying to automate the installation/bringup of a complete hadoop/hbase cluster from a single script. I have run into a very small issue...
    Before bringing up the namenode, I have to format it with the usual "hadoop namenode -format"

    Executing the above command prompts the user for Y/N?. Is there an option that can be passed to force the format without prompting?
    The aim is for the script to complete without any human intervention...
  • Giridharan Kesavan at Aug 15, 2011 at 12:42 am
    this should help.

    echo Y | ${hadoophdfshome}/bin/hdfs namenode -format

    -giri

    On Sat, Aug 13, 2011 at 8:41 AM, Dhodapkar, Chinmay
    wrote:
    I am trying to automate the installation/bringup of a complete hadoop/hbase
    cluster from a single script. I have run into a very small issue...
    Before bringing up the namenode, I have to format it with the usual "hadoop
    namenode -format"

    Executing the above command prompts the user for Y/N?. Is there an option
    that can be passed to force the format without prompting?
    The aim is for the script to complete without any human intervention...


  • Dhodapkar, Chinmay at Aug 16, 2011 at 12:02 am
    Perfect :)

    -----Original Message-----
    From: Giridharan Kesavan
    Sent: Sunday, August 14, 2011 5:42 PM
    To: common-user@hadoop.apache.org
    Subject: Re: hdfs format command issue

    this should help.

    echo Y | ${hadoophdfshome}/bin/hdfs namenode -format

    -giri

    On Sat, Aug 13, 2011 at 8:41 AM, Dhodapkar, Chinmay
    wrote:
    I am trying to automate the installation/bringup of a complete hadoop/hbase
    cluster from a single script. I have run into a very small issue...
    Before bringing up the namenode, I have to format it with the usual "hadoop
    namenode -format"

    Executing the above command prompts the user for Y/N?. Is there an option
    that can be passed to force the format without prompting?
    The aim is for the script to complete without any human intervention...


  • Harsh J at Aug 16, 2011 at 4:59 am
    Generally though, the coreutil 'yes' lets you accomplish these kind of
    tasks where you need to repeatedly put out a string in order to get
    through some interaction/etc..

    On Tue, Aug 16, 2011 at 5:32 AM, Dhodapkar, Chinmay
    wrote:
    Perfect :)

    -----Original Message-----
    From: Giridharan Kesavan
    Sent: Sunday, August 14, 2011 5:42 PM
    To: common-user@hadoop.apache.org
    Subject: Re: hdfs format command issue

    this should help.

    echo Y | ${hadoophdfshome}/bin/hdfs namenode -format

    -giri

    On Sat, Aug 13, 2011 at 8:41 AM, Dhodapkar, Chinmay
    wrote:
    I am trying to automate the installation/bringup of a complete hadoop/hbase
    cluster from a single script. I have run into a very small issue...
    Before bringing up the namenode, I have to format it with the usual "hadoop
    namenode -format"

    Executing the above command prompts the user for Y/N?. Is there an option
    that can be passed to force the format without prompting?
    The aim is for the script to complete without any human intervention...




    --
    Harsh J

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedAug 12, '11 at 9:20p
activeAug 18, '11 at 4:54p
posts16
users8
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase