FAQ
I am running basic hadoop examples on amazon emr and I am stuck at a very
simple place. I am apparently not passing the right "classname" for
inputFormat
From hadoop documentation it seems like "TextInputFormat" is a valid option
for input format

I am running a simple sort example using mapreduce.

Here is the command variations I tried, all to vain:


$usr/local/hadoop/bin/hadoop jar /path to hadoop
examples/hadoop-0.18.0-examples.jar sort -inFormat TextInputFormat
-outFormat TextOutputFormat /path to datainput/datain/ /path to data
output/dataout

The sort function does not declare "TextInputFormat" in its import list.
Could that be a problem
?
Could it be a version problem?


Any help is aprpeciated!
Shivani



--
Research Scholar,
School of Electrical and Computer Engineering
Purdue University
West Lafayette IN
web.ics.purdue.edu/~sgrao

Search Discussions

  • Shivani Rao at Mar 3, 2011 at 4:55 am
    Problems running local installation of hadoop on single-node cluster

    I followed instructions given by tutorials to run hadoop-0.21 on a single node cluster.

    The first problem I encountered was that of HADOOP-6953. Thankfully that has got fixed.

    The other problem I am facing is that the datanode does not start. This I guess because when I run stop-dfs.sh for datanode, I get a message
    "no datanode to stop"

    I am wondering if it is related remotely to the difference in the IP addresses on my computer

    127.0.0.1 localhost
    127.0.1.1 my-laptop

    Although I am aware of this, I do not know how to fix this.

    I am unable to even run a simple pi estimate example on the haddop installation

    This is the output I get is

    bin/hadoop jar hadoop-mapred-examples-0.21.0.jar pi 10 10
    Number of Maps = 10
    Samples per Map = 10
    11/03/02 23:38:47 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

    And nothing else for long long time.

    I have not set the dfs.namedir and dfs.datadir in my hdfs-site.xml. But After running bin/hadoop namenode -format, I see that the tmp.dir has a folder with dfs/data and dfs/data folders for the two directories.

    what Am I doing wrong? Any help is appreciated.

    Here are my configuration files

    Regards,
    Shivani

    hdfs-site.xml

    <property>
    <name>dfs.replication</name>
    <value>1</value>
    <description>Default block replication.
    The actual number of replications can be specified when the file is created.
    The default is used if replication is not specified in create time.
    </description>
    </property>


    core-site.xml

    <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop-${user.name}</value>
    <description>A base for other temporary directories.</description>
    </property>

    <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:54310</value>
    <description>The name of the default file system. A URI whose
    scheme and authority determine the FileSystem implementation. The
    uri's scheme determines the config property (fs.SCHEME.impl) naming
    the FileSystem implementation class. The uri's authority is used to
    determine the host, port, etc. for a filesystem.</description>
    </property>



    mapred-site.xml

    <property>
    <name>mapred.job.tracker</name>
    <value>localhost:54311</value>
    <description>The host and port that the MapReduce job tracker runs
    at. If "local", then jobs are run in-process as a single map
    and reduce task.
    </description>
    </property>

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 25, '11 at 10:58p
activeMar 3, '11 at 4:55a
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Shivani Rao: 2 posts

People

Translate

site design / logo © 2022 Grokbase