Search Discussions

184 discussions - 729 posts

  • Hi, We've just upgraded our cluster from Hadoop 0.20.203 to 1.0.4 and have hit performance problems. Before the upgrade a 15TB terasort took about 45 minutes, afterwards it takes just over an hour ...
    Jon AllenJon Allen
    Nov 23, 2012 at 12:03 pm
    Dec 21, 2012 at 1:22 am
  • Hi, I am wondering how I can assign reduce tasks to specific nodes. What I want to do is, for example, assigning reducer which produces part-00000 to node xxx000, and part-00001 to node xxx001 and so ...
    Hiroyuki YamadaHiroyuki Yamada
    Nov 28, 2012 at 4:04 am
    Dec 8, 2012 at 1:19 pm
  • Unsubscribe The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action ...
    Jibins JosephJibins Joseph
    Nov 19, 2012 at 4:01 pm
    Nov 27, 2012 at 6:55 am
  • I am using hadoop0.20.2, now I want to use HDFS HA function. I research AvatarNode. I find if the StandbyNN do checkpoint fail, when next time the StandbyNN do checkpoint, the same edits file is ...
    Lei liuLei liu
    Nov 4, 2012 at 8:07 am
    Nov 10, 2012 at 12:45 pm
  • Hi, I'm reading a bit about hadoop and I'm trying to increase the HA of my current cluster. Today I have 8 datanodes and one namenode. By reading here: http://www.aosabook.org/en/hdfs.html I can see ...
    Jean-Marc SpaggiariJean-Marc Spaggiari
    Nov 22, 2012 at 4:28 pm
    Dec 1, 2012 at 4:30 am
  • Hello everyone, Like most others I am also running into some problems while running my word count example. I tried the various suggestion available on internet, but I guess it;s time to go on email ...
    Sandeep JangraSandeep Jangra
    Nov 29, 2012 at 3:31 pm
    Nov 30, 2012 at 1:23 am
  • I call the sentence "JobID id = new JobID()" of hadoop API with JNI. But when my program run to this sentence, it exits. And no errors output. I don't make any sense of this. The hadoop is ...
    Nov 27, 2012 at 12:08 pm
    Dec 4, 2012 at 8:00 am
  • Hi, I am setting up HDFS security with Kerberos: When I manually started the first datanode, I got the following messages (the namenode is started): 1) INFO ...
    Nov 26, 2012 at 11:51 am
    Nov 26, 2012 at 2:52 pm
  • By default the number of reducers is set to 1.. Is there a good way to guess optimal number of reducers.... Or let's say i have tbs worth of data... mappers are of order 5000 or so... But ultimately ...
    Jamal sashaJamal sasha
    Nov 21, 2012 at 4:39 pm
    Nov 22, 2012 at 4:46 am
  • Hi, Please help! I have installed a Hadoop Cluster with a single master (master1) and have HBase running on the HDFS. Now I am setting up the second master (master2) in order to form HA. When I used ...
    Nov 16, 2012 at 4:29 am
    Nov 16, 2012 at 2:15 pm
  • Hi, I have a cluster with 4 nodes and 32 many cores on each. My default value for the maximum number of mappers per slot is 1: <property <name mapred.tasktracker.map.tasks.maximum</name <!-- see ...
    Mark KerznerMark Kerzner
    Nov 13, 2012 at 3:16 pm
    Nov 13, 2012 at 3:52 pm
  • Guys, I am learning that NN doesn't persistently store block locations. Only file names and heir permissions as well as file blocks. It is said that locations come from DataNodes when NN starts. So, ...
    Kartashov, AndyKartashov, Andy
    Nov 19, 2012 at 2:27 pm
    Nov 19, 2012 at 4:38 pm
  • Hi, Quick question: What's the best way to turn on Map Output Compression in Hadoop 1.0.3? The tutorial at http://hadoop.apache.org/docs/r1.0.3/mapred_tutorial.html says to use ...
    Tony BurtonTony Burton
    Nov 28, 2012 at 11:13 am
    Dec 19, 2012 at 9:24 am
  • Hi, can I remove one hard drive from a slave but tell Hadoop not to replicate missing blocks for a few minutes, because I will return it back? Or will this not work at all, and will Hadoop continue ...
    Mark KerznerMark Kerzner
    Nov 28, 2012 at 2:15 pm
    Nov 28, 2012 at 4:39 pm
  • Hi, Everytime i query hbase or hive ,there is a significant growth in my log files and it consumes lot of space from my hard disk....(Approx 40 gb) So i stop the cluster ,delete all the logs and free ...
    Iwannaplay gamesIwannaplay games
    Nov 23, 2012 at 6:59 am
    Nov 23, 2012 at 9:45 pm
  • Hi All, I am trying upgrading apache hadoop-0.20.2 to hadoop-1.0.4. I have give same dfs.name.dir, etc as same in hadoop-1.0.4' conf files as were in hadoop-0.20.2. Now I am starting dfs n mapred ...
    Yogesh dhariYogesh dhari
    Nov 22, 2012 at 6:54 am
    Nov 22, 2012 at 1:15 pm
  • Hey all, When setting up the namenode, some of the commands that we run are: hadoop fs -mkdir /tmp hadoop fs -chmod -R 1777 /tmp This has worked for previous CDH releases of Hadoop. We recently ...
    Brian DericksonBrian Derickson
    Nov 7, 2012 at 5:58 pm
    Nov 9, 2012 at 7:42 pm
  • Hi all, I wonder wy there is a difference between "du" on HDFS and "get" + "du" on my local machnine. Here is an example: hadoop fs -du myfile.txt hadoop fs -get myfile.txt . du myfile.txt --- ...
    Nov 28, 2012 at 2:14 pm
    Nov 29, 2012 at 4:07 pm
  • Hello everyone, I have a very basic question. Besides sequence file format ( http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/SequenceFile.html), are there any other built-in file ...
    Lin MaLin Ma
    Nov 28, 2012 at 12:59 pm
    Nov 29, 2012 at 3:28 pm
  • Hi All, I'm coming from the RDBMS world and am looking at hdfs for long term data storage and analysis. I've done some research and set up some smallish hdfs clusters with hive for testing but I'm ...
    Jeff lJeff l
    Nov 24, 2012 at 8:56 pm
    Nov 28, 2012 at 7:02 pm
  • Guys, I know that there is old and new API for MapReduce. The old API is found under org.apache.hadoop.mapred and the new is under org.apache.hadoop.mapreduce I successfully used both (the old and ...
    Kartashov, AndyKartashov, Andy
    Nov 23, 2012 at 9:36 pm
    Nov 27, 2012 at 3:07 pm
  • Hello Ted, Thanks yours below. I do run hadoop on EC2 as well. And I use internal host-addresses ($hostname -f) for configuration of core-site/hdfs-site/slaves/master files. The only time I switch to ...
    Kartashov, AndyKartashov, Andy
    Nov 9, 2012 at 9:04 pm
    Nov 12, 2012 at 1:58 pm
  • Hello, I use hadoop-1.0.4 I have followed instruction to install hadoop-snappy at http://code.google.com/p/hadoop-snappy/ When I run a mapred job I see FATAL org.apache.hadoop.mapred.TaskTracker ...
    Nov 6, 2012 at 1:10 am
    Nov 7, 2012 at 6:57 pm
  • What do folks do to backup hdfs data? Has anyone experience in trying to use enterprise solutions such as netbackup with datadomain D-2-D appliance for doing backups of data in hdfs? If so, what is ...
    Uday chopraUday chopra
    Nov 6, 2012 at 12:19 am
    Nov 6, 2012 at 4:56 pm
  • Guys, Came across this error like many others who tried to run Ooozie examples. Searched and read bunch of posts on this topic. Even came across Harsh's response stipulating that oozie user must be ...
    Kartashov, AndyKartashov, Andy
    Nov 9, 2012 at 2:20 pm
    Nov 28, 2012 at 11:32 pm
  • Hi All, am working on XML processing in hadoop , followed the steps from the blog http://xmlandhadoop.blogspot.in/.I have added all jars in classpath but still getting the below exception. is there ...
    Dyuti aDyuti a
    Nov 27, 2012 at 1:00 pm
    Nov 28, 2012 at 12:57 pm
  • Can you please share your code over pastebin.com or GH gists so? Also, the mapreduce-dev@ address is for MR project development, not user help. I've moved your question to the proper users list ...
    Harsh JHarsh J
    Nov 24, 2012 at 11:03 am
    Nov 26, 2012 at 8:50 pm
  • Hello! I'm currently writing an importer to import our MySQL data into hadoop ( as Avro files ). Attached you can find the schema i'm converting to Avro, along with the corresponding Avro schema i ...
    Bart VerwilstBart Verwilst
    Nov 23, 2012 at 1:13 pm
    Nov 24, 2012 at 6:27 pm
  • I am not sure whats happening, but I wrote a simple mapper and reducer script. And I am testing it against a small dataset (like few lines long). For some reason reducer is just not starting.. and ...
    Jamal sashaJamal sasha
    Nov 20, 2012 at 4:53 pm
    Nov 21, 2012 at 2:43 pm
  • Hi All I'm wondering if there is an additional overhead when storing some data into HDFS? For example, I have a 2GB file, the replicate factor of HDSF is 2, when the file is uploaded to HDFS, should ...
    Nov 21, 2012 at 7:01 am
    Nov 21, 2012 at 9:02 am
  • Hi, I wrote a simple map reduce job in hadoop streaming. I am wondering if I am doing something wrong .. While number of mappers are projected to be around 1700.. reducers.. just 1? It’s couple of ...
    Jamal sashaJamal sasha
    Nov 20, 2012 at 7:39 pm
    Nov 21, 2012 at 4:09 am
  • Is it necessary to add hadoop and hbase site xmls in the classpath of the java client? Is there any other way we can configure it using general properties file using key=value?
    Mohit AnchliaMohit Anchlia
    Nov 12, 2012 at 11:35 pm
    Nov 13, 2012 at 9:19 am
  • Yinghua, What mode are you running your hadoop in: Local/Pseud/Fully...? Your hostname is not recognised Your configuration setting seems to be wrong. Hi, all Could some help looking at this problem? ...
    Kartashov, AndyKartashov, Andy
    Nov 9, 2012 at 5:37 pm
    Nov 10, 2012 at 12:11 am
  • Guys, When running examples, you bring them into HDFS. Say, you need to make some correction to a file, you need to make them on local FS and run $hadoop fs -put ... again. You cannot just make ...
    Kartashov, AndyKartashov, Andy
    Nov 8, 2012 at 9:48 pm
    Nov 8, 2012 at 11:33 pm
  • Hello Hadoop Champs, Please give some suggestion.. As Hadoop Ecosystem(Hive, Pig...) internally do Map-Reduce to process. My Question is 1). where Map-Reduce program(written in Java, python etc) are ...
    Yogesh Kumar13Yogesh Kumar13
    Nov 7, 2012 at 3:33 pm
    Nov 7, 2012 at 8:48 pm
  • Hi, I came across the following question in some sites and the answer that they provided seems to be wrong according to me... I might be wrong... Can some one help on confirming the right answers for ...
    Ramasubramanian NarayananRamasubramanian Narayanan
    Nov 7, 2012 at 5:21 pm
    Nov 7, 2012 at 6:54 pm
  • Hi guys, I've encountered a situation where the ratio between "Map output bytes" and "Map output materialized bytes" is quite huge and during the map-phase data is spilled to disk quite a lot. This ...
    Sigurd SpieckermannSigurd Spieckermann
    Nov 7, 2012 at 12:33 pm
    Nov 7, 2012 at 3:14 pm
  • Hi, Can anyone suggest sample model questions for taking the Cloudera Hadoop developer exam for CDH3 pls.. regards, Rams
    Ramasubramanian NarayananRamasubramanian Narayanan
    Nov 7, 2012 at 12:51 pm
    Nov 7, 2012 at 2:05 pm
  • Harsh/Ravi, I wrote my own scripts to [start|stop|restart] [hdfs|mapred] daemons. The content of the script is $sudo service /etc/init.d/hadoop-[hdfs-*|mapred-*] I have no problem starting Daemons ...
    Kartashov, AndyKartashov, Andy
    Nov 6, 2012 at 8:08 pm
    Nov 7, 2012 at 2:56 am
  • Hi, I am using a Secondary Sort for my Hadoop program. My map function emits (Text,NullWritable) where Text contains the composite key and appropriate comparison functions are made and a custom ...
    Aseem AnandAseem Anand
    Nov 4, 2012 at 8:18 pm
    Nov 5, 2012 at 10:40 am
  • I have a check monitoring the page jobtracker:50030/jobtracker.jsp, and the check shows timeout (180 sec) pretty often. Once I jump and browse to the page it actually take me from 5 sec to 5 minutes ...
    Patai SangbutsarakumPatai Sangbutsarakum
    Nov 1, 2012 at 8:48 pm
    Nov 2, 2012 at 2:03 pm
  • Hi I understand that the maximum number of concurrent map tasks is set by mapred.tasktracker.map.tasks.maximum - however I wish to run with a smaller number of maps (am testing disk IO). I thought ...
    Cogan, Peter (Peter)Cogan, Peter (Peter)
    Nov 1, 2012 at 4:47 pm
    Nov 1, 2012 at 9:34 pm
  • Hello list, Although a lot of similar discussions have been done here, I still seek some of your able guidance. Till now I have worked only on small or mid-sized clusters. But this time situation is ...
    Mohammad TariqMohammad Tariq
    Nov 28, 2012 at 10:10 pm
    Nov 30, 2012 at 12:01 am
  • Hello : I see the below exception when I submit a MapReduce Job from standalone java application to a remote Hadoop cluster. Cluster authentication mechanism is Kerberos. Below is the code. I am ...
    Erravelli, VenkatErravelli, Venkat
    Nov 28, 2012 at 4:05 pm
    Nov 29, 2012 at 6:56 pm
  • Hi Hadoop user community, I am trying to setup my first Hadoop cluster and I've found most of the instructions a little confusing. I've seen how-to's that say "core-site.xml" should have ...
    Michael NamaiandehMichael Namaiandeh
    Nov 27, 2012 at 8:16 pm
    Nov 28, 2012 at 6:40 am
  • Hello everyone.. I have a very weird issue at hand.. I am using CDH4.0.0 on one of my clusters and it works perfectly fine.. Now I tried deploying the same packages to another cluster with very ...
    Dhaval ShahDhaval Shah
    Nov 23, 2012 at 8:28 pm
    Nov 26, 2012 at 9:44 pm
  • Hi, We would like to integrate hive with hbase, so we are following the instructions listed here https://cwiki.apache.org/Hive/hbaseintegration.html . Hadoop 1.0.4 Hbase 0.94.2 Hive 0.9.0 I’ve ...
    Barak YaishBarak Yaish
    Nov 26, 2012 at 12:43 pm
    Nov 26, 2012 at 1:38 pm
  • It's not clear to me how to stitch together multiple map reduce jobs. Without using cascading or something else like it, is the method basically to write to a intermediate spot, and have the next ...
    Sean McNamaraSean McNamara
    Nov 23, 2012 at 10:22 pm
    Nov 25, 2012 at 9:35 pm
  • Hi, I’ve 2 nodes cluster (v1.04), master and slave. On the master, in Tool.run() we add two files to the DistributedCache using addCacheFile(). Files do exist in HDFS. In the Mapper.setup() we want ...
    Barak YaishBarak Yaish
    Nov 22, 2012 at 8:34 pm
    Nov 22, 2012 at 9:09 pm
  • Hi, We have observed some data loss when we enabled speculative execution with multipleoutputs. But when we disabled speculative execution, multipleOutputs working fine. I am also trying to find the ...
    AnilKumar BAnilKumar B
    Nov 21, 2012 at 2:21 pm
    Nov 21, 2012 at 3:45 pm
Group Navigation
period‹ prev | Nov 2012 | next ›
Group Overview
groupcommon-user @

187 users for November 2012

Harsh J: 93 posts Kartashov, Andy: 72 posts Mohammad Tariq: 28 posts Michel Segel: 17 posts Ac: 16 posts Mahesh Balija: 14 posts Bejoy KS: 13 posts Jay Vyas: 13 posts Ramasubramanian Narayanan: 13 posts Visioner Sadak: 13 posts Jamal sasha: 11 posts Mark Kerzner: 11 posts Yogesh dhari: 11 posts Mohit Anchlia: 10 posts Yinghua hu: 10 posts Jean-Marc Spaggiari: 9 posts Andy Isaacson: 8 posts Bertrand Dechoux: 8 posts Ted Dunning: 8 posts Lei liu: 7 posts
show more