FAQ

Search Discussions

172 discussions - 677 posts

  • Hi. I measured the Datanode memory usage, and noticed they take up to 700 MB of RAM. As their main job is to store files to disk, any idea why they take so much RAM? Thanks for any information.
    Stas OskinStas Oskin
    Aug 31, 2009 at 10:41 am
    Sep 2, 2009 at 2:59 pm
  • Hi. I checked this ticket and I like what I found. Had question about it, and hoped someone can answer it: If I have a NN, and BN, and the NN fails, how the DFS clients will know how to connect to ...
    Stas OskinStas Oskin
    Aug 6, 2009 at 5:46 pm
    Sep 21, 2009 at 4:08 pm
  • I am trying to download binary files stored in Hadoop but there is like a 2 minute wait on a 20mb file when I try to execute the in.read(buf). is there a better way to be doing this? private void ...
    Ananth T. SarathyAnanth T. Sarathy
    Aug 18, 2009 at 4:01 pm
    Aug 21, 2009 at 1:18 pm
  • Hi, I want to output two text files from my MapReduce job but I am having trouble understanding how to use the MultipleTextOutputFormat class to do so. I want to write to the two files depending on ...
    John ClarkeJohn Clarke
    Aug 14, 2009 at 2:11 pm
    Aug 19, 2009 at 11:46 pm
  • Hi all, Is fedora a decent choice of OS for a new hadoop cluster? All our other stuff is fedora, but is there was a strong case to move to something else? Cheers Tim
    Tim robertsonTim robertson
    Aug 12, 2009 at 11:05 am
    Aug 17, 2009 at 10:07 am
  • Hi all, We have an application where we pull logs from an external server(far apart from hadoop cluster) to hadoop cluster. Sometimes, we could see huge delay (of 1 hour or more) in actually seeing ...
    Pallavi PalletiPallavi Palleti
    Aug 11, 2009 at 7:49 am
    Aug 13, 2009 at 9:01 am
  • Hi all, Can anyone tell me how the MR scheduler schedule the MR jobs? How does it decide where t create MAP tasks and how many to create. Once the MAP tasks are over how does it decide to move the ...
    Bharath vissapragadaBharath vissapragada
    Aug 20, 2009 at 4:01 pm
    Aug 21, 2009 at 4:58 pm
  • Dear all I'm sorry to disturb you. Our cluster has 200 nodes now. In order to improve its ability, we hope to add 60 nodes into the current cluster. However, we all don't know what will happen if we ...
    Yang songYang song
    Aug 12, 2009 at 6:01 am
    Aug 13, 2009 at 10:22 pm
  • Hello, everyone When I submit a big job(e.g. maptasks:10000, reducetasks:500), I find that the copy phrase will last for a long long time. From WebUI, the message "reduce copy (xxxx of 10000 at 0.01 ...
    Yang songYang song
    Aug 24, 2009 at 6:49 am
    Aug 28, 2009 at 6:19 pm
  • Hi all, I know this has been filed as a JIRA improvement already http://issues.apache.org/jira/browse/HDFS-343, but is there any good workaround at the moment? What's happening is I have added a few ...
    Kris JirapinyoKris Jirapinyo
    Aug 25, 2009 at 7:51 pm
    Aug 26, 2009 at 5:40 pm
  • Hello, all I have met the problem "too many fetch failures" when I submit a big job(e.g. tasks 10000). And I know this error occurs when several reducers are unable to fetch the given map output. ...
    Yang songYang song
    Aug 19, 2009 at 5:23 am
    Aug 20, 2009 at 5:15 pm
  • 10

    Re::!

    I want to compress the data first and then place it in HDFS. Again, while retrieving the same, I want to uncompress it and place on the desired destination. Can this be possible. How to get started? ...
    Sugandha NaolekarSugandha Naolekar
    Aug 3, 2009 at 5:15 am
    Aug 3, 2009 at 5:19 pm
  • I'm running a test Hadoop cluster, which had
    Andy LiuAndy Liu
    Aug 27, 2009 at 2:00 pm
    Oct 13, 2009 at 10:41 am
  • does anyone have any up to date data on the memory consumption per block/file on the NN on a 64-bit JVM with compressed pointers? The best documentation on consumption is ...
    Steve LoughranSteve Loughran
    Aug 20, 2009 at 10:41 am
    Aug 25, 2009 at 6:32 pm
  • Hi. How can I get the free / used space on DFS, via Java? What are the functions that can be used for that? Note, I'm using a regular (non-super) user, so I need to do it in a similar way to ...
    Stas OskinStas Oskin
    Aug 23, 2009 at 11:22 am
    Aug 25, 2009 at 11:33 pm
  • Hello everyone, could anyone please tell me in which class and which method does Hadoop download the file chunk from HDFS and associate it with the thread that executes the Map function on given ...
    Roman kolcunRoman kolcun
    Aug 20, 2009 at 1:55 am
    Aug 20, 2009 at 6:04 pm
  • I had 3 jobs running and I saw something a bit odd. Two of the tasks are reducing, one of them is using all the reducers so the other is waiting, this is OK. However the 3rd job is still in the ...
    Mayuran YogarajahMayuran Yogarajah
    Aug 12, 2009 at 8:15 pm
    Aug 12, 2009 at 10:24 pm
  • We are inviting gurus or major contributors of Hive and/or Hbase (or anything related to Hadoop) to give us presentations about the products. Would you name a few names? The gurus must be in bay ...
    Gopal GandhiGopal Gandhi
    Aug 28, 2009 at 6:26 pm
    Aug 28, 2009 at 7:32 pm
  • Hello, everybody I feel puzzled about setting properties in hadoop-site.xml. Suppose I submit the job from machine A, and JobTracker runs on machine B. So there are two hadoop-site.xml files. Now, I ...
    Yang songYang song
    Aug 19, 2009 at 2:21 pm
    Aug 21, 2009 at 10:11 am
  • Hi, I'm having troubles with running Hadoop in RHEL 5, I did everything as documented in: http://hadoop.apache.org/common/docs/r0.20.0/quickstart.html And configured: conf/core-site.xml, ...
    Onur AKTASOnur AKTAS
    Aug 3, 2009 at 11:57 pm
    Aug 4, 2009 at 1:25 am
  • Has anybody had any luck setting up the log4j.properties file to send logs to a syslog-ng server? My log4j.properties excerpt: log4j.appender.SYSLOG=org.apache.log4j.net.SyslogAppender ...
    Mike AndersonMike Anderson
    Aug 19, 2009 at 11:32 pm
    Aug 20, 2009 at 8:29 pm
  • Hello I was wondering how I could locate the source code files for the fair scheduler. Thanks Mithila
    Mithila NagendraMithila Nagendra
    Aug 19, 2009 at 9:48 pm
    Aug 20, 2009 at 7:59 pm
  • Hi, I'm having some problems (Hadoop 0.20.0) where map tasks fail to report status for 10 minutes and get killed eventually. All of the tasks output around the same amount of data, some only take a ...
    Mathias De MaréMathias De Maré
    Aug 5, 2009 at 7:03 am
    Aug 17, 2009 at 2:10 pm
  • I have a map/reduce job that has a total of 6000 map tasks. The issue is that the number of maps that is "running" at any given time is 6 (number of nodes) and rest are pending. Does anyone know how ...
    Zeev MilinZeev Milin
    Aug 5, 2009 at 10:59 pm
    Aug 6, 2009 at 8:46 pm
  • hi all, We need to dump data from a mysql cluster with about 50 nodes to a hdfs file. Considered about the issues on security , we can't use tools like sqoop, where all datanodes must hold a ...
    Min ZhouMin Zhou
    Aug 4, 2009 at 3:16 am
    Aug 6, 2009 at 10:58 am
  • Hi, I've say 800 sequence files written using SequenceFileOutputFormat. Is there any way to know no. of unique keys in those sequence files? Thanks, Prashant.
    Prashant ullegaddiPrashant ullegaddi
    Aug 2, 2009 at 4:54 am
    Aug 4, 2009 at 2:26 pm
  • Hadoop Fans, please pardon the short notice, but we wanted to let you know that we are offering a 3 day training program at the end of the month in San Francisco. There is a $300 discount for those ...
    Christophe BiscigliaChristophe Bisciglia
    Aug 14, 2009 at 2:03 am
    Aug 21, 2009 at 12:21 am
  • Hi! I'm currently evaluating different Hadoop versions for a new project. I'm tempted by the Cloudera distribution, since it's neatly packaged into .deb files, and is the stable distribution but with ...
    Erik ForsbergErik Forsberg
    Aug 19, 2009 at 10:56 am
    Aug 19, 2009 at 11:29 pm
  • Hi, I just wanted to know what if we have set the replication factor greater than the number of nodes in the cluster. for example, i have only 3 nodes in my cluster but i set the replication factor ...
    Rakhi KhatwaniRakhi Khatwani
    Aug 12, 2009 at 6:43 pm
    Aug 12, 2009 at 8:34 pm
  • Hi all, I was wondering if anyone's encountered 4 extra bytes at the beginning of the serialized object file using MultipleOutputFormat. Basically, I am using BytesWritable to write the serialized ...
    Kris JirapinyoKris Jirapinyo
    Aug 12, 2009 at 1:34 am
    Aug 12, 2009 at 6:35 pm
  • Hi, We had a cluster of 9 machines with one name node, and 8 data nodes (2 had 220GB hard disk space, rest had 450GB). Most of the space on first machines with 250GB disk space was consumed. Now we ...
    Prashant ullegaddiPrashant ullegaddi
    Aug 7, 2009 at 5:38 pm
    Aug 8, 2009 at 5:42 am
  • Hi Everybody, I am trying to run hadoop from eclipse... but when i run NmaeNode.java as java appliaction i get following error..... Please help in getting rid of this problem. 2009-08-05 23:42:00,760 ...
    Ashish pareekAshish pareek
    Aug 6, 2009 at 4:52 am
    Aug 7, 2009 at 4:23 pm
  • Hi all , I have noticed some problem in my cluster when i changed the hadoop version on the same DFS directory .. The namenode log on the master says the following .. ile system image contains an old ...
    Bharath vissapragadaBharath vissapragada
    Aug 3, 2009 at 7:09 pm
    Aug 5, 2009 at 7:22 pm
  • hi can u guys suggest some hadoop unit testing framework apart from MRUnit??? i have used MRUnit but i m not sure abt its feasibilty and support to hadoop 0.20..... i could not find a proper ...
    Nikhil SawantNikhil Sawant
    Aug 26, 2009 at 11:20 am
    Sep 1, 2009 at 5:38 am
  • Hey, I want to start learning and using about Hadoop (not the srouce code) but I don't know where to start, there are many projects. http://hadoop.apache.org/ Thanks. -- View this message in context: ...
    HHBHHB
    Aug 30, 2009 at 11:32 am
    Aug 30, 2009 at 10:11 pm
  • Hi, when I run Hadoop in pseudo-distributed mode, I can't find the log which System.out.println() goes. When I run in the IDE, I see it. When I run on EC2, it's part of the output logs. But here - do ...
    Mark KerznerMark Kerzner
    Aug 25, 2009 at 1:23 am
    Aug 30, 2009 at 2:22 am
  • Hi all , Is there any general cost model that can be used to guess the run time of a program (similar to Page IO/s , selectivity factors in RDBMS) in terms of any config aspects such as number of ...
    Bharath vissapragadaBharath vissapragada
    Aug 25, 2009 at 3:00 am
    Aug 29, 2009 at 6:20 am
  • Is there any way to concatenate/append a local file to a file on HDFS without copying down the HDFS file locally first? I tried: bin/hadoop dfs -cat file:///[local file] hdfs://[hdfs file] But it ...
    Turner KunkelTurner Kunkel
    Aug 26, 2009 at 4:33 pm
    Aug 27, 2009 at 2:13 pm
  • Hi everyone. I installed hadoop among three pcs. When I ran the command 'start-all.sh', I only could start the jobtracker and tasktrackers. I use 192.*.*.x as master and use 192.*.*.y and 192.*.*.z ...
    Qiu tianQiu tian
    Aug 21, 2009 at 11:31 am
    Aug 25, 2009 at 11:47 pm
  • Hi all, I have two computers, and in the hadoop-site.xml, I define the fs.default.name as localhost:9000, then I cannot access the cluster with Java API from another machine But if I change it to its ...
    Zhang jianfengZhang jianfeng
    Aug 25, 2009 at 12:47 am
    Aug 25, 2009 at 11:53 am
  • I'm new to hadoop. I'm running 0.19.2 on a Centos 5.2 cluster. I have been having problems with the nodes connecting to the master (even when the firewall is off) using the hostname in the ...
    Nelson, WilliamNelson, William
    Aug 24, 2009 at 4:33 pm
    Aug 24, 2009 at 5:36 pm
  • Hi, I am a beginner trying to setup a few simple hadoop tests on a single node before moving on to a cluster. I am just using the simple wordcount example for now. My question is what's the best way ...
    Vasilis LiaskovitisVasilis Liaskovitis
    Aug 18, 2009 at 12:18 am
    Aug 23, 2009 at 11:57 pm
  • I am new to Hadoop (I have not yet installed/configured), and I want to make sure that I have the correct tool for the job. I do not "currently" have a need for the Map/Reduce functionality, but I am ...
    Poole, Samuel [USA]Poole, Samuel [USA]
    Aug 18, 2009 at 11:09 pm
    Aug 19, 2009 at 11:44 pm
  • Hi, all When I add another 50 nodes into the current cluster(200 nodes) at the same time, the jobs run very smoothly at first. However, after a while, all the jobs are suspended and never continue. I ...
    Yang songYang song
    Aug 17, 2009 at 6:36 am
    Aug 19, 2009 at 2:26 pm
  • I want to encrypt the data that would be placed in HDFS. So I will have to use some kind of encryption algorithms, right? Also, This encryption is to be done on data before placing it in HDFS. How ...
    Sugandha NaolekarSugandha Naolekar
    Aug 3, 2009 at 10:16 am
    Aug 18, 2009 at 7:24 pm
  • Hi, I am using Hadoop 0.18.3. I'm trying to get my app to output DEBUG messages to the console using a custom conversion pattern. I'm editing the log4j.properties file in the conf folder but the ...
    John ClarkeJohn Clarke
    Aug 7, 2009 at 4:10 pm
    Aug 14, 2009 at 2:08 pm
  • Hello All When the fair scheduler switches between two jobs, what does it do with the intermediary data? Does it dump the data/job states onto the disk (DFS)? Or does it do a context switch (i.e. ...
    Mithila NagendraMithila Nagendra
    Aug 13, 2009 at 5:45 pm
    Aug 13, 2009 at 6:50 pm
  • I have a 6 node cluster running Hadoop 0.18.3. I'm trying to figure out how the data was spread out like this: node001 94.15% node002 94.16% node003 48.22% node004 47.85% node005 48.12% node006 ...
    Mayuran YogarajahMayuran Yogarajah
    Aug 11, 2009 at 6:09 pm
    Aug 12, 2009 at 5:02 pm
  • Hello all, What can cause HDFS to become corrupt? I was running some jobs which were failing. When I checked logs I saw that some files were corrupt so I ran 'hadoop fsck /' which showed that a few ...
    Mayuran YogarajahMayuran Yogarajah
    Aug 10, 2009 at 10:08 pm
    Aug 11, 2009 at 4:44 pm
  • I can't seem to get Hbase to run using the hadoop i have connected to my s3 bucket Running Hbase 0.19.2 Hadoop 0.19.2 Hadoop-site.xml < configuration <property <name fs.default.name</name <value ...
    Ananth T. SarathyAnanth T. Sarathy
    Aug 7, 2009 at 2:51 pm
    Aug 7, 2009 at 5:34 pm
Group Navigation
period‹ prev | Aug 2009 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions172
posts677
users181
websitehadoop.apache.org...
irc#hadoop

181 users for August 2009

Aaron Kimball: 31 posts Yang song: 20 posts Raghu Angadi: 19 posts Stas Oskin: 18 posts Ted Dunning: 18 posts Sugandha Naolekar: 17 posts Bharath vissapragada: 16 posts Edward Capriolo: 15 posts Todd Lipcon: 15 posts Brian Bockelman: 14 posts Jason Venner: 14 posts Amogh Vasekar: 13 posts Ananth T. Sarathy: 13 posts Steve Loughran: 13 posts Harish Mallipeddi: 12 posts Arvind Sharma: 11 posts Prashant ullegaddi: 11 posts Amandeep Khurana: 10 posts Mayuran Yogarajah: 10 posts Mithila Nagendra: 10 posts
show more