FAQ

Search Discussions

120 discussions - 468 posts

  • Hi ,all Our hadoop cluster has 22 nodes including one namenode, one jobtracker and 20 datanodes. Each node has 2 * 12 cores with 32G RAM Dose anyone tell me how to config following parameters: ...
    Hao.wangHao.wang
    Jan 9, 2012 at 12:22 pm
    Jan 10, 2012 at 6:01 pm
  • We have been having problems with mappers timing out after 600 sec when the mapper writes many more, say thousands of records for every input record - even when the code in the mapper is small and ...
    Steve LewisSteve Lewis
    Jan 20, 2012 at 5:16 pm
    Jan 23, 2012 at 6:29 pm
  • Hi, I've been googling, but haven't been able to find an answer. I'm currently trying to setup Hadoop in pseudo-distributed mode as a first step. I'm using the Cloudera distro and installed ...
    Eli FinkelshteynEli Finkelshteyn
    Jan 9, 2012 at 6:14 pm
    Jan 9, 2012 at 11:37 pm
  • Hi guys, I get this error from a job trying to process 3Million records. java.io.IOException: Bad connect ack with firstBadLink 192.168.1.20:50010 at ...
    Mark questionMark question
    Jan 26, 2012 at 11:13 am
    Jan 27, 2012 at 5:54 pm
  • Understanding Fair Schedulers better. Can we create mulitple pools in Fair Schedulers. I guess Yes. Please correct me. Suppose I have 2 pools in my fair-scheduler.xml 1. Hadoop-users : Min map : 10, ...
    Praveenesh kumarPraveenesh kumar
    Jan 25, 2012 at 11:24 am
    Jan 25, 2012 at 3:36 pm
  • The map tasks fail timing out after 600 sec. I am processing one 9 GB file with 16,000,000 records. Each record (think is it as a line) generates hundreds of key value pairs. The job is unusual in ...
    Steve LewisSteve Lewis
    Jan 18, 2012 at 10:06 pm
    Jan 19, 2012 at 12:08 pm
  • Hi, whatever I do, I can't make it work, that is, I cannot use s3://host or s3n://host as a replacement for HDFS while runnings EC2 cluster. I change the settings in the core-file.xml, in ...
    Mark KerznerMark Kerzner
    Jan 18, 2012 at 6:26 am
    Jan 18, 2012 at 4:57 pm
  • Hello, In hdfs we have set block size - 40bytes . Input Data set is as below terminated with line feed. data1 (5*8=40 bytes) data2 ...... ....... data10 But still we see only 2 map tasks spawned, ...
    SsetSset
    Jan 9, 2012 at 4:47 pm
    Jan 13, 2012 at 8:47 am
  • mvn compile and failed:( jdk version is "1.6.0_23" maven version is Apache Maven 3.0.3 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (compile-proto) on project ...
    Smith jackSmith jack
    Jan 15, 2012 at 2:47 am
    Jul 16, 2012 at 9:34 am
  • Hey, Can anyone explain me what is reduce copy phase in the reducer section ? The (K,List(V)), is passed to the reducer. Is reduce copy representing copying of K,List(V) on the reducer from all ...
    Praveenesh kumarPraveenesh kumar
    Jan 25, 2012 at 10:04 am
    Feb 2, 2012 at 7:35 am
  • I have a Hadoop cluster on which I have generated some data using Teragen. But while running Terasort on this data, it gives following error. java.lang.RuntimeException: Error in configuring object ...
    Utkarsh RathoreUtkarsh Rathore
    Jan 24, 2012 at 10:31 am
    Jan 25, 2012 at 12:31 pm
  • Hi Guys ! When we get CPU utilization value of a node in hadoop cluster, what percent value can be considered as overloaded ? Say for eg. CPU utilization Node Status 85% Overloaded 20% Normal Arun -- ...
    ArunKumarArunKumar
    Jan 17, 2012 at 6:17 am
    Jan 18, 2012 at 5:48 pm
  • Hi, Can someone help me asap? when i run my mapred job, it fails with this error - 12/01/12 02:58:36 INFO mapred.JobClient: Task Id : attempt_201112151554_0050_m_000071_0, Status : FAILED Error: Java ...
    T Vinod GuptaT Vinod Gupta
    Jan 12, 2012 at 3:15 am
    Jan 12, 2012 at 8:57 am
  • I have been compiling my mapreduce with the jars in the classpath, and I believe I need to also add the jars as an option to -libjars to hadoop. However, even when I do this, I still get an error ...
    Daniel QuachDaniel Quach
    Jan 31, 2012 at 6:34 am
    Feb 1, 2012 at 1:51 pm
  • Hello, How much memory/JVM heap does NameNode use for each block? I've tried locating this in the FAQ and on search-hadoop.com, but couldn't find a ton of concrete numbers, just these two: ...
    Otis GospodneticOtis Gospodnetic
    Jan 17, 2012 at 3:09 pm
    Jan 31, 2012 at 5:31 am
  • Is there anyway through which we can kill hadoop jobs that are taking enough time to execute ? What I want to achieve is - If some job is running more than "_some_predefined_timeout_limit", it should ...
    Praveenesh kumarPraveenesh kumar
    Jan 30, 2012 at 7:06 am
    Jan 31, 2012 at 1:15 am
  • Hi All, I am new to Hadoop, Can any one tell me which is the best Linux Operating system used for installing & running Hadoop. ?? now a day i am using Ubuntu 11.4 and install Hadoop on it but it ...
    Sujit DhamaleSujit Dhamale
    Jan 27, 2012 at 9:16 am
    Jan 30, 2012 at 12:34 pm
  • Dear List, we're trying to use a central HDFS storage in order to be accessed from various other Hadoop-Distributions. Do you think this is possible? We're having trouble, but not related to ...
    Romeo KienzlerRomeo Kienzler
    Jan 25, 2012 at 10:38 am
    Jan 25, 2012 at 2:57 pm
  • Hello, I'm trying to develop an application, where Reducer has to produce multiple outputs. In detail I need the Reducer to produce two types of files. Each file will have different output. I found ...
    Ondřej KlimperaOndřej Klimpera
    Jan 25, 2012 at 11:43 am
    Jan 25, 2012 at 1:08 pm
  • After reading this article, http://www.cloudera.com/blog/2012/01/caching-in-hbase-slabcache/ , I was wondering if there was a filesystem cache for hdfs. For example, if a large file (10gigabytes) was ...
    RitaRita
    Jan 15, 2012 at 1:30 am
    Jan 17, 2012 at 12:27 pm
  • Hello, I'm running two jobs on Hadoop-0.20.2 consecutively, such that the second one reads the output of the first which would look like: outputPath/part-00000 outputPath/_logs .... But I get the ...
    Mark questionMark question
    Jan 6, 2012 at 12:16 pm
    Jan 9, 2012 at 1:17 am
  • Hello, I have a large amount of text file 1GB, that I want to sort. So far, I know of hadoop examples that takes sequence file as an input to sort program. Does anyone know of any implementation that ...
    SangroyaSangroya
    Jan 30, 2012 at 3:12 pm
    Feb 8, 2012 at 5:05 pm
  • Hi, Our group is trying to set up a prototype for what will eventually become a cluster of ~50 nodes. Anyone have experiences with a stateless Hadoop cluster setup using this method on CentOS? Are ...
    Aaron TokhyAaron Tokhy
    Jan 30, 2012 at 10:40 pm
    Jan 31, 2012 at 4:42 pm
  • Hello All, I have a mapred job that does transfermation and outputs to a compresses SequenceFile (by using org.apache.hadoop.mapred.SequenceFileOutputFormat) I am able to attach the output to a ...
    Rk vishuRk vishu
    Jan 26, 2012 at 10:10 pm
    Jan 27, 2012 at 6:53 am
  • I am running a hadoop jar and keep getting this error - java.lang.NoSuchMethodError: org.codehaus.jackson.JsonParser.getValueAsLong() on digging deeper, this is what I can find:- my jar packages ...
    VvkbtnkrVvkbtnkr
    Jan 13, 2012 at 9:59 pm
    Jan 26, 2012 at 10:31 pm
  • Hi guys, I just faced a weird situation, in which one of my hard disks on DN went down. Due to which when I restarted namenode, some of the blocks went missing and it was saying my namenode is ...
    Praveenesh kumarPraveenesh kumar
    Jan 17, 2012 at 6:38 am
    Jan 20, 2012 at 9:03 am
  • Hi, If I have one username on a hadoop cluster and would like to set myself up to use that same username from every client from which I access the cluster, how can I go about doing that? I found ...
    Eli FinkelshteynEli Finkelshteyn
    Jan 12, 2012 at 8:55 pm
    Jan 19, 2012 at 7:45 pm
  • Hi, I was wondering if anyone knows any paper discussing and comparing the mentioned topic. I am a little bit confused about the classification of hadoop.. Is it a /cluster/comp grid/ a mix of them? ...
    Merto MertekMerto Mertek
    Jan 11, 2012 at 2:43 pm
    Jan 12, 2012 at 2:52 am
  • java version 1.6.0_29 hadoop: 0.20.203.0 I'm attempting to setup the pseudo-distributed config on a mac 10.6.8. I followed the steps from the QuickStart (http://wiki.apache.org./hadoop/QuickStart) ...
    Dave KelseyDave Kelsey
    Jan 4, 2012 at 10:50 pm
    Jan 10, 2012 at 5:48 am
  • Has anyone managed to get the eclipse plugin working with Hadoop 1.0.0, I keep getting errors such as: Caused by: java.io.IOException: Call to /192.168.1.200:50010 failed on local exception: ...
    Chris 0Chris 0
    Jan 3, 2012 at 12:55 pm
    Jan 4, 2012 at 7:28 am
  • Hey folks, i m facing a problem, with job Tracker URL, actually i added a node to the cluster and after sometime i restart the cluster, then i found that my job tracker is showing recent added node ...
    Hadoop hiveHadoop hive
    Jan 27, 2012 at 11:18 am
    Jan 30, 2012 at 7:15 am
  • Hi, I want to run a MR procedure under Hadoop and then send some messages & data to all of nodes and after that run anther MR. What's the easiest way for sending data to all or some nodes? Or "Is ...
    OliaeiOliaei
    Jan 28, 2012 at 1:02 pm
    Jan 29, 2012 at 2:18 pm
  • In working a sample issue I used a combiner - I noticed that the Combiner output records were 90% of the Combiner Input records and when looking at the data found relatively few duplicated keys. This ...
    Steve LewisSteve Lewis
    Jan 24, 2012 at 5:34 pm
    Jan 25, 2012 at 4:06 pm
  • Gurus, I'm setting up a security cluster of hadoop .23. But now, the communication between Data Node and Name Node, Node Manager and Resource Manager have problem. When I start the Node Manager, it ...
    Emma LinEmma Lin
    Jan 20, 2012 at 4:53 am
    Jan 20, 2012 at 7:29 pm
  • -- View this message in context: http://old.nabble.com/Getting-error-tp33123705p33123705.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
    ArusarkaArusarka
    Jan 11, 2012 at 7:59 pm
    Jan 11, 2012 at 8:27 pm
  • Hi guys ! When a Job is submitted it is given an ID say job_200904211745_0018 in Hadoop. But for some reason i want to submit it with ID say "job1". How can i do that ? Arun -- View this message in ...
    ArunKumarArunKumar
    Jan 4, 2012 at 4:02 pm
    Jan 5, 2012 at 6:07 pm
  • Hi, I'm having trouble trying to handle lzo compressed files. The input files are compressed by LzopCodec provided by hadoop-lzo package. And I am using Cloudera 3 update 2 version Hadoop. I don't ...
    Edward choiEdward choi
    Jan 2, 2012 at 5:35 am
    Jan 2, 2012 at 8:02 am
  • Hi hadoop-0.19 I have a working hadoop cluster which has been running perfectly for months. But today after restarting the cluster, at jobtracker UI its showing state INITIALIZING for a long time and ...
    Gaurav BaggaGaurav Bagga
    Jan 12, 2012 at 7:13 pm
    Apr 2, 2012 at 8:25 am
  • Hi, I have 100 .txt input files and I want my mapper output to be in an orderly manner. I am not using any reducer.Any idea? Regards,
    Daniel YehdegoDaniel Yehdego
    Jan 19, 2012 at 7:19 pm
    Apr 2, 2012 at 6:01 am
  • hi Hadoops & Nutchs, I'm trying to run Nutch 1.4 *locally*, on Windows 7, using Hadoop 0.20.203.0. I run with: fs.default.name = D:\fs hadoop.tmp.dir = D:\tmp dfs.permissions = false PATH environment ...
    Shlomi javaShlomi java
    Jan 11, 2012 at 3:55 pm
    Feb 2, 2012 at 1:30 am
  • Hi All, I am using hadoop-0.20.2 and doing a fresh installation of a distributed Hadoop cluster along with Hbase.I am having virtualized nodes running on top of VMwareESXi5.0 server. The VM on which ...
    Anil guptaAnil gupta
    Jan 30, 2012 at 9:48 pm
    Jan 31, 2012 at 10:38 pm
  • Hi All, I am new to Hadoop. Please let me know the details of Hardware required for Hadoop cluster set up? ( Min 3 node cluster) I would like to know the OS and Memory,CPU, network, storage details ...
    RenukaRenuka
    Jan 30, 2012 at 6:01 am
    Jan 30, 2012 at 6:19 am
  • I am looking to use Hadoop for parallel loading of CSV file into a non-Hadoop, parallel database. Is there an existing utility that allows one to pick entries, row-by-row, synchronized and in ...
    Edmon BegoliEdmon Begoli
    Jan 24, 2012 at 5:19 pm
    Jan 25, 2012 at 1:32 am
  • Hi, I often run into situations like this: I am running a very heavy job(let's say job 1) on a hadoop cluster(which takes many hours). Then something comes up that needs to be done very quickly(let's ...
    Edward choiEdward choi
    Jan 18, 2012 at 6:58 am
    Jan 20, 2012 at 1:54 am
  • Hi We just started implemented hadoop on our system for the first time(Cloudera CDH3u2 ) After reformatting a namenode for a few times, DataNode is not coming up with error "Incompatible ...
    Gdan2000Gdan2000
    Jan 15, 2012 at 8:46 am
    Jan 15, 2012 at 11:02 pm
  • Hi everyone. I am running C++ code using the PIPES wrapper and I am looking for some tutorials, examples or any kind of help with regards to using binary data. My problems is that I am working with ...
    GorGoGorGo
    Jan 10, 2012 at 4:31 pm
    Jan 13, 2012 at 8:48 am
  • I need a value of the core-site.xml property file within FileInputFormat. I'm overriding the getSplits method: public List<InputSplit getSplits(JobContext job); Using the job I can access the job ...
    Marcel HolleMarcel Holle
    Jan 12, 2012 at 6:59 pm
    Jan 12, 2012 at 10:41 pm
  • Jobtracker webUI suddenly stopped showing. It was working fine before. What could be the issue ? Can anyone guide me how can I recover my WebUI ? Thanks, Praveenesh
    Praveenesh kumarPraveenesh kumar
    Jan 11, 2012 at 1:39 pm
    Jan 12, 2012 at 5:44 am
  • what are the thoughts on running a hadoop cluster in a datacenter with respect to power? should all the boxes have redundant power supplies and be on dual power? or just dual power for the namenode, ...
    Koert KuipersKoert Kuipers
    Jan 7, 2012 at 7:24 pm
    Jan 9, 2012 at 5:03 pm
  • Hey guys, Just wanted to ask, are there any sort of best practices to be followed for hadoop shuffling improvements ? I am running Hadoop 0.20.205 on 8 nodes cluster.Each node is 24 cores/CPUs with ...
    Praveenesh kumarPraveenesh kumar
    Jan 30, 2012 at 12:51 pm
    Jan 31, 2012 at 9:32 pm
Group Navigation
period‹ prev | Jan 2012 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions120
posts468
users145
websitehadoop.apache.org...
irc#hadoop

145 users for January 2012

Harsh J: 57 posts Praveenesh kumar: 28 posts Eli Finkelshteyn: 14 posts Steve Lewis: 14 posts Mark question: 12 posts Raghavendhra rahul: 11 posts Michel Segel: 10 posts Alo.alt: 8 posts Hadoop hive: 8 posts Joey Echeverria: 8 posts Raj Vishwanthan: 8 posts ArunKumar: 7 posts Prashant Kommireddi: 7 posts Robert Evans: 7 posts Srinivas Surasani: 7 posts W.P. McNeill: 7 posts Bejoy Ks: 6 posts Koji Noguchi: 6 posts Rk vishu: 6 posts Utkarsh Rathore: 6 posts
show more