Search Discussions

153 discussions - 595 posts

  • http://hadoop.apache.org/#What+Is+Apache%E2%84%A2+Hadoop%E2%84%A2%3F March 2011 - Apache Hadoop takes top prize at Media Guardian Innovation Awards The Hadoop project won the "innovator of the ...
    Edward CaprioloEdward Capriolo
    May 18, 2011 at 4:53 pm
    May 23, 2011 at 2:27 pm
  • I have installed hadoop-0.20.2 (using quick start guide) and mahout. I am running OpenSuse Linux 11.1 (but am a newbie to Linux). My JAVA_HOME is set to usr/java/jdk1.6.0_21. When I run bin/hadoop ...
    Keith ThompsonKeith Thompson
    May 10, 2011 at 3:20 pm
    May 12, 2011 at 9:12 am
  • Hi, I have a 10 node cluster (IBM blade servers, 48GB RAM, 2x500GB Disk, 16 HT cores). I've uploaded 10 files to HDFS. Each file is 10GB. I used the streaming jar with 'wc -l' as mapper and 'cat' as ...
    May 30, 2011 at 12:28 pm
    Jul 17, 2011 at 2:15 am
  • Hello, I am trying to set up Hadoop HDFS in a cluster for the first time. So far I was using pseudo-distributed mode on my PC at home and everything was working perfectly. Tha NameNode starts but the ...
    Panayotis AntonopoulosPanayotis Antonopoulos
    May 13, 2011 at 1:08 am
    May 14, 2011 at 2:31 am
  • Hi Folks, We try to get hbase and hadoop running on clusters, take 2 Solaris servers for now. Because of the incompatibility issue between hbase and hadoop, we have to stick with hadoop 0.20.2-append ...
    Xu, RichardXu, Richard
    May 27, 2011 at 1:54 am
    Jan 25, 2012 at 7:18 am
  • http://stackoverflow.com/q/6015818/300248 -- Regards, K. Gabriele --- unchanged since 20/9/10 --- P.S. If the subject contains "[LON]" or the addressee acknowledges the receipt within 48 hours then I ...
    Gabriele KahloutGabriele Kahlout
    May 16, 2011 at 10:20 am
    May 19, 2011 at 3:34 pm
  • Hi, My question is when I run a command from hdfs client, for eg. hadoop fs -copyFromLocal or create a sequence file writer in java code and append key/values to it through Hadoop APIs, does it ...
    Mapred LearnMapred Learn
    May 18, 2011 at 12:44 am
    May 27, 2011 at 5:07 am
  • I'm working on a cluster with 360 reducer slots. I've got a big job, so when I launch it I follow the recommendations in the Hadoop documentation and set mapred.reduce.tasks=350, i.e. slightly less ...
    W.P. McNeillW.P. McNeill
    May 18, 2011 at 9:25 pm
    May 19, 2011 at 11:53 am
  • Hello, I am trying to debug a Hadoop MapReduce job under Eclipse in Windows and I am running into a problem when the Hadoop framework tries to set up the staging directory (see the stack trace below) ...
    Iwona Bialynicka-BirulaIwona Bialynicka-Birula
    May 7, 2011 at 8:56 pm
    Mar 11, 2012 at 7:35 am
  • Hello guys, In case any of you are working on HBASE, I just wrote a program by reading some tutorials.. But no where its mentioned how to run codes on HBASE. In case anyone of you has done some ...
    Praveenesh kumarPraveenesh kumar
    May 24, 2011 at 9:08 am
    May 24, 2011 at 1:35 pm
  • I have been reviewing quite a few presentations on the web from various businesses, in addition to the ones I watched first hand at the cloudera data summit last week, and I am curious as to others ...
    Matt GoekeMatt Goeke
    May 4, 2011 at 5:31 pm
    May 6, 2011 at 5:16 pm
  • I have a new-API Partitioner<http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/mapreduce/Partitioner.html object whose behavior needs to change based on values passed in from the job ...
    W.P. McNeillW.P. McNeill
    May 3, 2011 at 11:43 pm
    May 6, 2011 at 4:18 pm
  • When I want to launch a hadoop job I use SCP to execute a command on the Name node machine. I an wondering if there is a way to launch a Hadoop job from a machine that is not on the cluster. How to ...
    Steve LewisSteve Lewis
    May 28, 2011 at 7:26 pm
    Jun 6, 2011 at 9:36 am
  • How can I tell how the map and reduce tasks were spread accross the cluster? I looked at the jobtracker web page but can't find that info. Also, can I specify how many map or reduce tasks I want to ...
    Mohit AnchliaMohit Anchlia
    May 26, 2011 at 9:48 pm
    May 31, 2011 at 7:18 pm
  • I'm trying to sort Sequence files using the Hadoop-Example TeraSort. But after taking a couple of minutes .. output is empty. HDFS has the following Sequence files: -rw-r--r-- 1 Hadoop supergroup ...
    Mark questionMark question
    May 22, 2011 at 1:22 am
    May 26, 2011 at 7:35 pm
  • I sent this to pig apache user mailing list but have got no response. Not sure if that list is still active. thought I will post here if someone is able to help me. I am in process of installing and ...
    Mohit AnchliaMohit Anchlia
    May 26, 2011 at 4:56 pm
    May 26, 2011 at 7:01 pm
  • Hello everyone, I am using wordcount application to test on my hadoop cluster of 5 nodes. The file size is around 5 GB. Its taking around 2 min - 40 sec for execution. But when I am checking the ...
    Praveenesh kumarPraveenesh kumar
    May 20, 2011 at 1:20 pm
    May 23, 2011 at 5:06 am
  • For one long running job we are noticing that the mapper jvms do not exit even after the mapper is done. Any suggestions on why this could be happening. The java processes get cleaned up if I do a ...
    May 12, 2011 at 8:40 pm
    May 13, 2011 at 2:17 pm
  • I have a set of (key, value) pairs. For each value there is a function f(value) that returns an integer. I want to generate a histogram over f(value) for my data set. For example, representing the ...
    W.P. McNeillW.P. McNeill
    May 9, 2011 at 8:13 pm
    May 9, 2011 at 10:06 pm
  • Hi Guys, I recently configured my cluster to have 2 VMs. I configured 1 machine (slave3) to be the namenode and another to be the jobtracker (slave2). They both work as datanode/tasktracker as well. ...
    Juan P.Juan P.
    May 31, 2011 at 9:08 pm
    Jun 1, 2011 at 3:55 pm
  • I am new to hadoop and from what I understand by default hadoop splits the input into blocks. Now this might result in splitting a line of record into 2 pieces and getting spread accross 2 maps. For ...
    Mohit AnchliaMohit Anchlia
    May 27, 2011 at 4:56 pm
    May 29, 2011 at 11:55 pm
  • I just started learning hadoop and got done with wordcount mapreduce example. I also briefly looked at hadoop streaming. Some questions 1) What should be my first step now? Are there more examples ...
    Mohit AnchliaMohit Anchlia
    May 24, 2011 at 11:17 pm
    May 25, 2011 at 4:42 pm
  • Hello Hadoop Gurus, We are running a 4-node cluster. We just upgraded the RAM to 48 GB. We have allocated around 33-34 GB per node for hadoop processes. Leaving the rest of the 14-15 GB memory for OS ...
    May 11, 2011 at 5:31 pm
    May 11, 2011 at 9:48 pm
  • n00b here, just started playing around with pipes. I'm getting linker errors while compiling a simple WordCount example using hadoop-0.20.203 (current most recent version) that did not appear for the ...
    May 17, 2011 at 3:11 am
    Jun 9, 2011 at 9:51 am
  • Hello All, I am planning to start project where I have to do extensive storage of xml and text files. On top of that I have to implement efficient algorithm for searching over thousands or millions ...
    May 31, 2011 at 5:51 pm
    Jun 1, 2011 at 5:58 am
  • Hi guys, I'm using an NFS cluster consisting of 30 machines, but only specified 3 of the nodes to be my hadoop cluster. So my problem is this. Datanode won't start in one of the nodes because of the ...
    Mark questionMark question
    May 25, 2011 at 2:23 am
    May 25, 2011 at 8:37 am
  • Hello everybody, I'm a GSoC student for this year and I will be working on James [1]. My project is to implement email storage over HDFS. I am quite new to Hadoop and associates and I am looking for ...
    Ioan Eugen StanIoan Eugen Stan
    May 18, 2011 at 11:03 pm
    May 24, 2011 at 3:01 am
  • Hi, can I use a Counter to give each record in all reducers a consecutive number? Currently I am using a single Reducer, but it is an anti-pattern. But I need to assign consecutive numbers to all ...
    Mark KerznerMark Kerzner
    May 20, 2011 at 4:56 pm
    May 20, 2011 at 6:38 pm
  • I've got a directory with a bunch of MapReduce data in it. I want to know how many <Key, Value pairs it contains. I could write a mapper-only process that takes <Writeable, Writeable pairs as input ...
    W.P. McNeillW.P. McNeill
    May 20, 2011 at 4:35 pm
    May 20, 2011 at 5:38 pm
  • Hello everybody ! This exception was thrown when I tried to copy a file from local file to HDFS. This is my program : *************************************************************** import ...
    Lạc TrungLạc Trung
    May 14, 2011 at 11:58 pm
    May 18, 2011 at 3:33 pm
  • Hi , I am using cdh3 in pseudo distributed mode and getting the following error while starting the task tracker and job tracker. Any suggestions.? Error for Task Tracker 2011-05-17 13:28:10,234 INFO ...
    Subhramanian, DeepakSubhramanian, Deepak
    May 17, 2011 at 3:49 pm
    May 18, 2011 at 1:42 pm
  • Hi I need to use hadoop-tool-kit for monitoring. So I followed http://code.google.com/p/hadoop-toolkit/source/checkout and applied the patch in my hadoop.20.2 directory as: patch -p0 < patch.20.2 and ...
    Mark questionMark question
    May 17, 2011 at 8:02 pm
    May 17, 2011 at 10:55 pm
  • Hi I'm using FileInputFormat which will split files logically according to their sizes into splits. Can the mapper get a pointer to these splits? and know which split it is assigned ? I tried looking ...
    Mark questionMark question
    May 13, 2011 at 4:00 am
    May 13, 2011 at 3:57 pm
  • Hi all! I have been trying to figure out why I m getting this error! All that I did was : 1) Use a single node cluster 2) Made some modifications in the core (in some MapRed modules). Successfully ...
    Matthew JohnMatthew John
    May 11, 2011 at 1:27 pm
    May 11, 2011 at 2:17 pm
  • Hi, On our 15node cluster (1GB ethernet and 4x1TB disk per node) I noticed that distcp does a much better job at rebalancing than the dedicated balancer does. We needed to decommision 11 nodes, so ...
    Ferdy GalemaFerdy Galema
    May 5, 2011 at 12:31 pm
    May 5, 2011 at 6:55 pm
  • Hello all, I'm attempting to set up a Hadoop 0.20.2 cluster (Yes I know it's old, but I've already got several programs written for it). I am very close to having it set up correctly. I can add files ...
    Travis BolingerTravis Bolinger
    May 3, 2011 at 5:21 pm
    May 3, 2011 at 6:19 pm
  • hi, all. I got so many failures on a reducing step. see this error. java.io.IOException: Failed to delete earlier output of task: attempt_201105021341_0021_r_000001_0 at ...
    Jun Young KimJun Young Kim
    May 2, 2011 at 7:26 am
    May 3, 2011 at 2:20 am
  • I am trying to install hadoop in cluster env with multiple nodes. Following instructions from http://hadoop.apache.org/common/docs/r0.17.0/cluster_setup.html ...
    May 23, 2011 at 5:36 pm
    Aug 11, 2011 at 10:05 am
  • Hi all, I meet a wried problem that I can not access hadoop cluster from outside. I have a client machine, and I can telnet namenode's port 9000 in this client machine , but I can not access the ...
    Jeff ZhangJeff Zhang
    May 27, 2011 at 6:59 am
    Jun 3, 2011 at 1:14 am
  • Hi, I am not sure if this question has been asked. Its more of a hadoop fs question. I am trying to execute the following hadoop fs command : hadoop fs -copyToLocal s3n://<Access Key :<Secret Key ...
    Neeral beladiaNeeral beladia
    May 31, 2011 at 10:56 pm
    Jun 1, 2011 at 5:24 pm
  • Hi, I'm running a job with maps only and I want by end of each map (ie.Close() function) to open the file that the current map has wrote using its output.collector. I know "job.getWorkingDirectory()" ...
    Mark questionMark question
    May 22, 2011 at 12:03 am
    May 25, 2011 at 8:50 am
  • Hello, I'm trying to pick up certain lines of a text file. (say 1st, 110th line of a file with 10^10 lines). I need a InputFormat which gives the Mapper line number as the key. I tried to implement ...
    May 18, 2011 at 6:42 pm
    May 22, 2011 at 1:58 am
  • I am running a Hadoop Java program in local single-JVM mode via an IDE (IntelliJ). I want to do performance profiling of it. Following the instructions in chapter 5 of *Hadoop: the Definitive Guide*, ...
    W.P. McNeillW.P. McNeill
    May 17, 2011 at 11:07 pm
    May 18, 2011 at 6:42 pm
  • The master is the central controller, slaves are specific cluster nodes? What to do if the master crashing(single point failure)? thanks
    May 7, 2011 at 8:36 am
    May 7, 2011 at 12:45 pm
  • Hi, I have a script something like this (simplified): for i in $(seq 1 200); do regenerate-files $dir $i hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar \ -D ...
    Dieter PlaetinckDieter Plaetinck
    May 6, 2011 at 2:10 pm
    May 7, 2011 at 6:56 am
  • Hi all I met a problem about changing block size from 64M to 128M. I am sure I modified the correct configuration file hdfs-site.xml. Because I can change the replication number correctly. However, ...
    He ChenHe Chen
    May 4, 2011 at 7:29 pm
    May 4, 2011 at 9:41 pm
  • 4


    Hello, I have installed using 0.20.2 in cygwin environment. I'm getting following error while trying to put some content on HDFS. Please suggest any solution. Abhay * $ bin/hadoop dfs -put ...
    Abhay ratnaparkhiAbhay ratnaparkhi
    May 2, 2011 at 6:41 pm
    May 3, 2011 at 1:29 am
  • Hi all, 1) I wanted to know how strong the coupling between HDFS and MapReduce (programming abstraction) in Hadoop is. Can someone throw some light on the protocols used between HDFS and ...
    Matthew JohnMatthew John
    May 2, 2011 at 6:48 am
    May 2, 2011 at 11:32 am
  • Hey, Maybe someone can give an idea where to look for the bug... I have a cluster with 270 slots for mappers, And a fairSchedualer configured for it.... Sometimes this cluster allocates only 80 or 50 ...
    Guy DoulbergGuy Doulberg
    May 1, 2011 at 4:36 pm
    May 2, 2011 at 11:11 am
  • Hello Geeks, I am a new bee to use hadoop and i am currently installed hadoop- I am running the sample programs part of this package but getting this error Any pointer to fix this ??? ...
    May 26, 2011 at 6:50 pm
    Aug 11, 2011 at 2:58 pm
Group Navigation
period‹ prev | May 2011 | next ›
Group Overview
groupcommon-user @

167 users for May 2011

Harsh J: 50 posts Mark question: 33 posts W.P. McNeill: 26 posts Joey Echeverria: 22 posts Mohit Anchlia: 21 posts James Seigel Tynt: 19 posts Praveenesh kumar: 16 posts Matthew John: 13 posts Luca Pireddu: 10 posts Baran cakici: 9 posts Mapred Learn: 9 posts Adi: 8 posts Dieter Plaetinck: 8 posts Edward Capriolo: 8 posts Gabriele Kahlout: 8 posts Konstantin Boudnik: 8 posts Panayotis Antonopoulos: 8 posts Steve Loughran: 8 posts Bharath Mundlapudi: 7 posts Highpointe: 7 posts
show more