FAQ

Search Discussions

134 discussions - 457 posts

  • We are continuing to see a small, consistent amount of block corruption leading to file loss. We have been upgrading our cluster lately, which means we've been doing a rolling de-commissioning of our ...
    Brian BockelmanBrian Bockelman
    Dec 6, 2008 at 1:00 am
    Dec 10, 2008 at 11:35 am
  • I'm having trouble finding a way to do what I want, so I'm wondering if I'm just not looking at the right place or if I'm thinking about the problem in the wrong way. Any insight would be ...
    Andy SautinsAndy Sautins
    Dec 7, 2008 at 6:03 pm
    Dec 8, 2008 at 7:20 pm
  • We encountered a bottleneck during the shuffle phase. However, there is not much data to be shuffled across the network at all - total less than 10MBytes (the combiner aggregated most of the data). ...
    Songting ChenSongting Chen
    Dec 5, 2008 at 7:17 pm
    Dec 6, 2008 at 5:49 pm
  • im getting this error message when i am dong *bash-3.2$ bin/hadoop dfs -put urls urls* please lemme know the resolution, i have a project submission in a few hours
    Elangovan anbalahanElangovan anbalahan
    Dec 4, 2008 at 6:20 pm
    Dec 4, 2008 at 7:50 pm
  • Hello, I had previously emailed regarding heap size issue and have discovered that the hadoop-site.xml is not loading completely, i.e Configuration defaults = new Configuration(); JobConf jobConf = ...
    Saptarshi GuhaSaptarshi Guha
    Dec 29, 2008 at 9:18 pm
    Jan 6, 2009 at 7:38 pm
  • I've defined a custom key class that implements writable. I've noticed that for use between the mapper and reducer the write and readFields are actually used. However, when I use an identity reducer, ...
    David CoeDavid Coe
    Dec 16, 2008 at 4:29 pm
    Dec 18, 2008 at 7:07 am
  • Couple of the datanodes crashed with the following error The /tmp is 15% occupied # # An unexpected error has been detected by Java Runtime Environment: # # SIGBUS (0x7) at pc=0xb4edcb6a, pid=10111, ...
    Sagar NaikSagar Naik
    Dec 1, 2008 at 9:01 pm
    Dec 1, 2008 at 11:35 pm
  • Hi, I am trying to create a hadoop cluster which can handle 2000 write requests per second. In each write request I would writing a line of size 1KB in a file. I would be using machine having ...
    Sandeep DhawanSandeep Dhawan
    Dec 30, 2008 at 11:58 am
    Jan 20, 2009 at 5:14 am
  • Hello All, I am designing an architecture which should support 10 million records storage capacity and 1 million updates / minute. Data persistancy is not that important as I will be purging this ...
    Aakash_j j_shahAakash_j j_shah
    Dec 20, 2008 at 1:06 am
    Dec 31, 2008 at 5:28 pm
  • Hi everyone: How do I control the number of threads per mapreduce job. I am using bin/hadoop jar wordcount to run jobs and even though I have found these settings in hadoop-default.xml and changed ...
    Michael MiceliMichael Miceli
    Dec 27, 2008 at 5:20 am
    Dec 31, 2008 at 10:11 am
  • I'm seeing some strange behavior with bzip2 files and release 0.19.0. I'm wondering if anyone can shed some light on what I'm seeing. Basically it _looks_ like the processing of a particular bzip2 ...
    Andy SautinsAndy Sautins
    Dec 4, 2008 at 5:12 pm
    Dec 7, 2008 at 2:07 am
  • Hi I have a scenario where, while a map/reduce is working on a file, the input file may get deleted and copied with a new version of the file. All my files are compressed and hence each file is ...
    Sandhya ESandhya E
    Dec 15, 2008 at 4:11 pm
    Jan 9, 2009 at 7:38 pm
  • Hello, I was wondering if Hadoop provides thread safe shared variables that can be accessed from individual mappers/reducers along with a proper locking mechanism. To clarify things, let's say that ...
    Jim TwenskyJim Twensky
    Dec 24, 2008 at 8:29 am
    Jan 2, 2009 at 6:46 am
  • Hello, I am currently using hadoop-0.18.0. I am not able to append files in DFS. I came across a fix which was done on version 0.19.0 (http://issues.apache.org/jira/browse/HADOOP-1700). But I cannot ...
    Sandeep Dhawan, NoidaSandeep Dhawan, Noida
    Dec 1, 2008 at 1:21 pm
    Dec 30, 2008 at 11:26 am
  • Hi friends of Hadoop, we from ScaleUnlimited.com put together a video that visualize the code commit history of the Hadoop core project. It is a neat way of visualizing who is behind the Hadoop ...
    Stefan GroschupfStefan Groschupf
    Dec 16, 2008 at 8:37 am
    Dec 17, 2008 at 9:28 pm
  • Hi I want to build a little cluster of Hbase & Hadoop. Starngely , I couldn't find any recommendations on the web. If you could share your experience , it would be great , specially in concern with ...
    Yossi IttachYossi Ittach
    Dec 14, 2008 at 8:19 am
    Dec 15, 2008 at 5:46 pm
  • Hi, I need to run my map-reduce routines for several iterations so that the output of an iteration becomes the input to the next iteration. Is there a standard pattern to do this instead of calling ...
    Delip RaoDelip Rao
    Dec 8, 2008 at 6:26 am
    Dec 26, 2008 at 10:07 pm
  • I have made a Hadoop platform on 15 machines recently. NameNode - DataNodes work properly but when I use bin/start-mapred.sh to start MapReduce framework only 3 or 4 TaskTracker could be started ...
    Ascend1Ascend1
    Dec 19, 2008 at 9:01 am
    Dec 23, 2008 at 4:53 am
  • I've noticed that if I put a system.out.println in the run() method I see the result on my console. If I put it in the map or reduce class, I never see the result. Where does it go? Is there a way to ...
    David CoeDavid Coe
    Dec 10, 2008 at 9:32 pm
    Dec 11, 2008 at 10:14 am
  • Hi I would like to know if Hadoop architecture more resembles SAN or NAS? -I'm guessing it is NAS. Or does it fall under a totally different category? If so, can you please email brief information? ...
    Sirisha AkkalaSirisha Akkala
    Dec 7, 2008 at 5:07 am
    Dec 8, 2008 at 11:54 am
  • I've written a simple map/reduce job that demonstrates a problem I'm having. Please see attached example. Environment: hadoop 0.19.0 cluster resides across linux nodes client resides on cygwin To ...
    Stuart WhiteStuart White
    Dec 11, 2008 at 5:05 pm
    Dec 30, 2008 at 10:12 pm
  • Hello, I have work machines with 32GB and allocated 16GB to the heap size ==hadoop-env.sh== export HADOOP_HEAPSIZE=16384 ==hadoop-site.xml== <property <name mapred.child.java.opts</name <value ...
    Saptarshi GuhaSaptarshi Guha
    Dec 28, 2008 at 9:01 pm
    Dec 29, 2008 at 3:57 am
  • Hi, I am stuck with some questions based on following scenario. 1) Hadoop normally splits the input file and distributes the splits across slaves(referred to as Psplits from now), in to chunks of 64 ...
    AmitsinghAmitsingh
    Dec 10, 2008 at 7:09 pm
    Dec 26, 2008 at 6:05 pm
  • Hello, I intend to start a mapreduce job from another java app, using ToolRunner.run method. This works fine on a local job. However when distributed i get java.lang.NoClassDefFoundError: ...
    Saptarshi GuhaSaptarshi Guha
    Dec 18, 2008 at 11:38 pm
    Dec 23, 2008 at 10:32 pm
  • Hey, I hit a bit of a roadbump in solving the "truncated block issue" at our site: namely, some of the blocks appear perfectly valid to the datanode. The block verifies, but it is still the wrong ...
    Brian BockelmanBrian Bockelman
    Dec 16, 2008 at 6:23 pm
    Dec 19, 2008 at 6:57 pm
  • Hello, We have two HOD questions: (1) For our current Torque PBS setup, the number of nodes requested by HOD (-l nodes=X) corresponds to the number of CPUs allocated, however these nodes can be ...
    Craig MacdonaldCraig Macdonald
    Dec 17, 2008 at 4:47 pm
    Dec 19, 2008 at 1:07 pm
  • (I'm quite new to hadoop and map/reduce, so some of these questions might not make complete sense.) I want to perform simple data transforms on large datasets, and it seems Hadoop is an appropriate ...
    Stuart WhiteStuart White
    Dec 14, 2008 at 2:33 am
    Dec 15, 2008 at 1:07 am
  • Hello all, I normally upload files into hadoop via bin/hadoop fs -put file dest. However, is there a way to somehow stream data into Hadoop? For example, I'd love to do something like this: zcat xxx ...
    Ryan LeCompteRyan LeCompte
    Dec 8, 2008 at 10:19 pm
    Dec 10, 2008 at 4:19 am
  • Is there support to tell hadoop servers (tasktracker, datanode, ....) to re-read configuration, or fully reset, or to shut down (without an external kill)?
    Christian KunzChristian Kunz
    Dec 9, 2008 at 3:47 am
    Dec 9, 2008 at 6:48 pm
  • If you're running the first job to do just the first pass (the output of which is the list of documents that you want to analyze properly in the second job), then yes, this is okay (and this is what ...
    Devaraj DasDevaraj Das
    Dec 6, 2008 at 5:43 pm
    Dec 8, 2008 at 7:05 pm
  • I have set some variable using the JobConf object. jobConf.set("Operator", operator) etc. How can I get an instance of Configuration object/ JobConf object inside a map method so that I can retrieve ...
    AbhinitAbhinit
    Dec 5, 2008 at 7:35 pm
    Dec 6, 2008 at 9:21 am
  • Hi, I'm trying to decommission some nodes. The process I tried to follow is: 1) add them to conf/excluding (hadoop-site points there) 2) invoke hadoop dfsadmin -refreshNodes This returns immediately, ...
    David HallDavid Hall
    Dec 4, 2008 at 8:49 am
    Dec 4, 2008 at 6:33 pm
  • Dear, I want to config a 4-site hadoop cluster. but failed. Who can help me to know why? and how can i start it? thanks. environment: Cent OS 5.2, hadoop-site.xml setting: fs.default.name ...
    LeeauLeeau
    Dec 4, 2008 at 12:04 pm
    Dec 4, 2008 at 1:36 pm
  • Hi, I have a application which creates a simple text file on hdfs. There is a second application which processes this file. The second application picks up the file for processing only when the file ...
    Sandeep DhawanSandeep Dhawan
    Dec 30, 2008 at 11:14 am
    Jan 9, 2009 at 5:59 am
  • Hey When I tried to install hadoop in ubuntu 8.04 I got an error ssh connection refused to localhost at port 22. Please any one can tell me the solution. Thanks -- Vinayak Katkar Sun Campus ...
    Vinayak katkarVinayak katkar
    Dec 29, 2008 at 4:29 pm
    Jan 5, 2009 at 6:56 pm
  • Hello, I am new to hadoop. I am running hapdoop 0.17 in a Eucalyptus cloud instance (its a centos image on xen) bin/hadoop dfs -ls / gives the following Exception 08/12/31 08:58:10 WARN ...
    Sagar arlekarSagar arlekar
    Dec 31, 2008 at 2:17 pm
    Dec 31, 2008 at 9:06 pm
  • Hi list, I've come up against a scenario like this, to finish a same task, one of my hadoop cluster only needs 5 seconds, and another one needs more than 2 minutes. It's a common phenomenon that will ...
    Jeremy ChowJeremy Chow
    Dec 24, 2008 at 4:56 am
    Dec 24, 2008 at 10:02 am
  • Hello all, Somewhat of a an off-topic related question, but I know there are Hadoop + EC2 users here. Does anyone know if there is a programmatic API to get find out how many machine time hours have ...
    Ryan LeCompteRyan LeCompte
    Dec 18, 2008 at 5:00 pm
    Dec 18, 2008 at 5:39 pm
  • Is anyone working on a JDBC RecordReader/InputFormat. I was thinking this would be very useful for sending data into mappers. Writing data to a relational database might be more application dependent ...
    Edward CaprioloEdward Capriolo
    Dec 8, 2008 at 5:11 pm
    Dec 12, 2008 at 10:42 am
  • hi all - can anyone comment on the performance cost of merging many small files into an increasingly large MapFile ? will that cost be dependent on the size of the larger MapFile (since I have to ...
    Yoav.moragYoav.morag
    Dec 9, 2008 at 1:09 pm
    Dec 10, 2008 at 8:30 am
  • Hi everybody, does anybody know if there exists a tool which writes a JobConf instance back to xml in hadoop-site.xml format ? cheers Johannes
    Johannens ZillmannJohannens Zillmann
    Dec 3, 2008 at 8:44 am
    Dec 3, 2008 at 8:56 am
  • I have been experiencing some unusual behavior from Hadoop recently. When trying to run a job, some of the tasks fail with: java.io.IOException: Task process exit with nonzero status of 1. at ...
    Nathan MarzNathan Marz
    Dec 22, 2008 at 6:24 pm
    Jan 7, 2009 at 4:08 am
  • Hello, I am new to hadoop. I am using hadoop 0.17, I am trying to run it Pseudo-Distributed. I get NotReplicatedYetException while executing 'bin/hadoop dfs' commands. The following is the partial ...
    Sagar arlekarSagar arlekar
    Dec 30, 2008 at 9:00 pm
    Dec 30, 2008 at 9:35 pm
  • Hi, When a task tracker kills a non-responsive task, it prints out a message "Task XXXXX not reported status for 600 seconds. Killing!". The stack trace it then dumps out is that of the task tracker ...
    Sriram RaoSriram Rao
    Dec 5, 2008 at 7:44 pm
    Dec 30, 2008 at 7:16 pm
  • Hi, I have a setup of 2-node Hadoop cluster running on Windows using cygwin. When I open up the web gui to view the number of Live Nodes, it shows 2. But when I kill the slave node and refreshes the ...
    Sandeep DhawanSandeep Dhawan
    Dec 29, 2008 at 11:35 am
    Dec 30, 2008 at 12:14 am
  • I've been trying to trouble shoot an OOME we've been having. When we run the job over a dataset that about 700GB (~9000 files) or larger we will get an OOME on the map jobs. However if we run the job ...
    PhilipPhilip
    Dec 17, 2008 at 6:45 pm
    Dec 29, 2008 at 11:46 pm
  • Hi, some queries, 1) If i set value of dfs.replication to 3 only in hadoop-site.xml of namenode(master) and then restart the cluster will this take effect. or i have to change hadoop-site.xml at all ...
    AmitsinghAmitsingh
    Dec 27, 2008 at 7:34 am
    Dec 29, 2008 at 10:44 pm
  • Hi, I get the following error when trying to mount the fuse dfs. [fuse-dfs]$ ./fuse_dfs_wrapper.sh -d dfs://mydevserver.com:9000 /mnt/hadoop/ fuse-dfs ignoring option -d ...
    Amit handaAmit handa
    Dec 29, 2008 at 3:04 pm
    Dec 29, 2008 at 3:26 pm
  • To all, Version: hadoop-0.17.2.1-core.jar I have created a MapFile. What I don't seem to be able to do is correctly place the MapFile in the DistributedCache and the make use of it in a map method. I ...
    Sean ShannySean Shanny
    Dec 26, 2008 at 10:21 pm
    Dec 29, 2008 at 9:39 am
  • Hi, I've built a Hadoop cluster from two computers( master and slave), using Hadoop 0.18.2/HBase 0.18.1. While running Map-Reduce jobs on 5-10 GB files I've noticed that reduce-copy tasks from master ...
    GenadyGenady
    Dec 27, 2008 at 1:59 pm
    Dec 27, 2008 at 4:33 pm
Group Navigation
period‹ prev | Dec 2008 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions134
posts457
users150
websitehadoop.apache.org...
irc#hadoop

150 users for December 2008

Brian Bockelman: 29 posts Sagar Naik: 21 posts Aaron Kimball: 17 posts Owen O'Malley: 14 posts Steve Loughran: 14 posts Songting Chen: 11 posts Arun C Murthy: 10 posts Devaraj Das: 10 posts Andy Sautins: 9 posts Raghu Angadi: 9 posts Ryan LeCompte: 9 posts Sandeep Dhawan, Noida: 9 posts Amareshwari Sriramadasu: 8 posts Doug Cutting: 8 posts Alex Loddengaard: 7 posts Saptarshi Guha: 7 posts Tim robertson: 7 posts Delip Rao: 6 posts Elangovan anbalahan: 6 posts Jason Venner: 6 posts
show more