FAQ

Search Discussions

148 discussions - 577 posts

  • What would be a good hard drive for a 7 node cluster which is targeted to run a mix of IO and CPU intensive Hadoop workloads? We are looking for around 1 TB of storage on each node distributed ...
    Shrinivas JoshiShrinivas Joshi
    Feb 10, 2011 at 8:27 pm
    Feb 19, 2011 at 1:14 am
  • Hello, I'm seeing frequent fails in reduce jobs with errors similar to this: 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201102081823_0175_m_002153_0, compressed ...
    Kelly BurkhartKelly Burkhart
    Feb 16, 2011 at 3:00 pm
    Feb 17, 2011 at 3:23 am
  • Dear all, I am going to work on a Project that includes " Working on CUDA in Hadoop Environment ". I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the past 8 months. If anyone has ...
    Adarsh SharmaAdarsh Sharma
    Feb 9, 2011 at 1:04 pm
    Feb 10, 2011 at 5:39 pm
  • Hi, I have a 10 nodes Hadoop cluster, where I am running some benchmarks for experiments. Surprisingly, when I initialize the Hadoop cluster (hadoop/bin/start-mapred.sh), in many instances, only some ...
    Bikash sharmaBikash sharma
    Feb 26, 2011 at 3:26 pm
    Mar 12, 2011 at 4:31 pm
  • Hi All, I have run the hadoop 0.20 append branch . Can someone please clarify the following behavior? A writer writing a file but he has not flushed the data and not closed the file. Could a parallel ...
    Gokulakannan MGokulakannan M
    Feb 10, 2011 at 3:12 pm
    Feb 15, 2011 at 6:29 am
  • I have been trying to understand how to write a simple custom writable class and I find the documentation available very vague and unclear about certain things. okay so here is the sample writable ...
    Adeel QureshiAdeel Qureshi
    Feb 2, 2011 at 5:22 pm
    Feb 2, 2011 at 8:58 pm
  • Hi, I executed my cluster by this way. call a command in shell directly. String runInCommand ="/opt/hadoop-0.21.0/bin/hadoop jar testCluster.jar example"; Process proc = ...
    Jun Young KimJun Young Kim
    Feb 24, 2011 at 6:57 am
    Feb 26, 2011 at 7:15 am
  • Hi all, I want to check if the following statement is right: If I use TextInputFormat to process a text file with 2000 lines (each ending with \n) with 20 mappers. Then each map will have a sequence ...
    MahaMaha
    Feb 18, 2011 at 7:14 pm
    Feb 21, 2011 at 4:49 pm
  • Hi, is it accurate to say that - In 0.20 the Secondary NameNode acts as a cold spare; it can be used to recreate the HDFS if the Primary NameNode fails, but with the delay of minutes if not hours, ...
    Mark KerznerMark Kerzner
    Feb 14, 2011 at 4:48 pm
    Feb 16, 2011 at 7:31 pm
  • I have two Hadoop instances running on one cluster of machines for the purpose of upgrading. I'm trying to copy all the files from the old instance to the new one but have been having trouble with ...
    Korb, Michael [USA]Korb, Michael [USA]
    Feb 8, 2011 at 5:06 pm
    Feb 8, 2011 at 10:08 pm
  • I am running two instances of Hadoop on a cluster and want to copy all the data from hadoop1 to the updated hadoop2. From hadoop2, I am running the command "hadoop distcp -update ...
    Korb, Michael [USA]Korb, Michael [USA]
    Feb 7, 2011 at 3:52 pm
    Feb 8, 2011 at 2:05 am
  • Can Hadoop be used for Real time Applications such as banking solutions... -- With Regards, Karthik
    Karthik KumarKarthik Kumar
    Feb 17, 2011 at 4:18 am
    Feb 18, 2011 at 3:53 am
  • Greetings all, I'm teaching an undergraduate Computer Science class that is using Hadoop quite heavily, and would like to include some case studies at various points during this semester. We are ...
    Ted PedersenTed Pedersen
    Feb 28, 2011 at 12:34 am
    Mar 2, 2011 at 8:16 pm
  • Hi, I am working on an open-source project that would be using Hadoop/HDFS/HBase/Tika/Lucene and would make all files on a hard drive searchable. Like Nutch, only applied to hard drives, and like ...
    Mark KerznerMark Kerzner
    Feb 28, 2011 at 9:02 pm
    Mar 1, 2011 at 2:58 pm
  • hi, in an application, I read many files in many directories. additionally, by using MultipleOutputs class, I try to write thousands of output files in many directories. during reduce ...
    Jun Young KimJun Young Kim
    Feb 21, 2011 at 1:20 am
    Feb 22, 2011 at 2:19 am
  • I've seen this asked before, but haven't seen a response yet. If the input to a streaming job is not actual data splits but simple HDFS file names which are then read by the mappers, then how can ...
    Keith WileyKeith Wiley
    Feb 3, 2011 at 5:17 pm
    Feb 7, 2011 at 9:35 pm
  • I had started a thread recently to ask questions about custom writable implementations which is basically similar to this .. but that was more of an understanding of the concept and here I wanted to ...
    Adeel QureshiAdeel Qureshi
    Feb 3, 2011 at 10:46 pm
    Feb 7, 2011 at 2:16 pm
  • Hi, I am planning to make a search engine for news articles. It will probably have over several billions of news articles so I thought HBase is the way to go. However, I am very new to CGI. All I ...
    Edward choiEdward choi
    Feb 28, 2011 at 1:38 pm
    Mar 1, 2011 at 8:29 pm
  • Dear all, I want to set some extra jars in java.library.path , used while running map-reduce program in Hadoop Cluster. I got a exception entitled "no jcuda in java.library.path" in each map task. I ...
    Adarsh SharmaAdarsh Sharma
    Feb 28, 2011 at 10:28 am
    Mar 1, 2011 at 5:34 am
  • Which workloads are used for serious benchmarking of Hadoop clusters? Do you care about any of the following workloads : TeraSort, GridMix v1, v2, or v3, MalStone, CloudBurst, MRBench, NNBench, ...
    Shrinivas JoshiShrinivas Joshi
    Feb 18, 2011 at 9:32 pm
    Feb 22, 2011 at 1:30 pm
  • Please keep user questions off of general and use the user lists instead. This is defined here <http://hadoop.apache.org/mailing_lists.html . MRUnit is for testing user's MapReduce applications. ...
    Owen O'MalleyOwen O'Malley
    Feb 2, 2011 at 4:47 pm
    Feb 10, 2011 at 10:25 pm
  • Hi, I have a script that I use to re-package all the jars (which are output in a dist directory by NetBeans) - and it structures everything correctly into a single jar for running a MapReduce job. ...
    Mark KerznerMark Kerzner
    Feb 18, 2011 at 10:18 pm
    Feb 21, 2011 at 2:28 am
  • Hi all, I wanted to know if the Map/Reduce (Mapper and Reducer) code incurs any fixed cost of ByteCode execution. And how do the mappers (say of WordCount MR) look like in detail (in bytecode detail) ...
    Matthew JohnMatthew John
    Feb 17, 2011 at 5:49 am
    Feb 18, 2011 at 3:43 am
  • Hello everybody, I am experiencing a weird issue: I have written a small hadoop program and I am launching it using this https://gist.github.com/808297 JobDriver. Strangely InvIndexReducer is never ...
    Marco DidonnaMarco Didonna
    Feb 2, 2011 at 8:23 pm
    Feb 3, 2011 at 4:00 pm
  • I am running basic hadoop examples on amazon emr and I am stuck at a very simple place. I am apparently not passing the right "classname" for inputFormat for input format I am running a simple sort ...
    Shivani RaoShivani Rao
    Feb 27, 2011 at 1:05 pm
    Mar 3, 2011 at 6:21 pm
  • Greetings to all, Today i came across a strange problem about non-root users in Linux ( CentOS ). I am able to compile & run a Java Program through below commands properly : [root@cuda1 ...
    Adarsh SharmaAdarsh Sharma
    Feb 28, 2011 at 6:31 am
    Mar 1, 2011 at 5:03 am
  • Hey all, I want to consult with you hadoppers about a Map/Reduce application I want to build. I want to build a map/reduce job, that read files from HDFS, perform some sort of transformation on the ...
    Guy DoulbergGuy Doulberg
    Feb 16, 2011 at 3:08 pm
    Feb 18, 2011 at 7:28 am
  • Hi, I would appreciate it if you could give me your thoughts if there is affect on efficiency if: 1) Mappers were per line in a document or 2) Mappers were per block of lines in a document. I know ...
    MahaMaha
    Feb 7, 2011 at 11:38 pm
    Feb 8, 2011 at 5:20 am
  • I would really appreciate any help people can offer on the following matters. When running a streaming job, -D, -files, -libjars, and -archives don't seem work, but -jobconf, -file, -cacheFile, and ...
    Keith WileyKeith Wiley
    Feb 2, 2011 at 7:40 am
    Feb 7, 2011 at 11:39 pm
  • hi, all. I got errors from hdfs. 2011-02-18 11:21:29[WARN ][DFSOutputStream.java]run()(519) : DataStreamer Exception: java.io.IOException: Unable to create new block. at ...
    Jun Young KimJun Young Kim
    Feb 18, 2011 at 2:31 am
    Apr 16, 2011 at 3:13 pm
  • Hi,guys: I checked out the source code fromhttp://svn.apache.org/repos/asf/hadoop/mapreduce/trunk/. Then I compiled using this script: #!/bin/bash export JAVA_HOME=/usr/share/jdk1.6.0_14 export ...
    朱韬朱韬
    Feb 28, 2011 at 3:00 am
    Feb 28, 2011 at 3:35 pm
  • Is it possible to set up hadoop in eclipse in windows only for browsing code without using cygwin? I see some unix specific commands being executed in the eclipse target of the build.xml file (tr and ...
    Hari SreekumarHari Sreekumar
    Feb 26, 2011 at 4:19 pm
    Feb 27, 2011 at 8:34 pm
  • Hi, I got the latest trunk out of svn, ran this command ant -Djavac.args="-Xlint -Xmaxwarns 1000" clean test tar and got the following error /hadoop-common-trunk/build.xml:950: 'forrest.home' is not ...
    Mark KerznerMark Kerzner
    Feb 17, 2011 at 11:37 pm
    Feb 26, 2011 at 2:01 am
  • Hello Everyone, I'm using " Runtime.getRuntime().freeMemory()" to see current memory available before and after creation of an object, but this doesn't seem to work well with Hadoop? Why? and is ...
    MahaMaha
    Feb 24, 2011 at 2:32 am
    Feb 25, 2011 at 2:58 am
  • Hello every one, Does spilled records mean that the sort-buffer size for sorting is not enough to sort all the input records, hence some records are written to local disk ? If so, I tried setting my ...
    MahaMaha
    Feb 22, 2011 at 2:52 am
    Feb 23, 2011 at 2:57 am
  • What is the main use of org.apache.hadoop.io.ObjectWritable ? Thank you :)
    Weishung ChungWeishung Chung
    Feb 21, 2011 at 4:04 pm
    Feb 22, 2011 at 4:43 am
  • Tried a simple example job with Yahoo M45. The job fails for non-existence of a default queue. Output is attached as below. From the Apache hadoop mailing list, found this post (specific to M45), ...
    Shivani RaoShivani Rao
    Feb 10, 2011 at 5:13 am
    Feb 20, 2011 at 11:33 pm
  • I'm submitting jobs via JobClient.submitJob(JobConf), and then waiting until it completes with RunningJob.waitForCompletion(). I then want to get how long the entire MR takes, which appears to need ...
    Aaron BaffAaron Baff
    Feb 16, 2011 at 6:40 pm
    Feb 18, 2011 at 4:35 pm
  • Dear All, I am trying to setup Hadoop for multiple users in a class, on our cluster. For some reason I don't seem to get it right. If only one user is running it works great. I would want to have all ...
    Kumar, Amit H.Kumar, Amit H.
    Feb 9, 2011 at 7:13 pm
    Feb 10, 2011 at 6:02 pm
  • This should be a straightforward question, but better safe than sorry. I wanted to add a second name node directory (on an NFS as a backup), so now my hdfs-site.xml contains: <property <name ...
    Mike andersonMike anderson
    Feb 10, 2011 at 5:20 pm
    Feb 10, 2011 at 5:56 pm
  • Dear All, Please Help. I have tried to start the data nodes with ./start-all.sh on a 7 node cluster however I recieve incompatible namespace when i try to put any file on the HDFS I tried the ...
    AhmednagyAhmednagy
    Feb 9, 2011 at 12:13 am
    Feb 9, 2011 at 7:03 pm
  • Hi All, I am working on a task where I have to determine the count in the sequence and increment by one. My input to the job is multiple files input/abc.csv input/xyz.csv So for example if my mapper ...
    ANKITBHATNAGARANKITBHATNAGAR
    Feb 5, 2011 at 12:04 pm
    Feb 5, 2011 at 8:32 pm
  • Hi, I have setup a Hadoop cluster as per the instructions for CDH3. When I try to start the datanode on the slave, I get this error, org.apache.hadoop.hdfs.server.datanode.DataNode: ...
    DanoomistmatisteDanoomistmatiste
    Feb 1, 2011 at 5:19 am
    Feb 4, 2011 at 3:55 pm
  • Hi, Even though I'm running a very simple job of 4 small records (2 per map) and one reducer .. I still get all 9 records output of map spilled. But map-logs shows spilled to be ZERO ! 2011-02-26 ...
    MahaMaha
    Feb 27, 2011 at 4:01 am
    Feb 27, 2011 at 4:53 am
  • Hii I m facing a problem...I m not able to run the kmeans clustering algo on a singls node...till now I have just run the wordcount program...what are the steps in doing so???
    MANISH SINGLAMANISH SINGLA
    Feb 26, 2011 at 7:29 am
    Feb 26, 2011 at 5:34 pm
  • Hello all, I have few mapreduce jobs that I am calling from a java driver. The problem I am facing is that when there is an exception in mapred job, the exception is not propogated to the client so ...
    Praveen PeddiPraveen Peddi
    Feb 25, 2011 at 4:01 pm
    Feb 25, 2011 at 8:18 pm
  • Hi, when packaging additional libraries for an MR job, I can use a script or a Maven Hadoop plugin, but what about the Hadoop libraries themselves? Should I package them in, or should I rely on those ...
    Mark KerznerMark Kerzner
    Feb 25, 2011 at 1:06 pm
    Feb 25, 2011 at 1:49 pm
  • I'm trying to use the fair scheduler. I have jobs written using the new api and hadoop 0.20.2. I've seen that to associate a job with a queue you have to do: JobConf.setQueueName("xxxx") The Job ...
    Marc SturleseMarc Sturlese
    Feb 22, 2011 at 3:42 pm
    Feb 22, 2011 at 4:31 pm
  • Hi, I am using Hadoop 0.21.0. I am getting Exception as java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:249) Caused by: java.io.IOException: Task process ...
    Nitin KhandelwalNitin Khandelwal
    Feb 16, 2011 at 2:00 pm
    Feb 21, 2011 at 6:20 am
  • Greetings, I recently had a power failure, resulting in all my servers shutting down. Everything appears to have recovered, but I am now unable to run fsck: $ hadoop fsck / Exception in thread "main" ...
    Christian StucchioChristian Stucchio
    Feb 20, 2011 at 2:38 pm
    Feb 20, 2011 at 7:15 pm
Group Navigation
period‹ prev | Feb 2011 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions148
posts577
users146
websitehadoop.apache.org...
irc#hadoop

146 users for February 2011

Harsh J: 45 posts Ted Dunning: 33 posts Maha: 29 posts JunYoung Kim: 27 posts Madhu phatak: 17 posts Mark Kerzner: 17 posts James Seigel: 15 posts Keith Wiley: 15 posts Praveen Peddi: 12 posts Adarsh Sharma: 12 posts Adeel Qureshi: 12 posts Konstantin Boudnik: 11 posts Matthew John: 11 posts Korb, Michael [USA]: 10 posts Shrinivas Joshi: 10 posts Ahmed Said Nagy: 9 posts Michael Segel: 9 posts Allen Wittenauer: 8 posts Kelly Burkhart: 8 posts Ted Yu: 8 posts
show more