Search Discussions
-
I am attempting to speed up a mapping process whose input is GZIP compressed CSV files. The files range from 1-2GB, I am running on a Cluster where each node has a total of 32GB memory available to ...
Hans Uhlig
Mar 11, 2012 at 4:00 am
Mar 13, 2012 at 2:04 am -
Hello, I have a couple of questions regarding mapreduce configurations. We install various platforms on data nodes that require mixed set of native libraries. Part of the problem is that in general ...
Dmitriy Lyubimov
Mar 28, 2012 at 1:43 am
Mar 30, 2012 at 6:00 pm -
Hi all, We tried using mapreduce to execute a simple map code which read a txt file stored in HDFS and write then the output. The file to read is a very small one. It was not split and written ...
Hassen Riahi
Mar 3, 2012 at 11:53 pm
Mar 4, 2012 at 8:32 pm -
Hi, I am quite new to Hadoop and Java as well and have two questions: *Ques 1:* ====== I have a HDFS directory which contains the o/p files of reducer. I want to read all the part-r-* files present ...
Piyush Kansal
Mar 5, 2012 at 9:47 am
Mar 16, 2012 at 7:37 am -
Hi, I'm trying to debug map and reduce tasks for a quite long time, and it seems that it's impossible. MR are launched in new process and there's no way to debug them. Even with IsolationRunner class ...
Pedro Costa
Mar 29, 2012 at 3:34 pm
Mar 29, 2012 at 4:33 pm -
The ReduceTask can save the file using several output format: InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc... How can I read the keys and the values from the output file? ...
Pedro Costa
Mar 30, 2012 at 5:20 pm
Apr 2, 2012 at 2:41 am -
Hi all, I'd like to log and monitor average job completion times on hadoop. (For example, wall time from the perspective of the JobTracker) What might be the best way to do this? I see a lot of ...
Bharath Ravi
Mar 4, 2012 at 10:39 pm
Mar 6, 2012 at 4:22 am -
Hi, Is there any way to ensure the execution of a map on all nodes of a clusterin a way that each node run the map once and only once. That is, I would use Hadoop to execute a method on all nodes in ...
Luiz Carlos Muniz
Mar 29, 2012 at 12:25 am
Apr 3, 2012 at 11:07 am -
We have a weird performance problem with a hadoop job on our cluster. We have a 32-node experimenting cluster of blades (2 hex-core), one dedicated job tracker, one dedicated namenode, with ...
GUOJUN Zhu
Mar 16, 2012 at 9:23 pm
Mar 19, 2012 at 1:57 pm -
I'm trying to write a Reducer which will eliminate duplicates from the list of values before writing them out. I have the following code for my Reducer: /*****************/ public class ...
Steven Willis
Mar 14, 2012 at 9:35 pm
Mar 16, 2012 at 2:54 pm -
I am using hadoop 0.20.2 mapreduce API. The program is running fine, just slower than it could. I sum values and then use job.setSortComparatorClass(LongWritable.DecreasingComparator.class) to sort ...
Henry Helgen
Mar 8, 2012 at 11:02 pm
Mar 9, 2012 at 9:42 pm -
hi all, we are looking for a way, to map-reduce on a *non-closed files*. we currently able to run a hadoop fs -cat <non-closed-file *non-closed files* - files that are currently been written, and ...
Niv Mizrahi
Mar 4, 2012 at 2:37 pm
Mar 5, 2012 at 9:29 pm -
Hi All, I have a file in HDFS spanning across many blocks. Say the file has many words in it from W1, W2 , W3 ...Wn. I want to find the edit distance between all pairs of words. Is this is possible ...
Praveen Kumar K J V S
Mar 28, 2012 at 2:13 am
Apr 3, 2012 at 12:25 pm -
Ashish vyas
Mar 30, 2012 at 8:30 am
Mar 30, 2012 at 9:56 am -
Hi All I'm using Hadoop-0.20-append, the cluster contains 3 nodes, for each node I have 14 map and 14 reduce slots, here is the configuration: <property <name ...
WangRamon
Mar 10, 2012 at 10:40 am
Mar 12, 2012 at 8:05 am -
Hi, I tried to configure hadoop 0.23.1.I added all libs from share folder to lib directory.But still i get the error while formating the namenode Exception in thread "main" ...
Raghavendhra rahul
Mar 1, 2012 at 9:49 am
Mar 2, 2012 at 3:35 am -
Hi, I dont see mapred.max.map.failures.percent property in mapred-default.xml conf of hadoop version 0.20.1 . Is it removed? Is there any alternate property corresponding to this? -Ajit
Ajit Ratnaparkhi
Mar 29, 2012 at 11:53 am
Mar 31, 2012 at 12:09 pm -
all sorry to bother, as a new user, it seems that I cannot post anything. I've tried twice yesterday, but I didn't receive my own post... can anyone enlighten me? thanks
Fang Xin
Mar 31, 2012 at 4:11 am
Mar 31, 2012 at 4:16 am -
Hi All, Can n't we use special character to get paths in HDFS using Hadoop API. E.g. Path path = new Path("/data/input/20120321*"); I have files /data/input/20120321000000/a.txt , ...
Thamizhannal Paramasivam
Mar 21, 2012 at 10:31 am
Mar 22, 2012 at 7:14 am -
I am trying to write a map-reduce application in which the mapper function is aware of the input HDFS filename of its data split. Anyone know how to do that?
Qu Chen
Mar 17, 2012 at 12:35 pm
Mar 17, 2012 at 12:57 pm -
Hi, In MapReduce, if the locations of the split are in {HostA, HostB, HostC}, and the respective map tasks will run in HostB, the map tasks will pick up the split from HostB? Who is responsible to ...
Pedro Costa
Mar 7, 2012 at 1:58 pm
Mar 7, 2012 at 10:58 pm -
I switched to new mapreduce API. I need a replacement for job.getNumMapTasks()) in job driver.
Radim Kolar
Mar 4, 2012 at 4:25 pm
Mar 4, 2012 at 7:58 pm -
Hi all, The FileOutputFormat/FileOutputCommitter always treats an output path as a directory and write files under it, even if there is only one Reducer. Is there any way to configure an OutputFormat ...
Jianhui Zhang
Mar 3, 2012 at 12:39 am
Mar 4, 2012 at 4:28 pm -
Hello Folks, Are there any pointers to such comparisons between Apache Pig and Hadoop Streaming Map Reduce jobs? Also there was a claim in our company that Pig performs better than Map Reduce jobs? ...
Subir S
Mar 2, 2012 at 4:48 am
Mar 2, 2012 at 7:08 am -
Hi I have 5 dependent jobs, I'm running them with jobcontrol and jobs 2 and 3 run at the same time (not dependency between them). Each job produces several information that is the input for the ...
Cornelio Iñigo
Mar 16, 2012 at 4:59 pm
Apr 2, 2012 at 7:27 am -
Hi, When i tried to run randomwriter example in capacity scheduler it works fine.But when i run the distributed shell example under capacity scheduler it shows the following exception. RemoteTrace ...
Raghavendhra rahul
Mar 27, 2012 at 9:23 am
Apr 2, 2012 at 3:33 am -
Hi all: I'm new to mapreduce, but familiar with Collaborative Filtering recommendation framework. I tried to use mahout to do this work. But it disappointed me. My machine work all day to do this job ...
Chao yin
Mar 31, 2012 at 11:17 am
Mar 31, 2012 at 4:51 pm -
Hi All, Just move from Matlab to Hadoop, can anyone kindly give me advise on how to deal with matrix? maybe a starting point will be to calculate some stats for each column. what could a mapper and ...
Fang Xin
Mar 30, 2012 at 6:07 pm
Mar 31, 2012 at 11:04 am -
Hi All, Just move from Matlab to Hadoop, can anyone kindly give me advise on how to deal with matrix easily in Hadoop? maybe a starting example will be to calculate some stats for each column. how ...
Fang Xin
Mar 30, 2012 at 6:15 pm
Mar 31, 2012 at 5:08 am -
Hello, I'm interested in writing a library, to be used with Node.js, that can ask the JobTracker for information about jobs. I see that this is possible using the Java API, with the JobClient ...
Ryan Cole
Mar 29, 2012 at 1:08 am
Mar 29, 2012 at 3:32 pm -
current implementation of MapWritable and AbstractMapWritable do not track class usage. Class name is still serialized in write() to disk even if no instance of such class exists in stored table ...
Radim Kolar
Mar 25, 2012 at 10:16 am
Mar 25, 2012 at 4:19 pm -
i have mappers only job - number of reducers set to 0. Its hadoop 0.22 and output from job is this: 2012-03-24 18:24:22,117 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop ...
Radim Kolar
Mar 24, 2012 at 5:32 pm
Mar 24, 2012 at 5:40 pm -
I want to add new MR applications into hadoop-0.20.2-examples.jar. How to do that? I have set up Hadoop 0.20.2 development in eclipse.
Qu Chen
Mar 19, 2012 at 7:05 pm
Mar 19, 2012 at 9:49 pm -
Hi, Is there any way i can dump the output of my Ruby Map Reduce jobs into HBase directly? In other words does Hadoop Streaming with Ruby integrate with HBase? Like Pig has HBaseStorage etc. Thanks ...
Subir S
Mar 16, 2012 at 8:05 am
Mar 17, 2012 at 1:39 pm -
Hi people, Please, I would like ask something a bit more high level than programing for Hadoop. I will have some students working with Hive, Pig or H-Base (I don't which of them yet) and I would like ...
Luiz Antonio Falaguasta Barbosa
Mar 14, 2012 at 7:42 pm
Mar 15, 2012 at 3:51 pm -
Hi all, I am new to Hadoop and just start coding in MapReduce. I've checked out the trunk and am able to build the MapReduce project. I also import the code to the eclipse. My very first goal is to ...
Viney Gupta
Mar 8, 2012 at 9:28 am
Mar 8, 2012 at 10:16 am -
dear I'm sorry for my poor in English. I'm confused about something. Recently, i read a paper names "TeraByte Sort on Apache Hadoop" which was written by Owen O'Malley. And i see the graph "Running ...
张建
Mar 5, 2012 at 6:42 am
Mar 5, 2012 at 9:55 am -
Hi, I would like to chain multiple reducers without an intervening map phase. I looked at ChainReducer but it seems to support chains of the for M+RM*. What I am looking for is M+R*. What would be ...
IGZ Nick
Mar 4, 2012 at 2:23 pm
Mar 4, 2012 at 3:01 pm -
Hi all, Consider in hadoop cluster having 4 nodes, and in every node the maximum no.of reduce slots fixed at 5. When mapreduce deamons started, 1) Is there any restriction on no. of simultaneously ...
Vamshi Krishna
Mar 2, 2012 at 10:10 am
Mar 2, 2012 at 10:43 am -
Hi, In many Hadoop production environments you get gzipped files as the raw input. Usually these are Apache HTTPD logfiles. When putting these gzipped files into Hadoop you are stuck with exactly 1 ...
Niels Basjes
Mar 30, 2012 at 2:07 pm
Mar 30, 2012 at 2:07 pm -
Hi All, I am running a Map reduce program that scans the HBase and takes the required data from it. Both the hbase and the MR program runs in the same hadoop cluster. In this case after running 950 ...
V, Sriram
Mar 26, 2012 at 11:55 pm
Mar 26, 2012 at 11:55 pm -
Hi All I noticed there is something strange in my Fair Share Scheduler monitor GUI, the SUMof the Faire Share Value is always about 30 even there is only one M/R Job is running, so I don't know ...
WangRamon
Mar 21, 2012 at 1:20 am
Mar 21, 2012 at 1:20 am -
You are using 0.20.203 but the documentation you are looking at is for 0.21. The MultipleOutputs and MultipleInputs were ported to the mapreduce context objects API in release 0.21. ...
Henry Helgen
Mar 18, 2012 at 3:43 am
Mar 18, 2012 at 3:43 am -
Hi, We are trying to execute a mapper making a random access during writing files. It seems that HDFS supports only random seek during read and not during write (neither the file modification). Is it ...
Hassen Riahi
Mar 17, 2012 at 2:09 pm
Mar 17, 2012 at 2:09 pm -
Hello, Apache MRUnit 0.8.1-incubating has been released and there is a blog post describing the major changes: https://blogs.apache.org/mrunit/entry/apache_mrunit_0_8_1 Chief among the new features ...
Brock Noland
Mar 16, 2012 at 4:39 pm
Mar 16, 2012 at 4:39 pm -
(Replying to my old email sent on 1/31/2012) https://issues.apache.org/jira/browse/MAPREDUCE-4003 was opened for this issue. Uploaded a silly patch. I hope someone can pick it up from there. Koji
Koji Noguchi
Mar 15, 2012 at 11:23 pm
Mar 15, 2012 at 11:23 pm -
Hi, Are there examples of serializing array of type float to from C++ pipes? From the examples, I assume float floatArrayOut[arrayBytes/sizeof(float)]; // assignment of floatArrayOut entries... char ...
Charles Earl
Mar 13, 2012 at 12:17 pm
Mar 13, 2012 at 12:17 pm -
The Apache MRUnit team is pleased to announce the release of MRUnit 0.8.1-incubating from the Apache Incubator. This is the third release of Apache MRUnit, a Java library that helps developers unit ...
Brock Noland
Mar 11, 2012 at 9:10 pm
Mar 11, 2012 at 9:10 pm -
Hi, I'd like to introduce you Pangool <http://pangool.net/ , an easier low-level MapReduce API for Hadoop. I'm one of the developers. We just open-sourced it yesterday. Pangool is a Java, low-level ...
Pere Ferrera
Mar 6, 2012 at 10:36 am
Mar 6, 2012 at 10:36 am -
Hi, I have a following issue in Hadoop 0.20.2. When i try to use inheritance with WritableComparables the job is failing. Example If i create a base writable called as shape public abstract class ...
Madhu phatak
Mar 5, 2012 at 5:33 am
Mar 5, 2012 at 5:33 am
Group Overview
group | mapreduce-user |
categories | hadoop |
discussions | 53 |
posts | 164 |
users | 67 |
website | hadoop.apache.org... |
irc | #hadoop |
67 users for March 2012
Archives
- February 2013 (251)
- January 2013 (868)
- December 2012 (621)
- November 2012 (742)
- October 2012 (868)
- September 2012 (733)
- August 2012 (1,082)
- July 2012 (226)
- June 2012 (135)
- May 2012 (102)
- April 2012 (180)
- March 2012 (164)
- February 2012 (167)
- January 2012 (284)
- December 2011 (249)
- November 2011 (201)
- October 2011 (130)
- September 2011 (310)
- August 2011 (168)
- July 2011 (207)
- June 2011 (241)
- May 2011 (225)
- April 2011 (157)
- March 2011 (146)
- February 2011 (174)
- January 2011 (226)
- December 2010 (166)
- November 2010 (135)
- October 2010 (126)
- September 2010 (145)
- August 2010 (128)
- July 2010 (121)
- June 2010 (136)
- May 2010 (82)
- April 2010 (108)
- March 2010 (62)
- February 2010 (59)
- January 2010 (95)
- December 2009 (46)
- November 2009 (45)
- October 2009 (75)
- September 2009 (24)
- August 2009 (30)
- July 2009 (15)