Search Discussions
-
Hi We are considering to use MapReduce for a project. I am participating in an "investigation"-phase where we try to reveal if we would benefit from using the MapReduce framework. A little bit about ...
Per Steffensen
Aug 26, 2011 at 11:14 am
Aug 30, 2011 at 6:42 am -
Hi, Can reducer gets parameters from mapper besides key and value? Such as files, lists, string[] etc. Thanks -- Regards! Jun Tan
谭军
Aug 9, 2011 at 6:29 am
Aug 14, 2011 at 1:26 pm -
Using 0.20.2... Is it possible to override the context.write() method in ReduceContext? I have an entire set of Reducers that I would like to all use a specific function just before every ...
Ross Nordeen
Aug 12, 2011 at 9:06 pm
Aug 15, 2011 at 4:03 pm -
Hi, I have 2 computers each of which is double-core CPU. The 2 computers are datanodes and another computer is the namenode. Now, I want to increase the number of datanode but no more computers ...
谭军
Aug 8, 2011 at 8:47 am
Aug 9, 2011 at 4:12 am -
Hello all, I have a question regarding the mappers. I can see from the logs that the start time of the mapper is different from start time of logging. I am having a problem because that time ...
Iman E
Aug 5, 2011 at 6:02 pm
Aug 5, 2011 at 8:51 pm -
Hello, I am trying to learn Hadoop and doing a project on it. I need to update some files in my project and hence wanted to use version 0.21.0 However, I am confused as to how I can compile my ...
Arko Provo Mukherjee
Aug 31, 2011 at 9:35 pm
Sep 2, 2011 at 5:36 am -
I have just started to explore Hadoop but I am stuck in a situation now. I want to run a MapReduce job in hadoop which needs to create a "setup" folder in working directory. During the execution the ...
Shrish Bajpai
Aug 2, 2011 at 12:50 am
Aug 2, 2011 at 10:39 am -
Hi, I wrote a custom InputFormat for parsing through the Enron Email corpus which is attached in the file named EmailInputFormat I have attached the code in a text file with the sample input mail ...
Varad Meru
Aug 29, 2011 at 10:01 am
Aug 30, 2011 at 7:19 am -
Hi, I got the error information below. What's wrong? What I/O exception? [email protected]:/home/cx/software/Hadoop/hadoop-0.20.2# bin/hadoop jar Retrieval.jar Retrieval protein/b.txt protein/a.txt ...
谭军
Aug 11, 2011 at 10:41 am
Aug 12, 2011 at 8:26 am -
Hi all, I'm using Avro as a serialization format and assume I have a generated specific class FOO that I use as a Mapper output format: class FOO { int a; List<BAR barList; } where BAR is another ...
Vyacheslav Zholudev
Aug 3, 2011 at 7:19 am
Aug 4, 2011 at 9:16 pm -
Hi, I am using apache-maven-3.0.3 and i have set LD_LIBRARY_PATH=/usr/local/lib which has google protocbuf library. I am getting following error while building hadoop-yarn using mvn clean install ...
Rajesh putta
Aug 19, 2011 at 6:26 am
Aug 29, 2011 at 6:41 am -
I am working on a large application where I need to pass a big array of intwritables from mapper to reducer. Obviously I want to use combiner and I am extending ArrayWritable. Here is the signature ...
Vipul sharma
Aug 18, 2011 at 8:44 pm
Aug 19, 2011 at 5:03 am -
Hi, I'm a newbie to the Hadoop and Map/Reduce applications. I have set-up a cluster and just running the example map/reduce applications which comes with the Hadoop source code. I want to run some ...
M S Vishwanath Bhat
Aug 11, 2011 at 7:00 pm
Aug 17, 2011 at 4:10 am -
I understand that we can decide which task run by which reducer in Hadoop by using custom partitioner, but is there anyway to decide which reducer run on which machine? Suhendry Effendy
Suhendry Effendy
Aug 5, 2011 at 5:25 am
Aug 5, 2011 at 7:15 pm -
Hi all, I'm trying to create M/R tasks that will output more than one "type" of data. Ideal thing would be MultipleOutputs feature of Map Reduce, but in our current production version, CDH3 ( 0.20.2 ...
Vanja Komadinovic
Aug 1, 2011 at 10:21 pm
Aug 4, 2011 at 5:32 pm -
Hi, So I just started using capacity scheduler for M/R jobs. I have 4 task trackers each with 4 map/reduce slots. Configured a queue so that it uses 25% (4 slots) of the available slots. I was ...
Sulabh Choudhury
Aug 23, 2011 at 10:52 pm
Aug 23, 2011 at 11:12 pm -
Hi all, I am having issues using SequenceFileInputFormat to retrieve whole records I have 1 job that is used to write to a SequenceFile SequenceFileOutputFormat.setOutputPath(job, new ...
Tim Fletcher
Aug 19, 2011 at 12:32 pm
Aug 19, 2011 at 2:20 pm -
Hi all, I have been working on hadoop jobs which are writing output into multiple files. In Hadoop API I have found class MultipleOutputs which implement this functionality. My use case is to change ...
Dino Kečo
Aug 18, 2011 at 8:31 am
Aug 18, 2011 at 10:54 am -
Hi, I'm having multiple hadoop jobs that use the avro mapred API. Only in one of the jobs I have a visible mismatch between a number of map output records and reducer input records. Does anybody ...
Vyacheslav Zholudev
Aug 16, 2011 at 6:38 pm
Aug 16, 2011 at 7:57 pm -
Hi, i'm trying without success to execute a shell script that is inside the HDFS. Is this possible? If it is, how can i do this? Since ever, thanks. Carlos.
Kadu canGica Eduardo
Aug 15, 2011 at 9:51 pm
Aug 16, 2011 at 7:00 pm -
Hi, we're running an 8-node Hadoop cluster with CDH2. Recently, our monitoring tools caught warnings like this one when fsck'ing the HDFS: /tmp/hadoop-tgp/mapred/system/job_201105191458_1857/job.jar: ...
Christoph Schmitz
Aug 15, 2011 at 1:40 pm
Aug 16, 2011 at 6:39 am -
Hi, I want to define a matrix or list that all mappers share. So that all mappers can do operations on it. Can I make it? Thanks! -- Regards! Jun Tan
谭军
Aug 15, 2011 at 9:15 am
Aug 15, 2011 at 11:29 am -
Hi, I am working on a project, which requires multiple input formats and multiple output formats. Basically, I store some sales rank data to a Cassandra cluster and I get a sales rank update file ...
Jian Fang
Aug 10, 2011 at 4:09 pm
Aug 10, 2011 at 4:26 pm -
I want to calculate some statistics on a per document basis, and it seems like the only way to do this would be to emit a compound key of (key,documentname). 1) Is this the case, or is there a better ...
Jonathan Coveney
Aug 9, 2011 at 9:12 pm
Aug 10, 2011 at 12:25 am -
Hi All, I need help in setting up the Next Gen Mapreduce. Please provide links to documents/Guide if any to start setting up the Next Gen MR. Thanks and Regards, Vinayakumar B ...
Vinayakumar B
Aug 2, 2011 at 10:16 am
Aug 2, 2011 at 3:13 pm -
Hi. I'm trying to tweak heap sizes for the Hadoop daemons, i.e. namenode/datanode and jobtracker/tasktracker. I've tried setting HADOOP_NAMENODE_HEAPSIZE, HADOOP_DATANODE_HEAPSIZE, and so on in ...
Kai Ju Liu
Aug 1, 2011 at 6:11 pm
Aug 2, 2011 at 4:39 am -
Hello! I would like to know how Hadoop is computing the number of mappers when CombineFileInputFormat is used? I have read the API specification for CombineFileInputFormat ...
Florin P
Aug 12, 2011 at 6:36 am
Sep 21, 2011 at 8:00 pm -
Hi, I see in the code that while we assign a number of map tasks, we assign only one reduce task per tasktracker during the heartbeat. Is there a brief somewhere on why this design decision is made ? ...
Sudharsan Sampath
Aug 24, 2011 at 10:49 am
Aug 24, 2011 at 1:50 pm -
Hello Folks, I needed some help with using MultipleTextOutputFormat to control the output filename in MapReduce. Currently I am using it as shown below(or at http://pastebin.com/gJxkdwRd). And it ...
Decimus Phostle
Aug 8, 2011 at 4:07 pm
Aug 23, 2011 at 11:20 pm -
Hi, If the task jvm is set to be re-used with a -1 option, when does the jvms exit? completes. Is that right? Thanks Sudharsan S
Sudharsan Sampath
Aug 23, 2011 at 6:10 am
Aug 23, 2011 at 6:35 am -
Please check the defect in MAPREDUCE jira https://issues.apache.org/jira/browse/MAPREDUCE-2264 This is because the compression is enabled for map outputs and statistics are taken on compressed data ...
Vinayakumar B
Aug 19, 2011 at 8:17 am
Aug 19, 2011 at 3:05 pm -
Hi, cluster details: hbase 0.90.2. 10 machines. 1GB switch. use-case M/R job that inserts about 10 million rows to hbase in the reducer, followed by M/R that works with hdfs files. When the first job ...
Lior Schachter
Aug 14, 2011 at 4:32 pm
Aug 14, 2011 at 4:37 pm -
Hi all, Is anybody familiar with how to define a custom composite key in a format such as (Text, Text) so your context written as a key and value could be ((Text, Text) LongWritable)? Thanks, -- ...
Roger Chen
Aug 12, 2011 at 10:50 pm
Aug 12, 2011 at 11:23 pm -
Is it possible to set the MS. Windows PATH environment variable for Mapper classes? I have maps that depend on JNI and I need to set the PATH so that the .dlls can be found. The PATH can't be ...
Curtis Jensen
Aug 8, 2011 at 9:05 pm
Aug 8, 2011 at 9:14 pm -
Hi, I just want 2 mappers created to do my job for there are only 2 data nodes. I think that is the most efficient. But how can I know how many mappers have been created while doing my job? Can I do ...
谭军
Aug 7, 2011 at 6:46 am
Aug 8, 2011 at 4:33 am -
hello,everyone I checkout the branch-0.22 source code from apache, i can compile the common and hdfs codes successfully,but I got a exception when i compiling branch-0.22/mapreduce as follow.I don't ...
周俊清
Aug 2, 2011 at 2:15 am
Aug 2, 2011 at 4:12 am -
hi ,all The hadoop mapreduce version i used is 0.20.0.when i run a job today,some things happen like this: mayb it's normal form this view ,but the job keep this status for our hours.It didn't ...
Zhengrui.m
Aug 30, 2011 at 9:47 am
Aug 30, 2011 at 9:47 am -
Hi, I have an already running system where I define a simple data flow (using a simple custom data flow language) and configure jobs to run against stored data. I use quartz to schedule and run these ...
Tharindu Mathew
Aug 29, 2011 at 7:47 pm
Aug 29, 2011 at 7:47 pm -
unsubscribe
Yue Wang
Aug 24, 2011 at 1:18 pm
Aug 24, 2011 at 1:18 pm -
Moving to mapreduce-user@, bcc common-user@ Why would you want to do that? Typically, you want the JT to retry the failed tasks as quickly as possible to fail the job rather than try all tasks and ...
Arun C Murthy
Aug 23, 2011 at 2:53 pm
Aug 23, 2011 at 2:53 pm -
Varad Meru
Aug 22, 2011 at 2:07 am
Aug 22, 2011 at 2:07 am -
Moving to mapreduce-user@, bcc common-user@ Not for the same user - the CS tries to get jobs done as quickly as possible and thus it won't share resources for the same user. However, you can submit ...
Arun C Murthy
Aug 17, 2011 at 6:07 pm
Aug 17, 2011 at 6:07 pm -
Hi, I have some junit tests for a MapReduce job that run fine from my IDE, but fail with the following error when using Maven (mvn test): 2011-08-17 15:08:49,191 [main] INFO ...
Cristina Ionescu
Aug 17, 2011 at 1:05 pm
Aug 17, 2011 at 1:05 pm -
Yes, just restart the TaskTracker. There's no need to restart your JobTracker, so a rolling TT reconfigure+restart should get you going. P.s. general@ is for project wide general discussions, not ...
Harsh J
Aug 16, 2011 at 5:26 am
Aug 16, 2011 at 5:26 am -
Hello, I am running gridmix2, and get an error due to missing GridMixRunner Exception in thread "main" java.lang.ClassNotFoundException: org.apache.hadoop.mapred.GridMixRunner I looked in all jars ...
Keren Ouaknine
Aug 15, 2011 at 1:49 pm
Aug 15, 2011 at 1:49 pm -
Hi, We have hadoop(hadoop-0.20.1) cluster of 14 nodes and daily some jobs execute on this cluster. Recently we faced an issue in which jobtracker looses track of all tasktrackers and as a result job ...
Ajit Ratnaparkhi
Aug 15, 2011 at 11:14 am
Aug 15, 2011 at 11:14 am -
Hi, I am using LZO to compress my intermediate map outputs. These are the settings: mapred.map.output.compression.codec = com.hadoop.compression.lzo.LzoCodec pig.tmpfilecompression.codec = lzo But I ...
Rakesh kothari
Aug 15, 2011 at 6:49 am
Aug 15, 2011 at 6:49 am -
I am using hadoop 0.20.2 on CDH. I am trying to get the filename of the file currently being processed. I will extract some information from the filename which will determine the data processing to ...
Vegar Hatlevik
Aug 12, 2011 at 9:26 am
Aug 12, 2011 at 9:26 am -
Gabriel B M Armelin
Aug 12, 2011 at 12:51 am
Aug 12, 2011 at 12:51 am -
Hi, I want to write a program to achieve secondary retrieval, but don't know how to do it. I don't know how to express myself, so the source code below my help. I don't know whether my first retieval ...
谭军
Aug 8, 2011 at 2:59 pm
Aug 8, 2011 at 2:59 pm
Group Overview
group | mapreduce-user |
categories | hadoop |
discussions | 55 |
posts | 168 |
users | 72 |
website | hadoop.apache.org... |
irc | #hadoop |
72 users for August 2011
Archives
- February 2013 (251)
- January 2013 (868)
- December 2012 (621)
- November 2012 (742)
- October 2012 (868)
- September 2012 (733)
- August 2012 (1,082)
- July 2012 (226)
- June 2012 (135)
- May 2012 (102)
- April 2012 (180)
- March 2012 (164)
- February 2012 (167)
- January 2012 (284)
- December 2011 (249)
- November 2011 (201)
- October 2011 (130)
- September 2011 (310)
- August 2011 (168)
- July 2011 (207)
- June 2011 (241)
- May 2011 (225)
- April 2011 (157)
- March 2011 (146)
- February 2011 (174)
- January 2011 (226)
- December 2010 (166)
- November 2010 (135)
- October 2010 (126)
- September 2010 (145)
- August 2010 (128)
- July 2010 (121)
- June 2010 (136)
- May 2010 (82)
- April 2010 (108)
- March 2010 (62)
- February 2010 (59)
- January 2010 (95)
- December 2009 (46)
- November 2009 (45)
- October 2009 (75)
- September 2009 (24)
- August 2009 (30)
- July 2009 (15)