Search Discussions
-
Hi, everybody. I'm running into some difficulties getting needed libraries to map/reduce tasks using the distributed cache. I'm using Hadoop 0.20.2, which from what I can tell is a hard requirement ...
John Armstrong
May 26, 2011 at 2:46 pm
Jun 1, 2011 at 8:06 pm -
Hi, I have my application installed on Tomcat and I wish to submit M/R jobs programmatically. Is there any standard way to do that ? Thanks, Lior
Lior Schachter
May 18, 2011 at 2:59 pm
May 19, 2011 at 4:12 pm -
Hi, I just upgraded to hadoop-0.20.203.0, and am having problem starting the datanode: # hadoop datanode Unrecognized option: -jvm Could not create the Java virtual machine. It looks like it has ...
Anh Nguyen
May 20, 2011 at 5:35 pm
May 20, 2011 at 11:00 pm -
Hi, I have few input splits that are few MB in size. I want to submit 1 GB of input to every mapper. How can I do it ? Currently each mapper gets one input split that results in many small map-output ...
Mapred Learn
May 25, 2011 at 12:18 am
May 25, 2011 at 7:44 pm -
Hi all, There is lots of SequenceFile in HDFS, how can I merge them into one SequenceFile? Thanks for you suggestion. -Lin
丛林
May 12, 2011 at 12:18 am
May 25, 2011 at 7:26 pm -
Hi, I'm running hadoop map-reduce in a cluster with 10 machines. I would like to set in the configuration that each tasktracker can run 8 map tasks simultaneously and 4 reduce tasks simultaneously. ...
Pedro Costa
May 23, 2011 at 4:46 pm
May 24, 2011 at 11:01 am -
Hi guys, I have a cluster of 16 machines running Hadoop. Now I want to do some benchmark on this cluster with the "nnbench" and "mrbench". I'm new to the hadoop thing and have no one to refer to. I ...
Stanley Shi
May 8, 2011 at 12:55 am
May 10, 2011 at 2:26 pm -
Hello Hadoop Users, I would like to know if anyone has ever tried splitting an input sequence file by key instead of by size. I know that this is unusual for the map reduce paradigm but I am in a ...
Vincent Xue
May 23, 2011 at 9:09 am
May 24, 2011 at 4:06 am -
Hi All, I'm trying to run hadoop(0.20.2) examples in Pseudo-Distributed Mode following the hadoop user guide. After I run the 'start-all.sh', it seems the namenode can't connect to datanode. 'SSH ...
李赟
May 19, 2011 at 3:23 am
May 20, 2011 at 4:05 pm -
Hi, I'm a newbie with Hadoop/MapReduce. I've a problem with hadoop. I set some variables in the run function but when Map running, he can't get the value of theses variables... If anyone knows the ...
Laurent Hatier
May 27, 2011 at 11:52 am
May 30, 2011 at 8:30 am -
Does anyone knows how to save and how to retrieve an instance of a class using the Configuration class?
Michael Giannakopoulos
May 25, 2011 at 6:01 pm
May 26, 2011 at 4:11 pm -
Hello guys, I have written an application that downloads metadata from 3 groups of Flickr and i implement a map/reduce task so as metadata to be processed by 3 different mappers (each corresponds to ...
Michael Giannakopoulos
May 25, 2011 at 3:52 pm
May 25, 2011 at 4:20 pm -
In general, the Java interfaces say that one invocation of a combiner (technically, a Class<? extends Reducer ) can output multiple (key,value) pairs. So: What happens if one invocation of a combiner ...
Mike Spreitzer
May 23, 2011 at 6:33 pm
May 23, 2011 at 9:22 pm -
Dear all, We have a task to run a map-reduce job multiple times to do some machine learning calculation. We will first use a mapper to update the data iteratively, and then use the reducer to process ...
Stanley Xu
May 3, 2011 at 6:09 am
May 4, 2011 at 10:20 am -
any comments??? 2011/4/28 baran cakici <barancakici@gmail.com
Baran cakici
May 2, 2011 at 11:38 am
May 4, 2011 at 9:21 am -
All, I have three questions I would appreciate if anyone could weigh in on. I apologise in advance if I sound whiny. 1. The namenode logs, when I view them from a browser, are displayed with the ...
Geoffry Roberts
May 3, 2011 at 5:22 pm
May 3, 2011 at 11:11 pm -
Hi, i was trying to create a test based on mapreduce job in a local mode testing various partitioning issues. But curiously, whenever i switch mapreduce into local node, i can't seem to be able to ...
Dmitriy Lyubimov
May 2, 2011 at 10:53 pm
May 3, 2011 at 12:28 am -
Thanks for your replies! I use TableOutputFormat to delete entries from a HBase Table and MO for HFileOutputFormat. Until yesterday I used normal HFileOutputFormat output (not MO) and the files ...
Panayotis Antonopoulos
May 31, 2011 at 2:27 am
May 31, 2011 at 7:42 am -
All, I am mostly seeking confirmation as to my thinking on this matter. I have an MR job that I believe will force me into using a single reducer. The nature of the process is one where calculations ...
Geoffry Roberts
May 12, 2011 at 5:44 pm
May 13, 2011 at 2:34 pm -
Hi, all. I want to write lots of little files (32GB) to HDFS as org.apache.hadoop.io.SequenceFile. But now it is too slow: we use about 8 hours to create this SequenceFile (6.7GB). So I wonder how to ...
丛林
May 11, 2011 at 11:49 pm
May 12, 2011 at 3:56 pm -
We have intermittently seen cases where a job will "freeze" for some as yet unknown reason, and thereby block other processes waiting for that job to complete. I'm trying to modify our job-launching ...
Adam Phelps
May 10, 2011 at 10:45 pm
May 11, 2011 at 5:28 am -
I have a basic job that is dying, I think, on one badly compressed file. Is there a way to see what file it is choking on? Via the job tracker I can find the mapper that is dying but I cannot find a ...
Jonathan Coveney
May 10, 2011 at 3:36 pm
May 10, 2011 at 5:25 pm -
All, I need for each one of my reducers to have read access to a certain object or a clone thereof. I can instantiate this object a start up. How can I give my reducers a copy? -- Geoffry Roberts
Geoffry Roberts
May 6, 2011 at 5:13 pm
May 6, 2011 at 5:59 pm -
Hello, I just noticed that the files that are created using MultipleOutputs remain in the temporary folder into attempt sub-folders when there is no normal output (using context.write(...)). Has ...
Panayotis Antonopoulos
May 30, 2011 at 3:33 pm
May 30, 2011 at 7:51 pm -
Anyone knows the mechanism that hadoop use to load Map and Reduce class on the remote node where the JobTracker submit the tasks? In particular, how can hadoop retrieves the .class files ? Thanks
Francesco De Luca
May 27, 2011 at 2:17 pm
May 27, 2011 at 3:17 pm -
Hi, My question is when I run a command from hdfs client, for eg. hadoop fs -copyFromLocal or create a sequence file writer in java code and append key/values to it through Hadoop APIs, does it ...
Mapred Learn
May 18, 2011 at 12:44 am
May 27, 2011 at 5:07 am -
Is it possible to dynamically specify the slave nodes instead to specify them statically through the configuration files? I want to build a dynamic environment in which each node can enter and exit, ...
Francesco De Luca
May 24, 2011 at 10:43 am
May 24, 2011 at 1:14 pm -
Hi, I have implemented a custom record reader to read fixed length records. Pseudo code is as: class CRecordReader extends RecordReader<Text, BytesWritable { private FileSplit fileSplit; private ...
Mapred Learn
May 20, 2011 at 12:45 am
May 23, 2011 at 9:33 pm -
Hi guys, I've just found a problem with the class TableSplit. It implements "equals", but it does not implement hashCode also, as it should have. I've discovered it by trying to use a HashSet of ...
Lucian Iordache
May 20, 2011 at 10:39 am
May 20, 2011 at 1:13 pm -
Hello, I have a Hadoop/Hbase cluster, cloudera cdh3u0 version, with several machines. I've created a job that does some work using the information from a HBase table. The next example *works* fine: - ...
Lucian Iordache
May 16, 2011 at 3:31 pm
May 18, 2011 at 9:00 am -
All, I am attempting to pass a string value from my driver to each one of my mappers and it is not working. I can set the value, but when I read it back it returns null. the value is not null when I ...
Geoffry Roberts
May 11, 2011 at 3:11 pm
May 11, 2011 at 4:51 pm -
Hi all, I just remember there's a property for setting the number of failure task can been tolerated in one job. Does anyone know what's the property name ? -- Best Regards Jeff Zhang
Jeff Zhang
May 10, 2011 at 8:33 am
May 11, 2011 at 8:56 am -
Hi, I've written my first very simple job that does something with hbase. Now when I try to submit my jar in my cluster I get this: [nbasjes@master ~/src/catalogloader/run]$ hadoop jar ...
Niels Basjes
May 3, 2011 at 1:43 pm
May 10, 2011 at 4:11 pm -
Hi, I'm running hadoop (Cloudera release 3) in pseudo distributed mode, with the linux task controller so that jobs will run as the user who submitted them. My program (which uses hadoop cascading) ...
Jeremy
May 6, 2011 at 6:45 pm
May 6, 2011 at 7:28 pm -
Hi, I have extracted the hadoop-0.20.2, hadoop-0.20.203.0 and hadoop-0.21.0 files. In the hadoop-0.21.0 folder the hadoop-hdfs-0.21.0.jar, hadoop-mapred-0.21.0.jar and the hadoop-common-0.21.0.jar ...
Praveen Sripati
May 30, 2011 at 1:46 pm
May 30, 2011 at 1:50 pm -
Hi, We have MapReduce program which writes data to mysql database using DBOutputFormat. Our program has one reducer. I understand that all the inserts happen during the close() operation of the ...
Giridhar Addepalli
May 25, 2011 at 8:58 pm
May 25, 2011 at 10:51 pm -
Hello all, How do I print the job status of each job on the client with the % complete. I am invoking the hadoop jobs using the java client (not hadoop cli) and I am not seeinf the map and reduce job ...
Praveen Peddi
May 23, 2011 at 10:45 pm
May 23, 2011 at 11:35 pm -
I need to save some data in the job config as part of OutputFormat.checkOutputSpecs(), and have it propagated to map tasks. It seems that the property is saved correctly when ...
Jane Chen
May 20, 2011 at 6:25 pm
May 21, 2011 at 5:09 pm -
Hi All, I was investigating the ways to profile the hadoop code. All I found is to use ...
Shuja Rehman
May 19, 2011 at 8:57 am
May 19, 2011 at 1:13 pm -
Hi,all..How can I run a MR job though my own program instead of using console to submit a job to a real Hadoop env? I write code like this, this program works fine but i don't think it ran in my ...
Felix.徐
May 17, 2011 at 10:47 am
May 17, 2011 at 9:39 pm -
Hi, As of now, primary namenode and secondary namenode are running on the same machine in our configuration. As both are RAM heavy processes, we want to move secondary namenode to another machine in ...
Giridhar Addepalli
May 13, 2011 at 9:14 am
May 13, 2011 at 11:43 am -
Hi, 1 - I'm looking for the map and reduce functions of the several examples of the Gridmix2 platform (Webdatasort, Webdatascan, Monsterquery, javasort and combiner) and I can't find it? Where can I ...
Pedro Costa
May 2, 2011 at 10:09 am
May 10, 2011 at 3:11 pm -
Hi, I have a job that processes raw data inside tarballs. As job input I have a text file listing the full HDFS path of the files that need to be processed, e.g.: ... /user/eric/file451.tar.gz ...
Eric
May 9, 2011 at 9:48 am
May 9, 2011 at 6:25 pm -
All, I am attempting to take a large file and split it up into a series of smaller files. I want the smaller files to be named based on values taken from the large file. I am using ...
Geoffry Roberts
May 6, 2011 at 5:56 pm
May 6, 2011 at 6:16 pm -
Dear All, Our team is trying to implement a parallelized LDA with Gibbs Sampling. We are using the algorithm mentioned by plda, http://code.google.com/p/plda/ The problem is that by the Map-Reduce ...
Stanley Xu
May 5, 2011 at 9:17 am
May 5, 2011 at 3:30 pm -
Hi guys, I asked this question earlier but did not get any response. So, posting again. Hope somebody can point to the right description: When you do hadoop fs -copyFromLocal or use API to call ...
Mapred Learn
May 31, 2011 at 11:57 pm
May 31, 2011 at 11:57 pm -
I need to track every state change in a Job and in all its tasks. In particular i also need to track Job and Task failures. Does exists any API to do this? Somewhat like Publish-Subscribe paradigm? ...
Francesco De Luca
May 24, 2011 at 10:27 am
May 24, 2011 at 10:27 am -
Hi Karthik, FYI, I'm moving this thread to mapreduce-user@hadoop.apache.org (You and common-user are BCCed). My guess is that your task trackers are throwing a lot of exceptions which are getting ...
Joey Echeverria
May 23, 2011 at 12:46 pm
May 23, 2011 at 12:46 pm -
Hi Mark, FYI, I'm moving the discussion over to mapreduce-user@hadoop.apache.org since your question is specific to MapReduce. You can derive the output name from the TaskAttemptID which you can get ...
Joey Echeverria
May 23, 2011 at 12:42 pm
May 23, 2011 at 12:42 pm -
All, I have a job where there I need to have but a single reducer. i.e. job.setNumReduceTasks(1); When I do this, I get no log file for the reducer; mappers yes, but reducer no. If I remove the ...
Geoffry Roberts
May 20, 2011 at 4:18 pm
May 20, 2011 at 4:18 pm
Group Overview
group | mapreduce-user |
categories | hadoop |
discussions | 65 |
posts | 225 |
users | 65 |
website | hadoop.apache.org... |
irc | #hadoop |
65 users for May 2011
Archives
- February 2013 (251)
- January 2013 (868)
- December 2012 (621)
- November 2012 (742)
- October 2012 (868)
- September 2012 (733)
- August 2012 (1,082)
- July 2012 (226)
- June 2012 (135)
- May 2012 (102)
- April 2012 (180)
- March 2012 (164)
- February 2012 (167)
- January 2012 (284)
- December 2011 (249)
- November 2011 (201)
- October 2011 (130)
- September 2011 (310)
- August 2011 (168)
- July 2011 (207)
- June 2011 (241)
- May 2011 (225)
- April 2011 (157)
- March 2011 (146)
- February 2011 (174)
- January 2011 (226)
- December 2010 (166)
- November 2010 (135)
- October 2010 (126)
- September 2010 (145)
- August 2010 (128)
- July 2010 (121)
- June 2010 (136)
- May 2010 (82)
- April 2010 (108)
- March 2010 (62)
- February 2010 (59)
- January 2010 (95)
- December 2009 (46)
- November 2009 (45)
- October 2009 (75)
- September 2009 (24)
- August 2009 (30)
- July 2009 (15)