FAQ

Search Discussions

38 discussions - 121 posts

  • Hi I have a scenario for which I'd like to write a MR job in which Mappers do some work and eventually the output of all mappers need to be combined by a single Reducer. Each Mapper outputs ...
    Shai EreraShai Erera
    Jul 27, 2010 at 11:07 am
    Aug 13, 2010 at 10:35 am
  • I am trying to run the terasort example with a small input on a 4 node cluster. I just did the minimal configuration (fs.default.name, master, slaves etc.), but did not do anything specific to ...
    Chinni, RaviChinni, Ravi
    Jul 16, 2010 at 3:56 pm
    Jul 27, 2010 at 2:19 pm
  • Hi, I am running mapreduce on 5 machines, where I have 8 cores for 3 of them, but 2 cores for 2 of them, and the 8 core machines are more powerful (faster, more mem, more disk). Currently, I am using ...
    Shaojun ZhaoShaojun Zhao
    Jul 14, 2010 at 6:51 pm
    Aug 31, 2010 at 9:42 am
  • Hello, I am trying to do the following and hope to get some help on this problem: I want to run 2 consecutive jobs which both use the same input files, but the second should use additional ...
    Benjamin HillerBenjamin Hiller
    Jul 25, 2010 at 3:27 pm
    Jul 26, 2010 at 4:38 pm
  • Hi everyone, I'm implementing PageRank algorithm on Hadoop platform with Eclipse. I start all the necessary daemons. When I run the .java, job starts successfuly, but the progress of map and reduce ...
    Ilče GeorgievskiIlče Georgievski
    Jul 21, 2010 at 8:24 pm
    Jul 22, 2010 at 12:08 pm
  • MessageTo get around the small-file-problem (I have thousands of 2MB log files) I wrote a class to convert all my log files into a single SequenceFile in (Text key, BytesWritable value) format. That ...
    Some BodySome Body
    Jul 8, 2010 at 1:49 pm
    Jul 9, 2010 at 6:42 pm
  • Hi, Is it possible to use hadoop and not use disk i/o, apart from the initial input? I am asking this with the assumption that disk i/o is the bottleneck in overall processing, even more than the ...
    Juber patelJuber patel
    Jul 30, 2010 at 3:53 am
    Jul 30, 2010 at 4:59 pm
  • I am running a map reduce ob where a few reduce tasks fail with an out of memory error - Increasing the memory is not an option. However if a retry had information that an earlier attempt failed out ...
    Steve LewisSteve Lewis
    Jul 13, 2010 at 5:17 pm
    Jul 14, 2010 at 2:41 am
  • Hi, - In hadoop MR it's used the term speculative tasks. What is speculative tasks? - During the execution of a MR test, when we don't have splits to attribute to reduce tasks, those reduce tasks ...
    Pedro CostaPedro Costa
    Jul 25, 2010 at 2:04 pm
    Jul 26, 2010 at 8:34 am
  • MessageHi All, I had a MR job that processed 2000 small (<3MB ea.) files and it took 40 minutes on 8 nodes. Since the files are small it triggerred 2000 tasks. I packed my 2000 files into a single ...
    Some BodySome Body
    Jul 13, 2010 at 11:01 am
    Jul 24, 2010 at 9:21 pm
  • hi When i wrote a mapreduce program i have this error please can any one help me Jul 19, 2010 5:06:31 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Task Id : ...
    Khaled BEN BAHRIKhaled BEN BAHRI
    Jul 20, 2010 at 1:01 pm
    Jul 21, 2010 at 12:19 am
  • Hey folks, how far along is the 0.21 release? ...I just keep building from the branch myself currently. That's a bit of a pain. cheers -- Torsetn
    Torsten CurdtTorsten Curdt
    Jul 15, 2010 at 8:24 am
    Jul 17, 2010 at 3:15 pm
  • Hello to all I wonder how to specify inputs of mapreduce it's necessary that inputs be in hdfs ??? or it's possible to process mapreduce with inputs located in the local system?? if it's possible how ...
    Khaled BEN BAHRIKhaled BEN BAHRI
    Jul 16, 2010 at 3:38 pm
    Jul 16, 2010 at 3:48 pm
  • Hi, my program needs total sorting as done by the TotalOrderPartitioner. I have written all my code using the new API. I see that this partitioner was ported to new API only in 0.21 RC0 that Tom ...
    Juber patelJuber patel
    Jul 2, 2010 at 11:51 am
    Jul 13, 2010 at 2:26 am
  • Hi Everyone, I am having some problem with naming the output file of each reduce task with the partition number. First of all, how can I get the partition number within each reduce? Second, How am I ...
    Denim LiveDenim Live
    Jul 8, 2010 at 9:15 am
    Jul 8, 2010 at 6:52 pm
  • Hi folks, I want to determine the exact time it took for my mapreduce job to get executed for some anaylsis purpose. How can I calculate it? Thanks
    Denim LiveDenim Live
    Jul 8, 2010 at 7:51 am
    Jul 8, 2010 at 9:08 am
  • Hello I want to get the id of each mapper and reducer task because I want to tag the output of these mappers and reducers according to the mapper and reducer id. How can I retrieve the ids of each? ...
    Denim LiveDenim Live
    Jul 6, 2010 at 9:45 am
    Jul 6, 2010 at 10:22 am
  • Hello everyone, I have written my custom partitioner for partitioning datasets. I want to partition two datasets using the same partitioner and then in the next mapreduce job, I want each mapper to ...
    Denim LiveDenim Live
    Jul 3, 2010 at 4:30 pm
    Jul 6, 2010 at 9:38 am
  • Hi all, I am doing a simple project to analyze http proxy server logs by hadoop mapreduce approach (in Java). The log file contains logs for a week or some times more than that. I have following ...
    Bright D LBright D L
    Jul 31, 2010 at 8:10 pm
    Aug 1, 2010 at 2:58 pm
  • We are planning to use Hadoop to run a number of recurring jobs that involve map side joins. Rather than requiring that the joined datasets be partitioned into separate part-* files, we are ...
    Deem, MikeDeem, Mike
    Jul 22, 2010 at 12:23 am
    Jul 22, 2010 at 5:55 pm
  • Hello :) I developped my first mapreduce program with eclipse. when i want to execute it i have this error i tried to solve it but i failed : Jul 19, 2010 5:06:37 PM ...
    Khaled BEN BAHRIKhaled BEN BAHRI
    Jul 19, 2010 at 3:39 pm
    Jul 21, 2010 at 1:31 am
  • I am trying to develop a MR application. Due to the kind of application I am trying to develop, the mapper is a dummy (passes it's input to it's output) task and I am only interested in having a ...
    Chinni, RaviChinni, Ravi
    Jul 9, 2010 at 8:07 pm
    Jul 9, 2010 at 10:28 pm
  • Hello all, As a new user of hadoop, I am having some problems with understanding some things. I am writing a program to load a file to the distributed cache and read this file in each mapper. In my ...
    Denim LiveDenim Live
    Jul 8, 2010 at 7:03 pm
    Jul 9, 2010 at 3:57 am
  • Hi, I want to be able to discover the 10 most popular routes through our web site that lead a visitor to register with us. I am already logging page view data but don't seem to be able to find the ...
    Tim JonesTim Jones
    Jul 8, 2010 at 9:56 am
    Jul 8, 2010 at 10:06 am
  • Assume we have a medium size cluster - say 20 nodes and that the cluster is used for one job and cannot change in size. Assume we are sorting a large data set. As we increase the size of the data ...
    Steve LewisSteve Lewis
    Jul 1, 2010 at 5:16 am
    Jul 1, 2010 at 2:33 pm
  • Moving to mapreduce-user@, bcc general@. Please do not use the general@ list for project specific discussions. The part about 'loading data in reduce method controlled by an initialize flag variable ...
    Arun C MurthyArun C Murthy
    Jul 21, 2010 at 10:24 pm
    Jul 21, 2010 at 10:24 pm
  • Hi I am getting this strange error in the target "compile-mapred- classes:" when compiling map-reduce on Mac OS X. The JAVA_HOME is properly set , and if I remove <jsp-compile from compile-mapred- ...
    Asif JanAsif Jan
    Jul 13, 2010 at 3:41 pm
    Jul 13, 2010 at 3:41 pm
  • Hello to all :) i'm developping a map reduce function and i use xml files as inputs to make statistics of theses files. -- Cordialement Khaled BEN BAHRI
    Khaled BEN BAHRIKhaled BEN BAHRI
    Jul 13, 2010 at 12:06 pm
    Jul 13, 2010 at 12:06 pm
  • Hello to all I'm novice in working with mapreduce and i'm developping a mapreduce function that treat xml documents. How can i make input files and precise it to the map function Thanks for help Best ...
    Khaled BEN BAHRIKhaled BEN BAHRI
    Jul 12, 2010 at 11:06 pm
    Jul 12, 2010 at 11:06 pm
  • We have a cluster with 4 Cloudera VMs - hadoop fs -ls / says 10/07/12 05:42:22 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:8022. Already tried 0 time(s). 10/07/12 05:42:23 INFO ...
    Steve LewisSteve Lewis
    Jul 12, 2010 at 6:53 pm
    Jul 12, 2010 at 6:53 pm
  • Hi All I am facing a hard problem. I am running a map reduce job using streaming but it fails and it gives the following error. Caught: java.lang.OutOfMemoryError: Java heap space at ...
    Shuja RehmanShuja Rehman
    Jul 9, 2010 at 5:16 pm
    Jul 9, 2010 at 5:16 pm
  • Today is your last chance to submit a CFP abstract for the 2010 Surge Scalability Conference. The event is taking place on Sept 30 and Oct 1, 2010 in Baltimore, MD. Surge focuses on case studies that ...
    Jason DixonJason Dixon
    Jul 9, 2010 at 3:08 pm
    Jul 9, 2010 at 3:08 pm
  • I've written a mapper that relies upon hsqldb-2.0. I had a tough time determining that despite submitting my job with -libjars hsqldb-2.0.jar my code was conflicted with the v1.8 jar that comes in ...
    Andrew RothsteinAndrew Rothstein
    Jul 5, 2010 at 10:02 pm
    Jul 5, 2010 at 10:02 pm
  • hi, The sql is select count(distinct (uname)) from table. For your information!
    Jake 宫Jake 宫
    Jul 5, 2010 at 8:10 am
    Jul 5, 2010 at 8:10 am
  • ROOM CHANGE TO 211 ROOM CHANGE TO 211 Hello Fellow Hadoopists, We are meeting at 7:15 pm on July 15th at the University Heights Community Center 5031 University Way NE Seattle WA 98105 Room #211 note ...
    Sean Jensen-GreySean Jensen-Grey
    Jul 4, 2010 at 1:52 am
    Jul 4, 2010 at 1:52 am
  • A quick reminder that there's one week left to submit your abstract for this year's Surge Scalability Conference. The event is taking place on Sept 30 and Oct 1, 2010 in Baltimore, MD. Surge focuses ...
    Jason DixonJason Dixon
    Jul 2, 2010 at 6:38 pm
    Jul 2, 2010 at 6:38 pm
  • Moving to mapreduce-user@, bcc common-user@. org.apache.hadoop.mapred.TaskTracker.MapOutputServlet
    Arun C MurthyArun C Murthy
    Jul 1, 2010 at 9:34 pm
    Jul 1, 2010 at 9:34 pm
  • Hi Marcin, did you solve this error ? I stumbled into the same thing also i have no NFS involved... Johannes
    Johannes ZillmannJohannes Zillmann
    Jul 1, 2010 at 3:03 pm
    Jul 1, 2010 at 3:03 pm
Group Navigation
period‹ prev | Jul 2010 | next ›
Group Overview
groupmapreduce-user @
categorieshadoop
discussions38
posts121
users51
websitehadoop.apache.org...
irc#hadoop

51 users for July 2010

Ted Yu: 11 posts Denim Live: 8 posts Chinni, Ravi: 6 posts Juber patel: 6 posts Shai Erera: 6 posts Bmdevelopment: 5 posts Khaled BEN BAHRI: 5 posts Benjamin Hiller: 4 posts Ferdy Galema: 4 posts Khaled BEN BAHRI: 4 posts Some Body: 4 posts Steve Lewis: 4 posts Alex Kozlov: 3 posts Hemanth Yamijala: 3 posts Alexandros Konstantinakis - Karmis: 2 posts Alex Loddengaard: 2 posts Arun C Murthy: 2 posts Ilče Georgievski: 2 posts James Hammerton: 2 posts Jason Dixon: 2 posts
show more