FAQ

Search Discussions

47 discussions - 180 posts

  • Hi all, We have a usecase in which I start with first MR1 job with input file as File1.txt, and from this job, call another MR2 job with input as File2.txt So : MRjob1{ Map(){ MRJob2(File2.txt) } } ...
    Stuti AwasthiStuti Awasthi
    Apr 4, 2012 at 10:35 am
    Apr 5, 2012 at 7:00 am
  • Hi All, I am new to Hadoop and was trying to generate random numbers using apache commons math library. I used Netbeans to build the jar file and the manifest has path to commons-math jar as ...
    Utkarsh GuptaUtkarsh Gupta
    Apr 4, 2012 at 6:52 am
    Apr 4, 2012 at 5:14 pm
  • Dear All, I am porting code from the old API to the new API (Context objects) and run on Hadoop 0.20.203. Job job_first = new Job(); job_first.setJarByClass(My.class) ...
    Arko Provo MukherjeeArko Provo Mukherjee
    Apr 17, 2012 at 5:02 am
    Apr 18, 2012 at 12:22 am
  • Hello, I am trying to apply a custom CompressionCodec to work with MapReduce jobs, but I haven't found a way to inject it during the reading of input data, or during the write of the job results. Am ...
    Grzegorz GuniaGrzegorz Gunia
    Apr 11, 2012 at 7:56 am
    Apr 11, 2012 at 6:40 pm
  • Hi, all I'm writing my own map-reduce code using eclipse with hadoop plug-in. I've specified input and output directories in the project property. (two folders, namely input and output) My problem is ...
    Fang XinFang Xin
    Apr 3, 2012 at 9:35 am
    Apr 4, 2012 at 9:52 am
  • Hi I have few jobs added to a Job controller . I need a afterJob() to be executed after the completion of s Job. For example Here i am actually overriding the Job of JobControl. I have Job2 depending ...
    Kasi subrahmanyamKasi subrahmanyam
    Apr 26, 2012 at 11:15 am
    May 1, 2012 at 3:07 pm
  • Its possible to configure in YARN scheduling requirement to not run more than 1 mapper per node? Its not because of memory requirements. Who decides this kind of scheduling? ResourceManager or ...
    Radim KolarRadim Kolar
    Apr 26, 2012 at 5:56 pm
    May 3, 2012 at 12:59 pm
  • Hi, I am creating o/p files in reducer using my own file name convention. So, using FileSystem APIs I am dumping data in the files. I now want to compress these files while writing so as to write ...
    Piyush KansalPiyush Kansal
    Apr 13, 2012 at 12:45 am
    Apr 15, 2012 at 3:15 am
  • Hello all, I am a Hadoop newbie:) I ld like to ask about the Hadoop's dependency on Linux. I have seen that Hadoop is usually deployed in Linux machines and Windows is supported only as a development ...
    Thanos PapaoikonomouThanos Papaoikonomou
    Apr 11, 2012 at 1:54 pm
    Apr 14, 2012 at 10:53 pm
  • I am trying to gather the info regarding the amount of HDFS read/write for each task in a given map-reduce job. How can I do that?
    Qu ChenQu Chen
    Apr 24, 2012 at 9:48 pm
    Apr 25, 2012 at 7:50 am
  • Hi, Does anybody know how to submit multiple hadoop jobs without opening multiple terminals? I found one method is to use Job.Submit() in ToolRunner.run(), but can I use a shell script to submit jobs ...
    BriskBrisk
    Apr 22, 2012 at 4:21 pm
    Apr 22, 2012 at 10:51 pm
  • Hello Everyone, I'm relatively new to hadoop mapreduce and I'm trying to get this simple modification to the WordCount example to work. I'm using hadoop-1.0.2, and I've included both a convenient ...
    Bryan YeungBryan Yeung
    Apr 17, 2012 at 1:44 am
    Apr 17, 2012 at 3:55 am
  • Hi, I have a spreadsheet where each column contains values for one variable. and I need to calculate sum, variance, etc for each column. For my understanding, mapper and reducer work for <key, value ...
    Fang XinFang Xin
    Apr 3, 2012 at 4:41 am
    Apr 3, 2012 at 5:14 am
  • Hi, Is there a way to specify JVM reuse for yarn applications as in MRV1? Regards, Ramgopal **************************************************************************** *********** This e-mail and ...
    RamgopalRamgopal
    Apr 9, 2012 at 11:34 am
    Apr 12, 2012 at 5:06 pm
  • i have a simple map-reduce job that i test with only 2 mappers, 2 reducers and very small input (10 lines of text). it runs fine without compression. but as soon as i turn on compression ...
    Koert KuipersKoert Kuipers
    Apr 11, 2012 at 5:03 pm
    Apr 11, 2012 at 6:21 pm
  • Regards to all the list. There are many people that use the Hadoop Tutorial released by Yahoo at http://developer.yahoo.com/hadoop/tutorial/ ...
    Marcos OrtizMarcos Ortiz
    Apr 4, 2012 at 12:32 pm
    Apr 4, 2012 at 9:30 pm
  • Hi all, The default config mapred-default.xml has a property "mapred.temp.dir". How to use "mapred.temp.dir" ? I did grep "mapred.temp.dir" to hadoop-0.20.2 sources, but I could not get line from ...
    Satoshi NotoSatoshi Noto
    Apr 27, 2012 at 6:10 am
    Apr 27, 2012 at 11:49 am
  • Hi, After around 2 weeks a TestTracker (TT) in our MR cluster gets hung with 100% CPU consumption. Most of the times no new tasks are sent to the node. We start getting more job failure in the ...
    Vladimir EgorovVladimir Egorov
    Apr 20, 2012 at 6:47 pm
    Apr 22, 2012 at 1:06 pm
  • Hi, I'm reposting this as I've not received any reply to my earlier post on the same issue. I've read that the combiner only works if it is specified AND the sort memory buffer overflows in the ...
    Sudip SinhaSudip Sinha
    Apr 19, 2012 at 12:36 pm
    Apr 20, 2012 at 9:24 am
  • is there a way to find out which user is running a mapred job from the JobConf? and is this usable with both unsecure mode and secure (kerberos) mode? thanks for your help! koert
    Koert KuipersKoert Kuipers
    Apr 19, 2012 at 7:05 pm
    Apr 19, 2012 at 9:42 pm
  • Hi, I want to run MapReduce job using ResourceManager. If I understood well i need to create ApplicationMaster and Client for this purpose? I was trying to read Distributed Shell example but it looks ...
    Dominik WiernickiDominik Wiernicki
    Apr 12, 2012 at 7:18 am
    Apr 12, 2012 at 1:24 pm
  • I read about CompositeInputFormat and how it allows one to join two datasets together as long as those datasets were sorted and partitioned the same way. Ok i think i get it, but something bothers ...
    Koert KuipersKoert Kuipers
    Apr 10, 2012 at 3:11 pm
    Apr 11, 2012 at 4:39 pm
  • Hi, I am new to Mapreduce and I have a short question: is it possible for a MapReduce job to split the lines of a file with \n and ignore \r? Basically, in the use case I am looking into, the \r has ...
    Marc SturmMarc Sturm
    Apr 9, 2012 at 7:02 pm
    Apr 9, 2012 at 9:03 pm
  • We are building a system with hadoop mapreq as the back-end distributing-computing engine. The front-end is also in java. So it will be nice to start a hadoop job, or interact with hdfs directly ...
    GUOJUN ZhuGUOJUN Zhu
    Apr 5, 2012 at 10:50 pm
    Apr 6, 2012 at 1:04 pm
  • Hi, I have tried the following combinations of bootstrap actions to increase the heap size of my job but none of them seem to work: --mapred-key-value mapred.child.java.opts=-Xmx1024m ...
    Shrish bajpaiShrish bajpai
    Apr 5, 2012 at 7:57 am
    Apr 5, 2012 at 5:45 pm
  • (I just finished writing this when I noticed the similar email from Marcos bringing up similar issues with the Yahoo tutorials) I've read through the following ...
    Steven WillisSteven Willis
    Apr 4, 2012 at 10:00 pm
    Apr 5, 2012 at 2:43 pm
  • Which application/service runs on port 8080 in YARN by default? I need to change port.
    Radim KolarRadim Kolar
    Apr 4, 2012 at 5:04 pm
    Apr 5, 2012 at 5:17 am
  • Hi, I'm currently working on some simulation software than models engineering facilities. As input we have two big chunks of data, one about the design of the site and one about the climate the site ...
    Kevin SavageKevin Savage
    Apr 4, 2012 at 9:00 pm
    Apr 4, 2012 at 9:21 pm
  • I'm trying to open a local file with the FileSystem class. FileSystem rfs = FileSystem.get(conf); FSDataInputStream i = srcFs.open(p); but I get file not found. The path is correct, but I think that ...
    Pedro CostaPedro Costa
    Apr 4, 2012 at 12:14 pm
    Apr 4, 2012 at 12:28 pm
  • Hi all, of course it's sensible that number of nodes in the cluster will influence map / reduce task capacity, but what determines average task per node? Can the number be manually set? any hardware ...
    Fang XinFang Xin
    Apr 3, 2012 at 8:16 am
    Apr 3, 2012 at 8:36 am
  • Hi folks, Version: Hadoop 0.20.205. My reducer can be optimized if I can get a good estimate on how many records are produced by the mappers, that is, if I can get the MAP_OUTPUT_RECORDS counter (or ...
    Jianhui ZhangJianhui Zhang
    Apr 30, 2012 at 7:02 pm
    Apr 30, 2012 at 8:11 pm
  • Hi All, I am trying to run map-reduce job using "Elastic MapReduce" but I am getting the following errors: java.lang.Throwable: Child Error at ...
    Shrish bajpaiShrish bajpai
    Apr 30, 2012 at 2:38 pm
    Apr 30, 2012 at 3:51 pm
  • Hi, I've question about what happens to blacklisted TaskTracker. 1 - When a TaskTracker don't send heartbeat messages to the JobTracker for a while, it will be blacklisted. But, this means that the ...
    Pedro CostaPedro Costa
    Apr 26, 2012 at 10:06 am
    Apr 26, 2012 at 10:37 am
  • Hi all, I am using the streaming library of hadoop-0.20 to be able to write map and reduce functions in python. Is it possible to make hadoop MapReduce not opening the input files? the map function ...
    Hassen RiahiHassen Riahi
    Apr 18, 2012 at 8:14 pm
    Apr 19, 2012 at 9:11 pm
  • Hi, I am trying to get the actual job name (e.g. WordCount, Sort, etc) inside MapTask class. How can I do that? Thanks, - Qu
    Qu ChenQu Chen
    Apr 19, 2012 at 12:02 am
    Apr 19, 2012 at 4:22 am
  • Hi, I am using Partitioner and Grouper class in my program. And lets say the data which I want to process using MapReduce varies in size, it can be just 10MB or can go up to 10GB. So, do I need to ...
    Piyush KansalPiyush Kansal
    Apr 6, 2012 at 9:33 pm
    Apr 7, 2012 at 3:48 am
  • Hi all, Since this bug is still open https://issues.apache.org/jira/browse/MAPREDUCE-1122 , has anyone any advice or suggestions to avoid rewriting the custom input format using old API? Thanks for ...
    Hassen RiahiHassen Riahi
    Apr 6, 2012 at 12:50 am
    Apr 6, 2012 at 1:58 pm
  • Hi Friends, In the reducer, I am dumping all the data to my customize set of files (using File I/O APIs) and thus not using the regular "context.write()". I am also creating filenames for these files ...
    Piyush KansalPiyush Kansal
    Apr 5, 2012 at 5:52 am
    Apr 5, 2012 at 6:42 am
  • This is to announce the Berlin Buzzwords program. The Program Committee has completed reviewing all submissions and set up the schedule containing a great lineup of speakers for this years Berlin ...
    Isabel DrostIsabel Drost
    Apr 26, 2012 at 11:23 am
    Apr 26, 2012 at 11:23 am
  • Hi all, I am a PhD student at DERI, NUI Galway, Ireland. My main area of research is in investigating social and socio-technical relationship among developers in open source projects. For my ...
    Iqbal, AftabIqbal, Aftab
    Apr 25, 2012 at 1:46 pm
    Apr 25, 2012 at 1:46 pm
  • Hi all, We have this error (*) when trying to execute the word-part example. Here is the content of word-part.xml (**). We made also sure that / examples/bin/wordcount-part file exists in HDFS. Any ...
    Hassen RiahiHassen Riahi
    Apr 24, 2012 at 9:02 am
    Apr 24, 2012 at 9:02 am
  • Hello, I have a big problem and I don't know why and how to resolve it. I have three jobs. Of course I have 3 mappers and 3 reducers. I send from the second mapper key-value pairs and i also print to ...
    Adriana SbirceaAdriana Sbircea
    Apr 11, 2012 at 4:49 pm
    Apr 11, 2012 at 4:49 pm
  • Hi all, I'm having some weird issues running a teragen on my 8 node cluster. while running: hadoop jar hadoop-examples-1.0.1.jar teragen -Dmapred.map.tasks=280 10000000000 tera-in i have 24 GB's of ...
    Ross NordeenRoss Nordeen
    Apr 9, 2012 at 4:55 am
    Apr 9, 2012 at 4:55 am
  • Dominik, Please do not use the general@hadoop.apache.org lists for user/dev queries. It exists for project-wide discussions. I'm moving your mail to mapreduce-user@hadoop.apache.org which is more ...
    Harsh JHarsh J
    Apr 5, 2012 at 2:52 pm
    Apr 5, 2012 at 2:52 pm
  • Hi Prashant, The userlogs for job are deleted after time specified by "* mapred.userlog.retain.hours*" property defined in mapred-site.xml (default is 24 Hrs). Thanks, Nitin -- Nitin Khandelwal
    Nitin KhandelwalNitin Khandelwal
    Apr 5, 2012 at 10:23 am
    Apr 5, 2012 at 10:23 am
  • Hi, I've read that the combiner only works if it is specified AND the sort memory buffer overflows in the mapper ...
    Sudip SinhaSudip Sinha
    Apr 4, 2012 at 4:52 pm
    Apr 4, 2012 at 4:52 pm
  • Hi, I have got the code for 0.22 and did the build successfully using 'ant clean compile eclipse' command. But, the ant command is downloading the dependent jar files every time. How to make ant use ...
    Praveen SripatiPraveen Sripati
    Apr 1, 2012 at 3:41 am
    Apr 1, 2012 at 3:41 am
Group Navigation
period‹ prev | Apr 2012 | next ›
Group Overview
groupmapreduce-user @
categorieshadoop
discussions47
posts180
users56
websitehadoop.apache.org...
irc#hadoop

56 users for April 2012

Harsh J: 21 posts Devaraj k: 9 posts Bejoy KS: 8 posts Ashwanth Kumar: 6 posts Fang Xin: 6 posts Koert Kuipers: 6 posts Piyush Kansal: 6 posts Utkarsh Gupta: 6 posts Arko Provo Mukherjee: 5 posts Arun C Murthy: 5 posts Madhu phatak: 5 posts Robert Evans: 5 posts Stuti Awasthi: 5 posts Radim Kolar: 4 posts Adriana Sbircea: 3 posts Brisk: 3 posts Bryan Yeung: 3 posts George Datskos: 3 posts Grzegorz Gunia: 3 posts Hassen Riahi: 3 posts
show more