FAQ

Search Discussions

44 discussions - 128 posts

  • I'm using a SequenceFileOutputFormat. But I'd like to be able to set some SequenceFile.Metadata on the SequenceFile.Writer that's getting created. Doesn't look like there's any easy way to do that, ...
    David RosenstrauchDavid Rosenstrauch
    Aug 7, 2010 at 5:50 am
    Aug 10, 2010 at 2:52 am
  • Hello, I have not used Hadoop but am researching it for an analytics project. I would like to know if Hadoop supports continuous or incremental map reduce functionality. If not, are there any plans ...
    Stephen MullinsStephen Mullins
    Aug 24, 2010 at 4:56 pm
    Sep 3, 2010 at 6:02 am
  • I'm new to Hadoop and I want to use it for my data processing. My understanding is that each Split will be processed by a mapper task, so for my application I have mapper in which I populate backend ...
    Anfernee XuAnfernee Xu
    Aug 25, 2010 at 1:08 pm
    Aug 25, 2010 at 4:00 pm
  • Hi, The map output files that are produced in the map side and that the reduce will fetch contains only data, or it also contains an header? If contains an header, what the size of it? Thanks, -- ...
    Pedro CostaPedro Costa
    Aug 8, 2010 at 1:02 pm
    Aug 8, 2010 at 2:30 pm
  • Hi I am trying to convert Mahout xmlInputFormat to new API but this is not working. The problem which i think is that in old api we have next method which takes key and value and we can set it in the ...
    Shuja RehmanShuja Rehman
    Aug 25, 2010 at 9:23 pm
    Aug 26, 2010 at 4:07 am
  • Someone sent this email to the commons-user list a while back, but it seems like it slipped through the cracks. We're starting to dig into some hard-core Hadoop development and just came upon this ...
    David RosenstrauchDavid Rosenstrauch
    Aug 4, 2010 at 3:38 pm
    Aug 4, 2010 at 9:10 pm
  • Hi, First post .... I wrote my own mapper and reducer in c++. I tried submitting the streaming jobs using the following command: path/to/hadoop jar path/to/streaming.jar -input path/to/input -output ...
    Xin FengXin Feng
    Aug 28, 2010 at 3:02 am
    Aug 28, 2010 at 3:30 pm
  • Is it possible for a M/R job to have no mapper? i.e.: job.setMapperClass(null)? Or is it required that one at least use an "identity mapper" (i.e., plain vanilla org.apache.hadoop.mapreduce.Mapper)? ...
    David RosenstrauchDavid Rosenstrauch
    Aug 16, 2010 at 8:25 pm
    Aug 18, 2010 at 9:31 pm
  • What's the preferred way to submit a job these days? org.apache.hadoop.mapreduce.Job.submit() ? Or org.apache.hadoop.mapred.JobClient.runJob()? Or does it even matter? (i.e., is there any difference ...
    David RosenstrauchDavid Rosenstrauch
    Aug 11, 2010 at 10:13 pm
    Aug 12, 2010 at 1:49 pm
  • I'm trying to write some tests with the mrunit framework, but running into a snag. It seems that the mock Context objects that are being created are always using a new, empty Configuration object. ...
    David RosenstrauchDavid Rosenstrauch
    Aug 10, 2010 at 8:24 pm
    Aug 12, 2010 at 2:24 am
  • Hi, I recently started developing with Hadoop using the 20.2 API. I'm looking to profile one of my jobs but I haven't been able to find any documentation about how to do this. For the earlier ...
    David JurgensDavid Jurgens
    Aug 12, 2010 at 10:34 pm
    Aug 17, 2010 at 6:20 am
  • Anyone have any ideas how I might be able to work around https://issues.apache.org/jira/browse/MAPREDUCE-1700 ? It's quite a thorny issue! I have a M/R job that's using Avro (v1.3.3). Avro, in turn, ...
    David RosenstrauchDavid Rosenstrauch
    Aug 12, 2010 at 10:40 pm
    Aug 12, 2010 at 11:33 pm
  • I'm working on a M/R job which uses DBInputFormat. So I have to create my own DBWritable for this. I'm a little bit confused about how to implement this though. In the sample code in the Javadoc for ...
    David RosenstrauchDavid Rosenstrauch
    Aug 5, 2010 at 1:38 am
    Aug 9, 2010 at 6:09 pm
  • For those people using LZO compression: While I know there is http://github.com/kevinweil/hadoop-lzo The native stuff makes it a bit of a hurdle. Especially if you are just running on Amazon Elastic ...
    Torsten CurdtTorsten Curdt
    Aug 31, 2010 at 2:37 pm
    Aug 31, 2010 at 6:50 pm
  • I'm migrating some code from the hadoop 0.18 apis to the 0.20 apis. The Mapper/Reducer interfaces in the mapred package to extending the Mapper/Reducer classes in the mapreduce package is pretty ...
    Steve HoffmanSteve Hoffman
    Aug 23, 2010 at 4:26 pm
    Aug 23, 2010 at 4:49 pm
  • Hi, guys, I am using hadoop 0.20.2, and I am trying to run the "SecondarySort" exmaple. The following is the "FirstGroupingComparator" class, and I just cannot figure out how ...
    DennisDennis
    Aug 10, 2010 at 8:01 am
    Aug 11, 2010 at 1:11 am
  • Hi, 1 - I'm trying to compare the size of 1 map output on the map and on the reduce side. So, I did some code modifications in the MR to see what's happening when map saves map outputs and the reduce ...
    Pedro CostaPedro Costa
    Aug 9, 2010 at 8:28 pm
    Aug 10, 2010 at 8:24 am
  • Hi, I would like to read a map output file that has the key/value pair. Can anyone give an example in java please? Thanks -- Pedro
    Pedro CostaPedro Costa
    Aug 8, 2010 at 6:21 pm
    Aug 9, 2010 at 12:25 am
  • Hi, I would like to debug the shuffle phase from a Reducer, but I can't because the Reducer starts as a new process. I've tried all the options that some pages says, but it doesn't work in the way ...
    Pedro CostaPedro Costa
    Aug 29, 2010 at 1:25 am
    Aug 29, 2010 at 3:41 am
  • Hi, 1 - I'm running the wordcount examples with one input file with size of 50Mb and with 2 reduces defined. At the end of the execution of the wordcount, the 2 reduces deals with each part of the ...
    Pedro CostaPedro Costa
    Aug 21, 2010 at 5:30 pm
    Aug 22, 2010 at 5:47 pm
  • Hi, This link: http://www.cloudera.com/hadoop-mrunit no longer points to MRUnit. Can someone please point out the location from where I can get it ? Does MRUnit support Hadoop 0.20.1 ? Thanks, -Rakesh
    Rakesh kothariRakesh kothari
    Aug 20, 2010 at 6:54 pm
    Aug 20, 2010 at 7:34 pm
  • Moving to mapreduce-user@, bcc general@. There isn't a direct way. One possible option is just use the per-job job-history file which is on HDFS (See ...
    Arun C MurthyArun C Murthy
    Aug 12, 2010 at 4:53 am
    Aug 17, 2010 at 4:30 am
  • i seem to be having problems submitting jobs from hadoop using cygwin in windows (windows 7) to a hadoop multi-node cluster (ubuntu). in windows/cygwin, i have created a user called hadoop. this ...
    Jake VangJake Vang
    Aug 15, 2010 at 7:56 pm
    Aug 16, 2010 at 4:08 am
  • Moving to mapreduce-user@, bcc common-user@. Why do you need to create a single top-level jar? Just register each of your jars and put each in the distributed cache... however you have 150 jars which ...
    Arun C MurthyArun C Murthy
    Aug 11, 2010 at 4:59 pm
    Aug 11, 2010 at 5:10 pm
  • Hi, I am playing with netflow data on my small hadoop cluster (20 nodes) just trying things out. I am a beginner on hadoop so please be gentle with me. I am currently running map reduce jobs on text ...
    Fred smithFred smith
    Aug 6, 2010 at 6:55 am
    Aug 6, 2010 at 7:42 am
  • Hi, This is an admittedly naïve question, but I've been unable to find a comprehensive answer online. I have gone through the tutorial a few times ...
    Parimi, NagenderParimi, Nagender
    Aug 2, 2010 at 6:23 pm
    Aug 3, 2010 at 7:51 pm
  • Early Bird Registration for Surge Scalability Conference 2010 ends next Tuesday, August 31. We have a killer lineup of speakers and architects from across the Internet. Listen to experts talk about ...
    Jason DixonJason Dixon
    Aug 27, 2010 at 7:33 pm
    Aug 27, 2010 at 7:33 pm
  • hi, I have some XML files with a structure like this: <document <header some text</header <record record 1</record <record record 2</record .... <record record N</record <document Where the info in ...
    Alejandro MontenegroAlejandro Montenegro
    Aug 26, 2010 at 9:44 pm
    Aug 26, 2010 at 9:44 pm
  • Hello guys, Over at http://search-hadoop.com we index MapReduce project's mailing lists, wiki, web site, source code, javadoc, jira... Would the community be interested in a patch that replaces the ...
    Alex BaranauAlex Baranau
    Aug 25, 2010 at 4:21 pm
    Aug 25, 2010 at 4:21 pm
  • I had a job that I ran a few days ago that rolled over to the Job tracker history. Now when I go view it in the history viewer although I can see basic stats such as total # records in/out, I can no ...
    David RosenstrauchDavid Rosenstrauch
    Aug 23, 2010 at 3:30 pm
    Aug 23, 2010 at 3:30 pm
  • Moving to mapreduce-user@, bcc common-dev@. Please use the project specific lists. DistributedCache.purgeCache isn't a public api. You shouldn't be calling it from the task. A simple way of doing ...
    Arun C MurthyArun C Murthy
    Aug 23, 2010 at 1:38 am
    Aug 23, 2010 at 1:38 am
  • Was reading up a bit today on configuring the settings for # task slots, namely: mapred.tasktracker.map.tasks.maximum mapred.tasktracker.reduce.tasks.maximum Was just wondering: couldn't (shouldn't?) ...
    David RosenstrauchDavid Rosenstrauch
    Aug 19, 2010 at 6:46 pm
    Aug 19, 2010 at 6:46 pm
  • ROOM CHANGE TO 209 (one floor up from usual) Hello Fellow Hadoopists, We are meeting at 7:15 pm on August 19th at the University Heights Community Center 5031 University Way NE Seattle WA 98105 Room ...
    Sean jensen-greySean jensen-grey
    Aug 19, 2010 at 5:02 pm
    Aug 19, 2010 at 5:02 pm
  • Moving mapreduce specific question to mapreduce-user@hadoop.apache.org All map task related execution starts at org.apache.hadoop.mapred.MapTask. For your specific question, you can see ...
    Vinod KVVinod KV
    Aug 18, 2010 at 3:50 am
    Aug 18, 2010 at 3:50 am
  • Hi! :) I am a total beginner in both Java and Hadoop Mapreduce, now my current goal is to understand how hadoop works internally when it processes an mapreduce application. Now I know two things: 1. ...
    Rita LiuRita Liu
    Aug 17, 2010 at 11:22 pm
    Aug 17, 2010 at 11:22 pm
  • The random writer example produces the input for the sort example. Please use mapreduce-user instead of general for questions. -- Owen
    Owen O'MalleyOwen O'Malley
    Aug 14, 2010 at 6:25 pm
    Aug 14, 2010 at 6:25 pm
  • Jothi PadmanabhanJothi Padmanabhan
    Aug 12, 2010 at 10:45 am
    Aug 12, 2010 at 10:45 am
  • Hi, 1 - I would like to know if a Map Task can produce more than 1 map output per execution? 2 - A Map Task can't be reused, right? When a Map Task instance produced a map outputs, this instance will ...
    Pedro CostaPedro Costa
    Aug 10, 2010 at 10:45 pm
    Aug 10, 2010 at 10:45 pm
  • Hi, guys, I am using hadoop 0.20.2, and I am trying to run the "SecondarySort" exmaple. The following is the "FirstGroupingComparator" class, and I just cannot figure out how ...
    DennisDennis
    Aug 10, 2010 at 12:49 pm
    Aug 10, 2010 at 12:49 pm
  • Hi, 1 - I would like to compare programatically the map output and the reduce input to see if they're equal in MR. So, I'm trying to do an hash on the output generated by the map, and on the input on ...
    Pedro CostaPedro Costa
    Aug 9, 2010 at 9:42 am
    Aug 9, 2010 at 9:42 am
  • Hi, How to chain two MR jobs using Hadoop C++ Pipes library? I have a C++ tool to be ported on Hadoop. I need to use two MR jobs which are to be chained such that output of the first MR job is the ...
    Sandeep DeshmukhSandeep Deshmukh
    Aug 6, 2010 at 6:46 pm
    Aug 6, 2010 at 6:46 pm
  • I have some job scope statistics of double type, and I want to use mechanism like Counters (Reporter). Does Hadoop MR have such kind of thing? Thanks.
    Wei XueWei Xue
    Aug 6, 2010 at 10:08 am
    Aug 6, 2010 at 10:08 am
  • Hi all, I'm trying to write a Mapper which accesses two separate SequenceFiles at the same time; the indices and corresponding rows all match up. In the previous hadoop.mapred.* package, there was a ...
    Shannon QuinnShannon Quinn
    Aug 3, 2010 at 2:25 am
    Aug 3, 2010 at 2:25 am
  • Registration for Surge Scalability Conference 2010 is open for all attendees! We have an awesome lineup of leaders from across the various communities that support highly scalable architectures, as ...
    Jason DixonJason Dixon
    Aug 2, 2010 at 4:04 pm
    Aug 2, 2010 at 4:04 pm
Group Navigation
period‹ prev | Aug 2010 | next ›
Group Overview
groupmapreduce-user @
categorieshadoop
discussions44
posts128
users50
websitehadoop.apache.org...
irc#hadoop

50 users for August 2010

David Rosenstrauch: 28 posts Pedro Costa: 12 posts Harsh J: 7 posts Dennis: 5 posts Ted Yu: 5 posts Aaron Kimball: 4 posts Anfernee Xu: 4 posts Shuja Rehman: 4 posts Arun C Murthy: 3 posts Josh Patterson: 3 posts Ken Goodhope: 3 posts Owen O'Malley: 3 posts Vitaliy Semochkin: 3 posts Xin Feng: 3 posts David Jurgens: 2 posts Hemanth Yamijala: 2 posts Jason Dixon: 2 posts Steve Hoffman: 2 posts Torsten Curdt: 2 posts Alejandro Montenegro: 1 post
show more