FAQ

Search Discussions

28 discussions - 108 posts

  • Hi, all, quick question about using PathFilter. Is there any way to provide information from the job configuration to a PathFilter instance? In my case, I want to limit the date range of the files ...
    Kris NuttycombeKris Nuttycombe
    Apr 12, 2010 at 3:14 pm
    Apr 14, 2010 at 2:34 pm
  • Hello, I'm using Hadoop to run a memory intensive job on different input datum. The job requires the availability (in memory) of some read-only HashMap, about 4Gb in size. The same fixed HashMap is ...
    Danny LeshemDanny Leshem
    Apr 29, 2010 at 12:59 pm
    Apr 30, 2010 at 7:46 am
  • Hi everyone, Yesterday I've started to look into Hadoop as I was trying to understand Mahout's FPGrowth algorithm. I have a few questions: Given that I have two tables containing information about ...
    Sebastian FeherSebastian Feher
    Apr 23, 2010 at 2:18 pm
    Apr 25, 2010 at 8:48 am
  • Hello, I want to investigate the matter of running hadoop MapReduce jobs over the Internet. I don't mean in private computers, all of them in different places, rather a collection of datacenters, ...
    AltanisAltanis
    Apr 17, 2010 at 11:29 am
    Apr 20, 2010 at 9:13 pm
  • Hello I am trying to pass an object of type LexicalizedParser (which is from an imported jar file stanford-parser) from the main method to the Mapper Class. This is to load the trained model first ...
    RishavRishav
    Apr 20, 2010 at 11:17 am
    Apr 20, 2010 at 8:17 pm
  • Hello all, I got the time out error as mentioned below -- after 600 seconds, that attempt was killed and the attempt would be deemed a failure. I searched around about this error, and one of the ...
    Raghava MutharajuRaghava Mutharaju
    Apr 8, 2010 at 5:31 pm
    Apr 18, 2010 at 8:25 am
  • Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs:///test-batchEventLog/metrics/data at ...
    Kris NuttycombeKris Nuttycombe
    Apr 6, 2010 at 7:46 pm
    Apr 11, 2010 at 2:27 pm
  • Hi, What's the best way to partition data generated from Reducer into multiple = directories in Hadoop 0.20.1. I was thinking of using MultipleTextOutputFor= mat but that's not backward compatible ...
    Rakesh kothariRakesh kothari
    Apr 3, 2010 at 12:32 am
    Apr 6, 2010 at 2:02 am
  • Hi, all, I'm new to Hadoop, and I'm finding myself having a hard time creating highly configurable Mapper and Reducer instances, due to the fact that Hadoop seems to require that Mapper and Reducer ...
    Kris NuttycombeKris Nuttycombe
    Apr 2, 2010 at 7:06 pm
    Apr 5, 2010 at 3:40 am
  • Hi, The reduce tasks are threads that are launched by the Reducer. The print below shows the stacktrace of one reduce task. at ...
    Psdc1978Psdc1978
    Apr 27, 2010 at 11:28 am
    May 1, 2010 at 5:48 pm
  • Hi, I am working on TCP communications in Hadoop and I would like to know if it is existing a way to separate easily communications between the processes of HDFS and MapReduce (without modifying ...
    Druilhe RemiDruilhe Remi
    Apr 26, 2010 at 1:24 pm
    Apr 28, 2010 at 8:33 am
  • Hi, When I run an MapReduce example, I've noticed that some temporary directories are buit in /tmp directory. In my case, in the /tmp/hadoop directory it was created the following file directory ...
    Psdc1978Psdc1978
    Apr 5, 2010 at 10:55 am
    Apr 6, 2010 at 5:11 am
  • Hi all, I am running a series of jobs one after another. While executing the 4th job, the job fails. It fails in the reducer --- the progress percentage would be map 100%, reduce 99%. It gives out ...
    Raghava MutharajuRaghava Mutharaju
    Apr 1, 2010 at 6:25 am
    Apr 3, 2010 at 4:16 am
  • Hi, I want to add a Java System property that a running reduce task should be able to read . I added it to the hadoop-env.sh script in the HADOOP_OPTS. So, like export ...
    Deepika KheraDeepika Khera
    Apr 30, 2010 at 10:50 pm
    May 3, 2010 at 8:35 pm
  • Hi, The MapReduce tutorial specifies that InputSplit generated by the InputFormat for the job. But, the mapred.map.tasks definition is mapred.job.tracker is "local". So, is the number of map tasks ...
    Praveen SripatiPraveen Sripati
    Apr 25, 2010 at 10:59 am
    Apr 27, 2010 at 12:01 pm
  • Hi All, How do I dynamically remove a tasktracker? Do I simply kill the tasktracker process, and the namenode will detect it? Or is there a graceful way of doing it? Thanks, Harold
    Harold LimHarold Lim
    Apr 14, 2010 at 7:16 pm
    Apr 20, 2010 at 1:36 am
  • Hi, I have been using the JobConf.setOutputCommitter() method to set my own OutputCommitter for a map reduce job. With hadoop v 0.20 since this class is deprecated, what will be the alternate to set ...
    Deepika KheraDeepika Khera
    Apr 7, 2010 at 10:53 pm
    Apr 12, 2010 at 6:30 pm
  • Hi, Is there a way to get jobtracker metrics, such as job completed, etc? For HDFS, I am using jmx to get the metrics. However, when I use jmx to get metrics from the job tracker, the only bean I see ...
    Harold LimHarold Lim
    Apr 9, 2010 at 1:06 am
    Apr 12, 2010 at 6:23 pm
  • Hi, I'm trying to use Distributed cache to add a jar file to the job and I'm using the following code, but it didn't work. If you have used it I would appreciate if you can post a sample Thank you ...
    Raja ThiruvathuruRaja Thiruvathuru
    Apr 3, 2010 at 6:44 am
    Apr 3, 2010 at 10:50 pm
  • Hi All, I was looking at the jobtracker metrics and it seems to be able to give me: jobs_completed, jobs_submitted, maps_completed, maps_launched, reduces_completed, reduces_launched. I was wondering ...
    Harold LimHarold Lim
    Apr 26, 2010 at 7:14 pm
    Apr 27, 2010 at 11:38 am
  • Hello folks, Possible newbie question here: from http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Counter.html : "Counters represent global counters, defined either by the ...
    George StathisGeorge Stathis
    Apr 20, 2010 at 3:08 am
    Apr 21, 2010 at 1:18 pm
  • Hello All, I've been looking for the documentation and the API docs, however I couldn't find an OutPutFormat that will go through all the reduce emits as key:value pairs by running an execute method ...
    Utku Can TopçuUtku Can Topçu
    Apr 19, 2010 at 11:46 am
    Apr 20, 2010 at 8:38 pm
  • Hi, all, I'm having problems with my Mapper instances accessing the DistributedCache. A bit of background: I'm running on a single-node cluster, just trying to get my first map/reduce job ...
    Kris NuttycombeKris Nuttycombe
    Apr 15, 2010 at 6:06 pm
    Apr 16, 2010 at 5:21 am
  • Hi, I've posted this post last moth, but I haven't got a response. Does anyone knows this question? I would like to understand what's the purpose of a setup and cleanup task. During the start-up of ...
    Psdc1978Psdc1978
    Apr 2, 2010 at 4:00 pm
    Apr 4, 2010 at 12:05 am
  • I've got a MR job which uses the MultipleOutputs class to send an additional set of output to a different path. However, I'm not sure how to then specify this output as the input to a new job: ...
    Whitney SorensonWhitney Sorenson
    Apr 9, 2010 at 10:23 pm
    Apr 9, 2010 at 10:23 pm
  • Hello, we would like to invite everyone interested in data storage, analysis and search to join us for two days on June 7/8th in Berlin for an in-depth, technical, developer-focused conference ...
    Isabel DrostIsabel Drost
    Apr 8, 2010 at 10:22 am
    Apr 8, 2010 at 10:22 am
  • I did setup a two node cluster and I hava a large file thats distributed between these two nodes, when I submit a MapReduce job, I get this following error. I appreciate your help. Thank you Raja ...
    Raja ThiruvathuruRaja Thiruvathuru
    Apr 6, 2010 at 11:30 pm
    Apr 6, 2010 at 11:30 pm
  • Take them out of dfs.exclude and refreshnodes again.
    Allen WittenauerAllen Wittenauer
    Apr 1, 2010 at 3:27 am
    Apr 1, 2010 at 3:27 am
Group Navigation
period‹ prev | Apr 2010 | next ›
Group Overview
groupmapreduce-user @
categorieshadoop
discussions28
posts108
users44
websitehadoop.apache.org...
irc#hadoop

44 users for April 2010

Kris Nuttycombe: 16 posts Raghava Mutharaju: 7 posts Jeff Zhang: 5 posts Raja Thiruvathuru: 5 posts Eric Sammer: 4 posts Harold Lim: 4 posts Psdc1978: 4 posts Allen Wittenauer: 3 posts David Rosenstrauch: 3 posts Deepika Khera: 3 posts Druilhe Remi: 3 posts Rishav: 3 posts Sebastian Feher: 3 posts Sonal Goyal: 3 posts Altanis: 2 posts Arun C Murthy: 2 posts Danny Leshem: 2 posts George Stathis: 2 posts Hemanth Yamijala: 2 posts J.G.Konrad: 2 posts
show more