Search Discussions

77 discussions - 249 posts

  • Hi, I want to understand a basic concept in MR. If a mapper creates an instance of some class (using the 'new' operator), then the created class exists ONCE in the VM of this node. For each node. ...
    Eyal GolanEyal Golan
    Dec 30, 2011 at 11:13 am
    Jan 9, 2012 at 11:36 pm
  • Hi, We just ran run large scale Apache Nutch jobs in our evaluation of 20.205.0 and they all failed. Some of these jobs ran concurrently with the fair scheduler enabled. These were simple jobs ...
    Markus JelsmaMarkus Jelsma
    Dec 27, 2011 at 12:48 am
    Jan 2, 2012 at 6:30 pm
  • Hi, Can anyone give the procedure about how to run Distibuted shell example in hadoop yarn.So that i try to understand how applicatin master really works.
    Sri ramSri ram
    Dec 14, 2011 at 9:07 am
    Dec 16, 2011 at 2:23 am
  • Hi, We sometimes see reducers fail just when all mappers are finishing. All mappers finish roughly at the same time. The reducers only dump the following exception: java.lang.Throwable: Child Error ...
    Markus JelsmaMarkus Jelsma
    Dec 26, 2011 at 7:45 pm
    Dec 27, 2011 at 11:14 am
  • Hi, I've been playing with 0.23.0, really nice stuff! I was able to setup a small test cluster (40 nodes) and launch the example jobs. I was also able to recompile old Hadoop programs with the new ...
    Avery ChingAvery Ching
    Dec 6, 2011 at 12:01 am
    Dec 10, 2011 at 5:22 pm
  • I've recently come across some interesting things happening within a 50-node cluster regarding the tasktrackers and task attempts. Essentially tasks are being created but they are sticking at 0.0% ...
    John MillerJohn Miller
    Dec 15, 2011 at 3:58 pm
    Dec 28, 2011 at 9:37 pm
  • Hi everyone, This is my first post in this list, as I am a newb with Hadoop. I am looking in the web for some documentation and example on how to use DI framework with Hadoop. Basically I want to ...
    Eyal GolanEyal Golan
    Dec 26, 2011 at 10:13 am
    Dec 30, 2011 at 5:29 pm
  • The current hadoop implementation shuffles directly to disk and then those disk files are eventually requested by the target nodes which are responsible for doing the reduce() on the intermediate ...
    Kevin BurtonKevin Burton
    Dec 20, 2011 at 11:56 pm
    Dec 21, 2011 at 8:33 am
  • Hi folks, I have a Hadoop 0.20.2 map only job with thousands of inputs tasks; I'm using the org.apache.nutch.tools.arc.ArcInputFormat input format so each task corresponds to a single file in HDFS ...
    Mat KelceyMat Kelcey
    Dec 3, 2011 at 10:36 pm
    Dec 4, 2011 at 4:36 pm
  • Hi,all I am running hadoop 0.23 on 5 nodes. I could run any YARN application or Mapreduce Job on this cluster before. But, after I changed Resourcemanager Node from node4 to node5, when I run ...
    Jingui LeeJingui Lee
    Dec 20, 2011 at 1:15 pm
    Dec 21, 2011 at 1:38 pm
  • Hi, I dont really understand the meaning of the sentences in "The Definitive Guide"(page 155): Tasktrackers have a fixed number of slots for map tasks and for reduce tasks: for example, a tasktracker ...
    Tan JunTan Jun
    Dec 12, 2011 at 3:19 am
    Dec 12, 2011 at 12:04 pm
  • Hi, We use Hadoop 0.20.2 version.The log4j.properties file has a property *hadoop.tasklog.logsRetainHours *(mentioned as 24 hours by default) * *which we have set to 12.Despite this property being ...
    Sahana BhatSahana Bhat
    Dec 7, 2011 at 8:24 am
    Dec 8, 2011 at 3:57 am
  • Hi everyone, I want to run a MR job continuously. Because i have streaming data and i try to analyze it all the time in my way(algorithm). For example you want to solve wordcount problem. It's the ...
    Dec 5, 2011 at 8:49 pm
    Dec 6, 2011 at 5:10 am
  • Hi, I use Yarn as resource management to deploy my run-time computing system. I follow http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html yarn-nodemanager-**.log: ...
    Bing JiangBing Jiang
    Dec 29, 2011 at 8:28 am
    Jan 6, 2012 at 8:49 am
  • I know Hadoop Yarn can support MapReduce job well, but I have not found DAG model task. Can you give me some demonstration I missed out , and point out how to build my own programming models in the ...
    Bing JiangBing Jiang
    Dec 26, 2011 at 9:57 am
    Dec 27, 2011 at 8:07 am
  • Hey, We use capacity scheduler and divide our map slots among queues. For a particular kind of job, we want to schedule at most one task per task tracker. How does one do this? We are using Hadoop ...
    Nitin KhandelwalNitin Khandelwal
    Dec 20, 2011 at 11:47 am
    Dec 21, 2011 at 5:03 am
  • Hi, I had some questions specifically on the Map-Reduce phase: [1] For the reduce phase, the TaskTrackers corresponding to the reduce node, poll the Job Tracker to know about maps that have completed ...
    Ann PalAnn Pal
    Dec 16, 2011 at 1:33 pm
    Dec 17, 2011 at 3:13 pm
  • Hi, I want to know where has information regarding completed MapTask been stored? i.e. how reduce task know about completed map output data is available on which tasktracker? please let me know this. ...
    Hadoop anisHadoop anis
    Dec 22, 2011 at 6:14 am
    Jan 4, 2012 at 12:22 pm
  • Hi, Costin: I work on HBase. I went over http://static.springsource.org/spring-hadoop/docs/current/reference/hbase.htmlbut didn't have time to download the source code. Is there a typo: 'does more ...
    Ted YuTed Yu
    Dec 30, 2011 at 10:14 am
    Dec 30, 2011 at 11:14 am
  • Hi, The release notes for 0.22 ( http://hadoop.apache.org/common/releases.html#10+December%2C+2011%3A+release+0.22.0+available) it says By Security missing, what all features are missing? Does it ...
    Praveen SripatiPraveen Sripati
    Dec 29, 2011 at 2:41 pm
    Dec 29, 2011 at 3:40 pm
  • Hi experts I have a query with the job.xml file in map reduce.I set some value in mapred-site.xml and *marked as final*, say mapred.num.reduce.tasks=15. When I submit my job I explicitly specified ...
    Bejoy KsBejoy Ks
    Dec 9, 2011 at 6:44 am
    Dec 10, 2011 at 5:43 pm
  • Hi guys ! I was trying to generate job trace and topology trace. I have hadoop set up for hduser at /usr/local/hadoop and ran wordcount program as hduser . I have mapreduce component set up in ...
    Arun kArun k
    Dec 8, 2011 at 6:34 am
    Dec 8, 2011 at 2:25 pm
  • Hello, Please find details below. I would like to resume the pending tasks (not sure why they went into pending state in the first place). I had a look at the logs, there were some tasks failures ...
    Keren OuaknineKeren Ouaknine
    Dec 6, 2011 at 4:03 pm
    Dec 6, 2011 at 5:40 pm
  • (moving to mapreduce-user@, bcc'ing common-user@) Hi Joey - You'll want to change the value on all of your servers running tasktrackers and then restart each tasktracker to reread the configuration. ...
    James WarrenJames Warren
    Dec 15, 2011 at 11:38 pm
    Dec 29, 2011 at 4:33 pm
  • Hello, Another newbie question. Suppose I want to use an external library (jar) in the mapper / reducer classes. (commons-lang, google's guava, etc.) In our environment, I added the jars into a ...
    Eyal GolanEyal Golan
    Dec 28, 2011 at 12:10 am
    Dec 28, 2011 at 11:43 am
  • Hi, We're sometimes seeing this exception if a map task already failed before due to, for example, an OOM error. Any ideas on how to address this issue? ...
    Markus JelsmaMarkus Jelsma
    Dec 26, 2011 at 5:39 pm
    Dec 27, 2011 at 11:17 am
  • One key point I wanted to mention for Hadoop developers (but then check out the announcement). I implemented a version of sysstat (iostat, vmstat, etc) in Peregrine and would be more than happy to ...
    Kevin BurtonKevin Burton
    Dec 27, 2011 at 6:31 am
    Dec 27, 2011 at 11:13 am
  • Hi, We have many different jobs running on a 0.22.0 cluster, each with its own memory consumption. Some jobs can easily be run with a large amount of *.tasks per job and others require much more ...
    Markus JelsmaMarkus Jelsma
    Dec 19, 2011 at 11:04 pm
    Dec 20, 2011 at 2:19 pm
  • Hai guys ! I have set up 5 node cluster with each of them in different racks. I have hadoop-0.20.2 set up on my Eclipse Helios. So, i ran Tracebuilder using Main Class: ...
    Arun kArun k
    Dec 16, 2011 at 6:52 am
    Dec 16, 2011 at 3:02 pm
  • Hi Friends, I want to know, where JobTracker stores Task's Information, i.e. which task is being executed on which tasktracker, and how JobTracker stores this information. If anyone know this please ...
    Hadoop anisHadoop anis
    Dec 14, 2011 at 9:15 am
    Dec 15, 2011 at 5:40 am
  • Hi, there. I've run into an odd situation, and I'm wondering if there's a way around it; I'm trying to use Jackson for some JSON serialization in my program, and I wrote/unit-tested it to work with ...
    John ArmstrongJohn Armstrong
    Dec 14, 2011 at 1:21 pm
    Dec 14, 2011 at 5:36 pm
  • Hello everyone, I'm running a small toy cluster (3 nodes) on EC2 configured as follows: * one node as JT+NN * two nodes as DN+TT I use whirr to build such cluster on demand (config file here ...
    Marco DidonnaMarco Didonna
    Dec 11, 2011 at 6:00 pm
    Dec 12, 2011 at 8:41 am
  • Moving to mapreduce-user@, bcc common-user@. Please use project specific lists. Niranjan, If you average as 0.5G output per-map, it's 5000 maps *0.5G - 2.5TB over 12 reduces i.e. nearly 250G per ...
    Arun C MurthyArun C Murthy
    Dec 9, 2011 at 6:29 pm
    Dec 11, 2011 at 1:32 am
  • I have several counters that I maintain to allow me to keep statistics on critical operations. I have my code incrementing the counters in an inner loop partly to make sure my job is not killed for ...
    Steve LewisSteve Lewis
    Dec 8, 2011 at 6:49 pm
    Dec 8, 2011 at 7:23 pm
  • Hi guys ! In which Hadoop Version can i find the adaptive scheduler of https://issues.apache.org/jira/browse/MAPREDUCE-1380 Can anyone tell me the difference between Dynamic scheduler and Adaptive ...
    Arun kArun k
    Dec 7, 2011 at 2:37 pm
    Dec 7, 2011 at 5:59 pm
  • Hi All, I just begin to use Hadoop framework, and write my reduce() method. Wonder if the Iterable input values (associated with the input key) are already sorted. Thanks! p.s. I am using the version ...
    James Y. LiJames Y. Li
    Dec 6, 2011 at 12:55 am
    Dec 6, 2011 at 2:37 am
  • I have an array computed in map stage and want its ouptut recorded or sent to file. Reducer can be naive and just send the output to file. Need to get a text output written to file for this array.
    Matt DennMatt Denn
    Dec 4, 2011 at 1:28 pm
    Dec 4, 2011 at 5:15 pm
  • Harsh, Sorry for creating confusion. The question is if i have a single node setup and i give Sysout statements in maptask.java and reducetask.java. {HADOOP_HOME}$ant build {HADOOP_HOME}$start all ...
    Arun kArun k
    Dec 3, 2011 at 7:10 am
    Dec 3, 2011 at 2:30 pm
  • Bccing common-user and ccing mapred-user. Please use the correct mailing lists for your questions. You can use -Dstream.map.output.field.separator= for specifying the seperator.  The link below ...
    Mahadev KonarMahadev Konar
    Dec 27, 2011 at 12:12 am
    Dec 28, 2011 at 4:59 pm
  • Hi, I had the following questions related to Yarn: [1] How does the Application Master know where the data is, to give a list to Resource Manager? Is it talking to the Name Node? [2] How does ...
    Ann PalAnn Pal
    Dec 21, 2011 at 12:06 am
    Dec 21, 2011 at 12:48 am
  • Hi, When resources are allocated in Map reduce Next gen, it can be based on cpu, memory, disk and network bandwidth. Is network bandwidth the bandwidth from server to the switch (TOR) it is connected ...
    Ann PalAnn Pal
    Dec 20, 2011 at 6:57 pm
    Dec 20, 2011 at 7:28 pm
  • Hi, all I got the following exception when I submit a hadoop streaming job to my hadoop cluster. I wrote the mapper in perl langguage, and there is no reducer. the mapper script runs well on local ...
    Yu YangYu Yang
    Dec 19, 2011 at 3:45 am
    Dec 19, 2011 at 4:24 pm
  • Hi, Apologies for cross-posting. We're in the process of migrating data from an Apache Hadoop cluster to a 0.22.0 cluster using distcp with a hftp source and hdfs dest as described in the ...
    Markus JelsmaMarkus Jelsma
    Dec 19, 2011 at 8:24 am
    Dec 19, 2011 at 12:30 pm
  • Hi, I want to read a file that has 100MB of size and it is in the HDFS. How should I do it? Is it with IOUtils.readFully? Can anyone give me an example? -- Thanks, -- Thanks,
    Pedro CostaPedro Costa
    Dec 16, 2011 at 3:58 pm
    Dec 16, 2011 at 4:52 pm
  • I am reporting on performance of a hadoop task on a cluster with about 50 nodes. I would like to be able to report performance on clusters of 5,10,20 nodes without changing int current cluster. Is ...
    Steve LewisSteve Lewis
    Dec 15, 2011 at 10:03 pm
    Dec 16, 2011 at 12:09 am
  • Hi, I am trying to run a shell command from within a mapper. The shell command is of the form: * hadoop jar somjarfile arg1 arg2 ...* Can i do this type of operation from within a mapper? Also, can i ...
    Souri dattaSouri datta
    Dec 14, 2011 at 11:24 am
    Dec 14, 2011 at 9:09 pm
  • Hi Guys ! I want to analyse the completed Job counters like FILE/HDFS BYTES READ/WRITTEN along with other values like average map/reduce task run time. I see that Jobtracker GUI has this info but i ...
    Arun kArun k
    Dec 14, 2011 at 2:40 pm
    Dec 14, 2011 at 4:06 pm
  • Hi Hadoop users, In my company we have been using Hadoop for 2 years and we have need to pause and resume map reduce jobs. I was searching on Hadoop JIRA and there are couple of tickets which are not ...
    Dino KečoDino Kečo
    Dec 13, 2011 at 12:44 am
    Dec 13, 2011 at 1:40 am
  • Hi, I am trying to form a hadoop cluster of 0.23 version in secure mode. While starting nodemanager i get the following error 2011-12-12 15:37:26,874 INFO ipc.HadoopYarnRPC ...
    Sri ramSri ram
    Dec 12, 2011 at 10:15 am
    Dec 12, 2011 at 5:22 pm
  • Hai guys ! Can i access the Job counters displayed in WEB GUI in Hadoop code when the job finished their execution ? If so, how can i access the values like "average task run time" and counters ...
    Arun kArun k
    Dec 11, 2011 at 6:42 am
    Dec 12, 2011 at 3:31 am
Group Navigation
period‹ prev | Dec 2011 | next ›
Group Overview
groupmapreduce-user @

66 users for December 2011

Harsh J: 23 posts Markus Jelsma: 21 posts Arun C Murthy: 20 posts Arun k: 19 posts Praveen Sripati: 10 posts Robert Evans: 9 posts Eyal Golan: 8 posts Bejoy Ks: 7 posts Kevin Burton: 6 posts Mahadev Konar: 6 posts Raghavendhra rahul: 6 posts Avery Ching: 5 posts John Miller: 5 posts Nitin Khandelwal: 5 posts Ann Pal: 4 posts Costin Leau: 4 posts Hadoop anis: 4 posts Keren Ouaknine: 4 posts Sri ram: 4 posts Bing Jiang: 3 posts
show more