FAQ

Search Discussions

83 discussions - 284 posts

  • Hi, I am trying to understand more about Hadoop Next Gen Map Reduce and had the following questions based on the following post: ...
    Ann PalAnn Pal
    Jan 4, 2012 at 4:24 pm
    Jan 6, 2012 at 9:13 pm
  • Hi, I need to provide a lot of 3th party libraries (both java and native) and doing that using generic option parser (-libjars and -files arguments) is a little bit messy. I was wandering if it is ...
    Samir EljazovicSamir Eljazovic
    Jan 3, 2012 at 12:09 am
    Jan 7, 2012 at 3:54 pm
  • Hi I've upgraded my hadoop cluster to version 1.0.0. The upgrade process went relatively smoothly but it rendered the cluster inoperable due to errors in jobtrackers operation: # in job output Error ...
    Marcin CylkeMarcin Cylke
    Jan 31, 2012 at 11:22 am
    Jan 31, 2012 at 10:09 pm
  • Hi, an out of record question, but apart from hadoop which are the other distributed computing platforms? -- Regards, R.V.
    Real great..Real great..
    Jan 30, 2012 at 1:41 pm
    Jan 31, 2012 at 3:23 am
  • Hi I want to save reducers outputs like other files in Hadoop. Does NameNode keep any information about them? How can I do this? Or can I add a new component to Hadoop like NameNode and make ...
    Aliyeh saeediAliyeh saeedi
    Jan 29, 2012 at 6:05 am
    Jan 30, 2012 at 8:02 am
  • Hello, I'm really new to Hadoop and I was wondering if the MAP reduce programming model from Hadoop is a good choice only for processing large amount of data, from a file, database or a queue? Thanks!
    Neo21 zerroNeo21 zerro
    Jan 27, 2012 at 10:58 am
    Jan 29, 2012 at 3:32 am
  • Hi, I've been trying to test HBase 0.92 (prerelease) with 0.23.1-SNAPSHOT but have run into a couple of issues. Perhaps I'm doing something wrong. What I've done:  - Checked out Hadoop branch-0.23 ...
    Andrew PurtellAndrew Purtell
    Jan 11, 2012 at 2:12 am
    Jan 14, 2012 at 7:19 pm
  • Hi,
    Raghavendhra rahulRaghavendhra rahul
    Jan 11, 2012 at 8:35 am
    Jan 12, 2012 at 8:20 am
  • Hi people, I wrote this code to implemment per-term indexing (Ivory), like figure 4 of paper http://www.dcs.gla.ac.uk/~richardm/papers/IPM_MapReduce.pdf but it don't print anything into part-00000 ...
    LuizLuiz
    Jan 29, 2012 at 12:51 am
    Jan 31, 2012 at 4:09 pm
  • I'm compiling a list of all Hadoop ecosystem/sub projects ordered alphabetically and I need your help if I missed something. 1. Ambari 2. Avro 3. Cascading 4. Cascalog 5. Cassandra 6. Chukwa 7. ...
    Ayad Al-QershiAyad Al-Qershi
    Jan 28, 2012 at 3:59 pm
    Jan 28, 2012 at 6:42 pm
  • Hi people, Please, does somebody know where could I find an implementation of per term inverted indexing (Ivory), like that showed in figure 4 of paper ...
    Luiz Antonio Falaguasta BarbosaLuiz Antonio Falaguasta Barbosa
    Jan 25, 2012 at 6:21 pm
    Jan 26, 2012 at 11:54 am
  • Hi, On our 0.20.205.0 test cluster we sometimes see tasks failing for no clear reason. The task tracker logs show us: 2012-01-03 21:16:27,256 WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve ...
    Markus JelsmaMarkus Jelsma
    Jan 3, 2012 at 9:56 pm
    Jan 23, 2012 at 2:28 pm
  • Hello, Our Hadoop cluster is setup on EC2, but our client machine which will trigger the M/R job is in our data center. I am trying to start a M/R job from our client machine, but getting this: ...
    Something SomethingSomething Something
    Jan 15, 2012 at 8:17 am
    Jan 16, 2012 at 5:16 am
  • Can someone explain how the map reduce merge is done? As far as I can tell, it appears to pull all of the spill files into one giant file to send to the reducer. Is this correct? Even if you set ...
    Bai ShenBai Shen
    Jan 12, 2012 at 4:27 pm
    Jan 13, 2012 at 1:33 pm
  • Hi all, I am trying to write an application master.Is there a way to specify node1: 10 conatiners node2: 10 containers Can we specify this kind of list using the application master????
    Raghavendhra rahulRaghavendhra rahul
    Jan 10, 2012 at 6:13 am
    Jan 11, 2012 at 8:59 am
  • Hi all, I am trying to write an application master.Is there a way to specify node1: 10 conatiners node2: 10 containers Can we specify this kind of list using the application master???? Also i set ...
    Raghavendhra rahulRaghavendhra rahul
    Jan 6, 2012 at 11:45 am
    Jan 11, 2012 at 1:37 am
  • Hello, We are trying to use Hadoop-0.20.203.0rc1 for parallel computation. Below are queries Assume single node of high configuration machine 8 cores and 8gb memory. (a) How do we know number of map ...
    Satish Setty (HCL Financial Services)Satish Setty (HCL Financial Services)
    Jan 5, 2012 at 5:38 pm
    Jan 10, 2012 at 4:23 am
  • We would like to announce that YSmart Release 12.01 (effectively version 0.1) is available. YSmart is a software that translates an SQL query to Hadoop Java programs. Compared to other existing ...
    Yin HuaiYin Huai
    Jan 30, 2012 at 5:41 pm
    Jan 31, 2012 at 4:38 pm
  • While building hadoop trunk i came across the following error.Can any one guide me what is the issue behind this failure. main: [exec] protoc: error while loading shared libraries: libprotobuf.so.7: ...
    Rajesh puttaRajesh putta
    Jan 23, 2012 at 7:31 pm
    Jan 28, 2012 at 1:31 am
  • Hello friends, I wrote a reduce() that receives a large dataset as a text values from the map(). The purpose of the reduce() is to compute the distance between each item in the values text. When I ...
    Ahmed Abdeen HamedAhmed Abdeen Hamed
    Jan 23, 2012 at 8:30 pm
    Jan 24, 2012 at 2:50 am
  • Hi, What is the minimum size of the container in hadoop yarn. capability.setmemory(xx);
    Raghavendhra rahulRaghavendhra rahul
    Jan 18, 2012 at 7:25 am
    Jan 18, 2012 at 7:58 am
  • Hello guys, 1. I have concern with my 3 node cluster, I run capacity scheduler with 4 queues and one has 30% of cluster resources, the problem is that when I schedule a job, all tasks are assigned to ...
    Marek MiglinskiMarek Miglinski
    Jan 10, 2012 at 4:35 pm
    Jan 12, 2012 at 9:40 am
  • Hi, I had been going through the MRv2 documentation and have the following queries 1) Let's say that an InputSplit is on Node1 and Node2. Can the ApplicationMaster ask the ResourceManager for a ...
    Praveen SripatiPraveen Sripati
    Jan 5, 2012 at 4:30 pm
    Jan 8, 2012 at 7:26 am
  • Hi, I want to send some data and messages to all nodes after I run a MR job. then begin another job. Is there any straight way to broadcast data under hadoop framework?
    Hamid OliaeiHamid Oliaei
    Jan 30, 2012 at 12:11 pm
    Aug 22, 2012 at 11:46 am
  • I have a problem that needs to be solved by an iteration of MapReduce jobs, and in each iteration I need to start by doing an equijoin between a large constant dataset and the output of the previous ...
    Mike SpreitzerMike Spreitzer
    Jan 15, 2012 at 8:21 am
    Jan 15, 2012 at 9:53 pm
  • Hi Is the following the only steps to turn on capacity scheduler? [1] Edit conf/yarn-site.xml to include: yarn.resourcemanager.scheduler.class - ...
    Ann PalAnn Pal
    Jan 10, 2012 at 8:47 pm
    Jan 11, 2012 at 10:16 pm
  • Hi I have some questions and I would be really grateful to know the answer. As I read in hadoop tutorial "the output files written by the Reducers are then left in HDFS for user use, either by ...
    Aliyeh saeediAliyeh saeedi
    Jan 1, 2012 at 2:34 pm
    Jan 9, 2012 at 3:30 pm
  • Hi, I am able to run 0.23 on a single node and trying to setup it on a cluster and getting errors. When I try to start the data nodes, I get the below errors. I have also tried adding `export ...
    Praveen SripatiPraveen Sripati
    Jan 6, 2012 at 5:52 pm
    Jan 7, 2012 at 3:39 pm
  • Hello, I have been working on profiling the performance of certain parts of Hadoop 0.20.203.0. For this reason, I have set up a simple cluster that uses one node as the Namenode/Jobtracker, and one ...
    Sven GrootSven Groot
    Jan 27, 2012 at 6:25 am
    Jan 31, 2012 at 9:22 pm
  • Hi, I am learning Hadoop now. I am trying to write a customized inputformat. I found out that there are two FileInputFormat's in the package, org.apache.hadoop.mapred and ...
    GUOJUN ZhuGUOJUN Zhu
    Jan 30, 2012 at 6:21 pm
    Jan 30, 2012 at 6:40 pm
  • I am running a mapper job which generates a large number of output records for every input record. about 32,000,000,000 output records from about 150 mappers - each record about 200 bytes The job is ...
    Steve LewisSteve Lewis
    Jan 18, 2012 at 5:50 pm
    Jan 26, 2012 at 4:23 pm
  • Hi All, I am experimenting MapReduce program on Hadoop-0.19. This program has single input file with 7 records(later it can have many records on multiple files) and each input suppose to produce 11 ...
    Thamizhannal ParamasivamThamizhannal Paramasivam
    Jan 21, 2012 at 6:18 pm
    Jan 24, 2012 at 11:44 am
  • Hello friends, I am new to Apache MapReduce. I just wrote a program that processes 89 files, each of which is 10000 lines. The program runs a clustering algorithm of the contents of the 89 files. The ...
    Ahmed Abdeen HamedAhmed Abdeen Hamed
    Jan 18, 2012 at 9:30 pm
    Jan 19, 2012 at 4:31 pm
  • Hi Experts A quick question. I have quite a few map reduce jobs running on my cluster. One job's input itself has a large number of files, I'd like to know which split was processed by each map task ...
    Bejoy KsBejoy Ks
    Jan 16, 2012 at 1:16 pm
    Jan 16, 2012 at 5:40 pm
  • I am learning Hadoop and had a question about writing Map Reduce jobs in newer versions of Hadoop. Is Next Generation Map Reduce a change just for our system administrator or do I (developer) already ...
    Dbadave85Dbadave85
    Jan 13, 2012 at 3:32 pm
    Jan 13, 2012 at 4:56 pm
  • Hi, I am experiencing slow startup times when submitting a MapReduce job (single node cluster for testing purposes) and was hoping someone could confirm that this was expected behavior. Even a job ...
    Scott LindnerScott Lindner
    Jan 10, 2012 at 3:36 pm
    Jan 11, 2012 at 3:51 am
  • Hi, How to set the maximum number of containers to be executed in each node. So that at a time only that much of containers will be running in that node..
    Raghavendhra rahulRaghavendhra rahul
    Jan 10, 2012 at 12:07 pm
    Jan 10, 2012 at 11:41 pm
  • Two cleanup related questions: Can I execute context.write from the reduce/map cleanup phase? Should I expect cleanup to be killed when a task fail or killed(speculative execution)? The idea is to ...
    Mefa GrutMefa Grut
    Jan 10, 2012 at 1:51 pm
    Jan 10, 2012 at 9:05 pm
  • Hi I am going to save files written by reducers, but I wonder when the disk space of one node is fulfilled, what will do Hadoop? Does Hadoop put aside the node? Thank for attention
    Aliyeh saeediAliyeh saeedi
    Jan 10, 2012 at 10:38 am
    Jan 10, 2012 at 3:08 pm
  • Hi, I am trying to setup 0.23 on a cluster and am stuck with errors while starting the NodeManager. The slaves file is proper and I am able to do a password-less ssh from the master to the slaves. ...
    Praveen SripatiPraveen Sripati
    Jan 9, 2012 at 12:17 pm
    Jan 10, 2012 at 3:03 pm
  • Hi, We sometimes see tasks failing with the exception below. There are no network issues and the domainname resolves normally. Also, all nodes have a local DNS caching daemon running. Any idea why we ...
    Markus JelsmaMarkus Jelsma
    Jan 6, 2012 at 2:23 pm
    Jan 7, 2012 at 1:01 pm
  • I tried connecting to a 0.20.205 hadoop cluster and use the methods on JobClient to query the JobTracker status and get a list of jobs, etc. Much like the JobTracker Web UI shows. Code is: ...
    Joseph McMahonJoseph McMahon
    Jan 19, 2012 at 8:16 pm
    Feb 1, 2012 at 1:33 pm
  • I have a problem at hand that seems to need "local" reducing: I have a large data input, in which each line is a data mapping, something like "name : attribute". The attributes for the same name are ...
    Jianhui ZhangJianhui Zhang
    Jan 29, 2012 at 8:08 am
    Feb 1, 2012 at 1:28 pm
  • Hi Does Hadoop behave with reducer's output like other files in the case of replication and keeping their metadata in NameNode?
    Aliyeh saeediAliyeh saeedi
    Jan 30, 2012 at 6:37 am
    Jan 30, 2012 at 6:46 am
  • Hi, How is the reduce node choosen in 0.23? What parameters determine choosing the reduce node. Does it depend on map node placement? Thanks!
    Ann PalAnn Pal
    Jan 19, 2012 at 6:10 pm
    Jan 19, 2012 at 7:06 pm
  • Hi, I have a question regarding reduce functionality. A reduce function receives key and list of values as argument, is there any limit on count of value elements in value list which is received as ...
    Ajit RatnaparkhiAjit Ratnaparkhi
    Jan 19, 2012 at 7:09 am
    Jan 19, 2012 at 7:23 am
  • I understand that normally map tasks are run close to the input files. but in my application, the input file is a txt file with many lines of query param, and the mapper reads out each line, use the ...
    YangYang
    Jan 17, 2012 at 9:02 pm
    Jan 18, 2012 at 7:35 am
  • Hi, I receive a DataBag in my custom UDF and want to sort it by first field in Tuples it stores. The way I implemented is: I create List of Tuples and add all Tuples from DataBag to List and then use ...
    Marek MiglinskiMarek Miglinski
    Jan 17, 2012 at 4:29 pm
    Jan 17, 2012 at 6:43 pm
  • Hi, We have a job that is IO bound. The mapper aggregates the keys and the reducer has to lookup the incoming keys externally. If this runs serially with 15 reducers it takes many days so we are ...
    Markus JelsmaMarkus Jelsma
    Jan 16, 2012 at 3:24 pm
    Jan 16, 2012 at 5:09 pm
  • Having looked at a few releases of Hadoop, I am surprised to find that in most of them the CompositeInputFormat class is in mapred but not mapreduce. While there is a CompositeInputFormat under ...
    Mike SpreitzerMike Spreitzer
    Jan 15, 2012 at 7:49 am
    Jan 16, 2012 at 5:20 am
Group Navigation
period‹ prev | Jan 2012 | next ›
Group Overview
groupmapreduce-user @
categorieshadoop
discussions83
posts284
users80
websitehadoop.apache.org...
irc#hadoop

80 users for January 2012

Raghavendhra rahul: 24 posts Harsh J: 23 posts Arun C Murthy: 18 posts Markus Jelsma: 14 posts Bejoy Ks: 10 posts Praveen Sripati: 10 posts Aliyeh saeedi: 8 posts Robert Evans: 8 posts Ronald Petty: 8 posts Ashwanth Kumar: 7 posts Bai Shen: 7 posts Ann Pal: 6 posts Bing Jiang: 6 posts Luiz Antonio Falaguasta Barbosa: 6 posts Steve Lewis: 6 posts Ahmed Abdeen Hamed: 5 posts Real great..: 5 posts Samir Eljazovic: 5 posts Andrew Purtell: 4 posts Eyal Golan: 4 posts
show more