Search Discussions
-
Hi, I am trying to understand more about Hadoop Next Gen Map Reduce and had the following questions based on the following post: ...
Ann Pal
Jan 4, 2012 at 4:24 pm
Jan 6, 2012 at 9:13 pm -
Hi, I need to provide a lot of 3th party libraries (both java and native) and doing that using generic option parser (-libjars and -files arguments) is a little bit messy. I was wandering if it is ...
Samir Eljazovic
Jan 3, 2012 at 12:09 am
Jan 7, 2012 at 3:54 pm -
Hi I've upgraded my hadoop cluster to version 1.0.0. The upgrade process went relatively smoothly but it rendered the cluster inoperable due to errors in jobtrackers operation: # in job output Error ...
Marcin Cylke
Jan 31, 2012 at 11:22 am
Jan 31, 2012 at 10:09 pm -
Hi, an out of record question, but apart from hadoop which are the other distributed computing platforms? -- Regards, R.V.
Real great..
Jan 30, 2012 at 1:41 pm
Jan 31, 2012 at 3:23 am -
Hi I want to save reducers outputs like other files in Hadoop. Does NameNode keep any information about them? How can I do this? Or can I add a new component to Hadoop like NameNode and make ...
Aliyeh saeedi
Jan 29, 2012 at 6:05 am
Jan 30, 2012 at 8:02 am -
Hello, I'm really new to Hadoop and I was wondering if the MAP reduce programming model from Hadoop is a good choice only for processing large amount of data, from a file, database or a queue? Thanks!
Neo21 zerro
Jan 27, 2012 at 10:58 am
Jan 29, 2012 at 3:32 am -
Hi, I've been trying to test HBase 0.92 (prerelease) with 0.23.1-SNAPSHOT but have run into a couple of issues. Perhaps I'm doing something wrong. What I've done:  - Checked out Hadoop branch-0.23 ...
Andrew Purtell
Jan 11, 2012 at 2:12 am
Jan 14, 2012 at 7:19 pm -
Hi,
Raghavendhra rahul
Jan 11, 2012 at 8:35 am
Jan 12, 2012 at 8:20 am -
Hi people, I wrote this code to implemment per-term indexing (Ivory), like figure 4 of paper http://www.dcs.gla.ac.uk/~richardm/papers/IPM_MapReduce.pdf but it don't print anything into part-00000 ...
Luiz
Jan 29, 2012 at 12:51 am
Jan 31, 2012 at 4:09 pm -
I'm compiling a list of all Hadoop ecosystem/sub projects ordered alphabetically and I need your help if I missed something. 1. Ambari 2. Avro 3. Cascading 4. Cascalog 5. Cassandra 6. Chukwa 7. ...
Ayad Al-Qershi
Jan 28, 2012 at 3:59 pm
Jan 28, 2012 at 6:42 pm -
Hi people, Please, does somebody know where could I find an implementation of per term inverted indexing (Ivory), like that showed in figure 4 of paper ...
Luiz Antonio Falaguasta Barbosa
Jan 25, 2012 at 6:21 pm
Jan 26, 2012 at 11:54 am -
Hi, On our 0.20.205.0 test cluster we sometimes see tasks failing for no clear reason. The task tracker logs show us: 2012-01-03 21:16:27,256 WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve ...
Markus Jelsma
Jan 3, 2012 at 9:56 pm
Jan 23, 2012 at 2:28 pm -
Hello, Our Hadoop cluster is setup on EC2, but our client machine which will trigger the M/R job is in our data center. I am trying to start a M/R job from our client machine, but getting this: ...
Something Something
Jan 15, 2012 at 8:17 am
Jan 16, 2012 at 5:16 am -
Can someone explain how the map reduce merge is done? As far as I can tell, it appears to pull all of the spill files into one giant file to send to the reducer. Is this correct? Even if you set ...
Bai Shen
Jan 12, 2012 at 4:27 pm
Jan 13, 2012 at 1:33 pm -
Hi all, I am trying to write an application master.Is there a way to specify node1: 10 conatiners node2: 10 containers Can we specify this kind of list using the application master????
Raghavendhra rahul
Jan 10, 2012 at 6:13 am
Jan 11, 2012 at 8:59 am -
Hi all, I am trying to write an application master.Is there a way to specify node1: 10 conatiners node2: 10 containers Can we specify this kind of list using the application master???? Also i set ...
Raghavendhra rahul
Jan 6, 2012 at 11:45 am
Jan 11, 2012 at 1:37 am -
5
hadoop
Hello, We are trying to use Hadoop-0.20.203.0rc1 for parallel computation. Below are queries Assume single node of high configuration machine 8 cores and 8gb memory. (a) How do we know number of map ...Satish Setty (HCL Financial Services)
Jan 5, 2012 at 5:38 pm
Jan 10, 2012 at 4:23 am -
We would like to announce that YSmart Release 12.01 (effectively version 0.1) is available. YSmart is a software that translates an SQL query to Hadoop Java programs. Compared to other existing ...
Yin Huai
Jan 30, 2012 at 5:41 pm
Jan 31, 2012 at 4:38 pm -
While building hadoop trunk i came across the following error.Can any one guide me what is the issue behind this failure. main: [exec] protoc: error while loading shared libraries: libprotobuf.so.7: ...
Rajesh putta
Jan 23, 2012 at 7:31 pm
Jan 28, 2012 at 1:31 am -
Hello friends, I wrote a reduce() that receives a large dataset as a text values from the map(). The purpose of the reduce() is to compute the distance between each item in the values text. When I ...
Ahmed Abdeen Hamed
Jan 23, 2012 at 8:30 pm
Jan 24, 2012 at 2:50 am -
Hi, What is the minimum size of the container in hadoop yarn. capability.setmemory(xx);
Raghavendhra rahul
Jan 18, 2012 at 7:25 am
Jan 18, 2012 at 7:58 am -
Hello guys, 1. I have concern with my 3 node cluster, I run capacity scheduler with 4 queues and one has 30% of cluster resources, the problem is that when I schedule a job, all tasks are assigned to ...
Marek Miglinski
Jan 10, 2012 at 4:35 pm
Jan 12, 2012 at 9:40 am -
Hi, I had been going through the MRv2 documentation and have the following queries 1) Let's say that an InputSplit is on Node1 and Node2. Can the ApplicationMaster ask the ResourceManager for a ...
Praveen Sripati
Jan 5, 2012 at 4:30 pm
Jan 8, 2012 at 7:26 am -
Hi, I want to send some data and messages to all nodes after I run a MR job. then begin another job. Is there any straight way to broadcast data under hadoop framework?
Hamid Oliaei
Jan 30, 2012 at 12:11 pm
Aug 22, 2012 at 11:46 am -
I have a problem that needs to be solved by an iteration of MapReduce jobs, and in each iteration I need to start by doing an equijoin between a large constant dataset and the output of the previous ...
Mike Spreitzer
Jan 15, 2012 at 8:21 am
Jan 15, 2012 at 9:53 pm -
Hi Is the following the only steps to turn on capacity scheduler? [1] Edit conf/yarn-site.xml to include: yarn.resourcemanager.scheduler.class - ...
Ann Pal
Jan 10, 2012 at 8:47 pm
Jan 11, 2012 at 10:16 pm -
Hi I have some questions and I would be really grateful to know the answer. As I read in hadoop tutorial "the output files written by the Reducers are then left in HDFS for user use, either by ...
Aliyeh saeedi
Jan 1, 2012 at 2:34 pm
Jan 9, 2012 at 3:30 pm -
Hi, I am able to run 0.23 on a single node and trying to setup it on a cluster and getting errors. When I try to start the data nodes, I get the below errors. I have also tried adding `export ...
Praveen Sripati
Jan 6, 2012 at 5:52 pm
Jan 7, 2012 at 3:39 pm -
Hello, I have been working on profiling the performance of certain parts of Hadoop 0.20.203.0. For this reason, I have set up a simple cluster that uses one node as the Namenode/Jobtracker, and one ...
Sven Groot
Jan 27, 2012 at 6:25 am
Jan 31, 2012 at 9:22 pm -
Hi, I am learning Hadoop now. I am trying to write a customized inputformat. I found out that there are two FileInputFormat's in the package, org.apache.hadoop.mapred and ...
GUOJUN Zhu
Jan 30, 2012 at 6:21 pm
Jan 30, 2012 at 6:40 pm -
I am running a mapper job which generates a large number of output records for every input record. about 32,000,000,000 output records from about 150 mappers - each record about 200 bytes The job is ...
Steve Lewis
Jan 18, 2012 at 5:50 pm
Jan 26, 2012 at 4:23 pm -
Hi All, I am experimenting MapReduce program on Hadoop-0.19. This program has single input file with 7 records(later it can have many records on multiple files) and each input suppose to produce 11 ...
Thamizhannal Paramasivam
Jan 21, 2012 at 6:18 pm
Jan 24, 2012 at 11:44 am -
Hello friends, I am new to Apache MapReduce. I just wrote a program that processes 89 files, each of which is 10000 lines. The program runs a clustering algorithm of the contents of the 89 files. The ...
Ahmed Abdeen Hamed
Jan 18, 2012 at 9:30 pm
Jan 19, 2012 at 4:31 pm -
Hi Experts A quick question. I have quite a few map reduce jobs running on my cluster. One job's input itself has a large number of files, I'd like to know which split was processed by each map task ...
Bejoy Ks
Jan 16, 2012 at 1:16 pm
Jan 16, 2012 at 5:40 pm -
I am learning Hadoop and had a question about writing Map Reduce jobs in newer versions of Hadoop. Is Next Generation Map Reduce a change just for our system administrator or do I (developer) already ...
Dbadave85
Jan 13, 2012 at 3:32 pm
Jan 13, 2012 at 4:56 pm -
Hi, I am experiencing slow startup times when submitting a MapReduce job (single node cluster for testing purposes) and was hoping someone could confirm that this was expected behavior. Even a job ...
Scott Lindner
Jan 10, 2012 at 3:36 pm
Jan 11, 2012 at 3:51 am -
Hi, How to set the maximum number of containers to be executed in each node. So that at a time only that much of containers will be running in that node..
Raghavendhra rahul
Jan 10, 2012 at 12:07 pm
Jan 10, 2012 at 11:41 pm -
Two cleanup related questions: Can I execute context.write from the reduce/map cleanup phase? Should I expect cleanup to be killed when a task fail or killed(speculative execution)? The idea is to ...
Mefa Grut
Jan 10, 2012 at 1:51 pm
Jan 10, 2012 at 9:05 pm -
Hi I am going to save files written by reducers, but I wonder when the disk space of one node is fulfilled, what will do Hadoop? Does Hadoop put aside the node? Thank for attention
Aliyeh saeedi
Jan 10, 2012 at 10:38 am
Jan 10, 2012 at 3:08 pm -
Hi, I am trying to setup 0.23 on a cluster and am stuck with errors while starting the NodeManager. The slaves file is proper and I am able to do a password-less ssh from the master to the slaves. ...
Praveen Sripati
Jan 9, 2012 at 12:17 pm
Jan 10, 2012 at 3:03 pm -
Hi, We sometimes see tasks failing with the exception below. There are no network issues and the domainname resolves normally. Also, all nodes have a local DNS caching daemon running. Any idea why we ...
Markus Jelsma
Jan 6, 2012 at 2:23 pm
Jan 7, 2012 at 1:01 pm -
I tried connecting to a 0.20.205 hadoop cluster and use the methods on JobClient to query the JobTracker status and get a list of jobs, etc. Much like the JobTracker Web UI shows. Code is: ...
Joseph McMahon
Jan 19, 2012 at 8:16 pm
Feb 1, 2012 at 1:33 pm -
I have a problem at hand that seems to need "local" reducing: I have a large data input, in which each line is a data mapping, something like "name : attribute". The attributes for the same name are ...
Jianhui Zhang
Jan 29, 2012 at 8:08 am
Feb 1, 2012 at 1:28 pm -
Hi Does Hadoop behave with reducer's output like other files in the case of replication and keeping their metadata in NameNode?
Aliyeh saeedi
Jan 30, 2012 at 6:37 am
Jan 30, 2012 at 6:46 am -
Hi, How is the reduce node choosen in 0.23? What parameters determine choosing the reduce node. Does it depend on map node placement? Thanks!
Ann Pal
Jan 19, 2012 at 6:10 pm
Jan 19, 2012 at 7:06 pm -
Hi, I have a question regarding reduce functionality. A reduce function receives key and list of values as argument, is there any limit on count of value elements in value list which is received as ...
Ajit Ratnaparkhi
Jan 19, 2012 at 7:09 am
Jan 19, 2012 at 7:23 am -
I understand that normally map tasks are run close to the input files. but in my application, the input file is a txt file with many lines of query param, and the mapper reads out each line, use the ...
Yang
Jan 17, 2012 at 9:02 pm
Jan 18, 2012 at 7:35 am -
Hi, I receive a DataBag in my custom UDF and want to sort it by first field in Tuples it stores. The way I implemented is: I create List of Tuples and add all Tuples from DataBag to List and then use ...
Marek Miglinski
Jan 17, 2012 at 4:29 pm
Jan 17, 2012 at 6:43 pm -
Hi, We have a job that is IO bound. The mapper aggregates the keys and the reducer has to lookup the incoming keys externally. If this runs serially with 15 reducers it takes many days so we are ...
Markus Jelsma
Jan 16, 2012 at 3:24 pm
Jan 16, 2012 at 5:09 pm -
Having looked at a few releases of Hadoop, I am surprised to find that in most of them the CompositeInputFormat class is in mapred but not mapreduce. While there is a CompositeInputFormat under ...
Mike Spreitzer
Jan 15, 2012 at 7:49 am
Jan 16, 2012 at 5:20 am
Group Overview
group | mapreduce-user |
categories | hadoop |
discussions | 83 |
posts | 284 |
users | 80 |
website | hadoop.apache.org... |
irc | #hadoop |
80 users for January 2012
Archives
- February 2013 (251)
- January 2013 (868)
- December 2012 (621)
- November 2012 (742)
- October 2012 (868)
- September 2012 (733)
- August 2012 (1,082)
- July 2012 (226)
- June 2012 (135)
- May 2012 (102)
- April 2012 (180)
- March 2012 (164)
- February 2012 (167)
- January 2012 (284)
- December 2011 (249)
- November 2011 (201)
- October 2011 (130)
- September 2011 (310)
- August 2011 (168)
- July 2011 (207)
- June 2011 (241)
- May 2011 (225)
- April 2011 (157)
- March 2011 (146)
- February 2011 (174)
- January 2011 (226)
- December 2010 (166)
- November 2010 (135)
- October 2010 (126)
- September 2010 (145)
- August 2010 (128)
- July 2010 (121)
- June 2010 (136)
- May 2010 (82)
- April 2010 (108)
- March 2010 (62)
- February 2010 (59)
- January 2010 (95)
- December 2009 (46)
- November 2009 (45)
- October 2009 (75)
- September 2009 (24)
- August 2009 (30)
- July 2009 (15)