Search Discussions

56 discussions - 167 posts

  • Hi Friends, I have to sort huge amount of data in minimum possible time probably using partitioning. The key is composed of 3 fields(partition, text and number). This is how partition is defined: - ...
    Piyush KansalPiyush Kansal
    Feb 20, 2012 at 2:09 am
    Feb 24, 2012 at 9:27 pm
  • Hi: Guys We just deliver a optimized hadoop , if you are interested, Pls refer to https://github.com/hanborq/hadoop -- Best Regards Anty Rao
    Feb 16, 2012 at 3:01 pm
    Feb 23, 2012 at 11:02 am
  • Hi All, I am using hadoop-0.19.2 and running a Mapper only Job on cluster. It's input path has 1000 files of 100-200MB. Since, it is Mapper only job, I gave number Of reducer=0. So, it is using 2 ...
    Thamizhannal ParamasivamThamizhannal Paramasivam
    Feb 16, 2012 at 6:41 am
    Feb 17, 2012 at 5:37 pm
  • Hi all i have an important question about mapreduce. i have 2 hadoop mapreduce jobs. job1 has only mapper but no reducer. Job1 started and in its map() it is writing to a "file1" using ...
    Vamshi KrishnaVamshi Krishna
    Feb 8, 2012 at 6:59 am
    Feb 11, 2012 at 1:55 pm
  • Hi, We have a job that outputs a set of files that are several hundred MB of text each. Using the comparators and such we can produce output files that are each sorted by themselves. What we want is ...
    Niels BasjesNiels Basjes
    Feb 28, 2012 at 8:11 pm
    Mar 1, 2012 at 2:23 pm
  • Hi all, I have to implement BillingEngine using MR jobs. My usecase is like this: I will be having data files of format <TimeStamp <Information for Billing . Now these datafiles will be containing ...
    Stuti AwasthiStuti Awasthi
    Feb 27, 2012 at 7:32 am
    Feb 28, 2012 at 8:32 pm
  • Hi Folks, Can anyone let me know the best way to do unit testing of map reduce job, particularly job which read/write to hbase??? Thanks in advance -- Regards Shuja-ur-Rehman Baig ...
    شجاع الرحمن بیگشجاع الرحمن بیگ
    Feb 26, 2012 at 7:34 pm
    Feb 28, 2012 at 11:46 am
  • Hi, I am benchmarking the cluster using the Terasort package of Hadoop 0.20.2. I enabled compression for both map output (*mapred.compress.map.output*) and reduce output (*mapred.output.compress*). I ...
    Juwei ShiJuwei Shi
    Feb 17, 2012 at 6:37 am
    Feb 17, 2012 at 3:22 pm
  • Hi. We are testing hadoop. We are using hadoop (0.20.2-cdh3u3). I am using the cotomized conf directory with -"-config mypath". I modified the log4j.properties file in this path, adding " ...
    Feb 27, 2012 at 4:35 pm
    Mar 16, 2012 at 8:59 pm
  • Hi, This about a typical pattern of map-reduce jobs, There are some map-reduce jobs in which map phase generates records which are more in number than its input, at reduce phase this data reduces a ...
    Ajit RatnaparkhiAjit Ratnaparkhi
    Feb 21, 2012 at 9:40 am
    Feb 25, 2012 at 3:24 pm
  • Would anyone recommend a good book or online resource that specifically teaches the .20 API as opposed to the .19 API? It's several years old and I've really dropped the ball on this one. As a ...
    Keith WileyKeith Wiley
    Feb 7, 2012 at 9:50 pm
    Feb 8, 2012 at 3:51 pm
  • Hi, In MapReduce the command bin/hadoop -job history <PATH_HISTORY only list the first job. How can I list the history of all jobs? -- Best regards,
    Pedro CostaPedro Costa
    Feb 28, 2012 at 1:38 pm
    Feb 28, 2012 at 3:55 pm
  • It seems nearly impossible to use CSV files as Hadoop input. I see that there is a CsvRecordInput class, but have found virtually no examples online of how to use it...and the one example I did find ...
    Keith WileyKeith Wiley
    Feb 22, 2012 at 7:01 pm
    Feb 23, 2012 at 12:13 am
  • Hello All, I wrote my own partitioner and I would like to see if it's working. By printing the return of method getPartition I could see that the partitions were different, but were they really ...
    Ext-fabio AlmeidaExt-fabio Almeida
    Feb 16, 2012 at 5:50 pm
    Feb 17, 2012 at 3:24 pm
  • Hi All, Is there some way to force the owner (user name) of a Job sent to a Hadoop cluster? I'm trying to use the following code when configuring the job: JobConf job = new JobConf(); ...
    Jose Luis SolerJose Luis Soler
    Feb 16, 2012 at 5:52 pm
    Feb 17, 2012 at 7:07 am
  • I have a question about the contract in the org.apache.hadoop.mapred .OutputCollector interface. If it matters, let us say we are talking about Hadoop-1.0.0. In my map or reduce method, after it ...
    Mike SpreitzerMike Spreitzer
    Feb 12, 2012 at 8:59 pm
    Feb 13, 2012 at 5:54 am
  • Hi, I am learning Hadoop. We have some special formated text file for input, so we need to write some customized inputFormat, probably based on FileInputFormat. Does the FileInputFormat respect the ...
    Feb 10, 2012 at 4:57 pm
    Feb 11, 2012 at 7:20 am
  • Hi, What is the suitable version of hbase that can be tested with hadoop yarn.
    Raghavendhra rahulRaghavendhra rahul
    Feb 7, 2012 at 4:25 am
    Feb 7, 2012 at 4:31 pm
  • Hey, I have a mapreduce job (transactions loader) and the main problem of it is "reduce- copy" and "reduce- sort" phase which takes all IO and uses all disk resources, what are the possible ways to ...
    Marek MiglinskiMarek Miglinski
    Feb 6, 2012 at 4:38 pm
    Feb 7, 2012 at 12:51 pm
  • How much namenode handler (dfs.namenode.handler.count) you have defined for your cluster? - Alex -- Alexander Lorenz http://mapredit.blogspot.com
    Alo altAlo alt
    Feb 1, 2012 at 11:54 am
    Feb 6, 2012 at 8:15 am
  • Hi 1- How does Hadoop decide where to save file blocks (I mean all files include files written by reducers)? Could you please give me a reference link?
    Alieh SaeediAlieh Saeedi
    Feb 4, 2012 at 7:46 am
    Feb 5, 2012 at 12:49 am
  • Hi. I'm trying yarn + security but still cannot make a mapred example runing. Can anyone help me to take a look? My env: - 3-slave cluster on ec2. Centos 5.5 - nn, dn, rm, nm all started, with ...
    Mingjie LaiMingjie Lai
    Feb 29, 2012 at 9:07 pm
    Mar 2, 2012 at 6:51 pm
  • HI: I packed a python module to "mypackage.tar.gz" and upload it to hdfs ,then visit the package with " -cacheArchive /app/mypackage.tar.gz#mypackage" But the python script failed to "import ...
    Devdoer birdDevdoer bird
    Feb 21, 2012 at 2:27 am
    Mar 2, 2012 at 4:24 am
  • Could someone give me some directions or examples of writing mapreduce and unit tests to test them? Also, need some help on how to set it up in eclipse.
    Mohit AnchliaMohit Anchlia
    Feb 21, 2012 at 12:04 am
    Feb 21, 2012 at 5:04 am
  • Hi, everybody. I'm having some difficulties, which I've traced to not having the Accumulo libraries and configuration available in my task JVMs. The most elegant solution -- especially since I will ...
    John ArmstrongJohn Armstrong
    Feb 16, 2012 at 2:49 pm
    Feb 16, 2012 at 3:35 pm
  • Hi, There are tow map-reduce jobs,which have same input file. They must read the input file double times. I want that the jobs read the file one time,and they can share the same in memory How can I ...
    Bruce WangBruce Wang
    Feb 25, 2012 at 1:35 pm
    Feb 27, 2012 at 3:31 pm
  • All of my current jobs and the wordcount I used when learning are all failing with the same error : Error reading task outputhttp:// ...
    Steve LewisSteve Lewis
    Feb 14, 2012 at 10:11 pm
    Feb 25, 2012 at 10:38 pm
  • Hello Mohit, I am looking at some hadoop tuning parameters like io.sort.mb, - My question was where to look at for current setting The default settings as well as the documentations can be found in ...
    Jie LiJie Li
    Feb 25, 2012 at 3:11 pm
    Feb 25, 2012 at 9:21 pm
  • Can I do something like this? FileInputFormat.*setInputPaths*(conf, *new* Path("hdfs://host:port/file")); Examples I am seeing are all using the dir/file. I haven't seen them using hdfs? If I am ...
    Mohit AnchliaMohit Anchlia
    Feb 21, 2012 at 10:37 pm
    Feb 22, 2012 at 7:06 am
  • Hi All, I wanted to experiment real time intensive data processing using map-reduce program. Can you share me the URL to where can I download real time data? Thanks tamil
    Thamizhannal ParamasivamThamizhannal Paramasivam
    Feb 15, 2012 at 7:21 pm
    Feb 15, 2012 at 7:23 pm
  • Hi all, i have a job which read all the rows from a hbase table and had written them to a location in dfs i.e /user/HSOP. HSOP is a folder which has 9 files each having their content as 00015DEGgJ ...
    Vamshi KrishnaVamshi Krishna
    Feb 14, 2012 at 2:59 pm
    Feb 14, 2012 at 3:29 pm
  • Hi, I have a hadoop job to read data from hbase. Each data node has a regional server. When the hadoop job was close to the end of the job, it became very slow as you can see from the timestamp of ...
    Jian FangJian Fang
    Feb 13, 2012 at 10:49 pm
    Feb 14, 2012 at 3:20 pm
  • Dear all, I am following the book, Hadoop: the Definitive Guide. However, I got stuck because I could not get the NCDC Weather data that is used by the source code in the book. The Appendix C told me ...
    Bing LiBing Li
    Feb 12, 2012 at 7:15 am
    Feb 12, 2012 at 8:51 am
  • Hi, all, I am starting to learn advanced Map/Reduce. However, I cannot find the class DataJoinMapperBase in my downloaded Hadoop 1.0.0 and 0.20.2. So I searched on the Web and get the following link. ...
    Bing LiBing Li
    Feb 10, 2012 at 7:40 pm
    Feb 10, 2012 at 11:49 pm
  • Dear all, When running some sample codes from Hadoop in Action, I got an IOException: Job Failed. Exception in thread "main" java.io.IOException: Job failed! at ...
    Bing LiBing Li
    Feb 9, 2012 at 2:06 pm
    Feb 9, 2012 at 3:55 pm
  • Hi people, Please, I don't know what is happening with values I'm putting into a MapWritable. I think I have to call write method (or readFields?) but I don't know how. Is it that I need to do to get ...
    Luiz Antonio Falaguasta BarbosaLuiz Antonio Falaguasta Barbosa
    Feb 6, 2012 at 1:55 am
    Feb 6, 2012 at 2:53 am
  • Is there way to signal mapreduce framework from mapper or reducer that I am not interested in any more input data? Currently i read rest of data but ignore them.
    Radim KolarRadim Kolar
    Feb 4, 2012 at 12:50 pm
    Feb 4, 2012 at 4:59 pm
  • Hi, This question is for mapreduce-user not hbase-user. +mapreduce-user bcc hbase-user Look at the configure method: ...
    Brock NolandBrock Noland
    Feb 29, 2012 at 2:24 pm
    Feb 29, 2012 at 2:24 pm
  • Hi, Some time ago I had an idea and implemented it. Normally you can only run a single gzipped input file through a single mapper and thus only on a single CPU core. What I created makes it possible ...
    Niels BasjesNiels Basjes
    Feb 28, 2012 at 3:51 pm
    Feb 28, 2012 at 3:51 pm
  • Hi, I am trying to use MultipleOutputs in Reducer. For that, I am trying to construct its object using in Reducer.setup() as follows: public static class MOReduce extends Reducer<Text, Integer, Text, ...
    Piyush KansalPiyush Kansal
    Feb 25, 2012 at 12:41 am
    Feb 25, 2012 at 12:41 am
  • Hello there, I got the following errors relative to disk problem. I checked the slave node which runs the task, only half of the disk space is used. I don't understand why that happens. The ...
    Feb 22, 2012 at 8:17 pm
    Feb 22, 2012 at 8:17 pm
  • Hello, So I have a two paths Mapreduce job with input of: /tmp/2012/2/19/input /tmp/2012/2/20/input One of those paths might be null, but other one is there... I need to process those inputs and get ...
    Marek MiglinskiMarek Miglinski
    Feb 21, 2012 at 9:59 am
    Feb 21, 2012 at 9:59 am
  • Hi all, Just finished running a job using Hadoop and Pig 0.9.1, pulling data out of a single Cassandra 1.0.7 column family. It completed successfully , but I'm seeing this exception on a ...
    Gabriel RosendorfGabriel Rosendorf
    Feb 16, 2012 at 4:39 pm
    Feb 16, 2012 at 4:39 pm
  • Hi, I am new to Hadoop and have just started to code mapreduce jobs. Can anyone provide me link to download latest plugin for Hadoop development in Eclipse 3.6+ I have googled a lot but all plugins I ...
    Utkarsh GuptaUtkarsh Gupta
    Feb 16, 2012 at 3:54 am
    Feb 16, 2012 at 3:54 am
  • RmattsteeleRmattsteele
    Feb 16, 2012 at 12:55 am
    Feb 16, 2012 at 12:55 am
  • Greetings!! This is Jinoj Mathew from HR Team of Narus Networks (Subsidiary of the Boeing Company), Bangalore. About Company: Narus Networks (http://www.narus.com) Narus is the 100% Subsidiary of The ...
    Feb 15, 2012 at 12:40 pm
    Feb 15, 2012 at 12:40 pm
  • Greetings!! This is Jinoj Mathew from HR Team of Narus Networks (Subsidiary of the Boeing Company), Bangalore. About Company: Narus Networks (http://www.narus.com) Narus is the 100% Subsidiary of The ...
    Feb 15, 2012 at 12:35 pm
    Feb 15, 2012 at 12:35 pm
  • Hi, I'm trying to extend the pipes interface as defined in Pipes.hh to support the read of binary input data. I believe that would mean extending the getInputValue() method of context to return char ...
    Charles EarlCharles Earl
    Feb 14, 2012 at 4:49 pm
    Feb 14, 2012 at 4:49 pm
  • Hi, I wanted to use Hbase Table as the source for my Hadoop Streaming MapReduce Jobs. However, the executable script (in Python) I am writing can only read data from STDIN. I found out that I need to ...
    Prakhar SrivastavaPrakhar Srivastava
    Feb 14, 2012 at 4:27 pm
    Feb 14, 2012 at 4:27 pm
  • Hello, I have a question regarding a problem with hadoop. At first, I'm using hadoop in version 0.20.2. I started a job yesterday for some calculations. The map and the reduce step are finished, each ...
    Matthias ZenglerMatthias Zengler
    Feb 14, 2012 at 1:17 pm
    Feb 14, 2012 at 1:17 pm
Group Navigation
period‹ prev | Feb 2012 | next ›
Group Overview
groupmapreduce-user @

65 users for February 2012

Harsh J: 20 posts Piyush Kansal: 8 posts Bejoy Ks: 5 posts Bing Li: 5 posts Brock Noland: 5 posts Joey Echeverria: 5 posts Schubert Zhang: 5 posts Thamizhannal Paramasivam: 5 posts Vamshi Krishna: 5 posts GUOJUN Zhu: 4 posts Jie Li: 4 posts Keith Wiley: 4 posts Mohit Anchlia: 4 posts Niels Basjes: 4 posts Robert Evans: 4 posts Anty: 3 posts Juwei Shi: 3 posts Marek Miglinski: 3 posts Mostafa Gaber: 3 posts Steve Lewis: 3 posts
show more