Search Discussions

72 discussions - 174 posts

  • Hello users of hadoop, I have a task to convert large binary files from one format to another. I am wondering what is the best practice to do this. Basically, I am trying to get one mapper to work on ...
    Felix gaoFelix gao
    Feb 8, 2011 at 1:07 am
    Feb 10, 2011 at 5:43 am
  • Hi, Very trivial question. Which is the easiest way to install hadoop? i mean which distribution should i go for?? apache or cloudera? n which is the easiest os for hadoop? -- Regards, R.V.
    Real great..Real great..
    Feb 23, 2011 at 3:43 am
    Feb 23, 2011 at 6:15 pm
  • Dear all, I am writing a map-reduce job today. Which I hope I could use different format for the Mapper and Combiner. I am using the Text as the format of the Mapper and MapWritable as the format of ...
    Stanley XuStanley Xu
    Feb 16, 2011 at 11:02 am
    Feb 17, 2011 at 7:02 am
  • I'm outputting a small amount of secondary summary information from a map task that I want to use in the reduce phase of the job. This information is keyed on a custom input split index. Each map ...
    Feb 13, 2011 at 8:18 pm
    Feb 14, 2011 at 1:44 pm
  • when i try to compile mapred of 0.21.0 I got error: [ivy:resolve] :: problems summary :: [ivy:resolve] :::: WARNINGS [ivy:resolve] module not found: org.apache.hadoop#hadoop-common;0.21.0 ...
    Feb 24, 2011 at 1:49 pm
    Feb 25, 2011 at 3:01 am
  • Hello, I'm testing hadoop and hbase, I can run mapreduce streaming or pipes jobs agains text files on hadoop, but I have a problem when I try to run the same job against hbase table. The table looks ...
    Ondrej HolecekOndrej Holecek
    Feb 18, 2011 at 2:06 pm
    Feb 22, 2011 at 8:07 pm
  • unsubscribe
    Dan IrishDan Irish
    Feb 14, 2011 at 2:44 pm
    Feb 19, 2011 at 3:39 pm
  • I wanted to input from s3 but output to someplace else in aws with elastic mapreduce. Their docs seem to only suggest that they only read from/write to s3. Is that correct?
    Jeremy HannaJeremy Hanna
    Feb 1, 2011 at 6:49 pm
    Feb 4, 2011 at 7:22 am
  • Hi, I have a use case to upload some tera-bytes of text files as sequences files on HDFS. These text files have several layouts ranging from 32 to 62 columns (metadata). What would be a good way to ...
    Mapred LearnMapred Learn
    Feb 17, 2011 at 9:16 pm
    Mar 8, 2011 at 12:22 am
  • Anyone know why I would be getting an error doing a filesystem.open on a file with a s3n prefix? for the input path "s3n://backlog.dev/1296648900000/" - I get the following stacktrace: ...
    Jeremy HannaJeremy Hanna
    Feb 9, 2011 at 11:24 pm
    Feb 10, 2011 at 2:20 am
  • Hi, How do you delete a directory in HAdoop? Contents are inside it. -- Regards, R.V.
    Real great..Real great..
    Feb 28, 2011 at 6:01 am
    Feb 28, 2011 at 10:20 am
  • Hi, 1 - The map output files are always of the type SequenceFileFormat? 2 - The means that it contains a header with the following files? # version - A byte array: 3 bytes of magic header 'SEQ', ...
    Pedro CostaPedro Costa
    Feb 14, 2011 at 3:22 pm
    Feb 14, 2011 at 6:17 pm
  • Hi guys, If I have a text file of 10 GB and I want to convert it to sequence file using map-reduce and make filesplits of 1 GB each so that 10 mappers work in parallel on it and convert it to ...
    Mapred LearnMapred Learn
    Feb 26, 2011 at 1:52 am
    Feb 28, 2011 at 5:51 pm
  • I have setup a cluster with several machines up and running. But I encounter a problem that my mapper reducer class does not log. The hadoop version I use is 0.20.2. The rootLogger in ...
    Thomas AndersonThomas Anderson
    Feb 23, 2011 at 12:58 pm
    Feb 25, 2011 at 5:45 am
  • Hi, As i guess, Hadoop creates the default dfs in temp directory. I tried changing it by editing the hdfs-site.xml to: ?xml version="1.0"? <?xml-stylesheet type="text/xsl" href="configuration.xsl"? ...
    Real great..Real great..
    Feb 25, 2011 at 12:32 am
    Feb 25, 2011 at 2:50 am
  • Hi all, Hadoop JobTracker's http info server provides running/failed/completed job informations on the web through jobtracker.jsp. Lines below show the logic how the web retries those informations ...
    Min ZhouMin Zhou
    Feb 15, 2011 at 3:17 am
    Feb 16, 2011 at 9:27 pm
  • Hi all, It is possible somehow to specify some taskTracker specific properties. Namely I would like to describe in a config file for each taskTracker some properties and to play with them at runtime. ...
    Robert GrandlRobert Grandl
    Feb 10, 2011 at 5:43 pm
    Feb 10, 2011 at 5:59 pm
  • Hi all, I'm interested in creating a solution that leverages multiple computing nodes in an EC2 or Rackspace cloud environment in order to do massively parallelized processing in the context of ...
    Zachary KozickZachary Kozick
    Feb 1, 2011 at 9:21 pm
    Feb 1, 2011 at 10:01 pm
  • Dear all, I am writing a Map-Reduce task to go through a HBase table to re-calculate the entries stored in it daily. The number of entries would be hundreds of millions. I use the TableMapper as the ...
    Stanley XuStanley Xu
    Feb 28, 2011 at 2:51 am
    Feb 28, 2011 at 6:20 am
  • Hi I want to configure a map only job where i need to read from hbase table 1 and do some processing in mapper and then save to some other hbase table and i do not need reducer for it. i have ...
    Shuja RehmanShuja Rehman
    Feb 23, 2011 at 7:55 pm
    Feb 24, 2011 at 3:53 am
  • Hi All, I have a simple question. I have a arraylist which i am populating through db in main class. now i want to use the same list in my map and reduce class so the question is how to access/send ...
    Shuja RehmanShuja Rehman
    Feb 21, 2011 at 6:57 pm
    Feb 21, 2011 at 7:01 pm
  • Hi, I like to know, depending on my problem, when should I use or not use Hadoop MapReduce? Does exist any list that advices me to use or not to use MapReduce? Thanks, -- Pedro
    Pedro CostaPedro Costa
    Feb 17, 2011 at 5:36 pm
    Feb 18, 2011 at 9:49 pm
  • I use DistributedCache to add two files to class path, exampe below code : String jeJarPath = "/group/aladdin/lib/je-4.1.7.jar"; DistributedCache.addFileToClassPath(new Path(jeJarPath), conf); String ...
    Lei liuLei liu
    Feb 17, 2011 at 3:51 am
    Feb 17, 2011 at 4:00 am
  • Hi, I have to upload some terabytes of data that is text files. What would be good option to do so: i) using hadoop fs -put to copy text files directly on hdfs. ii) copying text files as sequence ...
    Mapred LearnMapred Learn
    Feb 17, 2011 at 12:24 am
    Feb 17, 2011 at 2:33 am
  • Hi, I have a couple of questions: 1. What is the best way to create a composed MapReduce job in the 20.2 API? Can you use JobControl, which is still located in the mapred namespace, or is it better ...
    Joachim Van den BogaertJoachim Van den Bogaert
    Feb 16, 2011 at 8:32 am
    Feb 16, 2011 at 10:34 am
  • Hi, When the compression in on, the compressed map intermediate files are transfered to the reduce side as compressed data? Thanks, -- Pedro
    Pedro CostaPedro Costa
    Feb 15, 2011 at 3:12 pm
    Feb 15, 2011 at 5:00 pm
  • Hello all, I have been using Hadoop on physical machine for sometime now. But recently I tried to run the same hadoop jobs on the Raskspace cloud and I am not yet successful. My input file has 150M ...
    Praveen PeddiPraveen Peddi
    Feb 10, 2011 at 9:41 pm
    Feb 15, 2011 at 2:10 pm
  • Hi, I run two examples of a MR execution with the same input files and with 3 Reduce tasks defined. One example has the map-intermediate files compressed, and the other examples has uncompressed ...
    Pedro CostaPedro Costa
    Feb 15, 2011 at 10:36 am
    Feb 15, 2011 at 10:49 am
  • Hi, 1 - How do I get the name of the map tasks the ran in the command line? 2 - How do I get the start time and the end time of a map task in the command line? -- Pedro
    Pedro CostaPedro Costa
    Feb 13, 2011 at 4:45 pm
    Feb 13, 2011 at 8:54 pm
  • Hi, 1 - When a Map task is taking too long to finish its process, the JT launches another Map task to process. This means that the task that was replaced is killed? 2 - Does Hadoop MR allows that the ...
    Pedro CostaPedro Costa
    Feb 12, 2011 at 4:05 pm
    Feb 12, 2011 at 6:01 pm
  • Hi, I've two jobs and I'm trying to control them by ControlledJob. job2 depends on job1 and the job2's input is the job1's output so when i do this: cjob1 = new ControlledJob(job1, null); ...
    Feb 10, 2011 at 7:34 pm
    Feb 11, 2011 at 8:55 am
  • Hi there, I installed Hadoop on my Windows 7 machine for local development. The installation went fine. Now I was trying to compile the WordCount v1.0 example ...
    Lai WillLai Will
    Feb 10, 2011 at 10:01 pm
    Feb 10, 2011 at 10:39 pm
  • Hi,It try to do a self-join on a file using MultipleInputs on hadoop 0.21.0. A self-join is when you join a file with itself(for example, if you want to dereference the idrefs in an XML document). I ...
    Leonidas FegarasLeonidas Fegaras
    Feb 10, 2011 at 7:21 pm
    Feb 10, 2011 at 10:23 pm
  • Hi, In hadoop MR exists the property "mapred.system.dir" to set a relative directory where shared files are stored during a job run. What are these shared files? -- Pedro
    Pedro CostaPedro Costa
    Feb 8, 2011 at 12:05 pm
    Feb 9, 2011 at 3:27 am
  • Hi, When hadoop is running in cluster, the output of the Reducers are saved in HDFS. The MapReduce have also location awareness on where is saved the data? For example, we've TT1 running in Machine1, ...
    Pedro CostaPedro Costa
    Feb 4, 2011 at 3:07 pm
    Feb 8, 2011 at 10:10 pm
  • Hi all, I am trying to submit jobs to different queues in hadoop-0.20.2 I configured conf/mapred-site.xml <property <name mapred.queue.names</name <value queue1, queue2</value </property Then I was ...
    Robert GrandlRobert Grandl
    Feb 5, 2011 at 1:52 pm
    Feb 7, 2011 at 1:25 pm
  • Perhaps this has been covered before, but I wasn't able to dig up any info. Is there any way to run a custom "job cleanup" for a map/reduce job? I know that each map and reduce has a cleanup method, ...
    David RosenstrauchDavid Rosenstrauch
    Feb 3, 2011 at 4:39 pm
    Feb 3, 2011 at 6:54 pm
  • Hi, setup( ), method present in mapred api, is called once at the start of each map/reduce task. Is it the same with configure( ) method present in mapreduce api ? Thanks, Giridhar.
    Giridhar AddepalliGiridhar Addepalli
    Feb 2, 2011 at 9:28 am
    Feb 3, 2011 at 4:10 am
  • Hi, I'm running the wordcount example, but I would like compress the map output. I set the following properties in the mapred-site.xml [code] <property <name mapred.compress.map.output</name <value ...
    Pedro CostaPedro Costa
    Feb 2, 2011 at 1:50 pm
    Feb 2, 2011 at 3:19 pm
  • Hi, I'm trying to read the map output on the reduce side LocalFileSytem class. The map output is on the local file system. The problem with that, it's because it throws a ChecksumException. I know ...
    Pedro CostaPedro Costa
    Feb 2, 2011 at 10:30 am
    Feb 2, 2011 at 11:40 am
  • Hi, Hadoop uses the compressed length and the raw length. 1 - In my example, the RT is fetching a map output that shows that it has the raw length of 14 bytes and the partLength of 10 bytes. The map ...
    Pedro CostaPedro Costa
    Feb 1, 2011 at 11:43 am
    Feb 1, 2011 at 12:19 pm
  • Hello all, I have few mapreduce jobs that I am calling from a java driver. The problem I am facing is that when there is an exception in mapred job, the exception is not propogated to the client so ...
    Praveen PeddiPraveen Peddi
    Feb 25, 2011 at 4:01 pm
    Feb 25, 2011 at 4:01 pm
  • Does a map task finish after generating the map intermediate file? Thanks, -- Pedro
    Pedro CostaPedro Costa
    Feb 23, 2011 at 6:39 pm
    Feb 23, 2011 at 6:39 pm
  • ----- Forwarded Message ----- From: Zhengguo 'Mike' SUN <zhengguosun@yahoo.com To: mahout-user <user@mahout.apache.org Cc: mahout-dev <mahout-dev@lucene.apache.org Sent: Monday, February 21, 2011 ...
    Zhengguo 'Mike' SUNZhengguo 'Mike' SUN
    Feb 22, 2011 at 7:52 pm
    Feb 22, 2011 at 7:52 pm
  • Hi, I'm trying to run Gridmix2 tests (rungridmix_2), but all the tests remain in the waiting state and none of them run. I don't see any exception in the logs. Why this happens? Thanks, -- Pedro
    Pedro CostaPedro Costa
    Feb 22, 2011 at 3:48 pm
    Feb 22, 2011 at 3:48 pm
  • Hello, does anyone use mapreduce and pipes interface with hbase? I have working c++ program, but don't know how to interpret data I get from context.getInputValue() in mapper function. It seems to be ...
    Ondrej HolecekOndrej Holecek
    Feb 19, 2011 at 3:30 pm
    Feb 19, 2011 at 3:30 pm
  • Please keep user questions on mapreduce-user as per http://hadoop.apache.org/mailing_lists.html . Yes. -- Owen
    Owen O'MalleyOwen O'Malley
    Feb 18, 2011 at 4:59 pm
    Feb 18, 2011 at 4:59 pm
  • Hi all, Did somebody has experience in running Hadoop on PlanetLab nodes ? I will be grateful for any suggestions on how to do it. Thanks, Robert
    Robert GrandlRobert Grandl
    Feb 18, 2011 at 11:10 am
    Feb 18, 2011 at 11:10 am
  • Hi, Does there any one have the experience of configuring mulitple hadoop instances on the same cluster? I changed the port numbers, temp directory, local storage directory but the running instances ...
    Juwei ShiJuwei Shi
    Feb 17, 2011 at 2:40 am
    Feb 17, 2011 at 2:40 am
  • Hello Fellow Mappers and Reducers, We are meeting at 7:15 pm on February 17th at the University Heights Community Center 5031 University Way NE Seattle WA 98105 Room -- BASEMENT! The meetings are ...
    Sean jensen-greySean jensen-grey
    Feb 17, 2011 at 2:09 am
    Feb 17, 2011 at 2:09 am
Group Navigation
period‹ prev | Feb 2011 | next ›
Group Overview
groupmapreduce-user @

67 users for February 2011

Pedro Costa: 27 posts Harsh J: 17 posts Jeremy Hanna: 7 posts Mapred Learn: 7 posts Robert Grandl: 7 posts Sonal Goyal: 6 posts Felix gao: 5 posts Koji Noguchi: 4 posts Ondrej Holecek: 4 posts Real great..: 4 posts Stanley Xu: 4 posts Praveen Peddi: 3 posts Arun C Murthy: 3 posts Jacques: 3 posts Joan: 3 posts Mahadev Konar: 3 posts Shuja Rehman: 3 posts Andrew Hitchcock: 2 posts Benjamin Hiller: 2 posts Chase Bradford: 2 posts
show more