Search Discussions

39 discussions - 135 posts

  • Hi all, I am trying to figure how I can start a hadoop job porgramatically from my Java application running in an app server. I was able to run my map reduce job using hadoop command from hadoop ...
    Praveen PeddiPraveen Peddi
    Nov 22, 2010 at 9:39 pm
    Nov 25, 2010 at 9:20 am
  • Hi I have a job where i did not need any reducers. I am using only mappers. At the moment, the output of job is generated in files. But i want to use only java api to do some calculation and i want ...
    Shuja RehmanShuja Rehman
    Nov 7, 2010 at 7:49 pm
    Nov 8, 2010 at 3:29 pm
  • Hi , i am using 0.20.2.. when i try to start hadoop,the namenode ,datanode works will ,but i can't submit jobs.i look at the logs ,and find the error like: 2010-11-25 12:38:02,623 INFO ...
    Nov 25, 2010 at 4:53 am
    Nov 25, 2010 at 9:21 am
  • Hi all, We have setup a small cluster (13 nodes) using CDH3 We have been tuning it using TeraSort and Hive queries on our data, and the copy phase is very slow, so I'd like to ask if anyone can look ...
    Tim RobertsonTim Robertson
    Nov 17, 2010 at 8:43 am
    Nov 18, 2010 at 1:37 pm
  • Hi, I am working on migrating a mapreduce program from using org.apache.hadoop.mapred to org.apache.hadoop.mapreduce APIs. The program currently uses ...
    Srihari Anantha PadmanabhanSrihari Anantha Padmanabhan
    Nov 18, 2010 at 11:15 pm
    Nov 19, 2010 at 7:37 am
  • I've noticed an odd behavior with a map-reduce job I've written which is reading data out of an HBase table. After a couple days of poking at this I haven't been able to figure out the cause of the ...
    Adam PhelpsAdam Phelps
    Nov 5, 2010 at 12:57 am
    Nov 9, 2010 at 6:29 pm
  • Hi Is there a way to make MapReduce create exactly N Mappers? More specifically, if say my data can be split to 200 Mappers, and I have only 100 cores, how can I ensure only 100 Mappers will be ...
    Shai EreraShai Erera
    Nov 25, 2010 at 6:36 pm
    Nov 25, 2010 at 8:33 pm
  • Not sure whether this has been post on this mail list. But I strongly feel to tell everyone here that "Yahoo Open Source Real-Time MapReduce". See http://s4.io/ for more details. And thanks again for ...
    Jeff ZhangJeff Zhang
    Nov 9, 2010 at 9:49 am
    Dec 8, 2010 at 11:38 pm
  • Hi all, I have been trying to figure out why all mappers run only on one machine when I have 4 node cluster. Ruduce part is running fine on all 4 nodes correctly. I am using 0.20.2. My input file is ...
    Praveen PeddiPraveen Peddi
    Nov 16, 2010 at 5:24 pm
    Nov 16, 2010 at 7:43 pm
  • Is there any way to know how many values I will see in a call to reduce without first counting through them all with the iterator? Under 0.21? 0.20? 0.19? Thanks, Anthony
    Anthony UrsoAnthony Urso
    Nov 7, 2010 at 1:38 pm
    Nov 9, 2010 at 4:28 pm
  • Hi, How can I compile and use my own hadoop? I modified some source code of hadoop-0.20.2. Then, I tried to build it with eclipse according to this tutorial ...
    Shen LIShen LI
    Nov 8, 2010 at 11:42 pm
    Nov 9, 2010 at 7:42 am
  • When an hadoop MapReduce example is executed, at the end of the example it's showed a table with all the information about the execution, like the number of Map and Reduce tasks executed, the number ...
    Nov 30, 2010 at 6:05 pm
    Dec 1, 2010 at 8:50 am
  • Dear All, I am having a requirement in which I need to move my existing program to map-reduce framework: ---I am reading files within a directory and also subdirectories. ---Processing one file at a ...
    Bhaskar GhoshBhaskar Ghosh
    Nov 17, 2010 at 2:21 pm
    Nov 19, 2010 at 8:21 pm
  • Hi, Has anyone tried deploying Hadoop across two (or more) EC2 datacenters. (i.e. with slaves from more than one datacenter) ? Need to do this for one of the experiments I'm doing but things does not ...
    Chamikara JayalathChamikara Jayalath
    Nov 18, 2010 at 5:17 am
    Nov 18, 2010 at 7:55 am
  • How to set the number of tasks running on each slave? (e.g., 4 tasks on each slave node) Thanks
    Shen LIShen LI
    Nov 16, 2010 at 12:32 am
    Nov 16, 2010 at 5:56 am
  • Hi All, I have a question about map reduce. Suppose I have set of small files (say 100) usually having size 8-15 MB and need to process in a single job. For each file, there will be 1 map process and ...
    Shuja RehmanShuja Rehman
    Nov 11, 2010 at 6:37 pm
    Nov 12, 2010 at 6:15 pm
  • Hi, I have a map/reduce job that may discover on the way, during the map phase, that continuing is pointless. I am not sure to accomplish a job cancellation from within the Job. Is there an API for ...
    Henning BlohmHenning Blohm
    Nov 2, 2010 at 12:25 pm
    Nov 2, 2010 at 4:33 pm
  • Hi, 1 - I'm trying to run GridMix2 (rungridmix_2) in a cluster, but it happens nothing. A Job isn't created. It simple appears the message: [code] GridMix results: Total num of Jobs: 0 ExecutionTime: ...
    Nov 30, 2010 at 4:05 pm
    Nov 30, 2010 at 4:13 pm
  • Moving to mapreduce-user@, bcc common-user@. Please use project specific lists. mapreduce.reduce.slowstart.completed.maps is the right knob. Which version of hadoop are you running? If it isn't ...
    Arun C MurthyArun C Murthy
    Nov 28, 2010 at 10:04 am
    Nov 29, 2010 at 1:26 am
  • Hi, I'm trying to run the randomtextwriter example, but I gor the following error: $ bin/hadoop jar hadoop-0.20.2-dev-examples.jar randomtextwriter ~/temp/ Running 10 maps. Job started: Fri Nov 26 ...
    Pedro CostaPedro Costa
    Nov 26, 2010 at 2:08 pm
    Nov 26, 2010 at 3:41 pm
  • Hi all, I probably find a bug in InputSamper, under hadoop 0.21.0. In the file InputSampler.java under package org.apache.hadoop.mapreduce.lib.partition, inside function getSample, a record reader is ...
    Nov 19, 2010 at 7:12 am
    Nov 23, 2010 at 8:25 pm
  • Hi, I want to write the records to different hdfs files (instead of the default part-m..0000) into different output dirs based on the key generated by the mapper. I would like to implement this using ...
    Srihari Anantha PadmanabhanSrihari Anantha Padmanabhan
    Nov 22, 2010 at 6:58 pm
    Nov 22, 2010 at 7:06 pm
  • Hello, I wanted to know how to increase the data processed by a single JVM instance. What options are needed for this, and where to put them up. Regards, Jyothish Soman
    Jyothish SomanJyothish Soman
    Nov 9, 2010 at 9:25 am
    Nov 9, 2010 at 9:47 am
  • I am working on my cloud computing project, introducing the concept of trust for a node using some data mining techniques. I want to know where in the source code i mean the class of map reduce the ...
    Nitin reddyNitin reddy
    Nov 6, 2010 at 7:10 pm
    Nov 7, 2010 at 5:24 am
  • Whe Hadoop MapReduce needs SSH? When SSH is used? -- Pedro
    Pedro CostaPedro Costa
    Nov 5, 2010 at 11:54 am
    Nov 5, 2010 at 12:02 pm
  • Hello everyone, I am curious to know if anyone has tried using map-reduce across multiple data centers? The use case that I have in my mind where the dataset is geographically distributed across ...
    Hrishikesh GadreHrishikesh Gadre
    Nov 2, 2010 at 11:26 pm
    Nov 3, 2010 at 1:05 am
  • Hi all, Is there any way to change the final output file name of a job? Rather than the default file names (part-00000 ......), I want to use some other naming rules, which can be more meaningful. ...
    Nov 1, 2010 at 7:05 am
    Nov 1, 2010 at 7:20 am
  • Hi, To run gridmix2 (rungridmix_2) at ${HADOOP_HOME}/src/benchmarks/gridmix2 , do I need to run previously the generateGridmix2data.sh script file? Thanks, Pedro
    Nov 30, 2010 at 4:53 pm
    Nov 30, 2010 at 4:53 pm
  • Moving to mapreduce-user@, bcc common-user@. Please use project specific lists. Your InputSplits are defined by your InputFormat. Take a look 'getSplits' method in InputFormat.java. ...
    Arun C MurthyArun C Murthy
    Nov 28, 2010 at 5:27 am
    Nov 28, 2010 at 5:27 am
  • Hi, I'm trying to split a single file throug a mep-reduce job. My input is a sequence file where each entry represent a graph node together with its neighbors and i would like to split it in more ...
    Ivan LeonardiIvan Leonardi
    Nov 27, 2010 at 2:45 pm
    Nov 27, 2010 at 2:45 pm
  • Hi I need to implement a Writable, which contains a lot of data, and unfortunately I cannot break it down to smaller pieces. The output of a Mapper is potentially a large record, which can be of any ...
    Shai EreraShai Erera
    Nov 25, 2010 at 10:47 am
    Nov 25, 2010 at 10:47 am
  • Hi all, I am trying to sample the key distribution before making a total sort. But the programs failed and throw an exception. This is the stack: Exception in thread "main" ...
    Nov 18, 2010 at 12:09 pm
    Nov 18, 2010 at 12:09 pm
  • I am looking for pointers, urls to information about architectures of sites like youtube, fb or netflix ? Obviously, I don't expect to know the nitty-gritty details just wanted to know the kind of ...
    Web serviceWeb service
    Nov 17, 2010 at 6:51 pm
    Nov 17, 2010 at 6:51 pm
  • Hi, I recently got this error quite often. Does anyone know how to fix it? Thanks, Yin
    Yin LouYin Lou
    Nov 16, 2010 at 3:39 am
    Nov 16, 2010 at 3:39 am
  • Hi, I'm using two hadoop clusters, 0.21.0 and CDH2. The problem which resembled MAPREDUCE-1868 by my clusters by hanging up of TaskTracker occurred. Is it having to take the following methods in ...
    Shinichi YAMASHITAShinichi YAMASHITA
    Nov 14, 2010 at 4:06 am
    Nov 14, 2010 at 4:06 am
  • Hi, I'm trying to run teragen to generate data to run the terasort example, but I get the following error. Does anyone have any clue of what's happening? At the end it's the my mapred-site.xml? ...
    Pedro CostaPedro Costa
    Nov 12, 2010 at 9:42 pm
    Nov 12, 2010 at 9:42 pm
  • Hi all, I am very new to Hadoop and I have a (it seems to me it should be quite simple) question regarding MapReduce: I wanted to implement a simple Map Reduce to create an inverted index over a list ...
    Markus PilmanMarkus Pilman
    Nov 9, 2010 at 11:08 am
    Nov 9, 2010 at 11:08 am
  • Hi I have cluster of 4 machines and want to configure ganglia for monitoring purpose. I have read the wiki and add the following lines to hadoop-metrics.properties on each machine. ...
    Shuja RehmanShuja Rehman
    Nov 8, 2010 at 2:34 pm
    Nov 8, 2010 at 2:34 pm
  • I am pleased to announce the v0.0 release of Sizzle, a compiler and runtime for the Sawzall language. Sizzle targets Hadoop directly, by compiling Sawzall programs into Hadoop job jars that can be ...
    Anthony UrsoAnthony Urso
    Nov 5, 2010 at 6:34 pm
    Nov 5, 2010 at 6:34 pm
Group Navigation
period‹ prev | Nov 2010 | next ›
Group Overview
groupmapreduce-user @

49 users for November 2010

Harsh J: 14 posts Praveen Peddi: 7 posts Pedro Costa: 7 posts Henning Blohm: 6 posts Jeff Zhang: 6 posts Niels Basjes: 6 posts Shuja Rehman: 6 posts Adam Phelps: 5 posts Srihari Anantha Padmanabhan: 5 posts Tim Robertson: 5 posts Li ping: 4 posts Rahul patodi: 4 posts Shai Erera: 4 posts 祝美祺: 4 posts Exception: 3 posts Friso van Vollenhoven: 3 posts Shrijeet Paliwal: 3 posts Ted Yu: 3 posts Anthony Urso: 2 posts Arun C Murthy: 2 posts
show more