Grokbase Groups Hadoop user
FAQ

Search Discussions

1,135 discussions - 4,875 posts

  • Sorry.. Moving 'hbase' mailing list to BCC 'cause this is not related to HBase. Adding 'hadoop' user group.
    Something SomethingSomething Something
    Feb 11, 2013 at 6:25 pm
    Feb 11, 2013 at 9:00 pm
  • Hi all, Is anyone aware of any survey/paper/report showing the relationship between a replication factor and its penalty/benefit on write/read operations? BR, George -- ---------------------------
    George KousiourisGeorge Kousiouris
    Feb 11, 2013 at 4:43 pm
    Feb 12, 2013 at 1:37 am
  • Hi Guys, I am new to MapR distribution. please share you guidance. we previously using cloudera manger as set limitation. More than 50 nodes not support. please give idea, we planing to move ...
    Dhanasekaran AnbalaganDhanasekaran Anbalagan
    Feb 11, 2013 at 12:46 pm
    Feb 11, 2013 at 2:54 pm
  • Hi I found that my job runs with such parameters: mapred.tasktracker.map.tasks.maximum 4 mapred.tasktracker.reduce.tasks.maximum 2 I try to change these parameters from my java code Properties ...
    Oleg RuchovetsOleg Ruchovets
    Feb 11, 2013 at 11:45 am
    Feb 11, 2013 at 11:55 am
  • Are there any rules against writing results to Reducer.Context while in the cleanup() method? I’ve got a reducer that is downloading a few 10’s of millions of images from a set of URLs feed to it. To ...
    David ParksDavid Parks
    Feb 11, 2013 at 6:03 am
    Feb 11, 2013 at 6:44 pm
  • Hi am fresher in Hadoop technologies, I want to take part in any(hive, pig) related projects( I used to be informatica developer) and start off my career . All enterprises need experienced ...
    Monkey2CodeMonkey2Code
    Feb 11, 2013 at 5:02 am
    Feb 11, 2013 at 5:19 am
  • Hi, I am trying to view HADOOP SOURCE CODE. I am using HADOOP 1.0.3. In HADOOP distribution, only jar files are there. Give me some instruction to view source code... please I have seen "contribute ...
    Dibyendu KarmakarDibyendu Karmakar
    Feb 11, 2013 at 3:38 am
    Feb 11, 2013 at 5:46 am
  • Hi, I have a quick question regarding RAID0 performances vs multiple dfs.data.dir entries. Let's say I have 2 x 2TB drives. I can configure them as 2 separate drives mounted on 2 folders and assignes ...
    Jean-Marc SpaggiariJean-Marc Spaggiari
    Feb 11, 2013 at 1:58 am
    Feb 11, 2013 at 4:03 pm
  • I'm a little confused about splitting and readers. The data in my application is stored in files of google protocol buffers. There are multiple protocol buffers per file. There have been a number of ...
    Christopher PiggottChristopher Piggott
    Feb 10, 2013 at 3:36 pm
    Feb 11, 2013 at 4:27 am
  • Hi all, Has anyone ever used some kind of a "generic output key" for a mapreduce job ? I have a job running multiple tasks and I want them to be able to use both Text and IntWritable as output key ...
    Amit SelaAmit Sela
    Feb 10, 2013 at 12:01 pm
    Feb 11, 2013 at 7:22 am
  • HI, Currently I am researching about options of encrypting the data in the MapReduce, as we plan to use the Amazon EMR or EC2 services for our data. I am thinking that the compression codec is good ...
    Java8964 java8964Java8964 java8964
    Feb 9, 2013 at 8:50 pm
    Feb 11, 2013 at 6:09 am
  • Hi, I want to work on release 1.0.4 source code. As per Hadoop wiki HowToContribute, I can download source code from trunk or from release 1.0.4 tag. 1. Source code from hadoop/common/trunk with ...
    Trupti GaikwadTrupti Gaikwad
    Feb 9, 2013 at 2:45 pm
    Feb 9, 2013 at 3:07 pm
  • Glen MazzaGlen Mazza
    Feb 9, 2013 at 2:11 pm
    Feb 9, 2013 at 2:11 pm
  • Hi, When I run a job its getting hanged bacause its not able to get free memory resources to map/reduce task containers. Total Memory avialable : 8 GB schedular configured : CapacitySchedular queue ...
    Rajeshbabu chintaguntlaRajeshbabu chintaguntla
    Feb 9, 2013 at 1:07 pm
    Feb 9, 2013 at 1:07 pm
  • I have a cluster of boxes with 3 reducers per node. I want to limit a particular job to only run 1 reducer per node. This job is network IO bound, gathering images from a set of webservers. My job ...
    David ParksDavid Parks
    Feb 9, 2013 at 3:55 am
    Feb 11, 2013 at 6:30 am
  • Hi, I am thinking to write some mapper to do conversion of mainframe files to ascii format and contribute back. And before even i do something i wanted to confirm from you guys the following - Do we ...
    Jagat SinghJagat Singh
    Feb 9, 2013 at 3:24 am
    Feb 11, 2013 at 6:45 pm
  • Hello All, I am confused over how MapReduce tasks select data blocks for processing user requests ? As data block replication replicates single data block over multiple datanodes, during job ...
    Mehal PatelMehal Patel
    Feb 9, 2013 at 12:41 am
    Feb 9, 2013 at 5:13 am
  • We have a use case that requires us to have the ability to: * delete all of a customers data as it sits in hdfs on a whims notice * Re-mapreduce all of a particular accounts data, going way back in ...
    Sean McNamaraSean McNamara
    Feb 8, 2013 at 9:53 pm
    Feb 9, 2013 at 9:31 pm
  • Hi! I've followed the hadoop cluster tutorial on hadoop site (hadoop 1.1.1 on 64bit machines with openjdk 1.6). I've set-up 1 namenode, 1 jobtracker, and 3 slaves acting as datanode and tasktracker ...
    Ricardo CasazzaRicardo Casazza
    Feb 8, 2013 at 4:58 pm
    Feb 8, 2013 at 4:58 pm
  • Hi, I'm wondering what's the best way to install FUSE with Hadoop 1.0.3? I'm trying to follow all the steps described here: http://wiki.apache.org/hadoop/MountableHDFS but it's failing on each one, ...
    Jean-Marc SpaggiariJean-Marc Spaggiari
    Feb 8, 2013 at 4:31 pm
    Feb 8, 2013 at 5:17 pm
  • Hi, I have data stored in an object that I want to pass into my Mapper. I see from Configuration that there are setters and getters for primitives, but is there a way of doing this with ...
    Peter CoganPeter Cogan
    Feb 8, 2013 at 3:15 pm
    Feb 8, 2013 at 7:51 pm
  • Hi, I'm trying to install FUSE with Hadoop 1.0.3 and I'm facing some issues. I'm following the steps I have there: http://wiki.apache.org/hadoop/MountableHDFS I have extracted 1.0.3 code using svn ...
    Jean-Marc SpaggiariJean-Marc Spaggiari
    Feb 8, 2013 at 2:24 pm
    Feb 8, 2013 at 2:53 pm
  • Hello, I am new to Hadoop. Till now I have tried to run Hadoop on a single node and see a simple wordcount program running on it. As part of my project I need to add a new file system to it (like S3, ...
    Agarwal, NikhilAgarwal, Nikhil
    Feb 8, 2013 at 10:24 am
    Feb 8, 2013 at 10:24 am
  • Hi Guys, We have using CDH4.0.1, we having quite monitoring with fair scheduler with not handle with more than one lakh mapper. Is the Bug or configuration tweaking please guide me guys. My ...
    Dhanasekaran AnbalaganDhanasekaran Anbalagan
    Feb 8, 2013 at 9:27 am
    Feb 8, 2013 at 9:27 am
  • Hi, I upgrade from hadoop-1.0.4 to 2.0.2-alpha, my command: start-dfs.sh -upgrade -clusterId mycluster I can upgrade successfully, but clusterId sounds like be ignored, it generated a random ...
    Azuryy YuAzuryy Yu
    Feb 8, 2013 at 1:37 am
    Feb 8, 2013 at 1:37 am
  • Hello Hadoopers, How's your cluster behave today ?? hope they run well and strong. In the past or some bad days i saw 'Too many fetch-failure'; it was fixed by adjusting dfs.datanode.max.xcievers to ...
    Patai SangbutsarakumPatai Sangbutsarakum
    Feb 7, 2013 at 7:33 pm
    Feb 8, 2013 at 6:58 pm
  • Hi, I am trying to do a name sorting using secondary sort. I have a working example, which I am taking as a reference. But I am getting a null pointer error in the MapTask class. I am not able to ...
    Ravi ChandranRavi Chandran
    Feb 7, 2013 at 6:25 pm
    Feb 8, 2013 at 5:05 pm
  • Hi All, Can any one list me the mandatory system level check (ulimit,firewall,selinux...) before starting a hadoop cluster. Regards Sathish
    Sara rajiSara raji
    Feb 7, 2013 at 3:07 pm
    Feb 7, 2013 at 4:07 pm
  • Is there a good reason why the OldCombinerRunner passes Reporter.NULL to the combiner instead of the actual TaskReporter? The NewCombinerRunner does use the TaskReporter when creating the context. If ...
    Jim DonofrioJim Donofrio
    Feb 7, 2013 at 1:41 pm
    Feb 11, 2013 at 7:41 am
  • hive-0.9.0-cdh4.1.2)" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=bcaec502d692f132a604d5202931 --bcaec502d692f132a604d5202931 Content-Type: text/plain; charset="utf-8" ...
    Viral BajariaViral Bajaria
    Feb 7, 2013 at 11:30 am
    Feb 7, 2013 at 11:09 pm
  • Hello, I am trying to write MapReduce jobs to read data from JSON files and load it into HBase tables. Please suggest me an efficient way to do it. I am trying to do it using Spring Data Hbase ...
    Panshul WhisperPanshul Whisper
    Feb 7, 2013 at 11:22 am
    Feb 7, 2013 at 2:25 pm
  • Hi All, I could not see the hive meta store DB under Mysql database Under mysql user hadoop. Example: $ mysql –u root -p $ Add hadoop user (CREATE USER ‘hadoop'@'localhost' IDENTIFIED BY ‘hadoop ';) ...
    Samir das mohapatraSamir das mohapatra
    Feb 7, 2013 at 10:47 am
    Feb 8, 2013 at 6:21 am
  • Hi I am using Hadoop 0.20.203. I have performed simple vertical scalability experiments of Hadoop with the use of Graph datasets and BFS algorithm. My experiment configuration is 20workers + Master ...
    Blah blahBlah blah
    Feb 7, 2013 at 10:09 am
    Feb 7, 2013 at 10:09 am
  • Hi Guys, We have using CDH4.0.1, we have running running multiple jobs, We have look the job tracker page showing wrong Job tracker Start time same for all the running job ...
    Dhanasekaran AnbalaganDhanasekaran Anbalagan
    Feb 7, 2013 at 7:57 am
    Feb 7, 2013 at 7:57 am
  • Hi hadoop users, I am trying to use the streaming interface to use my python script mapper to create some files but am running into difficulties actually creating files on the hdfs. I have a python ...
    Julian BuiJulian Bui
    Feb 7, 2013 at 1:14 am
    Feb 7, 2013 at 3:40 pm
  • Hi, I wish to profile my mapper, so I've set the properties mapred.task.profileand mapred.task.profile.maps in mapred-site.xml. At the end of the job I'm getting a profile.out file, however I think ...
    Yaron GonenYaron Gonen
    Feb 6, 2013 at 9:50 pm
    Feb 7, 2013 at 4:48 am
  • Is it possible to pass unmolested binary data through a map-only streaming job from the command line? I.e., is there a way to avoid extra tabs and newlines in the output? I don't need input splits or ...
    Jay HackerJay Hacker
    Feb 6, 2013 at 9:30 pm
    Feb 7, 2013 at 3:20 pm
  • Hello everyone, I've been encountering the following problem for some time now and it is really slowing down my work. I would appreciate any help you guys can provide. I am using Hadoop 1.0.3. I ...
    Florin DinuFlorin Dinu
    Feb 6, 2013 at 7:47 pm
    Feb 6, 2013 at 7:47 pm
  • Hello! I'm trying to install Hadoop 1.1.2.21 on CentOS 6.3. I've configured dfs.name.dir in /etc/hadoop/conf/hdfs-site.xml file <name dfs.name.dir</name <value /mnt/ext/hadoop/hdfs/namenode</value ...
    Andrey V. RomanchevAndrey V. Romanchev
    Feb 6, 2013 at 3:07 pm
    Feb 6, 2013 at 6:07 pm
  • Hi Guys, We have done moving local file to HDFS hadoop fs -copyFromLocal we have verified some of the file missing in the HDFS, We want validate source to destination. We have already have source ...
    Dhanasekaran AnbalaganDhanasekaran Anbalagan
    Feb 6, 2013 at 10:27 am
    Feb 6, 2013 at 7:04 pm
  • We're trying to use HFileOutputFormat for bulk hbase loading. When using HFileOutputFormat's setOutputPath or configureIncrementalLoad, the job is unable to run. The error I see in the jobtracker ...
    Sean McNamaraSean McNamara
    Feb 6, 2013 at 12:32 am
    Feb 6, 2013 at 9:49 pm
  • When setting up passwordless ssh on a cluster, its clear that the namenode needs to be able to ssh into task trackers to start/stop nodes and restart the cluster. What else is passwordless SSH used ...
    Jay VyasJay Vyas
    Feb 5, 2013 at 11:06 pm
    Feb 5, 2013 at 11:56 pm
  • Hello everyone, I am setting up Hadoop for the first time, so please bear with me while I ask all these beginner questions :) I followed the instructions to create a hodrc, but looks like I cannot ...
    Mehmet BelginMehmet Belgin
    Feb 5, 2013 at 9:42 pm
    Feb 8, 2013 at 3:11 pm
  • Hello, I am new to Hadoop. I am doing a project in cloud in which I have to use hadoop for Map-reduce. It is such that I am going to collect logs from 2-3 machines having different locations. The ...
    Mayur PatilMayur Patil
    Feb 5, 2013 at 9:32 pm
    Feb 6, 2013 at 1:05 pm
  • Hello guys, I want to learn a bit more when we need to change (increase/decrease) replication factor for better performance, and also want to learn a bit more internals about how replication factor ...
    Lin MaLin Ma
    Feb 5, 2013 at 5:29 pm
    Feb 5, 2013 at 5:34 pm
  • Hi Guys, We have configured many Heap size related thing. in Hadoop for ex. Namenode's Java Heap Size in bytes. Secondary namenode's Java Heap Size in bytes. Balancer's Java Heap Size in bytes ...
    Dhanasekaran AnbalaganDhanasekaran Anbalagan
    Feb 5, 2013 at 5:05 pm
    Feb 5, 2013 at 5:05 pm
  • Hi Guys, I have configured HDFS with replication factor 3. We have 1TB for data How to file the particular block will available in 3 machine How to find same block of data will available in 3 machine ...
    Dhanasekaran AnbalaganDhanasekaran Anbalagan
    Feb 5, 2013 at 3:01 pm
    Feb 5, 2013 at 3:37 pm
  • Hi, I'm facing a problem with hadoop's secondary sort such that it is displaying the following error message. The code I have used has been used by me previously and had not given any issues for a ...
    Aseem AnandAseem Anand
    Feb 5, 2013 at 1:11 pm
    Feb 7, 2013 at 5:04 pm
  • Hi, I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I would like to get the following clarifications regarding cloudera hadoop distribution. I am using a CDH4 Demo VM for now ...
    Sharath Chandra GuntukuSharath Chandra Guntuku
    Feb 5, 2013 at 10:59 am
    Feb 5, 2013 at 6:12 pm
  • Lately, jobtracker in one of our production cluster fall into hang state. The load 5,10,15min is like 1 ish; with top command, jobtracker has 100% cpu all the time. So, i went ahead to try top -H -p ...
    Patai SangbutsarakumPatai Sangbutsarakum
    Feb 4, 2013 at 11:21 pm
    Feb 7, 2013 at 4:24 am
Group Navigation
period‹ prev | Latest | first ›
Group Overview
groupuser @
categorieshadoop
discussions1,135
posts4,875
users895
websitehadoop.apache.org
irc#hadoop

Top users

Harsh J: 500 posts Mohammad Tariq: 169 posts Kartashov, Andy: 110 posts Michael Segel: 107 posts Bejoy Hadoop: 93 posts Bertrand Dechoux: 84 posts Visioner Sadak: 83 posts Hemanth Yamijala: 70 posts Ted Dunning: 65 posts Jean-Marc Spaggiari: 54 posts Steve Loughran: 51 posts Nitin Pawar: 47 posts JAX: 46 posts David Parks: 45 posts Vinod Kumar Vavilapalli: 43 posts Jamal sasha: 37 posts Andy Isaacson: 36 posts Mohit Anchlia: 33 posts Lin Ma: 31 posts Mahesh Balija: 31 posts
show more