FAQ

Search Discussions

129 discussions - 474 posts

  • I'm building a Hadoop project using Maven. I want to add Maven dependencies to my project. What do I do? I think the answer is I add a <dependency </dependency section to my .POM file, but I'm not ...
    W.P. McNeillW.P. McNeill
    Aug 12, 2011 at 9:20 pm
    Aug 18, 2011 at 4:54 pm
  • Good morning, I would like to store some files in the distributed cache, in order to be opened and read from the mappers. The files are produced by an other Job and are sequence files. I am not sure ...
    Sofia GeorgiakakiSofia Georgiakaki
    Aug 12, 2011 at 7:54 am
    Aug 13, 2011 at 5:37 pm
  • Hello All: I used a combination of tutorials to setup hadoop but most seems to be using either an old version of hadoop or only using 2 machines for the cluster which isn't really a cluster. Does ...
    A DfA Df
    Aug 16, 2011 at 10:03 am
    Aug 17, 2011 at 12:14 pm
  • at page 243: Per my understanding, The reducer is supposed to output the first value (the maximum) for each year. But I just don't know how it work. suppose we have the data 1901 200 1901 300 1901 ...
    Daniel,WuDaniel,Wu
    Aug 2, 2011 at 1:26 pm
    Aug 5, 2011 at 12:43 pm
  • Dear All: I know that in pseudo mode that there is a web interface for the NameNode and the JobTracker but where is it for the standalone operation? The Hadoop page at ...
    A DfA Df
    Aug 10, 2011 at 12:32 pm
    Aug 11, 2011 at 10:59 am
  • Hi I am planning to start a hadoop meetup every alternate Saturday/Sunday. Idea is to discuss the latest happenings in the world of Big data, and create more awareness. Place: Delhi, India ...
    Ankit MinochaAnkit Minocha
    Aug 16, 2011 at 5:52 am
    Aug 16, 2011 at 3:55 pm
  • Hi Folks, After much poking around I am still unable to determine why I am seeing 'reduce' being called twice with the "same" key. Recall from my previous email that "sameness" is determined by ...
    Stan RosenbergStan Rosenberg
    Aug 15, 2011 at 1:20 am
    Aug 22, 2011 at 3:54 pm
  • Hello all, We are considering using low end HP proliant machines (DL160s and DL180s) for cluster nodes. However with these machines if you want to do more than 4 hard drives then HP puts in a P410 ...
    Koert KuipersKoert Kuipers
    Aug 11, 2011 at 9:51 pm
    Aug 14, 2011 at 11:01 pm
  • Hi all, I was interested in learning from how Hadoop implements their sort algorithm in the map/reduce framework. Could someone point me to the directory of the source code that has the ...
    Sean HoganSean Hogan
    Aug 12, 2011 at 11:12 pm
    Aug 13, 2011 at 3:54 pm
  • Hi, I am trying to write a map-reduce job to convert csv files to sequencefiles, but the job fails with the following error: java.lang.RuntimeException: Error while running command to get file ...
    Xiaobo GuXiaobo Gu
    Aug 7, 2011 at 9:49 am
    Aug 11, 2011 at 2:14 pm
  • Can anyone offer me some insight. It may have been due to me trying to run the start-all.sh script instead of starting the services. Not sure. Thanks Sean ...
    Sean wagnerSean wagner
    Aug 26, 2011 at 12:49 pm
    Aug 29, 2011 at 5:29 pm
  • Hi all ! How are you? My name is Avi and I have been fascinated by Apache Hadoop for the last few months. I am spending the last two weeks trying to optimize my configuration files and environment. I ...
    Avi VakninAvi Vaknin
    Aug 21, 2011 at 11:58 am
    Aug 22, 2011 at 4:19 pm
  • In my current project we are planning to streams of data to Namenode (20 Node Cluster). Data Volume would be around 1 PB per day. But there are application which can publish data at 1GBPS. Few ...
    Jagaran dasJagaran das
    Aug 10, 2011 at 8:12 am
    Aug 17, 2011 at 10:47 am
  • Is it recommended to install a hadoop cluster on a set of VM's that are all connected to a SAN? Thanks, Travis
    Travis CamechisTravis Camechis
    Aug 15, 2011 at 5:45 pm
    Aug 15, 2011 at 9:01 pm
  • I am keeping a Stream Open and writing through it using a multithreaded application. The application is in a different box and I am connecting to NN remotely. I was using FileSystem and getting same ...
    Jagaran dasJagaran das
    Aug 6, 2011 at 7:42 pm
    Aug 13, 2011 at 7:53 pm
  • Hi, We plan to organize a developer meetup to talk about Hadoop and big data during the week of Sept 12 in Shanghai. We'll have presenters from U.S and the topic looks very interesting. Suggestions ...
    Michael LvMichael Lv
    Aug 16, 2011 at 11:42 pm
    Aug 19, 2011 at 11:45 am
  • Hello All, I am new to Hadoop, and I am trying to use the GenericOptionsParser Class. In particular, I would like to use the -libjar option to specify additional jar files to include in the ...
    Aquil H. AbdullahAquil H. Abdullah
    Aug 1, 2011 at 4:11 pm
    Aug 1, 2011 at 7:50 pm
  • Here is a tutorial on some handy Hadoop classes - with sample source code. http://sujee.net/tech/articles/hadoop-useful-classes/ Would appreciate any feedback / suggestions. thanks all Sujee Maniyam ...
    Sujee ManiyamSujee Maniyam
    Aug 31, 2011 at 11:58 pm
    Sep 2, 2011 at 5:54 pm
  • Why hadoop should be built in JAVA? For integrity and stability, it is good for hadoop to be implemented in Java But, when it comes to speed issue, I have a question... How will it be if HADOOP is ...
    Chris SongChris Song
    Aug 16, 2011 at 12:13 pm
    Aug 22, 2011 at 4:22 am
  • Hi All, I am running the HBase distributed mode in seven node cluster with backup master. The HBase is running properly in the backup master environment. I want to run this HBase on top of the High ...
    Shanmuganathan.rShanmuganathan.r
    Aug 11, 2011 at 6:19 pm
    Aug 20, 2011 at 1:36 pm
  • HI, I hava many <key,value pairs now, and want to get all different values for each key, which way is efficient for this work. such as input : <1,2 <1,3 <1,4 <1,3 <2,1 <2,2 output: <1,2/3/4 <2,1/2 ...
    Jianxin WangJianxin Wang
    Aug 3, 2011 at 3:51 am
    Aug 3, 2011 at 12:54 pm
  • Hi, Has anybody been able to run hadoop standalone mode on fedora 15 ? I have installed it correctly. It runs till map but gets stuck in reduce. It fails with the error "mapred.JobClient Status : ...
    ManishManish
    Aug 5, 2011 at 8:56 am
    Apr 26, 2012 at 12:24 pm
  • Hi All, Here is what's happening. I have implemented my own WritableComparable keys and values. Inside a reducer I am seeing 'reduce' being invoked with the "same" key _twice_. I have checked that ...
    Stan RosenbergStan Rosenberg
    Aug 13, 2011 at 3:15 pm
    Jan 10, 2012 at 11:16 pm
  • Does map-reduce work well with binary contents in the file? This binary content is basically some CAD files and map reduce program need to read these files using some proprietry tool extract values ...
    Mohit AnchliaMohit Anchlia
    Aug 31, 2011 at 3:45 pm
    Sep 2, 2011 at 3:14 pm
  • Hi All, While discussing about Hadoop backup & restore plan with my team I thought about a scenario I wanted to ask you about: I wonder what will happen following the steps below: 1. Backup the ...
    Avi VakninAvi Vaknin
    Aug 30, 2011 at 12:06 pm
    Aug 30, 2011 at 8:13 pm
  • Hi All, I want to install Oozie and I wonder if it is OK to install it on the name node or maybe I need to install dedicated server to it. I have a very small Hadoop cluster (4 datanodes + namenode + ...
    Avi VakninAvi Vaknin
    Aug 29, 2011 at 9:51 am
    Aug 29, 2011 at 2:10 pm
  • One of my colleagues has noticed this problem for a while, and now it's biting me. Jobs seem to be failing before every really starting. It seems to be limited (so far) to running in ...
    John ArmstrongJohn Armstrong
    Aug 26, 2011 at 2:51 pm
    Aug 26, 2011 at 8:19 pm
  • Hi guys, We are currently in the process of writing a paper regarding hadoop and we would like to reference any attempt to remove the single point of failure of the Namenode. We have found in various ...
    George KousiourisGeorge Kousiouris
    Aug 25, 2011 at 9:45 am
    Aug 25, 2011 at 6:46 pm
  • What does HADOOP_CLASSPATH set in $HADOOP/conf/hadoop-env.sh do? This isn't clear to me from documentation and books, so I did some experimenting. Here's the conclusion I came to: the paths in ...
    W.P. McNeillW.P. McNeill
    Aug 22, 2011 at 6:01 pm
    Aug 22, 2011 at 6:49 pm
  • Hi all, I've tried to make a rack topology script. I've written it in python and it works if I call it with the following arguments: 10.2.0.1 10.2.0.11 10.2.0.11 10.2.0.12 10.2.0.21 10.2.0.26 ...
    ModemideModemide
    Aug 19, 2011 at 4:46 pm
    Aug 21, 2011 at 11:21 pm
  • I'm running the Cassandra Brisk server with Haddop core 20.203 on OSX, everything is local. I keep running into this problem for Hive jobs INFO 13:52:39,923 Error from ...
    Aaron mortonAaron morton
    Aug 15, 2011 at 2:05 am
    Aug 19, 2011 at 1:08 am
  • I'm new to setting up hadoop's scheduler and i'm trying to set up Fairscheduler on a 3-node cluster. The initial setup is fine but throughput is abysmal. Each node is configured with 16 map task ...
    Mick Semb WeverMick Semb Wever
    Aug 18, 2011 at 6:23 pm
    Aug 18, 2011 at 9:44 pm
  • Hello, If i have a failure during a job, is there a way I prevent the output folder from being deleted? Cheers Saptarshi
    Saptarshi GuhaSaptarshi Guha
    Aug 10, 2011 at 1:40 am
    Aug 10, 2011 at 9:14 am
  • I was following this tutorial on version 0.19.1 http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html I however wanted to use the latest version of api 0.20.2 The original code in ...
    GarpincGarpinc
    Aug 2, 2011 at 3:41 am
    Aug 5, 2011 at 11:26 pm
  • I can use cacheFile to load .so files into the distributed cache and it works fine (the streaming executable links against the .so and runs), but I can't get it to work with -cacheArchive. It always ...
    Keith WileyKeith Wiley
    Aug 5, 2011 at 5:10 pm
    Aug 5, 2011 at 5:50 pm
  • Hi all, I'm running a simple mapreduce job that connects to an hbase table, reads each row, counts some co-occurrence frequencies, and writes everything out to hdfs at the end. Everything seems to be ...
    Stevens, Keith D.Stevens, Keith D.
    Aug 1, 2011 at 7:21 pm
    Dec 24, 2012 at 10:00 pm
  • Hi everyone, We are going to create a new Hadoop cluster in our company, i have to get some advises from you: 1. Does anyone have stored whole Hadoop data not on local disks but on Netapp or other ...
    Hakan İlterHakan İlter
    Aug 25, 2011 at 6:58 am
    Sep 1, 2011 at 9:49 am
  • Dear hadoop users, Sorry for the off-topic. We're slowly migrating our hadoop cluster to EC2, and one thing that I'm trying to explore is whether we can use alternative scheduling systems like SGE ...
    Dmitry PushkarevDmitry Pushkarev
    Aug 29, 2011 at 9:57 pm
    Aug 31, 2011 at 7:03 pm
  • Hi - Is there a way I can start HDFS (the namenode) from a Java main and run unit tests against that? I need to integrate my Java/HDFS program into unit tests, and the unit test machine might not ...
    Frank AstierFrank Astier
    Aug 26, 2011 at 6:34 pm
    Aug 29, 2011 at 4:58 am
  • Hi All, I have the doubt in avatar node setup . I configure the avatarnode using the patch https://issues.apache.org/jira/browse/HDFS-976 Am I need to configure the NFS filer for share the FSimage ...
    Shanmuganathan.rShanmuganathan.r
    Aug 26, 2011 at 8:27 am
    Aug 26, 2011 at 8:29 am
  • hi, my job runs once ervey day. but it failed sometimes. i checked the log in job tracker. It seems a hdfs error? thanks a lot! 2011-08-16 21:07:13,247 INFO org.apache.hadoop.mapred.TaskInProgress: ...
    Jianxin WangJianxin Wang
    Aug 17, 2011 at 2:50 am
    Aug 23, 2011 at 3:31 am
  • I updated my hadoop cluster from 0.20.2 to higher version 0.21.0 because of MAPREDUCE-1286, and now I have problem running a Hbase on it. I saw the 0.21.0 version is marked as "unstable, unsupported, ...
    Steven zhuangSteven zhuang
    Aug 19, 2011 at 7:40 am
    Aug 22, 2011 at 4:21 pm
  • Instead of "hd fs -put" hundreds of files of X megs, I want to do it once on a gzipped (or zipped) archive, one file, much smaller total megs. Then I want to decompress the archive on HDFS? I can't ...
    Keith WileyKeith Wiley
    Aug 4, 2011 at 11:29 pm
    Aug 5, 2011 at 3:22 pm
  • Is there any way I can programmatically kill or fail a task, preferably from inside a Mapper or Reducer? At any time during a map or reduce task, I have a use case where I know it won't succeed based ...
    Adam ShookAdam Shook
    Aug 3, 2011 at 10:38 pm
    Aug 4, 2011 at 7:15 am
  • Good evening, I would like to ask you a question regarding the use of TotalOrderPartitioner. I am working on my diploma thesis, and I need to use the TotalOrderPartitioner (with the InputSampler of ...
    Sofia GeorgiakakiSofia Georgiakaki
    Aug 3, 2011 at 12:36 pm
    Aug 3, 2011 at 5:25 pm
  • I installed hadoop-0.20.2 in Eucalyptus VM environment. The file system is based on glusterfs, so it is a shared NAS. Though the nodes are much powerful (8 cores + 15G memory), I found the response ...
    Shi YuShi Yu
    Aug 29, 2011 at 3:34 pm
    Sep 1, 2011 at 9:51 am
  • Hadoop newbie here. I wrapped my company's entity extraction product in a Hadoop task, and give it a large file of the magnitude of 100MB. I have 4 VMs running on a 24-core CPU server, and made two ...
    Teruhiko KurosakaTeruhiko Kurosaka
    Aug 31, 2011 at 8:49 am
    Aug 31, 2011 at 11:30 pm
  • I'm configuring a local hadoop cluster in secure mode for development/experimental purposes on Ubuntu 11.04 with the hadoop-0.20.203.0 distribution from apache mirror. I have the basic Kerberos setup ...
    Thomas WeiseThomas Weise
    Aug 31, 2011 at 1:09 am
    Aug 31, 2011 at 8:33 pm
  • Hello, i've setted up a 3 nodes cluster running Hadoop 0.21. Everything works fine . but In the web-view of the HDFS (Port 60070) I see that I have 3 LiveNodes. If I click this link, I see a listing ...
    Ralf HeydeRalf Heyde
    Aug 30, 2011 at 4:12 pm
    Aug 30, 2011 at 5:47 pm
  • Hi, I am trying to run the following code sample: http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-example/ I am on Windows XP, Cygwin and am using hadoop-0.20.2. I am getting stuck on an error while ...
    Ws_dev2001Ws_dev2001
    Aug 29, 2011 at 2:15 pm
    Aug 29, 2011 at 2:18 pm
Group Navigation
period‹ prev | Aug 2011 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions129
posts474
users165
websitehadoop.apache.org...
irc#hadoop

165 users for August 2011

Harsh J: 36 posts A Df: 14 posts W.P. McNeill: 14 posts Allen Wittenauer: 13 posts Jagaran das: 12 posts John Armstrong: 12 posts Daniel,Wu: 10 posts Michel Segel: 10 posts Shanmuganathan.r: 10 posts אבי ווקנין: 10 posts GOEKE, MATTHEW (AG/1000): 9 posts Joey Echeverria: 9 posts Steve Loughran: 9 posts Keith Wiley: 8 posts Stan Rosenberg: 8 posts Kai Voigt: 7 posts Shahnawaz Saifi: 7 posts Jianxin Wang: 6 posts Madhu phatak: 6 posts Adi: 5 posts
show more