Search Discussions

145 discussions - 614 posts

  • Hi, I am hopping that someone can help with this issue. I have a 64 node cluster that we would like to run Hadoop on, most of the nodes are netbooted via NFS. Hadoop runs fine on nodes IF the node ...
    Nick RathkeNick Rathke
    Sep 24, 2009 at 6:30 am
    Sep 30, 2009 at 2:00 pm
  • Hi everyone, I am running two map-reduce program, they were working good but when the data turns into around 900MB (50000+ files). things weird happen to remind me as below: 'Communication problem ...
    Kunsheng ChenKunsheng Chen
    Sep 23, 2009 at 12:51 pm
    Oct 1, 2009 at 10:03 pm
  • Hi, guys, Pregel has been revealed on 8/11, what is your opinion of, does anybody know how to get the presentation, and is anyone interested in implementing it? Thank you, Mark
    Mark KerznerMark Kerzner
    Sep 3, 2009 at 4:47 pm
    Jul 12, 2010 at 11:57 pm
  • Hey all, I'm pretty new to hadoop in general and I've been tasked with building out a datacenter cluster of hadoop servers to process logfiles. We currently use Amazon but our heavy usage is starting ...
    Sep 29, 2009 at 5:58 pm
    Oct 1, 2009 at 3:46 pm
  • Hi. Anybody has experience a DB that can handle large amounts of data on top of Hadoop? HBase and Hive is nice but they also lack of some features. HadoopDB seems to bring some equilibrium. However, ...
    Sep 14, 2009 at 9:04 pm
    Sep 16, 2009 at 12:44 am
  • I'm seeing this error when I try to run my job. java.io.IOException: Task process exit with nonzero status of 1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418) Can't find anything ...
    Marc LimotteMarc Limotte
    Sep 23, 2009 at 6:06 pm
    Oct 27, 2009 at 10:51 pm
  • Good Day Everyone, I know that currently Hadoop on Windows is considering as a development system. Is there any chance of Hadoop working on Windows as a production system? Do anyone working on the ...
    Sep 17, 2009 at 1:33 am
    Sep 18, 2009 at 11:48 pm
  • We're using hadoop 0.20.0 to analyze large log files from web servers. I am looking for better HDFS support so that I don't have to copy log files from Linux File System over. Please comment. Thanks
    Ted YuTed Yu
    Sep 7, 2009 at 9:19 am
    Sep 8, 2009 at 5:24 pm
  • Has anyone done any extensive testing of what instance types on Amazon EC2 give you the most bang for the buck? Given the normal Hadoop recommendations of beefy machines, I would expect the best ...
    Kevin PetersonKevin Peterson
    Sep 29, 2009 at 5:20 pm
    Jan 18, 2010 at 7:03 am
  • Hi, recently, we're seeing frequent STEs in our datanodes. We had prior fixed this issue by upping the handler count max.xciever (note this is misspelled in the code as well - so we're just being ...
    Florian LeibertFlorian Leibert
    Sep 24, 2009 at 8:37 pm
    Sep 25, 2009 at 1:57 am
  • What do you do with the data on a failing disk when you replace it? Our support person comes in occasionally, and often replaces several disks when he does. These are disks that have not yet failed, ...
    David B. RitchDavid B. Ritch
    Sep 11, 2009 at 1:31 am
    Sep 14, 2009 at 5:23 pm
  • We've got the namenode image being written to a second machine via NFS so we have that backed up. That said, do we still need a secondary namenode, or is it OK to have the cluster going without one? ...
    Mayuran YogarajahMayuran Yogarajah
    Sep 28, 2009 at 5:26 pm
    Oct 7, 2009 at 11:30 pm
  • I work in a call center which means we have a lot of PCs sitting on agents' desks doing a whole lot nothing in the middle of the night. It also means that we collect a lot of phone and other data, ...
    James CarrollJames Carroll
    Sep 29, 2009 at 7:38 am
    Sep 30, 2009 at 9:23 am
  • Hi, i m writing a map reduce program which reads a file from HDFS and stores the contents in a static map (declared n initialized before executing map reduce). but however after executing the ...
    Rakhi KhatwaniRakhi Khatwani
    Sep 30, 2009 at 1:28 pm
    Oct 7, 2009 at 12:35 pm
  • Hello All, I am trying to study the performance of ec2 cluster when it is running hadoop. But I am not able get ganglia up and running. Can someone please guide me as how to use/configure Ganglia to ...
    Samprita HegdeSamprita Hegde
    Sep 9, 2009 at 1:22 am
    Sep 30, 2009 at 4:34 pm
  • Hi all, I am using hadoop 0.18.3, and I wan to stop cluster with command bin/stop-all.sh But it's weired that it shows I can not stop the cluster. Does anyone encounter this problem before ? Any ...
    Jeff ZhangJeff Zhang
    Sep 21, 2009 at 5:19 am
    Sep 21, 2009 at 5:27 pm
  • Howdy, I'm about to install a new hadoop cluster here and I'm wondering which Hadoop to install. Looking at http://hadoop.apache.org/common/releases.html suggests its either 0.19.2 or 0.20.0 but ...
    Stephen mulcahyStephen mulcahy
    Sep 15, 2009 at 2:42 pm
    Sep 15, 2009 at 7:40 pm
  • Hi guys, I just want to simulate a cluster with Hadoop on my laptop, so I chose the pseudo-distribute mode. The example is running well, but now I just want to test getting date from different ...
    Huang QianHuang Qian
    Sep 25, 2009 at 8:11 am
    Sep 28, 2009 at 12:45 am
  • Hi. When I specify multiple disks for DFS, does Hadoop distributes the concurrent writings over the multiple disks? I mean, to prevent an utilization of a single disk? Thanks for any info on subject.
    Stas OskinStas Oskin
    Sep 13, 2009 at 10:01 am
    Sep 15, 2009 at 4:55 pm
  • Hi, I am planning on running my MapReduce app on Amazon's EC2. I had a look at the public Hadoop images in the hadoop-images bucket and there is no image for the stable 0.18.3 release. The most ...
    John ClarkeJohn Clarke
    Sep 7, 2009 at 2:09 pm
    Sep 8, 2009 at 1:25 pm
  • Hi, I have set up a 4 (physica) nodes Hadoop cluster. Configuration: 2GB RAM each machine. Currently am using the sub-project Hive for firing queries on 45GB of data. I have certain queries that need ...
    Ramiya VRamiya V
    Sep 2, 2009 at 5:09 am
    Sep 4, 2009 at 5:51 pm
  • Hello Everyone! May be a stupid question. I want to install and configure Hadoop on a cluster for our users with SGE support. We have a scrarch space that is NFS moutned across each of the compute ...
    S BS B
    Sep 23, 2009 at 4:36 pm
    Sep 23, 2009 at 8:14 pm
  • I'm new to Hadoop, so pardon the potentially dumb question.... I've gathered, from much research, that Hadoop is not always a good choice when you need to process a whack of smaller files, which is ...
    Andrzej Jan TaraminaAndrzej Jan Taramina
    Sep 14, 2009 at 1:41 pm
    Sep 21, 2009 at 4:49 pm
  • Hello! Running a simple MR job, and setting a replication factor of 2. Now, after its execution, the output is split in files named as part-00000 and so on. I want to ask is, can't we avoid these ...
    Sugandha NaolekarSugandha Naolekar
    Sep 4, 2009 at 7:47 am
    Sep 7, 2009 at 12:09 am
  • Dear All, I am using Hadoop 0.20.0. I have an application that needs to run map-reduce functions iteratively. Right now, the way I am doing this is new a Job for each pass of the map-reduce. That ...
    Boyu ZhangBoyu Zhang
    Sep 4, 2009 at 6:37 pm
    Sep 4, 2009 at 7:27 pm
  • Hello Everyone, We have a cluster with one namenode and three datanodes. And we got these logs when starting hadoop0.20. Is it normal? 2009-09-25 10:45:00,616 INFO ...
    Zheng LvZheng Lv
    Sep 25, 2009 at 3:12 am
    Sep 27, 2009 at 2:53 pm
  • Hi all, I'm wondering whether it's possible to limit the number of task that are executed in parallel on a task tracker? I'm using the parameters mapred.tasktracker.{map|reduce}.tasks.maximum to ...
    Oliver SennOliver Senn
    Sep 25, 2009 at 9:49 am
    Sep 26, 2009 at 7:03 pm
  • Hi, I'm still relatively new to Hadoop here, so bear with me. We have a few ex-SGI staff with us, and one of the tools we now use at Aconex is Performance Co-Pilot (PCP), which is an open-source ...
    Paul SmithPaul Smith
    Sep 25, 2009 at 8:18 am
    Sep 25, 2009 at 3:09 pm
  • Hi all, I know this was discussed a bit last week, but I just wanted to point out that the core releases page shows the 0.20.1 release: http://hadoop.apache.org/common/releases.html But the hdfs and ...
    Andy SautinsAndy Sautins
    Sep 20, 2009 at 6:33 pm
    Sep 20, 2009 at 8:48 pm
  • Hi, after a lot of unsuccessful attempts of running hadoop distributed file system on my machine, I've located one possible error. Maybe you have some ideas about what's going on. Experiment: What ...
    Vincenzo GulisanoVincenzo Gulisano
    Sep 14, 2009 at 2:47 pm
    Sep 14, 2009 at 3:45 pm
  • Hi, I am using Hadoop 0.20.0 How can I get pass the exception below ? [hadoop@vh20 hadoop]$ jmap -heap 3837 Attaching to process ID 3837, please wait... ...
    Ted YuTed Yu
    Sep 4, 2009 at 7:01 pm
    Sep 10, 2009 at 11:03 am
  • Hi all, What is the best way to copy directories from HDFS to local disk in 0.19.1? Thanks, Kris.
    Kris JirapinyoKris Jirapinyo
    Sep 5, 2009 at 12:15 am
    Sep 5, 2009 at 6:11 pm
  • Greetings, It's time for another Hadoop/Lucene/Apache"Cloud" Stack meetup! This month it'll be on Wednesday, the 30th, at 6:45 pm. We should have a few interesting guests this time around -- someone ...
    Bradford StephensBradford Stephens
    Sep 14, 2009 at 6:35 pm
    Oct 7, 2009 at 2:11 pm
  • We would like to use the same data for Pig and Hive queries for flexibility, has anyone done this without having 2 copies of the data? Hive seems to only want to work with CTRL-A delimited data, and ...
    Sep 30, 2009 at 2:56 pm
    Oct 5, 2009 at 10:19 pm
  • Hi there, When I have hadoop running (version 0.20.0, Pseudo-Distributed Mode), I can not start my own java application. The exception complains that 'java.sql.SQLException: failed to connect to url ...
    Jianwu WangJianwu Wang
    Sep 29, 2009 at 11:59 pm
    Sep 30, 2009 at 10:14 pm
  • Hi all, My hadoop cluster is using hadoop 0.18.3. And I'd like to using python to access hdfs. But I did not found thrift in hadoop 0.18.3. Anybody know does hadoop 0.18.3 support thrift? If not, is ...
    Jeff ZhangJeff Zhang
    Sep 29, 2009 at 5:38 am
    Sep 30, 2009 at 3:31 am
  • Hi everyone, Just a final reminder for this NSF/Google/IBM event next Monday (10/5). We've put together an exciting program with talks by Luiz André Barroso (Google), Hamid Pirahesh (IBM), and many ...
    Jimmy LinJimmy Lin
    Sep 29, 2009 at 2:30 am
    Sep 29, 2009 at 3:42 pm
  • Hi, Does anyone have an experience running HDFS cluster stretched over high-latency WAN connections? Any specific concerns/options/recommendations? I'm trying to setup the HDFS cluster with the nodes ...
    Touretsky, GregoryTouretsky, Gregory
    Sep 16, 2009 at 7:09 am
    Sep 17, 2009 at 4:55 pm
  • Hi all, I am having some trouble with distributing workload evenly to reducers. I have 25 reducers and I intentionally created 25 different Map output keys so that each output set will go to one ...
    Anh NguyenAnh Nguyen
    Sep 16, 2009 at 7:24 am
    Sep 16, 2009 at 4:12 pm
  • I am using Hadoop 0.19.1 I attempt to split an input into multiple directories. I don't know in advance how many directories exists. I don't know in advance what is the directory depth. I expect that ...
    Aviad selaAviad sela
    Sep 9, 2009 at 4:06 pm
    Sep 16, 2009 at 3:57 am
  • All, I'm setting up my first hadoop full cluster. I did the cygwin thing and everything works. I'm having problems with the cluster. The cluster is five nodes of matched hardware running Ubuntu 8.04. ...
    Sep 9, 2009 at 9:29 pm
    Sep 10, 2009 at 4:01 pm
  • Hello all, First of all sorry if this question sounds stupid. I am a beginner with Hadoop. I am using Hadoop at windows so I have installed java in following directory C:\cygwin\home\HadoopAdmin\java ...
    Rajpal, Harjeet KumarRajpal, Harjeet Kumar
    Sep 4, 2009 at 8:44 am
    Sep 9, 2009 at 6:35 am
  • Good Day, I have a question on the DistributedCache as follows. I have used DistributedCache to move my executable(.exe) around the (onto the local filesystems of) nodes in Hadoop and run the .exe ...
    Sep 3, 2009 at 3:20 am
    Sep 7, 2009 at 12:01 pm
  • Hi All: Does anybody know if it's possible to distcp between an old version of Hadoop (0.15.x, for example) and a modern version (0.19.2)? A quick check trying to move from an "old" grid to a "new" ...
    C GC G
    Sep 7, 2009 at 5:46 am
    Sep 7, 2009 at 7:25 am
  • Hi! If I distribute files using the Distributed Cache (-archives option), are they guaranteed to be unique per job, or is there a risk that if I distribute a file named A with job 1, job 2 which also ...
    Erik ForsbergErik Forsberg
    Sep 29, 2009 at 9:56 am
    Sep 30, 2009 at 5:25 am
  • Hi. After namenode comes online, and finds all the blocks on all datanodes, I have a time-out of about 30 seconds before it accepts writes. Any idea: 1) Why it so long? 2) How it's possible to make ...
    Stas OskinStas Oskin
    Sep 29, 2009 at 3:18 pm
    Sep 29, 2009 at 10:20 pm
  • Hi. I'm trying to spread DataNode files over separate block devices, but these have lost+found directories created. DataNode initialization fails because it can't erase them (probably because it ...
    Stas OskinStas Oskin
    Sep 29, 2009 at 1:34 am
    Sep 29, 2009 at 4:12 pm
  • On a cluster running 0.19.2 We have some production jobs that perform ETL tasks that open files in hdfs during the reduce task (with speculative execution in reduce stage programmatically turned ...
    Dave bayerDave bayer
    Sep 29, 2009 at 12:28 am
    Sep 29, 2009 at 2:15 pm
  • Hi. I am wondering where the temp files (intermediate files) are stored. They should be located in the hadoop.tmp.dir by default, right? why I cannot find them in either the local file system and ...
    Starry SHIStarry SHI
    Sep 26, 2009 at 6:35 am
    Sep 28, 2009 at 5:37 pm
  • I looked in JIRA but didn't see this reported so I thought I'd see what this list thinks. We've been using SOCKS proxying to access a Hadoop cluster generally using setup described on the Couldera ...
    Andy SautinsAndy Sautins
    Sep 24, 2009 at 7:27 pm
    Sep 24, 2009 at 8:26 pm
Group Navigation
period‹ prev | Sep 2009 | next ›
Group Overview
groupcommon-user @

174 users for September 2009

Stas Oskin: 30 posts Todd Lipcon: 30 posts Steve Loughran: 23 posts Ted Dunning: 23 posts Brian Bockelman: 22 posts Edward Capriolo: 22 posts Amandeep Khurana: 21 posts Amogh Vasekar: 13 posts Allen Wittenauer: 12 posts Jason Venner: 12 posts Matt Massie: 12 posts Nick Rathke: 12 posts Ted Yu: 12 posts Andy Sautins: 10 posts Anthony Urso: 9 posts Chandraprakash Bhagtani: 9 posts Zjffdu: 9 posts John Clarke: 8 posts Mark Kerzner: 8 posts Paul Smith: 7 posts
show more