FAQ

Search Discussions

162 discussions - 726 posts

  • Hi Is the HDFS architecture completely based on the Google Filesystem? If it isnt, what are the differences between the two? Secondly, is the coupling between Hadoop and HDFS same as how it is ...
    Amandeep KhuranaAmandeep Khurana
    Feb 15, 2009 at 7:58 pm
    Feb 27, 2009 at 1:50 pm
  • In the setInput(...) function in DBInputFormat, there are two sets of arguments that one can use. 1. public static void *setInput*(JobConf ...
    Amandeep KhuranaAmandeep Khurana
    Feb 4, 2009 at 1:50 am
    Feb 24, 2009 at 6:17 pm
  • Does the patch HADOOP-2536 support connecting to Oracle databases as well? Or is it just limited to MySQL? Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa ...
    Amandeep KhuranaAmandeep Khurana
    Feb 4, 2009 at 2:17 am
    Mar 12, 2009 at 11:53 am
  • Hi! Kind of novice question, but I need to know, what Hadoop version is considered stable. I was trying to run version 0.19, and I've seen numerous stability issues with it. Maybe version 0.18 is ...
    Vadim ZalivaVadim Zaliva
    Feb 11, 2009 at 3:21 am
    Feb 16, 2009 at 12:47 pm
  • I am trying to import data from a flat file into Hbase using a Map Reduce job. There are close to 2 million rows. Mid way into the job, it starts giving me connection problems and eventually kills ...
    Amandeep KhuranaAmandeep Khurana
    Feb 21, 2009 at 5:44 am
    Feb 21, 2009 at 6:56 pm
  • Hello, I would really appreciate any help I can get on this! I've suddenly ran into a very strange error. when I do: bin/start-all I get: hadoop$ bin/start-all.sh starting namenode, logging to ...
    SandySandy
    Feb 13, 2009 at 11:28 pm
    Feb 17, 2009 at 10:56 pm
  • Hi, I am using VM image hadoop-appliance-0.18.0.vmx and an eclipse plug-in of hadoop. I have followed all the steps in this tutorial: http://public.yahoo.com/gogate/hadoop-tutorial/html/module3.html. ...
    ImanIman
    Feb 12, 2009 at 8:10 pm
    Mar 21, 2012 at 5:41 am
  • (Repost from the dev list) I noticed some really odd behavior today while reviewing the job history of some of our jobs. Our Ganglia graphs showed really long periods of inactivity across the entire ...
    Bryan DuxburyBryan Duxbury
    Feb 20, 2009 at 11:59 pm
    Mar 27, 2009 at 5:12 am
  • I have a mapreduce job that requires expensive initialization (loading of some large dictionaries before processing). I want to avoid executing this initialization more than necessary. I understand ...
    Stuart WhiteStuart White
    Feb 28, 2009 at 2:06 pm
    Mar 8, 2009 at 5:05 am
  • Hi, I'm trying to run hadoop version 19 on ubuntu with java build 1.6.0_11-b03. I'm getting the following error: Error occurred during initialization of VM Could not reserve enough space for object ...
    Madhuri72Madhuri72
    Feb 26, 2009 at 1:06 am
    Feb 26, 2009 at 6:01 am
  • Hi, Could someone help me to find some real Figures (transfer rate) about Hadoop File transfer from local filesystem to HDFS, S3 etc and among Storage Systems (HDFS to S3 etc) Thanks, Wasim
    Wasim BariWasim Bari
    Feb 10, 2009 at 10:10 pm
    Feb 11, 2009 at 11:19 am
  • Hi, Can someone help me with the usage of counters please? I am incrementing a counter in Reduce method but I am unable to collect the counter value after the job is completed. Its something like ...
    Some speedSome speed
    Feb 5, 2009 at 10:10 am
    Feb 8, 2009 at 6:34 am
  • I have setup a distributed environment on Fedora OS to run Hadoop. System Fedora1 is the name node, Fedora2 is Job tracker, Fedora3 and Fedora4 are task trackers. Conf/masters contains the entries ...
    Jagadesh_DoddiJagadesh_Doddi
    Feb 23, 2009 at 9:45 am
    Aug 13, 2009 at 4:15 pm
  • hi: the hadoop distributes the data and processing across clusters of commonly available computers.the document said this. but what is the "commonly available computers" mean? 1U server? or the pc ...
    Buddha1021Buddha1021
    Feb 19, 2009 at 1:09 am
    Feb 24, 2009 at 12:43 am
  • I'm using Eclipse 3.3.2 and want to view my remote cluster using the Hadoop plugin. Everything shows up and I can see the map/reduce perspective but when trying to connect to a location I get: ...
    Erik HolstadErik Holstad
    Feb 18, 2009 at 8:16 pm
    Feb 21, 2009 at 7:40 pm
  • hi, am going through the tutorial on multinode cluster setup by m. noll... http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster) ... i am at the networking and ssh ...
    Zander1013Zander1013
    Feb 15, 2009 at 3:50 am
    Feb 16, 2009 at 8:30 pm
  • I have setup hadoop in pseudo distributed mode with namenode, datanode, jobtracker and tasktracker all on the same machine... I also have a code which I use to write my data into hadoop. The code of ...
    Parag DhanukaParag Dhanuka
    Feb 24, 2009 at 7:03 am
    Feb 24, 2009 at 2:36 pm
  • Hi, Let's say the smaller subset has name A. It is a relatively small collection < 100 000 entries (could also be only 100), with nearly no payload as value. Collection B is a big collection with 10 ...
    Thibaut_Thibaut_
    Feb 11, 2009 at 9:39 pm
    Feb 18, 2009 at 4:45 pm
  • Is there a way to tell Hadoop to not run Map and Reduce concurrently? I'm running into a problem where I set the jvm to Xmx768 and it seems like 2 mappers and 2 reducers are running on each machine ...
    Kris JirapinyoKris Jirapinyo
    Feb 13, 2009 at 10:30 pm
    Feb 14, 2009 at 8:31 pm
  • How do people back up their data that they keep on HDFS? We have many TB of data which we need to get backed up but are unclear on how to do this efficiently/reliably.
    Nathan MarzNathan Marz
    Feb 10, 2009 at 12:17 am
    Feb 12, 2009 at 9:41 am
  • All, I have a few relatively small clusters (5-20 nodes) and am having trouble keeping them loaded with my MR jobs. The primary issue is that I have different jobs that have drastically different ...
    Jonathan GrayJonathan Gray
    Feb 3, 2009 at 7:15 pm
    Feb 4, 2009 at 9:37 pm
  • Hey guys, We have been using Hadoop to do batch processing of logs. The logs get written and stored on a NAS. Our Hadoop cluster periodically copies a batch of new logs from the NAS, via NFS into ...
    TCKTCK
    Feb 4, 2009 at 5:52 pm
    Mar 13, 2009 at 6:02 am
  • Release 0.19.1 fixes many critical bugs in 0.19.0, including ***some data loss issues***. The release also introduces an ***incompatible change*** by disabling the file append API (HADOOP-5224) until ...
    Nigel DaleyNigel Daley
    Feb 24, 2009 at 10:22 pm
    Mar 3, 2009 at 7:53 pm
  • I have about 24k gz files (about 550GB total) on hdfs and has a really simple java program to convert them into sequence files. If the script's setInputPaths takes a Path[] of all 24k files, it will ...
    BzhengBzheng
    Feb 25, 2009 at 12:04 am
    Mar 3, 2009 at 1:51 am
  • I'm having trouble overriding the maximum number of map tasks that run on a given machine in my cluster. The default value of mapred.tasktracker.map.tasks.maximum is set to 2 in hadoop-default.xml. ...
    S DS D
    Feb 18, 2009 at 1:01 pm
    Feb 19, 2009 at 4:07 am
  • Hi all, I'm continually running into the "Too many open files" error on 18.3: DataXceiveServer: java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ...
    Sean KnappSean Knapp
    Feb 12, 2009 at 7:56 pm
    Feb 13, 2009 at 8:21 pm
  • Is it possible to write a map reduce job using multiple input files? For example: File 1 has data like - Name, Number File 2 has data like - Number, Address Using these, I want to create a third file ...
    Amandeep KhuranaAmandeep Khurana
    Feb 6, 2009 at 9:35 am
    Feb 7, 2009 at 5:33 am
  • Hi Group, I am planning to use HDFS as a reliable and distributed file system for batch operations. No plans as of now to run any map reduce job on top of it, but in future we will be having map ...
    Amit ChandelAmit Chandel
    Feb 9, 2009 at 4:06 am
    Feb 10, 2009 at 2:48 am
  • Hello, I am new to Hadoop and I jus installed on Ubuntu 8.0.4 LTS as per guidance of a web site. I tested it and found working fine. I tried to copy a file but it is giving some error pls help me out ...
    RajshekarRajshekar
    Feb 5, 2009 at 6:02 am
    Feb 6, 2009 at 6:45 am
  • Hey all When I try to copy a folder from the local file system in to the HDFS using the command hadoop dfs -copyFromLocal, the copy fails and it gives an error which says "Bad connection to FS". How ...
    Mithila NagendraMithila Nagendra
    Feb 4, 2009 at 11:07 pm
    Feb 5, 2009 at 9:27 am
  • Hello all, Is anyone using Hadoop as more of a near/almost real-time processing of log data for their systems to aggregate stats, etc? I know that Hadoop has generally been good at off-line ...
    Ryan LeCompteRyan LeCompte
    Feb 25, 2009 at 2:00 pm
    Feb 25, 2009 at 9:18 pm
  • part of my map/reduce process could be greatly sped up by mapping key/value pairs in batches instead of mapping them one by one. I'd like to do the following: protected abstract void ...
    Jimmy WanJimmy Wan
    Feb 23, 2009 at 9:18 pm
    Feb 24, 2009 at 5:25 pm
  • When I start my job from eclipse it gets processed and the output is generated, but it never shows up in my JobTracker, which is opened in my browser. Why is this happening? -- Psssst! Schon vom ...
    Philipp DobrigkeitPhilipp Dobrigkeit
    Feb 19, 2009 at 9:20 pm
    Feb 20, 2009 at 8:31 am
  • I have a Hadoop 0.19.0 cluster of 3 machines (storm, mystique, batman). It seemed as if problems were occurring on mystique (I was noticing errors with tasks that executed on mystique). So I decided ...
    S DS D
    Feb 17, 2009 at 10:14 pm
    Feb 18, 2009 at 3:06 am
  • Hi, I have written binary files to a SequenceFile, seemeingly successfully, but when I read them back with the code below, after a first few reads I get the same number of bytes for the different ...
    Mark KerznerMark Kerzner
    Feb 6, 2009 at 5:42 am
    Feb 9, 2009 at 8:05 pm
  • Hello, I'm interested in a map-reduce flow where I output only values (no keys) in my reduce step. For example, imagine the canonical word-counting program where I'd like my output to be an unlabeled ...
    Jack StahlJack Stahl
    Feb 4, 2009 at 3:50 am
    Feb 5, 2009 at 12:53 am
  • Hi, I am writing an application to copy all files from a regular PC to a SequenceFile. I can surely do this by simply recursing all directories on my PC, but I wonder if there is any way to ...
    Mark KerznerMark Kerzner
    Feb 2, 2009 at 5:23 am
    Feb 2, 2009 at 4:01 pm
  • Hi, this from Dr. Dobbs caught my attention, 240 CPU for $1,700 http://www.ddj.com/focal/NVIDIA-CUDA What are your thoughts? Thank you, Mark
    Mark KerznerMark Kerzner
    Feb 27, 2009 at 6:57 pm
    Mar 2, 2009 at 11:20 am
  • Hi all, I have one class extends MultipleOutputFormat as below, public class MyMultipleTextOutputFormat<K, V extends MultipleOutputFormat<K, V { private TextOutputFormat<K, V theTextOutputFormat = ...
    Ma qiangMa qiang
    Feb 25, 2009 at 2:59 am
    Feb 25, 2009 at 2:20 pm
  • Hi everyone! I am using Hadoop Core (version 0.19.0), os : Ubuntu 8.04, on one single machine (for testing purpose). Everytime I shutdown my computer and turn on it again, I can't access the virtual ...
    Anh Vũ NguyễnAnh Vũ Nguyễn
    Feb 24, 2009 at 4:00 am
    Feb 24, 2009 at 5:16 am
  • The Apache ZooKeeper team is proud to announce Apache ZooKeeper version 3.1.0. ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services - such as ...
    Patrick HuntPatrick Hunt
    Feb 13, 2009 at 9:51 pm
    Feb 20, 2009 at 11:33 pm
  • Hi All I prepare my JobConf object in a java class, by calling various set apis in JobConf object. When I submit the jobconf object using JobClient.runJob(conf), I'm seeing the warning: "Use ...
    Sandhya ESandhya E
    Feb 18, 2009 at 9:48 am
    Feb 20, 2009 at 11:25 am
  • Does anyone have an expected or experienced write speed to HDFS outside of Map/Reduce? Any recommendations on properties to tweak in hadoop-site.xml? Currently I have a multi-threaded writer where ...
    Xavier StevensXavier Stevens
    Feb 13, 2009 at 4:39 pm
    Feb 18, 2009 at 8:12 pm
  • Hi all, I consistently have this problem that I can run HDFS and restart it after short breaks of a few hours, but the next day I always have to reformat HDFS before the daemons begin to work. Is ...
    Mark KerznerMark Kerzner
    Feb 17, 2009 at 4:12 am
    Feb 17, 2009 at 2:27 pm
  • Hi, As far as I can tell I've followed the setup instructions for a hadoop cluster to the letter, but I find that the datanodes can't connect to the namenode on port 9000 because it is only listening ...
    Michael LynchMichael Lynch
    Feb 13, 2009 at 12:32 pm
    Feb 16, 2009 at 6:53 am
  • In my Hadoop 0.19.0 program each map function is assigned a directory (representing a data location in my S3 datastore). The first thing each map function does is copy the particular S3 data to the ...
    S DS D
    Feb 14, 2009 at 10:46 pm
    Feb 16, 2009 at 5:23 am
  • Hi, We're running Hadoop cluster on 4 nodes, our primary purpose of running is to provide distributed storage solution for internal applications here in TellyTopia Inc. Our cluster consists of ...
    DeepakDeepak
    Feb 12, 2009 at 8:54 am
    Feb 16, 2009 at 4:29 am
  • Good morning everyone, I have a question about correct setup for hadoop. I have 14 Dell computers in a lab. Each connected to the internet and each independent of each other. All run CentOS. Logins ...
    BjdayBjday
    Feb 11, 2009 at 2:39 pm
    Feb 14, 2009 at 12:34 am
  • Hi, Hi, why is hadoop suddenly telling me Retrying connect to server: localhost/127.0.0.1:8020 with this configuration <configuration <property <name fs.default.name</name <value ...
    Mark KerznerMark Kerzner
    Feb 10, 2009 at 5:45 am
    Feb 12, 2009 at 3:16 pm
  • Hi, I'm new to Hadoop and I'm wondering what the recommended method is for using native libraries in mapred jobs. I've tried the following separately: 1. set LD_LIBRARY_PATH in .bashrc 2. set ...
    Mimi SunMimi Sun
    Feb 10, 2009 at 7:07 pm
    Feb 12, 2009 at 6:50 am
Group Navigation
period‹ prev | Feb 2009 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions162
posts726
users168
websitehadoop.apache.org...
irc#hadoop

168 users for February 2009

Amandeep Khurana: 51 posts Rasit OZDAS: 44 posts Jason hadoop: 34 posts Mark Kerzner: 24 posts S D: 24 posts Brian Bockelman: 22 posts Matei Zaharia: 20 posts Steve Loughran: 20 posts Tom White: 12 posts Amareshwari Sriramadasu: 11 posts Nathan Marz: 11 posts Bryan Duxbury: 10 posts Nick Cen: 9 posts Raghu Angadi: 9 posts Some speed: 9 posts Zander1013: 9 posts Parag Dhanuka: 8 posts Sean Knapp: 8 posts Vadim Zaliva: 8 posts Anum Ali: 7 posts
show more