Search Discussions

134 discussions - 558 posts

  • Hi, I use the org.apache.hadoop.io.Text object to set its value "測試" in chinese text(six bytes in UTF-8 encoding), and when I invoke its "getBytes()" method that return the raw bytes (11 bytes), but ...
    Oct 31, 2009 at 3:03 pm
    Oct 31, 2009 at 5:25 pm
  • Hi, Sorry for starting a new thread. I'm facing the same problem as Ian, see the quoted mail below. We used 0.15.0 on our lab cluster before. Recently we are considering upgrading to 0.20.x, and ...
    Xiance SI(司宪策)Xiance SI(司宪策)
    Oct 31, 2009 at 5:57 am
    Oct 31, 2009 at 5:57 am
  • I've run into a situation where it would be helpful to set specific configuration variables local to a data/task node. I've got a solution, but I'm curious if there is a best practice around this and ...
    Andy SautinsAndy Sautins
    Oct 30, 2009 at 7:21 pm
    Oct 30, 2009 at 7:21 pm
  • Hi all, I found the the default value of HADOOP_VERSION is 0.17.0 in hadoop-ec2-env.sh of hadoop 0.18.3, and I can create hadoop 0.17.0 cluster in ec2 succesully, but I can not create hadoop 0.18.3 ...
    Jeff ZhangJeff Zhang
    Oct 30, 2009 at 1:07 pm
    Oct 30, 2009 at 2:31 pm
  • Hi all, I have installed hadoop 0.18.3 on my own cluster with 5 machines, now I want to install hadoop 0.20, but I do not run to uninstall the hadoop 0.18.3. So what things should I modify to ...
    Jeff ZhangJeff Zhang
    Oct 30, 2009 at 5:44 am
    Oct 30, 2009 at 6:02 am
  • Hey there, Thought you guys would be interested in a new blog article I've posted entitled "HBase vs. Cassandra: NoSQL Battle!". I hope you all enjoy it -- I can always use some feedback! ...
    Bradford StephensBradford Stephens
    Oct 29, 2009 at 8:49 pm
    Oct 29, 2009 at 8:49 pm
  • I am using hadoop 0.19.2, Fedora Eclipse latest version and hadoop plugin available in contrib of Hadoop 0.19.2. I can view DFS fine. Mapred perspective works ok. But program "Run as" Run on Hadoop ...
    Rajiv MRajiv M
    Oct 29, 2009 at 3:12 pm
    Oct 29, 2009 at 3:12 pm
  • Hi, I am writing an M-R code using MapRunnable interface. The input format is SequenceFileInputFormat. Each Sequence-record contains a key-value pair of type <Text key,Text value (Text: ...
    Oct 29, 2009 at 12:19 pm
    Dec 22, 2009 at 3:29 pm
  • Hi There is 1 GB of rdf/owl files that I am executing on EC2. Execution throws the following exception ------------------- 08/11/19 16:08:27 WARN mapred.JobClient: Use GenericOptionsParser for ...
    Harshit KumarHarshit Kumar
    Oct 29, 2009 at 4:30 am
    Oct 30, 2009 at 5:15 am
  • Hi Everyone, I'm experimenting with Hadoop and trying to get it running in pseudo-distributed mode as described in Appendix A of Tom White's book Hadoop The Definitive Guide. I've got the ...
    David GreerDavid Greer
    Oct 28, 2009 at 10:46 pm
    Oct 29, 2009 at 8:50 pm
  • In Streaming tasks, how can I output a separate file with the key as the filename, for each line of output, instead of collecting it in a big file? Thanks, Ryan
    Ryan RosarioRyan Rosario
    Oct 28, 2009 at 8:43 pm
    Oct 28, 2009 at 8:43 pm
  • Going back to the issue: http://markmail.org/search/?q=Multinode+cluster+setup+issues#query:Multinode%20cluster%20setup%20issues+page:1+mid:qi45trv4fdfwugwf+state:results And having gone over the ...
    Hassaan KhanHassaan Khan
    Oct 28, 2009 at 8:06 pm
    Oct 28, 2009 at 8:06 pm
  • Hi, Using Hadoop 0.20 (CDH2) I'm trying to pass some JVM options to my child tasks on the command-line, like this: $ hadoop jar streaming.jar -D mapred.reduce.tasks=0 -D ...
    Brian VargasBrian Vargas
    Oct 28, 2009 at 7:11 pm
    Oct 28, 2009 at 8:00 pm
  • I'm running Hadoop 0.20.1+133 (Cloudera distro) I tried setting up a multi-node Hadoop cluster and on executing the command: hadoop jar /usr/lib/hadoop/hadoop-0.20.1+133-examples.jar grep input ...
    Hassaan KhanHassaan Khan
    Oct 28, 2009 at 3:42 pm
    Oct 28, 2009 at 3:48 pm
  • Hi All, We are facing the issue with distribution of data in a cluster where nodes have differnt storage capacity. We have 4 nodes with 100G capacity and 1 node with 2TB capacity. The storage of the ...
    Vibhooti VermaVibhooti Verma
    Oct 28, 2009 at 9:25 am
    Oct 28, 2009 at 9:42 am
  • Hi, I need to number all output records consecutively, like, 1,2,3... This is no problem with one reducer, making recordId an instance variable in the Reducer class, and setting ...
    Mark KerznerMark Kerzner
    Oct 28, 2009 at 3:55 am
    Oct 28, 2009 at 7:30 pm
  • Dear Huy Phan and others, Thanks a lot for your efforts in customizing the WebDav server<http://github.com/huyphan/HDFS-over-Webdav and make it work for Hadoop-0.20.1. After setting up the WebDav ...
    Zhang Bingjun (Eddy)Zhang Bingjun (Eddy)
    Oct 27, 2009 at 10:28 am
    Oct 28, 2009 at 7:40 am
  • Hi, I have written a code to create sequence files for given text files. The program takes following input parameters: 1. Local source directory - contains all the input text files 2. Destination ...
    Oct 27, 2009 at 8:44 am
    Oct 27, 2009 at 3:19 pm
  • Hi all, I'd like to know does the map task push map output to reduce task or reduce task pull it from map task ? Which way is real in hadoop ? Thank you very much. Jeff zhang
    Jeff ZhangJeff Zhang
    Oct 27, 2009 at 1:05 am
    Oct 27, 2009 at 5:17 am
  • Hi, I am new to Hadoop. I am following the tutorial on http://hadoop.apache.org/common/docs/current/quickstart.html I have downloaded the hadoop-0.20.1.tar.gz package and unpackaged it. First, I ...
    Dong ZhangDong Zhang
    Oct 26, 2009 at 5:48 am
    Oct 26, 2009 at 6:09 am
  • I was testing a job on a single node hadoop cluster running Hadoo9 0.19. The single tasktracker has 2 reduce slots. After finishing 8 reduce tasks out of 17 total reduce tasks, the tasktracker ...
    Runping QiRunping Qi
    Oct 26, 2009 at 5:29 am
    Oct 26, 2009 at 11:04 pm
  • Dear All, I am implementing a clustering algorithm in which I need to compare each line to two specific lines (they all have the same format ) and output two scores denoting the similarity between ...
    Boyu ZhangBoyu Zhang
    Oct 26, 2009 at 12:47 am
    Oct 26, 2009 at 7:40 pm
  • I am using a Python script as a mapper for a Hadoop Streaming (hadoop 0.20.0) job, with reducer NONE. My jobs keep getting killed with "task failed to respond after 600 seconds." I tried sending a ...
    Ryan RosarioRyan Rosario
    Oct 25, 2009 at 7:01 pm
    Oct 27, 2009 at 2:48 pm
  • Hi all, I m trying to get hadoop running on ubuntu and I get this error while trying to format, infact its only a warning and the format is successful according to the trace. But could anybody tell ...
    Prabhu Hari DhanapalPrabhu Hari Dhanapal
    Oct 25, 2009 at 12:05 am
    Oct 25, 2009 at 12:05 am
  • How can I make a datanode read-only ? My guess would be hack the config to set reserved disk space = current free disk space Is there any other way ? Vesion : 0.18.1 -Sagar
    Oct 24, 2009 at 11:41 pm
    Oct 24, 2009 at 11:41 pm
  • Hi all, I'd like to contribute the hadoop, and I'd like to get started with fixing bugs. But I found in the jira, it says that I have no permission to work on the jira item. So how can I get the ...
    Jeff ZhangJeff Zhang
    Oct 24, 2009 at 11:37 am
    Oct 24, 2009 at 11:44 am
  • Hello, In my application I need to reduce the original reducer output keys further. I was reading about Chainreducer and Chainmappers but looks like it is for : one or more mapper - reducer - 0 or ...
    Oct 22, 2009 at 11:17 pm
    Oct 23, 2009 at 5:20 pm
  • Hi! I have quite odd Hadoop behavior. I wrote a client to my app that simply is trying to talk to HDFS and do stuff. Version of Hadoop is 20.0. I still suspect CLASSPATH, but would be nice to know ...
    Bogdan M. MaryniukBogdan M. Maryniuk
    Oct 22, 2009 at 4:24 am
    Oct 23, 2009 at 12:09 am
  • We have a 6 node cluster where two of the nodes (NN, 2NN) have half the disk space available as the remaining four. As a result these two always have more space used. We run rebalancer occasionally ...
    Mayuran YogarajahMayuran Yogarajah
    Oct 21, 2009 at 11:47 pm
    Oct 21, 2009 at 11:47 pm
  • I just downloaded and installed hadoop ver 0.200.1 and cygwin 1.5.25-15 and installed them (Windows XP.) I'm having trouble with ssh. When I enter "ssh localhost" I'm prompted for a password. I can ...
    Dennis DiMariaDennis DiMaria
    Oct 21, 2009 at 9:40 pm
    Oct 22, 2009 at 3:03 pm
  • Hi all, These days, I begin look into source code hadoop. And I want to know whether I need some distributed computing algorithm if I want to deep into source code of hadoop ? Thank you. Jeff zhang
    Jeff ZhangJeff Zhang
    Oct 21, 2009 at 3:17 pm
    Oct 22, 2009 at 3:39 am
  • Hi. I'm want to keep a checkpoint data on several separate machines for backup, and deliberating between exporting these machines disks via NFS, or actually running Secondary Name Nodes there. Can ...
    Stas OskinStas Oskin
    Oct 21, 2009 at 2:45 pm
    Dec 24, 2009 at 7:16 pm
  • Hi, according to the API-Dokumentation of 0.20.1 JobConf is deprecated and we should use Configuration instead. However all examples on the webpage still referece JobConf. Is there a good example for ...
    Oliver B. FischerOliver B. Fischer
    Oct 21, 2009 at 12:49 pm
    Oct 29, 2009 at 4:26 am
  • hi here I choose a machine as a namenode,and a machine as a secondary namenode, a machine as a datanode. when i start up hadoop(bin/start-all.sh), there are some errors in secondary namenode,like ...
    Oct 21, 2009 at 9:09 am
    Oct 23, 2009 at 8:57 am
  • 2009-10-19 17:54:16,221 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=SHUFFLE, sessionId= 2009-10-19 17:54:17,632 INFO org.apache.hadoop.mapred.ReduceTask: ...
    Oct 21, 2009 at 8:56 am
    Oct 21, 2009 at 8:56 am
  • hi, I use hadoop0.20 and 8 nodes, there is a job that has 130 map to run ,and completed 128 map, but only 2 map fail ,and its fail in my case is accepted ,but the job fail ,the last 128 map also ...
    Oct 21, 2009 at 3:28 am
    Oct 22, 2009 at 2:07 am
  • A few weeks ago I set up the secondary namenode to run on a different machine as follows: - on the NN I put the host of the 2NN server inside the slaves file - on the 2NN I added the following to the ...
    Mayuran YogarajahMayuran Yogarajah
    Oct 21, 2009 at 12:57 am
    Oct 21, 2009 at 12:57 am
  • I am trying to stream data from HDFS on a workstation outside of hadoop. I have a small method to initialize the DistributedFileSystem and i pass the IP and port of the namenode, but that fails with ...
    Stephane BrossierStephane Brossier
    Oct 21, 2009 at 12:56 am
    Oct 21, 2009 at 5:03 am
  • Hi, I have input files, that contain NO carriage returns/line feeds. Each record is a fixed length (i.e. 202 bytes). Which FileInputFormat should I be using? so that each call to my Mapper receives ...
    Oct 20, 2009 at 8:53 pm
    Nov 1, 2009 at 6:44 pm
  • Hello! We have a cluster of 5 nodes and we are concentrating on the development of a DFS(Distributed File System). with the incorporation of Hadoop. Now, Can I get some ideas on how can I design ...
    Sugandha NaolekarSugandha Naolekar
    Oct 20, 2009 at 11:39 am
    Oct 22, 2009 at 3:56 am
  • Hi, I currently have an app written to use 0.18.3 (cloudera ec2 dist) and it is working fine. Are there any significant advantages to move to the new stable 0.20.1? The app uses a custom MapRunnable ...
    John ClarkeJohn Clarke
    Oct 20, 2009 at 8:44 am
    Oct 23, 2009 at 8:35 am
  • Hi, My Hadoop job has no reduce tasks. The Mapper creates some very simple Java POJOs which I would like to serialize as the output for the Mapper. These POJO's are to be stored in an Hbase table in ...
    Oct 20, 2009 at 1:17 am
    Oct 20, 2009 at 1:17 am
  • NOTE: for amd64 architecture, libhdfs will not compile unless you edit the Makefile in src/c++/libhdfs/Makefile and set OS_ARCH=amd64 (probably the same for others too). See ...
    Oct 19, 2009 at 10:07 pm
    Oct 20, 2009 at 2:40 am
  • Hi Everybody, I'm doing a project where I have to read a large set of compress files (gz). I'm using python and streaming to achieve my goals. However, I have a problem, there are corrupt compress ...
    Xavier QuintunaXavier Quintuna
    Oct 19, 2009 at 5:58 pm
    Oct 19, 2009 at 6:53 pm
  • 1. When I build hive-0.4.0, ivy would try to download hadoop, 0.18.3, 0.19.0 and 0.20.0. But always fail for 2. Then I modified shims/ivy.xml and shims/build.xml to remove ...
    Schubert ZhangSchubert Zhang
    Oct 19, 2009 at 5:03 pm
    Feb 15, 2010 at 10:15 pm
  • Hey all, While running the (latest as of Friday) Cloudera-created EC2 scripts, I noticed that running the terminate-cluster script kills ALL of your EC2 nodes, not just those associated with the ...
    Mark StetzerMark Stetzer
    Oct 19, 2009 at 3:41 pm
    Oct 19, 2009 at 4:53 pm
  • Hi, I have a cluster setup with 3 nodes, and I'm adding hostname details (in /etc/hosts) manually in each node. Seems it is not an effective approach. How this scenario is handled in big clusters? Is ...
    Oct 19, 2009 at 1:40 pm
    Oct 21, 2009 at 9:51 am
  • I and running a hadoop program to perform MapReduce work on files inside a folder. My program is basically doing Map and Reduce work, each line of any file is a pair of string, and the result is a ...
    Kunsheng ChenKunsheng Chen
    Oct 19, 2009 at 2:57 am
    Oct 20, 2009 at 2:02 am
  • Greetings, (You're receiving this e-mail because you're on a DL or I think you'd be interested) It's time for another Hadoop/Lucene/Apache "Cloud" stack meetup! This month it'll be on Wednesday, the ...
    Bradford StephensBradford Stephens
    Oct 19, 2009 at 12:11 am
    Oct 27, 2009 at 11:08 pm
  • Hi, What is the preferred method to distribute the classes (in various Jars) to my Hadoop instances, that are required by my Mapper class? thanks!
    Oct 18, 2009 at 10:08 pm
    Oct 19, 2009 at 12:53 pm
Group Navigation
period‹ prev | Oct 2009 | next ›
Group Overview
groupcommon-user @

177 users for October 2009

Amandeep Khurana: 22 posts Stas Oskin: 22 posts Jason Venner: 20 posts Tim robertson: 20 posts Todd Lipcon: 19 posts Edward Capriolo: 15 posts Aaron Kimball: 14 posts Amogh Vasekar: 14 posts Mark Kerzner: 13 posts Steve Loughran: 13 posts Brian Bockelman: 12 posts Bogdan M. Maryniuk: 10 posts Sudha sadhasivam: 10 posts Allen Wittenauer: 9 posts Huy Phan: 9 posts Jeff Zhang: 9 posts Eason.Lee: 8 posts Shwitzu: 7 posts Usman Waheed: 7 posts Yibo820217: 7 posts
show more