FAQ

Search Discussions

174 discussions - 634 posts

  • We have small xml files. Currently I am planning to append these small files to one file in hdfs so that I can take advantage of splits, larger blocks and sequential IO. What I am unsure is if it's ...
    Mohit AnchliaMohit Anchlia
    Feb 21, 2012 at 5:16 pm
    Feb 22, 2012 at 3:31 am
  • Hey there, I've been running a cluster for over a year and was getting a lzo decompressing exception less than once a month. Suddenly it happens almost once per day. Any ideas what could be causing ...
    Marc SturleseMarc Sturlese
    Feb 28, 2012 at 10:59 am
    Mar 2, 2012 at 4:22 am
  • I'm trying to run the basic example from hadoop/hadoop-1.0.0/docs/single_node_setup.html. I'm getting java.lang.OutOfMemoryError's when I run the grep example from that page. Stackoverflow suggests ...
    Tim BrobergTim Broberg
    Feb 7, 2012 at 9:50 am
    Nov 14, 2012 at 5:45 am
  • For some reason I am getting invocation exception and I don't see any more details other than this exception: My job is configured as: JobConf conf = *new* JobConf(FormMLProcessor.*class*); ...
    Mohit AnchliaMohit Anchlia
    Feb 27, 2012 at 11:01 pm
    Mar 1, 2012 at 2:28 am
  • It looks as if backupnode isn't supported in 1.0.0? Any chances it's in 1.0.1? Thanks -jeremy
    Jeremy HansenJeremy Hansen
    Feb 22, 2012 at 7:41 pm
    Feb 23, 2012 at 2:46 pm
  • I'm looking to clarify the relationship between MultithreadedMapper.setNumberOfThreads(i) and mapreduce.tasktracker.map.tasks.maximum . If I set: - MultithreadedMapper.setNumberOfThreads( 4 ) - ...
    Rob StewartRob Stewart
    Feb 10, 2012 at 12:26 pm
    Feb 10, 2012 at 10:39 pm
  • hi all, I'm testing hadoop and hive, and I want to use them in log analysis. Here I have a question, can I write/append log to an compressed file which is located in hdfs? Our system generate lots of ...
    Xiaobin SheXiaobin She
    Feb 6, 2012 at 8:41 am
    Feb 7, 2012 at 6:07 pm
  • All, I am trying to use Hibernate within my reducer and it goeth not well. Has anybody ever successfully done this? I have a java package that contains my Hadoop driver, mapper, and reducer along ...
    Geoffry RobertsGeoffry Roberts
    Feb 28, 2012 at 5:15 pm
    Mar 3, 2012 at 7:25 am
  • Hi guys, thought I should ask this before I use it ... will using C over Hadoop give me the usual C memory management? For example, malloc() , sizeof() ? My guess is no since this all will eventually ...
    Mark questionMark question
    Feb 29, 2012 at 6:57 pm
    Mar 1, 2012 at 11:07 pm
  • Hi, Some time ago I had an idea and implemented it. Normally you can only run a single gzipped input file through a single mapper and thus only on a single CPU core. What I created makes it possible ...
    Niels BasjesNiels Basjes
    Feb 28, 2012 at 3:51 pm
    Mar 1, 2012 at 12:35 pm
  • Hi, I am Guruprasad from Bangalore (India). I need help in setting up hadoop platform. I am very much new to Hadoop Platform. I am following the below given articles and I was able to set up ...
    Guruprasad BGuruprasad B
    Feb 8, 2012 at 7:02 pm
    Feb 11, 2012 at 1:31 am
  • Hi, I have a simple MR job, and I want each Mapper to get one line from my input file (which contains further instructions for lengthy processing). Each line is 100 characters long, and I tell Hadoop ...
    Mark KerznerMark Kerzner
    Feb 2, 2012 at 12:22 am
    Feb 3, 2012 at 4:26 am
  • Hi All, I have a cluster of 6 datanodes, all running hadoop version 0.20.2, r911707 that take a series of bzip2 compressed text files as input. I have read conflicting articles regarding whether or ...
    Daniel BaptistaDaniel Baptista
    Feb 24, 2012 at 3:43 pm
    Feb 27, 2012 at 9:31 am
  • Is LZO compression supported with sequenceFile compression codec? I looked http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.CompressionType.html but it doesn't have ...
    Mohit AnchliaMohit Anchlia
    Feb 25, 2012 at 9:38 pm
    Feb 26, 2012 at 7:28 pm
  • Hi I setup hadoop with hadoop 0.20.2 I use three virtual machines on vmware, The three virtual machine could ssh with each other, ERROR rise , the tasktracker on slave 192.168.164.137 and ...
    TghTgh
    Feb 23, 2012 at 9:03 am
    Feb 24, 2012 at 7:07 am
  • Hi, We have a 20 node hadoop cluster running CDH3 U2. Some of our jobs are failing with the following errors. We noticed that we are consistently hitting this error condition when the total number of ...
    Sumanth VSumanth V
    Feb 17, 2012 at 2:25 am
    Feb 17, 2012 at 5:26 am
  • Dear all, I got an error when running a simple Java program on Hadoop. The program is just to merge some local files to one and put it on Hadoop. The code is as follows. ...... Configuration conf = ...
    Bing LiBing Li
    Feb 7, 2012 at 9:39 am
    Feb 8, 2012 at 1:03 pm
  • Dear Group Members, We have openings in Hadoop/Big Data from 5 - 19 years of experience from Senior Developer to Heading the Hadoop Practice across the Globe with the TOP IT company in India. Work ...
    MaheswaranMaheswaran
    Feb 19, 2012 at 4:45 am
    Feb 19, 2012 at 6:34 pm
  • What would be the best way to process small number of xml files? I read about Mahout xmlInputFormat, wondering what would be the best way for processing when small files are involved.
    Mohit AnchliaMohit Anchlia
    Feb 12, 2012 at 4:59 pm
    Feb 18, 2012 at 2:13 pm
  • In our original test, we mistakenly ran the HBase test with the hbase.hregion.memstore.mslab.enabled property set to false. We re-ran the test with the hbase.hregion.memstore.mslab.enabled property ...
    Doug JuddDoug Judd
    Feb 13, 2012 at 5:08 pm
    Feb 18, 2012 at 1:39 pm
  • Hello All, I am a beginning hadoop user. I am trying to install hadoop as part of a single-node setup. I read in the documentation that the supported platforms are GNU/Linux and Win32. I have a Mac ...
    Sriram GanesanSriram Ganesan
    Feb 27, 2012 at 4:40 pm
    Mar 5, 2012 at 2:48 pm
  • Dear all, Today I am trying to configure hadoop-0.20.205.0 on a 4 node Cluster. When I start my cluster , all daemons got started except tasktracker, don't know why task tracker fails due to ...
    Adarsh SharmaAdarsh Sharma
    Feb 21, 2012 at 10:09 am
    Feb 28, 2012 at 9:10 am
  • Hey folks, i m using hadoop 0.20.2 + r911707 , please tell me the installation and how to use snappy for compression and decompression Regards Vikas Srivastava
    Hadoop hiveHadoop hive
    Feb 27, 2012 at 6:16 am
    Feb 27, 2012 at 7:37 am
  • Hello everyone, I run a daily job that takes files in a variety of different formats and process them using several custom InputFormats which are specified using MultipleInputs. The results get ...
    Leonardo UrbinaLeonardo Urbina
    Feb 8, 2012 at 5:40 pm
    Feb 9, 2012 at 8:26 pm
  • Are the 1.0 docs definitely correct at http://hadoop.apache.org/common/docs/r1.0.0? The reason that I ask is that it references setup steps that seem valid for the .2* releases, but not for 1.0. For ...
    Ian MeyersIan Meyers
    Feb 7, 2012 at 5:33 pm
    Feb 8, 2012 at 9:52 am
  • I have the first edition of Tom White's O'Reilly Hadoop book and I was curious about the second edition. I realize it adds new sections on some of the wrapper tools, like Hive, but as far as the core ...
    Keith WileyKeith Wiley
    Feb 6, 2012 at 4:36 pm
    Feb 7, 2012 at 2:29 am
  • Dear all, I am following the book, Hadoop: the Definitive Guide. However, I got stuck because I could not get the NCDC Weather data that is used by the source code in the book. The Appendix C told me ...
    Bing LiBing Li
    Feb 12, 2012 at 7:15 am
    Dec 6, 2012 at 4:09 am
  • Hi All, I am looking for example in java for hadoop. I have done lots of search but I have only found word count. Are there any other exapmple for the same. -- View this message in context: ...
    Vikas jainVikas jain
    Feb 17, 2012 at 9:01 am
    Jul 25, 2012 at 7:04 am
  • How can I set the fair scheduler such that all jobs submitted from a particular user group go to a pool with the group name? I have setup fair scheduler and I have two users: A and B (belonging to ...
    Austin ChungathAustin Chungath
    Feb 29, 2012 at 1:18 pm
    Mar 1, 2012 at 5:48 pm
  • What's the best way to write records to a different file? I am doing xml processing and during processing I might come accross invalid xml format. Current I have it under try catch block and writing ...
    Mohit AnchliaMohit Anchlia
    Feb 27, 2012 at 10:19 pm
    Feb 28, 2012 at 1:41 pm
  • If I want to change the block size then can I use Configuration in mapreduce job and set it when writing to the sequence file or does it need to be cluster wide setting in .xml files? Also, is there ...
    Mohit AnchliaMohit Anchlia
    Feb 26, 2012 at 1:43 am
    Feb 28, 2012 at 8:42 am
  • Hi, We are going into 24x7 production soon and we are considering whether we need vendor support or not. We use a free vendor distribution of Cluster Provisioning + Hadoop + HBase and looked at their ...
    Pavel FrolovPavel Frolov
    Feb 23, 2012 at 6:18 pm
    Feb 25, 2012 at 3:35 pm
  • Hi I want to implement security at file level in Hadoop, essentially restricting certain data to certain users. Ex - File A can be accessed only by a user X File B can be accessed by only user X and ...
    Shreya PalShreya Pal
    Feb 22, 2012 at 10:33 am
    Feb 22, 2012 at 3:07 pm
  • Say I have two Hadoop jobs, A and B, that can be run in parallel. I have another job, C, that takes the output of both A and B as input. I want to run A and B at the same time, wait until both have ...
    W.P. McNeillW.P. McNeill
    Feb 15, 2012 at 7:24 pm
    Feb 15, 2012 at 9:32 pm
  • Hi all, Sorry if it is not appropriate to send one thread into two maillist. ** I'm tring to use hadoop and hive to do some log analytic jobs. Our system generate lots of logs every day, for example, ...
    Xiaobin SheXiaobin She
    Feb 7, 2012 at 10:04 am
    Feb 7, 2012 at 3:39 pm
  • Does anyone have idea on Why $HADOOP_PREFIX was introduced instead of $HADOOP_HOME in hadoop 0.20.205 ? I believe $HADOOP_HOME was not giving any troubles or is there a reason/new feature that ...
    Praveenesh kumarPraveenesh kumar
    Feb 1, 2012 at 12:16 pm
    Feb 2, 2012 at 12:55 am
  • Hello everyone, I've been asked to prepare a small project for a client, which involves the use of machine learning algorithms, correlation and clustering, in order to analyse a big amount of ...
    Fabio PitzoluFabio Pitzolu
    Feb 3, 2012 at 5:02 pm
    Mar 30, 2012 at 8:49 am
  • I am comparing runtime of similar logic. The entire logic is exactly same but surprisingly map reduce job that I submit is 100x slow. For pig I use udf and for hadoop I use mapper only and the logic ...
    Mohit AnchliaMohit Anchlia
    Feb 29, 2012 at 12:12 am
    Feb 29, 2012 at 9:49 pm
  • Hello, Could someone please help me to understand these configuration parameters in depth. mapred.map.tasks and mapred.reduce.tasks It is mentioned that default value of these parameters is 2 and 1. ...
    SangroyaSangroya
    Feb 22, 2012 at 11:39 am
    Feb 23, 2012 at 7:46 pm
  • Hi, I am working on a project which requires a setup as follows: One master with four slaves.However, when a map only program is run, the master dynamically selects the slave to run the map. For ...
    ThetaTheta
    Feb 22, 2012 at 6:32 pm
    Feb 22, 2012 at 11:20 pm
  • How can I copy large text files using "hadoop fs" such that split occurs based on blocks + new lines instead of blocks alone? Is there a way to do this?
    Mohit AnchliaMohit Anchlia
    Feb 22, 2012 at 8:15 pm
    Feb 22, 2012 at 10:58 pm
  • Hello everyone, in order to provide our clients a custom UI for their MapReduce jobs and HDFS files, what is the best solution to create a web-based UI for Hadoop? We are not going to use Cloudera ...
    Fabio PitzoluFabio Pitzolu
    Feb 17, 2012 at 1:53 pm
    Feb 17, 2012 at 4:32 pm
  • Hi, My question is what's the difference between the following two settings: 1. mapred.task.default.maxvmem 2. mapred.child.java.opts The first one is used by the TT to monitor the memory usage of ...
    Mark questionMark question
    Feb 15, 2012 at 10:57 pm
    Feb 16, 2012 at 5:06 pm
  • Hi, I originally posted this on the dumbo forum, but it's more a general scripting hadoop issue. When testing a simple script that created some local files and then copied them to hdfs with ...
    Håvard Wahl KongsgårdHåvard Wahl Kongsgård
    Feb 13, 2012 at 6:39 pm
    Feb 15, 2012 at 12:13 pm
  • Dear all, while configuring our hadoop cluster I wonder whether there exists a reference document that contains information about which configuration property has to be specified in which properties ...
    Kleegrewe, ChristianKleegrewe, Christian
    Feb 7, 2012 at 10:02 am
    Feb 13, 2012 at 9:28 am
  • Hi everyone ! I try to setup Hive on single node to configure and use before i setup on multi node. First, i install Hadoop on single node ( ...
    Lac TrungLac Trung
    Feb 10, 2012 at 2:29 am
    Feb 10, 2012 at 9:32 am
  • Hi Folks, This might be a stupid question, but I'm new to Java and Hadoop, so... Anyway, if I want to check what FileSystem is currently being used at some point (i.e. evaluating ...
    Eli FinkelshteynEli Finkelshteyn
    Feb 7, 2012 at 10:25 pm
    Feb 8, 2012 at 3:57 pm
  • OK, I have a working Hadoop application that I would like to integrate into an application server environment. So, the question arises: can I do this? E.g. can I create a JobClient instance inside an ...
    Andy DoddingtonAndy Doddington
    Feb 8, 2012 at 7:18 pm
    Mar 8, 2012 at 11:42 am
  • Hello All, We have a hive table partitioned by date and hour(330 columns). We have 5 years worth of data for the table. Each hourly partition have around 800MB. So total 43,800 partitions with one ...
    Rk vishuRk vishu
    Feb 18, 2012 at 9:39 am
    Mar 3, 2012 at 12:17 am
  • I am running in pseudo-distributed on my Mac and just upgraded from 0.20.203.0 to 1.0.0. The web interface for HDFS which was working in 0.20.203.0 is broken in 1.0.0. HDFS itself appears to work: a ...
    W.P. McNeillW.P. McNeill
    Feb 19, 2012 at 5:22 pm
    Mar 1, 2012 at 8:33 am
Group Navigation
period‹ prev | Feb 2012 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions174
posts634
users180
websitehadoop.apache.org...
irc#hadoop

180 users for February 2012

Harsh J: 57 posts Mohit Anchlia: 44 posts Bejoy Ks: 20 posts Bing Li: 14 posts Hadoop hive: 14 posts Praveenesh kumar: 14 posts Tim Broberg: 12 posts Merto Mertek: 11 posts Alo alt: 10 posts Joey Echeverria: 10 posts Mark Kerzner: 10 posts Edward Capriolo: 9 posts Madhu phatak: 9 posts W.P. McNeill: 9 posts Keith Wiley: 8 posts Rohit Bakhshi: 8 posts Daniel Baptista: 7 posts Jinyan Xu: 7 posts Mark question: 7 posts Srinivas Surasani: 7 posts
show more