FAQ
Hi,

I would like to process a set of log files (say web server access log) from a number of different machines. So I need to get those log files from the respective machines to my central HDFS.

To achieve this -
a) Do I need to install hadoop and start reunning HDFS (using start-dfs.sh) in all those machines where the log files are getting created ? And then do a file get from the central HDFS server` ?
b) Any other way to achive this ?

Regards,
Sourav

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

Search Discussions

  • Tim Wintle at Sep 12, 2008 at 11:38 pm
    a) Do I need to install hadoop and start reunning HDFS (using start-dfs.sh) in all those machines where the log files are getting created ? And then do a file get from the central HDFS server` ?
    I'd install hadoop on the machine, but you don't have to start any nodes
    there - you can log onto a cluster running elsewhere using the command
    line tools to put / get data from the cluster.

    From what I recall, this is actually better than running nodes locally
    as if you put data on locally, the blocks will tend to be posted to the
    local machine.


    Tim

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 12, '08 at 9:13p
activeSep 12, '08 at 11:38p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Tim Wintle: 1 post Souravm: 1 post

People

Translate

site design / logo © 2022 Grokbase