FAQ
Hello,

I am new to Hadoop. I am doing a project in cloud in which I

have to use hadoop for Map-reduce. It is such that I am going

to collect logs from 2-3 machines having different locations.

The logs are also in different formats such as .rtf .log .txt

Later, I have to collect and convert them to one format and

collect to one location.

So I am asking which module of Hadoop that I need to study

for this implementation?? Or whole framework should I need

to study ??

Seeking for guidance,

Thank you !!
--
*Cheers,*
*Mayur.*

Search Discussions

  • Mayur Patil at Feb 5, 2013 at 9:36 pm
    Hello,

    I am new to Hadoop. I am doing a project in cloud in which I

    have to use hadoop for Map-reduce. It is such that I am going

    to collect logs from 2-3 machines having different locations.

    The logs are also in different formats such as .rtf .log .txt

    Later, I have to collect and convert them to one format and

    collect to one location.

    So I am asking which module of Hadoop that I need to study

    for this implementation?? Or whole framework should I need

    to study ??

    Seeking for guidance,

    Thank you !!

    --
    *Cheers,*
    *Mayur.*
  • Nitin Pawar at Feb 5, 2013 at 9:39 pm
    Hey Mayur,

    If you are collecting logs from multiple servers then you can use flume for
    the same.

    if the contents of the logs are different in format then you can just use
    textfileinput format to read and write into any other format you want for
    your processing in later part of your projects

    first thing you need to learn is how to setup hadoop
    then you can try writing sample hadoop mapreduce jobs to read from text
    file and then process them and write the results into another file
    then you can integrate flume as your log collection mechanism
    once you get hold on the system then you can decide more on which paths you
    want to follow based on your requirements for storage, compute time,
    compute capacity, compression etc

    On Wed, Feb 6, 2013 at 3:01 AM, Mayur Patil wrote:

    Hello,

    I am new to Hadoop. I am doing a project in cloud in which I

    have to use hadoop for Map-reduce. It is such that I am going

    to collect logs from 2-3 machines having different locations.

    The logs are also in different formats such as .rtf .log .txt

    Later, I have to collect and convert them to one format and

    collect to one location.

    So I am asking which module of Hadoop that I need to study

    for this implementation?? Or whole framework should I need

    to study ??

    Seeking for guidance,

    Thank you !!
    --
    *Cheers,*
    *Mayur.*


    --
    Nitin Pawar
  • Jagat Singh at Feb 5, 2013 at 9:43 pm
    Hi,

    Please read basics on how hadoop works.

    Then start your hands on with map reduce coding.

    The tool which has been made for you is flume , but don't see tool till you
    complete above two steps.

    Good luck , keep us posted.

    Regards,

    Jagat Singh

    -----------
    Sent from Mobile , short and crisp.
    On 06-Feb-2013 8:32 AM, "Mayur Patil" wrote:

    Hello,

    I am new to Hadoop. I am doing a project in cloud in which I

    have to use hadoop for Map-reduce. It is such that I am going

    to collect logs from 2-3 machines having different locations.

    The logs are also in different formats such as .rtf .log .txt

    Later, I have to collect and convert them to one format and

    collect to one location.

    So I am asking which module of Hadoop that I need to study

    for this implementation?? Or whole framework should I need

    to study ??

    Seeking for guidance,

    Thank you !!
    --
    *Cheers,*
    *Mayur.*
  • Mayur Patil at Feb 6, 2013 at 11:27 am
    Thanks to you duo. You solved my problem so easily. I want to

    ask one more question; for reference. I have

    1. hadoop the definitive guide
    2. Hadoop In Action

    Is it sufficient or do I need some more material to study

    your suggested implementation??
    *
    --
    Cheers,
    Mayur*

    Hey Mayur,
    If you are collecting logs from multiple servers then you can use flume
    for the same.

    if the contents of the logs are different in format then you can just use
    textfileinput format to read and write into any other format you want for
    your processing in later part of your projects

    first thing you need to learn is how to setup hadoop
    then you can try writing sample hadoop mapreduce jobs to read from text
    file and then process them and write the results into another file
    then you can integrate flume as your log collection mechanism
    once you get hold on the system then you can decide more on which paths
    you want to follow based on your requirements for storage, compute time,
    compute capacity, compression etc
    --------------
    --------------
    Hi,

    Please read basics on how hadoop works.

    Then start your hands on with map reduce coding.

    The tool which has been made for you is flume , but don't see tool till
    you complete above two steps.

    Good luck , keep us posted.

    Regards,

    Jagat Singh

    -----------
    Sent from Mobile , short and crisp.
    On 06-Feb-2013 8:32 AM, "Mayur Patil" wrote:

    Hello,

    I am new to Hadoop. I am doing a project in cloud in which I

    have to use hadoop for Map-reduce. It is such that I am going

    to collect logs from 2-3 machines having different locations.

    The logs are also in different formats such as .rtf .log .txt

    Later, I have to collect and convert them to one format and

    collect to one location.

    So I am asking which module of Hadoop that I need to study

    for this implementation?? Or whole framework should I need

    to study ??

    Seeking for guidance,

    Thank you !!
    --
    *Cheers,*
    *Mayur.*
  • Nitin Pawar at Feb 6, 2013 at 1:05 pm
    thats more than sufficient

    On Wed, Feb 6, 2013 at 4:56 PM, Mayur Patil wrote:

    Thanks to you duo. You solved my problem so easily. I want to

    ask one more question; for reference. I have

    1. hadoop the definitive guide
    2. Hadoop In Action

    Is it sufficient or do I need some more material to study

    your suggested implementation??
    *
    --
    Cheers,
    Mayur*

    Hey Mayur,
    If you are collecting logs from multiple servers then you can use flume
    for the same.

    if the contents of the logs are different in format then you can just
    use textfileinput format to read and write into any other format you want
    for your processing in later part of your projects

    first thing you need to learn is how to setup hadoop
    then you can try writing sample hadoop mapreduce jobs to read from text
    file and then process them and write the results into another file
    then you can integrate flume as your log collection mechanism
    once you get hold on the system then you can decide more on which paths
    you want to follow based on your requirements for storage, compute time,
    compute capacity, compression etc
    --------------
    --------------
    Hi,

    Please read basics on how hadoop works.

    Then start your hands on with map reduce coding.

    The tool which has been made for you is flume , but don't see tool till
    you complete above two steps.

    Good luck , keep us posted.

    Regards,

    Jagat Singh

    -----------
    Sent from Mobile , short and crisp.
    On 06-Feb-2013 8:32 AM, "Mayur Patil" wrote:

    Hello,

    I am new to Hadoop. I am doing a project in cloud in which I

    have to use hadoop for Map-reduce. It is such that I am going

    to collect logs from 2-3 machines having different locations.

    The logs are also in different formats such as .rtf .log .txt

    Later, I have to collect and convert them to one format and

    collect to one location.

    So I am asking which module of Hadoop that I need to study

    for this implementation?? Or whole framework should I need

    to study ??

    Seeking for guidance,

    Thank you !!
    --
    *Cheers,*
    *Mayur.*

    --
    Nitin Pawar

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-user @
categorieshadoop
postedFeb 5, '13 at 9:32p
activeFeb 6, '13 at 1:05p
posts6
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2019 Grokbase