FAQ
Writing a Java program that uses the API is basically equivalent to
installed a Hadoop client and writing a Python script to manipulate HDFS and
fire off a MR job. It's up to you to decide how much you like Java :).

Alex
On Thu, Jul 9, 2009 at 2:27 AM, Shravan Mahankali wrote:

Hi Group,

I have data to be analyzed and I would like to dump this data to Hadoop
from
machine.X where as Hadoop is running from machine.Y, after dumping this
data
to data I would like to initiate a job, get this data analyzed and get the
output information back to machine.X

I would like to do all this programmatically. Am going through Hadoop API
for this same purpose. I remember last day Alex was saying to install
Hadoop
in machine.X, but I was not sure why to do that?

I simple write a Java program including Hadoop-core jar, I was planning to
use "FsUrlStreamHandlerFactory" to connect to Hadoop in machine.Y and then
use "org.apache.hadoop.fs.shell" to copy data to Hadoop machine and
initiate
the job and get the results.

Please advice.

Thank You,
Shravan Kumar. M
Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
-----------------------------
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator - netopshelpdesk@catalytic.com

-----Original Message-----
From: Shravan Mahankali
Sent: Thursday, July 09, 2009 10:35 AM
To: common-user@hadoop.apache.org
Cc: 'Alex Loddengaard'
Subject: RE: how to use hadoop in real life?

Thanks for the information Ted.

Regards,
Shravan Kumar. M
Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
-----------------------------
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator - netopshelpdesk@catalytic.com

-----Original Message-----
From: Ted Dunning
Sent: Wednesday, July 08, 2009 10:48 PM
To: common-user@hadoop.apache.org; shravan.mahankali@catalytic.com
Cc: Alex Loddengaard
Subject: Re: how to use hadoop in real life?

In general hadoop is simpler than you might imagine.

Yes, you need to create directories to store data. This is much lighter
weight than creating a table in SQL.

But the key question is volume. Hadoop makes some things easier and Pig
queries are generally easier to write than SQL (for programmers ... not for
those raised on SQL), but, overall, map-reduce programs really are more
work
to write than SQL queries until you get to really large scale problems.

If your database has less than 10 million rows or so, I would recommend
that
you consider doing all analysis in SQL augmented by procedural languages.
Only as your data goes beyond 100 million to a billion rows do the clear
advantages of map-reduce formulation become apparent.

On Tue, Jul 7, 2009 at 11:35 PM, Shravan Mahankali <
shravan.mahankali@catalytic.com> wrote:
Use Case: We have a web app where user performs some actions, we have to
track these actions and various parameters related to action initiator, we
actually store this information in the database. But our manager has
suggested evaluating Hadoop for this scenario, however, am not clear that
every time I run a job in Hadoop I have to create a directory and how can I
track that later to read the data analyzed by Hadoop. Even though I drop
user action information in Hadoop, I have to put this information in our
database such that it knows the trend and responds for various of requests
accordingy.

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 10 of 13 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 6, '09 at 12:26p
activeJul 10, '09 at 5:46a
posts13
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase