Well, for once Hive uses Derby by default as its metastore.
What make you think that Hadoop project is using derby?

Also, this question seems to belong to common-dev@ (Cc'ed) raher then general@

On Wed, Aug 10, 2011 at 11:07AM, Saravana Kumar wrote:
Thanks For the Explanation but needs some clarity as well

Do you mean to say all the Information required to run a map/reduce job is
effectively stored in derby. It means hadoop(not Ecosystem) uses Derby?
On Tue, Aug 9, 2011 at 5:52 PM, Michael Segel wrote:


First a little history...
Derby started out long ago as Cloudscape. Cloudscape was bought by
Informix. Informix was bought by IBM. IBM didn't understand Cloudscape and
decided to open source the project under APL. Hence Derby was born.

Derby is an excellent lightweight 100% java database. So when you have a
Java framework, using Derby makes a lot of sense. Derby is used to persist
some environment information and I believe its used in part of some of the
unit testing.

Where Derby has been replaced by MySQL is when someone wanted a multi-user
database and they were more comfortable with MySQL than they were with
Derby. (Hint: Derby can be started as an embedded single user database, or
as a multi-user database by changing its invocation at startup. ;-)

So I would guess the initial reason to go with Derby was that its released
under APL and there were no licensing issues. ;-)

Date: Tue, 9 Aug 2011 15:17:35 +0530
Subject: Derby with Hadoop --Why?
From: saravana.hadoop@gmail.com
To: general@hadoop.apache.org


What is the significance of Derby in Hadoop Project.
Why people are using Derby along with Hadoop

Saravana Kumar.J

Search Discussions

  • Alejandro Abdelnur at Aug 10, 2011 at 3:08 pm
    [CCed general@]


    What you are describing is MapReduce application scenario, where the DB is
    handled from your MR code, nothing special from Hadoop side.


    On Wed, Aug 10, 2011 at 5:34 AM, Segel, Mike wrote:

    It's been far too many years since I was handed my diploma and kicked off
    campus. :-)
    IMHO it's a bit esoteric for a class room homework assignment. Maybe an
    interview question used to stump most of the candidates?

    The funny thing is that on the walk to work, I started to think of if it
    made sense for a certain subset of m/r problems to use derby as an in memory
    db/local lightweight DB
    Ok and before you say WTF, I'm talking about a subset of problems where
    depending on the input to the Mapper.map() method, you may want to do a
    quick look up against a database which contains lookup data that is static
    and is indexed.

    Ok, so maybe I shouldn't read my e-mails before heading off to work... :-)


    -----Original Message-----
    From: Ted Dunning
    Sent: Wednesday, August 10, 2011 12:55 AM
    To: general@hadoop.apache.org
    Subject: Re: Derby with Hadoop --Why?

    No. He meant nothing of the kind. The other explanations expanded on

    This sounds like homework. If so, I would recommend a bit of reading
    before asking.

    On Tue, Aug 9, 2011 at 10:37 PM, Saravana Kumar

    The information contained in this communication may be CONFIDENTIAL and is
    intended only for the use of the recipient(s) named above. If you are not
    the intended recipient, you are hereby notified that any dissemination,
    distribution, or copying of this communication, or any of its contents, is
    strictly prohibited. If you have received this communication in error,
    please notify the sender and delete/destroy the original message and any
    copy of it from your computer or paper files.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
postedAug 10, '11 at 5:48a
activeAug 10, '11 at 3:08p



site design / logo © 2022 Grokbase