FAQ
Hey Cloudera genius guys .

I read this

Via Cloudera, Hadoop is currently used by most of the giants in the
space including Google, Yahoo, Facebook (we wrote about Facebook’s use
of Cloudera here), Amazon, AOL, Baidu and more.

On.
http://www.techcrunch.com/2009/10/01/hadoop-clusters-get-a-monitoring-client-with-cloudera-desktop/

if this is true can you guys help us beat Y G and F.

Is it true that Google uses hadoop?
Is it true that above mentoned giants use Hadoop via Cloudera?

Thanks,
Stan S

Search Discussions

  • Jean-Daniel Cryans at Oct 2, 2009 at 11:36 pm
    Stan,

    First, this is not the Cloudera mailing list and this is not a dev question.

    Also, AFAIK, Google uses Hadoop only to interface with people outside
    since MapReduce works the same way.
    I think this article is wrong in saying that Google, Yahoo! and
    Facebook are using Hadoop via Cloudera and I'm 99% sure of that. They
    all have enough expertise to not be dependent on a support contract
    and Y! even has it's own distro of Hadoop (tho not supported like
    cloudera does). Maybe Leena Rao thought that Cloudera were the only
    ones developing Hadoop and took the biggest names out of the PoweredBy
    page.

    J-D
    On Fri, Oct 2, 2009 at 7:02 PM, Smith Stan wrote:
    Hey Cloudera genius guys .

    I read this

    Via Cloudera, Hadoop is currently used by most of the giants in the
    space including Google, Yahoo, Facebook (we wrote about Facebook’s use
    of Cloudera here), Amazon, AOL, Baidu and more.

    On.
    http://www.techcrunch.com/2009/10/01/hadoop-clusters-get-a-monitoring-client-with-cloudera-desktop/

    if this is true can you guys help us beat Y G and F.

    Is it true that Google uses hadoop?
    Is it true that above mentoned giants use Hadoop via Cloudera?

    Thanks,
    Stan S
  • Ted Dunning at Oct 2, 2009 at 11:37 pm

    On Fri, Oct 2, 2009 at 4:02 PM, Smith Stan wrote:

    if this is true can you guys help us beat Y G and F.
    What do you mean beat Yahoo, Google and Facebook?

    Is it true that Google uses hadoop?
    >

    Yes. Mostly for educational purposes, not internal production.

    Is it true that above mentoned giants use Hadoop via Cloudera?
    Yahoo sponsored most of the writing of Yahoo and does not use Cloudera's
    distribution.

    Facebook sponsored the writing of Hive and probably still runs their own
    version of Hadoop.

    Why do you care if they use Cloudera's distribution?

    --
    Ted Dunning, CTO
    DeepDyve
  • Stefan Groschupf at Oct 2, 2009 at 11:59 pm
    Hi Ted,

    I'm sure Stan meant that in a satirical way.
    Techcrunch article gives the impression Hadoop was developed by the
    cloudera boys and all the big companies including Y! use their
    distribution.

    :-)

    Stefan


    On Oct 2, 2009, at 4:36 PM, Ted Dunning wrote:
    On Fri, Oct 2, 2009 at 4:02 PM, Smith Stan wrote:

    if this is true can you guys help us beat Y G and F.
    What do you mean beat Yahoo, Google and Facebook?

    Is it true that Google uses hadoop?
    Yes. Mostly for educational purposes, not internal production.

    Is it true that above mentoned giants use Hadoop via Cloudera?
    Yahoo sponsored most of the writing of Yahoo and does not use
    Cloudera's
    distribution.

    Facebook sponsored the writing of Hive and probably still runs their
    own
    version of Hadoop.

    Why do you care if they use Cloudera's distribution?

    --
    Ted Dunning, CTO
    DeepDyve
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Hadoop training and consulting
    http://www.scaleunlimited.com
    http://www.101tec.com
  • Steve Loughran at Oct 5, 2009 at 9:40 am

    Smith Stan wrote:
    Hey Cloudera genius guys .
    Sorry, not cloudera. I speak for myself.
    I read this

    Via Cloudera, Hadoop is currently used by most of the giants in the
    space including Google, Yahoo, Facebook (we wrote about Facebook’s use
    of Cloudera here), Amazon, AOL, Baidu and more.
    I would be doubful that any on that list use the cloudera distro,
    because once you manage a cluster to the extent you create your own RPMs
    for PXE-preboot and kickstart install then you know what you are doing
    and will be worrying more about the power budget of your datacentre -as
    measured in megawatts-, and whether your off-site replication plan is
    copying data to other facilities on different earthquake fault lines for
    than how hadoop-site.xml works.
    This is not much different from saying these companies all use TCP/IP,
    Http, MySQL and Linux, therefore a Linux server running apache and
    mysqld will help you to beat them.

    Hadoop is a tool for very large datasets, works best if you can group
    and scan them independently.

    * If you do not know what you are doing, it will not help
    * if you do not have a sufficiently large dataset, it is not worth the
    effort
    * if you havent outgrown an RDBMS, stick with the database
    * Cloudera are offering to help with running/using hadoop, but they
    aren't going to code your datamining algorithms for you.

    see also: http://teddziuba.com/2008/04/im-going-to-scale-my-foot-up-y.html

    -Steve
  • Amr Awadallah at Oct 7, 2009 at 9:18 am
    As other folks said, if you need to communicate with Cloudera then
    please use info ait cloudera d0t com, this is not the right forum for that.

    That said, some blog reporters make mistakes like this all the time,
    despite all of our efforts to properly educate them about the space.
    We'll reach out and ask that they post a correction to the last
    paragraph but I can't promise that it will happen. I want to make it
    clear on this public forum that Cloudera's intention is certainly *not*
    to belittle the contribution of Yahoo to Apache Hadoop (or Facebook for
    that matter), we all know that without their backing Hadoop probably
    wouldn't be as successful as it is today. Finally, all this press is
    good for Hadoop, and it will, hopefully, lead to more companies using it
    which will only serve to strengthen the platform and grow it even more.

    -- amr

    Steve Loughran wrote:
    Smith Stan wrote:
    Hey Cloudera genius guys .
    Sorry, not cloudera. I speak for myself.
    I read this

    Via Cloudera, Hadoop is currently used by most of the giants in the
    space including Google, Yahoo, Facebook (we wrote about Facebook’s use
    of Cloudera here), Amazon, AOL, Baidu and more.
    I would be doubful that any on that list use the cloudera distro,
    because once you manage a cluster to the extent you create your own
    RPMs for PXE-preboot and kickstart install then you know what you are
    doing and will be worrying more about the power budget of your
    datacentre -as measured in megawatts-, and whether your off-site
    replication plan is copying data to other facilities on different
    earthquake fault lines for than how hadoop-site.xml works.
    This is not much different from saying these companies all use TCP/IP,
    Http, MySQL and Linux, therefore a Linux server running apache and
    mysqld will help you to beat them.

    Hadoop is a tool for very large datasets, works best if you can group
    and scan them independently.

    * If you do not know what you are doing, it will not help
    * if you do not have a sufficiently large dataset, it is not worth the
    effort
    * if you havent outgrown an RDBMS, stick with the database
    * Cloudera are offering to help with running/using hadoop, but they
    aren't going to code your datamining algorithms for you.

    see also:
    http://teddziuba.com/2008/04/im-going-to-scale-my-foot-up-y.html

    -Steve

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 2, '09 at 11:03p
activeOct 7, '09 at 9:18a
posts6
users6
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase