FAQ
Our ultimate goal is to basically replicate gigablast.com search engine.
They claim to have less than 500 servers that contain 10billion pages
indexed, spidered and updated on a routine basis... I am looking at
featuring 500 million pages indexed per node, and have a total of 20 nodes.
Each node will feature 2 quad core processes, 4TB (at raid 5) and 32 gb of
ram. I believe this can be done however how many searches per second do you
think would be realistic in this instance? We are looking at achieving
25+/- searches per second ultimately spread out over the 20 nodes... I can
really uses some advice with this one.

Thanks,
D. Segel

Search Discussions

  • Otis Gospodnetic at Jun 5, 2008 at 4:11 pm
    Dan,

    You may want to ask on Solr, Lucene, or Nutch lists. However, I can tell you already that these numbers look a little...overly optimistic :)


    Otis
    --
    Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

    ----- Original Message ----
    From: Dan Segel <dansegel@gmail.com>
    To: core-user@hadoop.apache.org
    Sent: Thursday, June 5, 2008 9:12:31 AM
    Subject: Gigablast.com search engine, 10billion pages!!!

    Our ultimate goal is to basically replicate gigablast.com search engine.
    They claim to have less than 500 servers that contain 10billion pages
    indexed, spidered and updated on a routine basis... I am looking at
    featuring 500 million pages indexed per node, and have a total of 20 nodes.
    Each node will feature 2 quad core processes, 4TB (at raid 5) and 32 gb of
    ram. I believe this can be done however how many searches per second do you
    think would be realistic in this instance? We are looking at achieving
    25+/- searches per second ultimately spread out over the 20 nodes... I can
    really uses some advice with this one.

    Thanks,
    D. Segel

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 5, '08 at 1:13p
activeJun 5, '08 at 4:11p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Dan Segel: 1 post Otis Gospodnetic: 1 post

People

Translate

site design / logo © 2021 Grokbase