FAQ
Hello,

So all my statistics is finally being calculated, results being
processed etc, i have a 1 node cluster. Mainly taking 3 aggreate logs
from my apache logs.

How far this setup will go? I have another machine ready to be hooked
up to my setup, and i wonder if it is worth at the moment to add this
and be a 2 node cluster.

The first node has 8gb ram and a quad core 3.0ghz, The second computer
I have is much more noisy, and spends more electricity. İt has 8gb
ram and dual opterons with dual cores - and running at 2.0ghz.

Best Regards,
C.B.

Search Discussions

  • Ajo Fod at Feb 14, 2011 at 4:33 pm
    Yes, I've often wondered about asymmetric configurations. Is there a
    mechanism to prevent partition map/reduce jobs to be aware of differences
    between speeds of processors and allocate less work the the slower
    processors?

    To try to answer the question here: I have not had much experience with
    multi-node clusters, but I'd start with checking if the 4 cores are being
    used ... especiallly in the part of the process that takes the longest
    (Amdahl's law) ... you can only get a speedup if that is already happening.

    Here are a few other questions I go through:

    Does the process take very long? At the very least the task should take
    longer than twice the time it takes you to switch on switch on and boot up
    the other computer ... rebalance HDFS and then run the job and switch off
    the computer ... and all the investment in time to figure out how to use and
    maintain the multi-node configuration.

    How often do you need to run the job? ... if it is only once a day ... and
    it can be run in the background or while the processor is not busy, perhaps
    you can schedule it on your PC for when you are taking a break.

    Are you developing code? ... If so, it is perhaps more efficient to run on
    one computer and test with a small chunk of data.

    So, in summary, I'd use multiple computers as a last resort ... multi core
    is good enough for me most of the time.

    Thanks,
    -Ajo.
    On Sun, Feb 13, 2011 at 4:58 PM, Cam Bazz wrote:

    Hello,

    So all my statistics is finally being calculated, results being
    processed etc, i have a 1 node cluster. Mainly taking 3 aggreate logs
    from my apache logs.

    How far this setup will go? I have another machine ready to be hooked
    up to my setup, and i wonder if it is worth at the moment to add this
    and be a 2 node cluster.

    The first node has 8gb ram and a quad core 3.0ghz, The second computer
    I have is much more noisy, and spends more electricity. İt has 8gb
    ram and dual opterons with dual cores - and running at 2.0ghz.

    Best Regards,
    C.B.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedFeb 14, '11 at 12:58a
activeFeb 14, '11 at 4:33p
posts2
users2
websitehive.apache.org

2 users in discussion

Ajo Fod: 1 post Cam Bazz: 1 post

People

Translate

site design / logo © 2022 Grokbase