Grokbase Groups HBase user May 2011
FAQ
Hi ,

We running production environment cluster (10 machines) :
1) hbase version 0.90.2
2) 2 tablse
3) we create ~ 15 regions per day (region size 250Mb)


I want to ask about major compaction best practices:

1) Have we to run it automatically or manually
2) How ofter it should run
3) Where can I read about compaction process and does it have potential
problem that I should know.

Best Regards
Oleg.

Search Discussions

  • Doug Meil at May 16, 2011 at 12:56 pm
    For starters, take a look at this...

    http://hbase.apache.org/book.html#perf.configurations



    -----Original Message-----
    From: Oleg Ruchovets
    Sent: Monday, May 16, 2011 6:42 AM
    To: user@hbase.apache.org
    Subject: major compaction best practice

    Hi ,

    We running production environment cluster (10 machines) :
    1) hbase version 0.90.2
    2) 2 tablse
    3) we create ~ 15 regions per day (region size 250Mb)


    I want to ask about major compaction best practices:

    1) Have we to run it automatically or manually
    2) How ofter it should run
    3) Where can I read about compaction process and does it have potential problem that I should know.

    Best Regards
    Oleg.
  • Stack at May 16, 2011 at 6:11 pm

    On Mon, May 16, 2011 at 3:42 AM, Oleg Ruchovets wrote:
    I want to ask about major compaction best practices:

    1) Have we to run it automatically or manually
    Major compaction runs once a day by default. It has a tendency
    whereby it will start just when you do not want it to run. So, change
    the configuration or run the compaction manually yourself at request
    downtimes (Running it manually is a pretty common practise).
    2) How ofter it should run
    Depends (Sorry).

    Major compactions clean up data that has been deleted or
    aged/versioned out of consideration. This clean up can help improve
    performance if there are obsoleted versions/deletes that the Scans and
    Gets do not have to continuously skip.

    Study your system. See what effect a major compaction has, if any, on
    your setup.
    3) Where can I read about compaction process and does it have potential
    problem that I should know.

    What Doug said,
    St.Ack
  • Oleg Ruchovets at May 19, 2011 at 1:47 pm
    Hi ,
    I turn off major compaction

    <property>
    <name>hbase.hregion.majorcompaction</name>
    <value>*0*</value>
    <description>The time (in miliseconds) between 'major' compactions of
    all
    HStoreFiles in a region. Default: 1 day.
    </description>
    </property>

    and run from hbase shell

    hbase(main):004:0> major_compact 'MYTABLE'
    0 row(s) in 0.1760 seconds

    --What does the responce mean : 0 row(s) in 0.1760 seconds? Does it means
    that major compaction will be scheduled and will be done asynchronously?
    --What is the way to see how the major compaction process is executing (log
    files or something else )

    Thanks.
    Oleg.







    On Mon, May 16, 2011 at 9:10 PM, Stack wrote:
    On Mon, May 16, 2011 at 3:42 AM, Oleg Ruchovets wrote:
    I want to ask about major compaction best practices:

    1) Have we to run it automatically or manually
    Major compaction runs once a day by default. It has a tendency
    whereby it will start just when you do not want it to run. So, change
    the configuration or run the compaction manually yourself at request
    downtimes (Running it manually is a pretty common practise).
    2) How ofter it should run
    Depends (Sorry).

    Major compactions clean up data that has been deleted or
    aged/versioned out of consideration. This clean up can help improve
    performance if there are obsoleted versions/deletes that the Scans and
    Gets do not have to continuously skip.

    Study your system. See what effect a major compaction has, if any, on
    your setup.
    3) Where can I read about compaction process and does it have potential
    problem that I should know.

    What Doug said,
    St.Ack
  • Stack at May 19, 2011 at 3:02 pm

    On Thu, May 19, 2011 at 6:47 AM, Oleg Ruchovets wrote:
    --What is the way to see how the major compaction process is executing (log
    files or something else )
    Curently yes, the only way to see state of the compaction is by
    viewing logs (I added HBASE-3900 to expose it UI and shell).
    St.Ack
  • Lars George at May 19, 2011 at 7:25 pm
    You can also check the compactionQueue on all RegionServers through
    the metrics or JMX.
    On Thu, May 19, 2011 at 5:01 PM, Stack wrote:
    On Thu, May 19, 2011 at 6:47 AM, Oleg Ruchovets wrote:
    --What is the way to see how the major compaction process is executing (log
    files or something else )
    Curently yes, the only way to see state of the compaction is by
    viewing logs (I added HBASE-3900 to expose it UI and shell).
    St.Ack

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshbase, hadoop
postedMay 16, '11 at 10:43a
activeMay 19, '11 at 7:25p
posts6
users4
websitehbase.apache.org

People

Translate

site design / logo © 2022 Grokbase