Grokbase Groups HBase dev August 2011
FAQ
Hey devs,

I want to have your opinion on the new way HBA.flush is working. It
used to be that it would contact the master which issued the flush
calls to every RS which were all queued. Now HBA calls every RS for
every region (so if you have 2000k regions in a table, it's that many
RPCs) and the flushing is done in-line meaning that in situations like
mine my call has been running for now more than an hour.

While it's nice to be able to tell if everything is truly flushed, the
current function doesn't give feedback on its progress so you don't
even know how far you are.

(oh my flush is done, took 1h25min)

So what do people think? Should we have both a flush and a flushAsync
command like we have for creating tables? The latter would ideally
queue all the flushes instead of doing them inline, which would also
require new HRS public method.

Also we could optimize how it works right now by adding some
parallelization while keeping the current guarantees.

J-D

Search Discussions

  • Ted Yu at Aug 11, 2011 at 6:51 pm
    I think we can
    1. introduce flushRegions() for region server
    2. batch HRegionInfo's per server in HBA.flush() to call the above new API

    An asynchronous flushAsync() may be useful as well.
    On Thu, Aug 11, 2011 at 11:30 AM, Jean-Daniel Cryans wrote:

    Hey devs,

    I want to have your opinion on the new way HBA.flush is working. It
    used to be that it would contact the master which issued the flush
    calls to every RS which were all queued. Now HBA calls every RS for
    every region (so if you have 2000k regions in a table, it's that many
    RPCs) and the flushing is done in-line meaning that in situations like
    mine my call has been running for now more than an hour.

    While it's nice to be able to tell if everything is truly flushed, the
    current function doesn't give feedback on its progress so you don't
    even know how far you are.

    (oh my flush is done, took 1h25min)

    So what do people think? Should we have both a flush and a flushAsync
    command like we have for creating tables? The latter would ideally
    queue all the flushes instead of doing them inline, which would also
    require new HRS public method.

    Also we could optimize how it works right now by adding some
    parallelization while keeping the current guarantees.

    J-D
  • Jean-Daniel Cryans at Aug 12, 2011 at 5:43 pm
    Thanks Ted, I created https://issues.apache.org/jira/browse/HBASE-4198

    J-D
    On Thu, Aug 11, 2011 at 11:50 AM, Ted Yu wrote:
    I think we can
    1. introduce flushRegions() for region server
    2. batch HRegionInfo's per server in HBA.flush() to call the above new API

    An asynchronous flushAsync() may be useful as well.
    On Thu, Aug 11, 2011 at 11:30 AM, Jean-Daniel Cryans wrote:

    Hey devs,

    I want to have your opinion on the new way HBA.flush is working. It
    used to be that it would contact the master which issued the flush
    calls to every RS which were all queued. Now HBA calls every RS for
    every region (so if you have 2000k regions in a table, it's that many
    RPCs) and the flushing is done in-line meaning that in situations like
    mine my call has been running for now more than an hour.

    While it's nice to be able to tell if everything is truly flushed, the
    current function doesn't give feedback on its progress so you don't
    even know how far you are.

    (oh my flush is done, took 1h25min)

    So what do people think? Should we have both a flush and a flushAsync
    command like we have for creating tables? The latter would ideally
    queue all the flushes instead of doing them inline, which would also
    require new HRS public method.

    Also we could optimize how it works right now by adding some
    parallelization while keeping the current guarantees.

    J-D

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedAug 11, '11 at 6:30p
activeAug 12, '11 at 5:43p
posts3
users2
websitehbase.apache.org

2 users in discussion

Jean-Daniel Cryans: 2 posts Ted Yu: 1 post

People

Translate

site design / logo © 2022 Grokbase