Grokbase Groups HBase dev June 2011
FAQ
Hi:
We found that the hlog sync to disk each time. When one thread exec
"doWrite(info, logKey, edit);", the others wait for "updateLock" in
HLog.java.
Why not the others add their edits into a list and wait. When sync's
time, the whole list sync to disk once. I think it will decrease the IO
calls.

So Maybe we will make two lists for edits. Each thread write to the
"waledits" and wait for "updateLock". Each thread can copy the "waledits" to
"flushedits" and flush the "flushedits" to
disk once it gets "updateLock".

In my test, it can increase the write speed of 40%.

Just see the HLog.patch.

Search Discussions

  • Joey Echeverria at Jun 29, 2011 at 1:08 pm
    Hey Mingjian,

    This sounds like a good idea Your patch didn't make it through. Would you mind either filing a JIRA and uploading your patch there or at least posting it to something like pastebin so we can take a look.

    -Joey


    On Jun 29, 2011, at 3:27, Mingjian Deng wrote:

    Hi:
    We found that the hlog sync to disk each time. When one thread exec "doWrite(info, logKey, edit);", the others wait for "updateLock" in HLog.java.
    Why not the others add their edits into a list and wait. When sync's time, the whole list sync to disk once. I think it will decrease the IO calls.

    So Maybe we will make two lists for edits. Each thread write to the "waledits" and wait for "updateLock". Each thread can copy the "waledits" to "flushedits" and flush the "flushedits" to
    disk once it gets "updateLock".

    In my test, it can increase the write speed of 40%.

    Just see the HLog.patch.
  • Dhruba Borthakur at Jun 29, 2011 at 1:14 pm
    We have implemented this idea, and it definitely increases HLog performance
    by quite a large number. The one drawabck is that writes to HLog (from HDFS
    perspective) become more "batchy", and writes to a HDFS file consume quite a
    bit of CPU. So I have observed that this change increase overall system
    throughput, but suffer slightly on individual transaction latency.

    -dhruba
    On Wed, Jun 29, 2011 at 6:08 AM, Joey Echeverria wrote:

    Hey Mingjian,

    This sounds like a good idea Your patch didn't make it through. Would you
    mind either filing a JIRA and uploading your patch there or at least posting
    it to something like pastebin so we can take a look.

    -Joey


    On Jun 29, 2011, at 3:27, Mingjian Deng wrote:

    Hi:
    We found that the hlog sync to disk each time. When one thread exec
    "doWrite(info, logKey, edit);", the others wait for "updateLock" in
    HLog.java.
    Why not the others add their edits into a list and wait. When sync's
    time, the whole list sync to disk once. I think it will decrease the IO
    calls.
    So Maybe we will make two lists for edits. Each thread write to the
    "waledits" and wait for "updateLock". Each thread can copy the "waledits" to
    "flushedits" and flush the "flushedits" to
    disk once it gets "updateLock".

    In my test, it can increase the write speed of 40%.

    Just see the HLog.patch.


    --
    Connect to me at http://www.facebook.com/dhruba
  • Mingjian Deng at Jun 29, 2011 at 2:39 pm
    Yes, it can only increase tps and less helpful for latency.
    I modified my code and generated from trunk.
    https://issues.apache.org/jira/browse/HBASE-4044

    2011/6/29 Dhruba Borthakur <dhruba@gmail.com>
    We have implemented this idea, and it definitely increases HLog performance
    by quite a large number. The one drawabck is that writes to HLog (from HDFS
    perspective) become more "batchy", and writes to a HDFS file consume quite
    a
    bit of CPU. So I have observed that this change increase overall system
    throughput, but suffer slightly on individual transaction latency.

    -dhruba
    On Wed, Jun 29, 2011 at 6:08 AM, Joey Echeverria wrote:

    Hey Mingjian,

    This sounds like a good idea Your patch didn't make it through. Would you
    mind either filing a JIRA and uploading your patch there or at least posting
    it to something like pastebin so we can take a look.

    -Joey


    On Jun 29, 2011, at 3:27, Mingjian Deng wrote:

    Hi:
    We found that the hlog sync to disk each time. When one thread exec
    "doWrite(info, logKey, edit);", the others wait for "updateLock" in
    HLog.java.
    Why not the others add their edits into a list and wait. When
    sync's
    time, the whole list sync to disk once. I think it will decrease the IO
    calls.
    So Maybe we will make two lists for edits. Each thread write to the
    "waledits" and wait for "updateLock". Each thread can copy the "waledits" to
    "flushedits" and flush the "flushedits" to
    disk once it gets "updateLock".

    In my test, it can increase the write speed of 40%.

    Just see the HLog.patch.


    --
    Connect to me at http://www.facebook.com/dhruba

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedJun 29, '11 at 10:28a
activeJun 29, '11 at 2:39p
posts4
users3
websitehbase.apache.org

People

Translate

site design / logo © 2022 Grokbase