|
Dhruba Borthakur |
at Jun 29, 2011 at 1:14 pm
|
⇧ |
| |
We have implemented this idea, and it definitely increases HLog performance
by quite a large number. The one drawabck is that writes to HLog (from HDFS
perspective) become more "batchy", and writes to a HDFS file consume quite a
bit of CPU. So I have observed that this change increase overall system
throughput, but suffer slightly on individual transaction latency.
-dhruba
On Wed, Jun 29, 2011 at 6:08 AM, Joey Echeverria wrote:Hey Mingjian,
This sounds like a good idea Your patch didn't make it through. Would you
mind either filing a JIRA and uploading your patch there or at least posting
it to something like pastebin so we can take a look.
-Joey
On Jun 29, 2011, at 3:27, Mingjian Deng wrote:
Hi:
We found that the hlog sync to disk each time. When one thread exec
"doWrite(info, logKey, edit);", the others wait for "updateLock" in
HLog.java.
Why not the others add their edits into a list and wait. When sync's
time, the whole list sync to disk once. I think it will decrease the IO
calls.
So Maybe we will make two lists for edits. Each thread write to the
"waledits" and wait for "updateLock". Each thread can copy the "waledits" to
"flushedits" and flush the "flushedits" to
disk once it gets "updateLock".
In my test, it can increase the write speed of 40%.
Just see the HLog.patch.
--
Connect to me at
http://www.facebook.com/dhruba