Hey guys,
So in HBASE-4365 I ran multiple uploads and the latest one I reported
was a 5TB import on 14 RS and it took 18h with Stack's patch. Now one
thing we can see is that apart from some splitting, there's a lot of
compacting going on. Stack was wondering exactly how much that IO
costs us, so we devised a test where we could upload 5TB with 0
compactions. Here are the results:
The table was pre-split with 14 regions, 1 per region server.
hbase.hstore.compactionThreshold=100
hbase.hstore.blockingStoreFiles=110
hbase.regionserver.maxlogs=64 (the block size is 128MB)
hfile.block.cache.size=0.05
hbase.regionserver.global.memstore.lowerLimit=0.40
hbase.regionserver.global.memstore.upperLimit=0.74
export HBASE_REGIONSERVER_OPTS="$HBASE_JMX_BASE -Xmx14G
-XX:CMSInitiatingOccupancyFraction=75 -XX:NewSize=256m
-XX:MaxNewSize=256m"
The table had:
MAX_FILESIZE => '549755813888', MEMSTORE_FLUSHSIZE => '549755813888'
Basically what I'm trying to do is to never block and almost always be
flushing. You'll probably notice the big difference between the lower
and upper barriers and think "le hell?", it's because it takes so long
to flush that you have to have enough room to take on more data while
this is happening (and we are able to flush faster than we take on
write).
The test reports the following:
Wall time: 34984.083 s
Aggregate Throughput: 156893.07 queries/s
Aggregate Throughput: 160030935.29 bytes/s
That's 2x faster than when we wait for compactions and splits, not too
bad but I'm pretty sure we can do better:
- The QPS was very uneven, it seems that when it's flushing it takes
a big toll and queries drop to ~100k/s while the rest of the time it's
more like 200k/s. Need to figure out what's going there and if it's
really just caused by flush-related IO.
- The logs were rolling every 6 seconds and since this takes a global
write lock, I can see how we could be slowing down a lot across 14
machines.
- The load was a bit uneven, I miscalculated my split points and the
last region always had 2-3k more queries per second.
Stay tuned for more.
J-D