So, we were loading from a single client and had a queue. The queue kept getting up to the fill level even though 20 threads are emptying the queue while 1 thread is pushing into the queue implying writing to hbase was making it fill up my queue. Checking cpu and iostat for disk utilization on the client, we don't really see the client as a bottleneck, but this fights every bone in my body so I ran the client on two nodes to process half the items it was processing to begin with and sure enough, it doubled the speed and that node was a bottleneck.
I keep thinking maybe the Bus was at 100% utilization or something that is not measurable. The disk was not really showing anything vs. when we make hadoop map/reduce against Sybase and we see the disk utilization go to 100% making it easy to spot. Anyone have any experience with monitoring at all and good stats to capture.
We were looking at "iostat -x 5 25" on linux to see how cpu and disk were doing, and then measuring the network which is near-zero right now as well as monitoring memory. Anyone else have some good commands to make it easier to spot a bottleneck?
This message and any attachments are intended only for the use of the addressee and
may contain information that is privileged and confidential. If the reader of the
message is not the intended recipient or an authorized representative of the
intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication in
error, please notify us immediately by e-mail and delete the message and any
attachments from your system.