We've been running several topologies on Storm successfully for awhile now.
Recently, as message volume has increased, we've begun to notice a higher
than expected CPU burn from the workers.
Originally I thought it was due to code within our topology, however even
when I ack tuples immediately at the first bolt in our topology (and don't
propagate them throughout the topology) the high CPU usage remains. This
leads me to believe that somewhere in our Kafka Spout or storm library
there is a lot of CPU burn.
We are using the Kafka Spout with storm 0.8.2 running on the 1.6.41 Oracle
JVM. Our topology config looks like:
num_workers => 3
num_ackers => 3
TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE => 16384
TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE => 16384
TOPOLOGY_RECEIVER_BUFFER_SIZE => 8
TOPOLOGY_TRANSFER_BUFFER_SIZE => 32
spout_pending => 300000
ZeroMQ is 2.1.7
jzmq is from nathan's fork
Pushing about 5,000 msgs/sec at an average message size of 150 bytes,
through a single Kafka partition (on a single host), we consume almost an
entire c1.xlarge worth of CPU when the topology is spread across three
c1.xlarges. The highest CPU across the nodes is over 50% CPU.
Connecting VisualVM to the worker consuming the most CPU and running the
CPU sampling shows that we spend the most CPU time in:
org.apache.zookeeper.ClientCnxn$SendThread.run()
and
org.zeromq.ZMQ$Socket.recv[native]()
I don't have profiling results though.
Any idea on where we could be burning CPU? Is this level of CPU usage to be
expected in a Kafka Spout configuration? Any of our configuration variables
likely to be worsening the usage? The buffer sizes were taken from the
presentation Nathan gave at Ooyala for a high-throughput topology.
The CPU usage does seem to scale upwards with our volume, so I'm trying to
identify the bottlenecks so that we can scale this further.
Thanks!
Mike
--
Mike Heffner <[email protected]>
Librato, Inc.
--
You received this message because you are subscribed to the Google Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.