So I believe that is the reverse callstack showing where the calls to
Reentrant.unlock() originate from. When I continue following the call stack
down it fans out to the various locations we emit() or ack() tuples across
our topology. I believe that is all the locations that lead to work being
added to the DisruptorQueue?
This is a snapshot of the actual hotspots as detected by YourKit:
[image: Inline image 1]
I also tried switching to the SleepingWaitStrategy, but that led to a 2-3x
overall increase in CPU usage, mainly spinning in the waitFor() methods.
Cheers,
Mike
On Sun, Mar 17, 2013 at 12:36 AM, Nathan Marz wrote:
The actual CPU usage is happening in the bottom 2 functions in that
screenshot. You should expand the tree more to see what's actually using
the CPU.
--
Twitter: @nathanmarz
http://nathanmarz.com
--
You received this message because you are subscribed to the Google Groups
"storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
The actual CPU usage is happening in the bottom 2 functions in that
screenshot. You should expand the tree more to see what's actually using
the CPU.
On Sat, Mar 16, 2013 at 9:50 AM, Mike Heffner wrote:
Nathan,
I connected YourKit to the topology and profiling shows that about 49% of
CPU time is spent in java.util.concurrent.locks.ReentrantLock.unlock():
[image: Inline image 1]
Is this typical of a topology running at several thousand tuples/second?
Anything that could be done to reduce usage here?
Is it correct to assume that Trident would reduce this overhead by
batching multiple messages into a single tuple, thereby reducing
inter-thread messaging/locking?
Thanks,
Mike
--
Mike Heffner <[email protected]>
Librato, Inc.
--
You received this message because you are subscribed to the Google Groups
"storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
Nathan,
I connected YourKit to the topology and profiling shows that about 49% of
CPU time is spent in java.util.concurrent.locks.ReentrantLock.unlock():
[image: Inline image 1]
Is this typical of a topology running at several thousand tuples/second?
Anything that could be done to reduce usage here?
Is it correct to assume that Trident would reduce this overhead by
batching multiple messages into a single tuple, thereby reducing
inter-thread messaging/locking?
Thanks,
Mike
On Wed, Mar 6, 2013 at 9:50 PM, Nathan Marz wrote:
The zmq recv function there is a blocking operation, so I don't think
you're measuring CPU there. I've found YourKit helpful for profiling. Also
be sure to check the capacity/execute latency statistics in 0.8.2.
--
Twitter: @nathanmarz
http://nathanmarz.com
--
You received this message because you are subscribed to the Google
Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
The zmq recv function there is a blocking operation, so I don't think
you're measuring CPU there. I've found YourKit helpful for profiling. Also
be sure to check the capacity/execute latency statistics in 0.8.2.
On Wed, Mar 6, 2013 at 2:19 PM, Mike Heffner wrote:
Hi,
We've been running several topologies on Storm successfully for awhile
now. Recently, as message volume has increased, we've begun to notice a
higher than expected CPU burn from the workers.
Originally I thought it was due to code within our topology, however
even when I ack tuples immediately at the first bolt in our topology (and
don't propagate them throughout the topology) the high CPU usage remains.
This leads me to believe that somewhere in our Kafka Spout or storm library
there is a lot of CPU burn.
We are using the Kafka Spout with storm 0.8.2 running on the 1.6.41
Oracle JVM. Our topology config looks like:
num_workers => 3
num_ackers => 3
TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE => 16384
TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE => 16384
TOPOLOGY_RECEIVER_BUFFER_SIZE => 8
TOPOLOGY_TRANSFER_BUFFER_SIZE => 32
spout_pending => 300000
ZeroMQ is 2.1.7
jzmq is from nathan's fork
Pushing about 5,000 msgs/sec at an average message size of 150 bytes,
through a single Kafka partition (on a single host), we consume almost an
entire c1.xlarge worth of CPU when the topology is spread across three
c1.xlarges. The highest CPU across the nodes is over 50% CPU.
Connecting VisualVM to the worker consuming the most CPU and running
the CPU sampling shows that we spend the most CPU time in:
org.apache.zookeeper.ClientCnxn$SendThread.run()
and
org.zeromq.ZMQ$Socket.recv[native]()
I don't have profiling results though.
Any idea on where we could be burning CPU? Is this level of CPU usage
to be expected in a Kafka Spout configuration? Any of our configuration
variables likely to be worsening the usage? The buffer sizes were taken
from the presentation Nathan gave at Ooyala for a high-throughput topology.
The CPU usage does seem to scale upwards with our volume, so I'm trying
to identify the bottlenecks so that we can scale this further.
Thanks!
Mike
--
Mike Heffner <[email protected]>
Librato, Inc.
--
You received this message because you are subscribed to the Google
Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
Hi,
We've been running several topologies on Storm successfully for awhile
now. Recently, as message volume has increased, we've begun to notice a
higher than expected CPU burn from the workers.
Originally I thought it was due to code within our topology, however
even when I ack tuples immediately at the first bolt in our topology (and
don't propagate them throughout the topology) the high CPU usage remains.
This leads me to believe that somewhere in our Kafka Spout or storm library
there is a lot of CPU burn.
We are using the Kafka Spout with storm 0.8.2 running on the 1.6.41
Oracle JVM. Our topology config looks like:
num_workers => 3
num_ackers => 3
TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE => 16384
TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE => 16384
TOPOLOGY_RECEIVER_BUFFER_SIZE => 8
TOPOLOGY_TRANSFER_BUFFER_SIZE => 32
spout_pending => 300000
ZeroMQ is 2.1.7
jzmq is from nathan's fork
Pushing about 5,000 msgs/sec at an average message size of 150 bytes,
through a single Kafka partition (on a single host), we consume almost an
entire c1.xlarge worth of CPU when the topology is spread across three
c1.xlarges. The highest CPU across the nodes is over 50% CPU.
Connecting VisualVM to the worker consuming the most CPU and running
the CPU sampling shows that we spend the most CPU time in:
org.apache.zookeeper.ClientCnxn$SendThread.run()
and
org.zeromq.ZMQ$Socket.recv[native]()
I don't have profiling results though.
Any idea on where we could be burning CPU? Is this level of CPU usage
to be expected in a Kafka Spout configuration? Any of our configuration
variables likely to be worsening the usage? The buffer sizes were taken
from the presentation Nathan gave at Ooyala for a high-throughput topology.
The CPU usage does seem to scale upwards with our volume, so I'm trying
to identify the bottlenecks so that we can scale this further.
Thanks!
Mike
--
Mike Heffner <[email protected]>
Librato, Inc.
--
You received this message because you are subscribed to the Google
Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
--
Twitter: @nathanmarz
http://nathanmarz.com
--
You received this message because you are subscribed to the Google
Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
--
Mike Heffner <[email protected]>
Librato, Inc.
--
You received this message because you are subscribed to the Google Groups
"storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
--
Twitter: @nathanmarz
http://nathanmarz.com
--
You received this message because you are subscribed to the Google Groups
"storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
--
Mike Heffner <[email protected]>
Librato, Inc.
--
You received this message because you are subscribed to the Google Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.