Is it preferable to consolidate topology logic into fewer bolts in order to
keep the overall topology simpler, or is there no penalty associated with
having a topology which contains dozens (or hundreds) of bolts? All of the
performance benchmarks which I've read about are run against trivially
simple topologies -- a spout plus 1 or 2 bolts.
I saw the section on the wiki about "Guaranteeing message processing", and
it seems that there tuple tree is a fixed size regardless of how many
tuples are in the tree, so this is not a concern.
However, it seems that the other overhead associated with having more bolts
could be significant:
- more zmq activity, since there is a send and receive operation required
for each bolt that message visits
- more input and output queues required, since each task maintains its own
In our testing, the performance of the topology drops off when there are
many tasks per worker (> 25), and I am wondering whether we should focus on
making our topology simpler by consolidating bolts.
(current config: 30 bolts, 300 tasks, 300 executors, and 64 workers on 4
Has anyone had a similar experience?