在 2013年3月11日星期一UTC+8上午4时58分00秒,Nathan Marz写道:
I recommend using the "name" function to name portions of your stream so
that the UI shows you what bolts correspond to what sections.

Trident packs operations into as few bolts as possible. In addition, it
*never* repartitions your stream unless you've done an operation that
explicitly involves a repartitioning (e.g. shuffle, groupBy, partitionBy,
global aggregation, etc). This property of Trident ensures that you can
control the ordering/semi-ordering of how things are processed. So in this
case, everything before the groupBy has to have the same parallelism or
else Trident would have to repartition the stream. And since you didn't say
you wanted the stream repartitioned, it can't do that. You can get a
different parallelism for the spout vs. the each's following by introducing
a repartitioning operation, like so:


On Sat, Mar 9, 2013 at 11:17 AM, P. Taylor Goetz <ptg...@gmail.com<javascript:>
(Storm version 0.8.1)

I'm in the process of performance tuning some topologies we recently
converted over to trident, and I'm having trouble getting the resulting
bolts parallelized the way I want.

In storm, it's pretty straightforward. For example, if I set a bolt in a
topology like so:

builder.setBolt(SPLIT_BOLT_ID, splitBolt,

Then the "splitBolt" will be assigned 3 tasks in the topology.

With trident however, it's not as clear (at least not to me) since you
set parallelism on the Stream class. We have a trident topology that's not
unlike the one depicted here:


So looking at the first spout and bolt in that diagram (upper left), If I
wanted to assign the spout a parallelism hint of 1, and the first bolt a
parallelism of 3, I would think I would do something like the following:

Stream stream = topology.newStream("myStream", spout);
stream =

But I'm not seeing the results I'm expecting. I've tried moving the
"parallelismHint()" calls around within the topology definition, and am
completely baffled by how it plays out when deployed to a cluster. I'm
using storm-ui to determine how each resulting bolt got parallelized (which
may be the problem). In some cases attempting to set the parallelism of a
bolt actually altered the parallelism of the spout.

I'm assuming (perhaps wrongly) that if a trident topology compiles down
to 5 bolts, that they will be numbered ("bolt0" through "bolt4")
consistently between topology submissions -- i.e. If a topology is
submitted/killed multiple times, can I safely assume that "bolt0" always
represents the same bolt?

Am I missing something simple? I can't share the actual topology code,
but could put together a simple example if that would help.

Thanks in advance,

- Taylor

You received this message because you are subscribed to the Google Groups
"storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to storm-user+...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

Twitter: @nathanmarz
You received this message because you are subscribed to the Google Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to storm-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 5 | next ›
Discussion Overview
groupstorm-user @
postedMar 9, '13 at 7:17p
activeDec 10, '13 at 9:13a



site design / logo © 2021 Grokbase