FAQ
Is there anyway to preserve tuple order across multiple streams?
For some reason I assumed tuples should be received in the order they were
emitted but I don't think that is always the case across multiple streams?

Basically what I have is:
1) Bolt A emits a listId and count to Bolt C on a count stream.
2) Bolt C saves this count in a map to track completion
3) Bolt A emits tuple per record to Bolt B on classify stream with listId
4) Bolt B emits completion message to Bolt C on completion stream with same
listId
5) When expected count = completed count another message Bolt C emits
results
Bolt C uses a fieldGrouping on listId for both count and classify stream

This seems to have been working fine when Bolt A and Bolt B were using
localOrShuffleGrouping. I changed this to shuffleGrouping and it now
appears that Bolt C now will receive completion messages from Bolt B
before it ever receives the expected count from Bolt A.

Is there a better way to do this?

Thanks!

Kurt

(currently using Storm 0.7.2)

On Monday, August 27, 2012 10:22:03 PM UTC-6, nathanmarz wrote:

In a fields grouping, all tuples with the same set of fields for the
grouping will go to the same task. This is true even if you're doing a
fields grouping on multiple streams, e.g.:

.setBolt(...)
.fieldsGrouping("source1", new Fields("a"))
.fieldsGrouping("source2", new Fields("b"))

Fields grouping does hash(fields grouping fields) % (# consumer tasks).



On Mon, Aug 27, 2012 at 3:52 AM, kirti singh <mail2k...@gmail.com<javascript:>
wrote:
Hi,
Based on a particular Field we can partition the stream using field
grouping option.
I just want to verify if all streams partitioned by a particular field go
to the same bolt?
Say i have a spout with two fields being emitted: name and id
I want to check if tuples with same Id go to same bolt for processing or
not?
Is there any way apart from field grouping to make this scenario work?

Thanks In Advance
Kirti


--
Twitter: @nathanmarz
http://nathanmarz.com

Search Discussions

  • Nathan Marz at Jan 5, 2013 at 6:53 am
    Tuples sent between two tasks retain order, regardless of the stream. Bolts
    are inherently parallel, so any concept of order between bolts doesn't make
    any sense.
    On Thu, Jan 3, 2013 at 4:15 PM, Kurt Harriger wrote:

    Is there anyway to preserve tuple order across multiple streams?
    For some reason I assumed tuples should be received in the order they were
    emitted but I don't think that is always the case across multiple streams?

    Basically what I have is:
    1) Bolt A emits a listId and count to Bolt C on a count stream.
    2) Bolt C saves this count in a map to track completion
    3) Bolt A emits tuple per record to Bolt B on classify stream with listId
    4) Bolt B emits completion message to Bolt C on completion stream with
    same listId
    5) When expected count = completed count another message Bolt C emits
    results
    Bolt C uses a fieldGrouping on listId for both count and classify stream

    This seems to have been working fine when Bolt A and Bolt B were using
    localOrShuffleGrouping. I changed this to shuffleGrouping and it now
    appears that Bolt C now will receive completion messages from Bolt B
    before it ever receives the expected count from Bolt A.

    Is there a better way to do this?

    Thanks!

    Kurt

    (currently using Storm 0.7.2)

    On Monday, August 27, 2012 10:22:03 PM UTC-6, nathanmarz wrote:

    In a fields grouping, all tuples with the same set of fields for the
    grouping will go to the same task. This is true even if you're doing a
    fields grouping on multiple streams, e.g.:

    .setBolt(...)
    .fieldsGrouping("source1", new Fields("a"))
    .fieldsGrouping("source2", new Fields("b"))

    Fields grouping does hash(fields grouping fields) % (# consumer tasks).


    On Mon, Aug 27, 2012 at 3:52 AM, kirti singh wrote:

    Hi,
    Based on a particular Field we can partition the stream using field
    grouping option.
    I just want to verify if all streams partitioned by a particular field
    go to the same bolt?
    Say i have a spout with two fields being emitted: name and id
    I want to check if tuples with same Id go to same bolt for processing or
    not?
    Is there any way apart from field grouping to make this scenario work?

    Thanks In Advance
    Kirti


    --
    Twitter: @nathanmarz
    http://nathanmarz.com

    --
    Twitter: @nathanmarz
    http://nathanmarz.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupstorm-user @
postedJan 5, '13 at 6:37a
activeJan 5, '13 at 6:53a
posts2
users2
websitestorm-project.net
irc#storm-user

2 users in discussion

Nathan Marz: 1 post Kurt Harriger: 1 post

People

Translate

site design / logo © 2022 Grokbase