|
Ryan LeCompte |
at Feb 6, 2013 at 3:57 am
|
⇧ |
| |
You usually solve this problem by having your traffic first hit a persistent queue like Kafka or kestrel. Your spout would consume data from the queue. When storm is down, data is queued up in Kafka. When storm is back up, it just keeps consuming data from the last offset.
Ryan
On Feb 5, 2013, at 8:59 AM, Artem Kalinchuk wrote:Since Twitter calculates its clicks/tweets metrics real-time using Storm, what would happen if Storm went offline for about 2 hours during the day? Would the data be out of sync? Let's say during those 2 hours there were 30,000 clicks, the metrics at the end of the day would be missing 30,000 clicks from their calculations. How does Twitter solve this problem?
--
You received this message because you are subscribed to the Google Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to storm-user+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out. --
You received this message because you are subscribed to the Google Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to storm-user+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out.