FAQ
We do experience the same problem with storm 0.8.2-wip20 and zookeeper
3.3.3 in the classpath.

Julien
On Tuesday, December 4, 2012 4:20:16 AM UTC-5, Philippe Guillebert wrote:

Hi list,

We've got the same problem here with vanilla storm 0.8.1 : sometimes, the
supervisor dies with this error "stormconf.ser does not exist".

I see you are using zookeeper 3.3.6, as we do.

Somebody apparently fixed this issue before by using zookeeper 3.3.3 (the
same version that is included in storm) (
https://groups.google.com/d/msg/storm-user/DcX79-qoSck/gH8s_j8rSYoJ )

Nathan, would it make sense to revert to 3.3.3 ?


Philippe


Le 3 déc. 2012 18:59, "Dane Hammer" <dane.m...@gmail.com <javascript:>> a
écrit :
I cherry-picked the commit that fixed this [1], applyin on the 0.8.1
release, would you expect that to work? I'm still seeing this issue.

[1]
https://github.com/nathanmarz/storm/commit/54fc737a207cffa8d5013e58ed0a2778cd4ec170
On Monday, October 1, 2012 4:10:26 PM UTC-5, nathanmarz wrote:

Looks like this was a race condition in the supervisor. I pushed a fix
to master that will be available in the next release. You can fix the
supervisor in your current version by clearing out its local dir.
On Mon, Oct 1, 2012 at 11:54 AM, phlp wrote:

I am seeing a problem as Marc Sturlese reported on Aug 20.

2012-10-01 10:43:08,531 ERROR [backtype.storm.event
] Error when processing event (Thread-3:)
java.io.FileNotFoundException: File '/var/data/storm/supervisor/**
stormdist/ISM_TerminalIE-3-**1348609873/stormconf.ser' does not exist
at org.apache.commons.io.**FileUtils.openInputStream(**
FileUtils.java:137)
at org.apache.commons.io.**FileUtils.readFileToByteArray(**
FileUtils.java:1135)
at backtype.storm.config$read_**supervisor_storm_conf.invoke(**
config.clj:138)
at backtype.storm.daemon.**supervisor$fn__4052.invoke(**
supervisor.clj:394)
at clojure.lang.MultiFn.invoke(**MultiFn.java:177)
at backtype.storm.daemon.**supervisor$sync_processes$**
iter__3946__3950$fn__3951.**invoke(supervisor.clj:247)
at clojure.lang.LazySeq.sval(**LazySeq.java:42)
at clojure.lang.LazySeq.seq(**LazySeq.java:60)
at clojure.lang.RT.seq(RT.java:**473)
at clojure.core$seq.invoke(core.**clj:133)
at clojure.core$dorun.invoke(**core.clj:2725)
at clojure.core$doall.invoke(**core.clj:2741)
at backtype.storm.daemon.**supervisor$sync_processes.**
invoke(supervisor.clj:235)
at clojure.lang.AFn.**applyToHelper(AFn.java:161)
at clojure.lang.AFn.applyTo(AFn.**java:151)
at clojure.core$apply.invoke(**core.clj:603)
at clojure.core$partial$fn__4070.**doInvoke(core.clj:2343)
at clojure.lang.RestFn.invoke(**RestFn.java:397)
at backtype.storm.event$event_**manager$fn__2173.invoke(event.**clj:24)
at clojure.lang.AFn.run(AFn.java:**24)
at java.lang.Thread.run(Thread.**java:619)
2012-10-01 10:43:08,547 INFO [backtype.storm.util
] Halting process: ("Error when processing an event")
(Thread-3:)

ISM_TerminalIE-3-1348609873 is a prior instance of the topology - the
current instance is ISM_TerminalIE-7-1348610442 and is active in the
cluster. Unlike Marc's case, the problem does not resolve itself and the
supervisor keeps failing on startup.

The configuration is a 2 node cluster with nimbus, ui and the failing
supervisor on one system (.210.138) and another supervisor on another
system (.210.139).

The problem is on a QA test system and I don't know when or how the
problem began -- i was reviewing the system for a separate issue when I
noticed this problem. I do know that the system ran out of disk space
during the evening prior to my discovering the problem but I don't know if
that's at all related.

I'm running on Centos 5.5 with Storm 0.7.4 and zookeeper 3.3.6.

I've restarted zookeepers and nimbus and no change.

I've attached a log of the supervisor startup plus the config files for
zookeeper and storm on both systems

The problem seems to occur with the following log entry. I can't find
any reference to ISM_TerminalIE-3-1348609873 in the storm data directory.

2012-10-01 10:43:08,462 DEBUG [backtype.storm.daemon.**supervisor
] Assigned tasks: {6702 #backtype.storm.daemon.*
*supervisor.LocalAssignment{:**storm-id "ISM_TerminalIE-3-1348609873",
:task-ids (32 64 96 128 1 33 65 97 129 2 34 66 98 130 3 35 67 99 131 4 36
68 100 132 5 37 69 101 133 6 38 70 102 134 7 39 71 103 135 8 40 72 104 136
9 41 73 105 137 10 42 74 106 138 11 43 75 107 139 12 44 76 108 140 13 45 77
109 141 14 46 78 110 142 15 47 79 111 143 16 48 80 112 144 17 49 81 113 145
18 50 82 114 146 19 51 83 115 147 20 52 84 116 148 21 53 85 117 149 22 54
86 118 150 23 55 87 119 151 24 56 88 120 152 25 57 89 121 26 58 90 122 27
59 91 123 28 60 92 124 29 61 93 125 30 62 94 126 31 63 95 127)}}
(Thread-3:)
2012-10-01 10:43:08,462 DEBUG [backtype.storm.daemon.**supervisor
] Allocated: {"56a8313a-3ac2-46bc-a78a-**a37f04bb32a6"
[:disallowed nil]} (Thread-3:)

--
Twitter: @nathanmarz
http://nathanmarz.com

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupstorm-user @
postedJan 8, '13 at 2:33p
activeJan 8, '13 at 2:33p
posts1
users1
websitestorm-project.net
irc#storm-user

1 user in discussion

Julien Letrouit: 1 post

People

Translate

site design / logo © 2022 Grokbase