Hi,

On 0.22.0 we sometimes see a shuffle phase being stuck to a point where the
framework does not kill it because of lack of progress. The reducer's
tasktracker log keeps filling up with two exceptions all night long:


2011-12-20 06:25:03,711 WARN org.mortbay.log: Committed before 410
getMapOutputs(attempt_201112191334_0039_m_000270_0,attempt_201112191334_0039_m_000264_0,attempt_201112191334_0039_m_000233_0,attempt_201112191334_0039_m_000266_0,attempt_201112191334_0039_m_000231_0,attempt_201112191334_0039_m_000228_0,attempt_201112191334_0039_m_000234_0,attempt_201112191334_0039_m_000309_0,attempt_201112191334_0039_m_000265_0,attempt_201112191334_0039_m_000271_0,attempt_201112191334_0039_m_000268_0,6)
failed
2011-12-20 06:25:03,711 ERROR org.mortbay.log: /mapOutput
java.lang.IllegalStateException: Committed
at org.mortbay.jetty.Response.resetBuffer(Response.java:1023)
at org.mortbay.jetty.Response.sendError(Response.java:240)
at
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3683)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:874)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
2011-12-20 06:25:03,711 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutputs(attempt_201112191334_0039_m_000264_0,attempt_201112191334_0039_m_000270_0,attempt_201112191334_0039_m_000233_0,attempt_201112191334_0039_m_000266_0,attempt_201112191334_0039_m_000234_0,attempt_201112191334_0039_m_000228_0,attempt_201112191334_0039_m_000231_0,attempt_201112191334_0039_m_000309_0,attempt_201112191334_0039_m_000271_0,attempt_201112191334_0039_m_000265_0,attempt_201112191334_0039_m_000268_0,6)
failed
java.io.IOException: error on sending map attempt_201112191334_0039_m_000264_0
to reduce 6
at
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.sendMapFile(TaskTracker.java:3815)
at
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3675)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:874)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.mortbay.jetty.EofException
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)
at
org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:569)
at
org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012)
at
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:651)
at
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.sendMapFile(TaskTracker.java:3785)
... 22 more
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
at sun.nio.ch.IOUtil.write(IOUtil.java:43)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:170)
at
org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:725)
... 28 more


Any thoughts? Each node is configured to 16k open files.

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedDec 20, '11 at 7:18a
activeDec 20, '11 at 7:18a
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Markus Jelsma: 1 post

People

Translate

site design / logo © 2021 Grokbase