FAQ
We have a test cluster running 0.8 that is not behaving properly. It is almost continuously spewing the following exception into its log:


2013-03-07 23:44:17,532 ERROR kafka.network.Processor: Closing socket for /10.10.2.123 because of error
java.io.IOException: Resource temporarily unavailable
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:133)
at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:73)
at kafka.network.MultiSend.writeTo(Transmission.scala:94)
at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
at kafka.network.MultiSend.writeCompletely(Transmission.scala:87)
at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:128)
at kafka.network.MultiSend.writeTo(Transmission.scala:94)
at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
at kafka.network.MultiSend.writeCompletely(Transmission.scala:87)
at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:223)
at kafka.network.Processor.write(SocketServer.scala:318)
at kafka.network.Processor.run(SocketServer.scala:211)
at java.lang.Thread.run(Thread.java:619)

And our consumer is reporting the following:


2013-03-07 23:46:09,736 INFO kafka.consumer.SimpleConsumer: Reconnect due to socket error:
java.io.EOFException: Received -1 when reading from channel, socket has likely been closed.
at kafka.utils.Utils$.read(Utils.scala:373)
at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)
at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:124)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:122)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:161)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:161)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:161)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:160)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:160)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:160)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:159)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
2013-03-07 23:46:09,740 INFO kafka.consumer.ConsumerFetcherManager: [ConsumerFetcherManager-1362697806347] removing fetcher on topic VTFull-enriched, partition 0

We have several other environments running the same code without error.

This is a CentOS server issuing these log errors.

We have both Ubuntu and CentOS environments working.

The only thing that seems weird is that this environment got brought up with replication factor 2 even though there is only one broker (we have both 1 and 2 broker clusters working fine). We have since purged all data and zookeeper nodes and started the cluster with clean data and this problem is still happening.

We have one process writing data into the VTFull-enriched topic and have 14,000 messages in that topic (only one partition).

The consumer is trying to read from message 0 and is hitting this EOFException right away. The app is not reading any messages at all.

Any ideas on what to do?

Thanks,
Bob Jervis

Search Discussions

  • Neha Narkhede at Mar 8, 2013 at 12:34 am
    Bob,

    It seems you are probably reaching the limit of open files on that box. In
    Kafka 0.8, we keep the file handles for all segment files open until they
    are garbage collection. Depending on the size of your cluster, this number
    can be pretty big. Few 10 K or so.

    Thanks,
    Neha


    On Thu, Mar 7, 2013 at 3:57 PM, Bob Jervis
    wrote:
    We have a test cluster running 0.8 that is not behaving properly. It is
    almost continuously spewing the following exception into its log:


    2013-03-07 23:44:17,532 ERROR kafka.network.Processor: Closing socket for /
    10.10.2.123 because of error
    java.io.IOException: Resource temporarily unavailable
    at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
    at
    sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
    at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
    at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:133)
    at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:73)
    at kafka.network.MultiSend.writeTo(Transmission.scala:94)
    at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
    at kafka.network.MultiSend.writeCompletely(Transmission.scala:87)
    at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:128)
    at kafka.network.MultiSend.writeTo(Transmission.scala:94)
    at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
    at kafka.network.MultiSend.writeCompletely(Transmission.scala:87)
    at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:223)
    at kafka.network.Processor.write(SocketServer.scala:318)
    at kafka.network.Processor.run(SocketServer.scala:211)
    at java.lang.Thread.run(Thread.java:619)

    And our consumer is reporting the following:


    2013-03-07 23:46:09,736 INFO kafka.consumer.SimpleConsumer: Reconnect due
    to socket error:
    java.io.EOFException: Received -1 when reading from channel, socket has
    likely been closed.
    at kafka.utils.Utils$.read(Utils.scala:373)
    at
    kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)
    at
    kafka.network.Receive$class.readCompletely(Transmission.scala:56)
    at
    kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
    at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
    at
    kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:124)
    at
    kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:122)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:161)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:161)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:161)
    at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:160)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:160)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:160)
    at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
    at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:159)
    at
    kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93)
    at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
    2013-03-07 23:46:09,740 INFO kafka.consumer.ConsumerFetcherManager:
    [ConsumerFetcherManager-1362697806347] removing fetcher on topic
    VTFull-enriched, partition 0

    We have several other environments running the same code without error.

    This is a CentOS server issuing these log errors.

    We have both Ubuntu and CentOS environments working.

    The only thing that seems weird is that this environment got brought up
    with replication factor 2 even though there is only one broker (we have
    both 1 and 2 broker clusters working fine). We have since purged all data
    and zookeeper nodes and started the cluster with clean data and this
    problem is still happening.

    We have one process writing data into the VTFull-enriched topic and have
    14,000 messages in that topic (only one partition).

    The consumer is trying to read from message 0 and is hitting this
    EOFException right away. The app is not reading any messages at all.

    Any ideas on what to do?

    Thanks,
    Bob Jervis

  • Jun Rao at Mar 8, 2013 at 6:13 am
    Could this be caused by this bug (
    http://bugs.sun.com/view_bug.do?bug_id=5103988) in java 5?

    Thanks,

    Jun

    On Thu, Mar 7, 2013 at 3:57 PM, Bob Jervis
    wrote:
    We have a test cluster running 0.8 that is not behaving properly. It is
    almost continuously spewing the following exception into its log:


    2013-03-07 23:44:17,532 ERROR kafka.network.Processor: Closing socket for /
    10.10.2.123 because of error
    java.io.IOException: Resource temporarily unavailable
    at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
    at
    sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
    at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
    at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:133)
    at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:73)
    at kafka.network.MultiSend.writeTo(Transmission.scala:94)
    at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
    at kafka.network.MultiSend.writeCompletely(Transmission.scala:87)
    at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:128)
    at kafka.network.MultiSend.writeTo(Transmission.scala:94)
    at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
    at kafka.network.MultiSend.writeCompletely(Transmission.scala:87)
    at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:223)
    at kafka.network.Processor.write(SocketServer.scala:318)
    at kafka.network.Processor.run(SocketServer.scala:211)
    at java.lang.Thread.run(Thread.java:619)

    And our consumer is reporting the following:


    2013-03-07 23:46:09,736 INFO kafka.consumer.SimpleConsumer: Reconnect due
    to socket error:
    java.io.EOFException: Received -1 when reading from channel, socket has
    likely been closed.
    at kafka.utils.Utils$.read(Utils.scala:373)
    at
    kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)
    at
    kafka.network.Receive$class.readCompletely(Transmission.scala:56)
    at
    kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
    at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
    at
    kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:124)
    at
    kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:122)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:161)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:161)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:161)
    at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:160)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:160)
    at
    kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:160)
    at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
    at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:159)
    at
    kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93)
    at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
    2013-03-07 23:46:09,740 INFO kafka.consumer.ConsumerFetcherManager:
    [ConsumerFetcherManager-1362697806347] removing fetcher on topic
    VTFull-enriched, partition 0

    We have several other environments running the same code without error.

    This is a CentOS server issuing these log errors.

    We have both Ubuntu and CentOS environments working.

    The only thing that seems weird is that this environment got brought up
    with replication factor 2 even though there is only one broker (we have
    both 1 and 2 broker clusters working fine). We have since purged all data
    and zookeeper nodes and started the cluster with clean data and this
    problem is still happening.

    We have one process writing data into the VTFull-enriched topic and have
    14,000 messages in that topic (only one partition).

    The consumer is trying to read from message 0 and is hitting this
    EOFException right away. The app is not reading any messages at all.

    Any ideas on what to do?

    Thanks,
    Bob Jervis

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupusers @
categorieskafka
postedMar 7, '13 at 11:58p
activeMar 8, '13 at 6:13a
posts3
users3
websitekafka.apache.org
irc#kafka

People

Translate

site design / logo © 2017 Grokbase