FAQ
I think there is a race condition in the net/http package. The race is (in
transport.go) between the persistConn's readLoop and the client (whoever is
reading the response body). The following sequence of events will result in
this race condition:

- Thread A (persistConn's readLoop)
- Thread B (cilent)


1. Thread B reads the response of bodyEOFSignal till EOF, and enters the
if conditional on line 738 and calls "fn".
2. The call to fn publishes to the waitForBodyRead on line 603, which
causes Thread A to receive on line 627 and hence it continues to line 563.
3. lastBody.Close() is called by Thread A and we reach line 751. At this
point, es.fn is still not nil (because Thread B has not reached line 740).
So, we call fn again, which publishes to the "waitForBodyRead" channel
again. However, this blocks since there is no one waiting on that channel
anymore.


Here is my proposed fix (set to nil before calling fn):


--- .../go1.0.3/src/pkg/net/http/transport.go
@@ -14,8 +14,9 @@
panic("http: unexpected bodyEOFSignal Read after Close;
see issue 1725")
}
if err == io.EOF && es.fn != nil {
- es.fn()
+ callback := es.fn
es.fn = nil
+ callback()
}
return
}
@@ -27,8 +28,9 @@
es.isClosed = true
err = es.body.Close()
if err == nil && es.fn != nil {
- es.fn()
+ callback := es.fn
es.fn = nil
+ callback()
}
return
}

Alternatively, we could use locks around checking if es.fn is nil.

--

Search Discussions

  • Ancientlore at Oct 24, 2012 at 1:29 am
    I noticed a similar issue where I had hundreds of stuck goroutines at:

    goroutine 105 [chan receive]:
    net/http.(*persistConn).readLoop(0xf8404d5660, 0x0)
    c:/Go/src/pkg/net/http/transport.go:539 +0x271
    created by net/http.(*Transport).getConn
    c:/Go/src/pkg/net/http/transport.go:382 +0x5e2

    I am not sure if it's the same issue but maybe it is. I was trying to
    reduce it to a simple example that would reproduce the issue, but when I
    started yanking out code the problem went away. For me it only happened on
    Windows.

    Mike
    On Tuesday, October 23, 2012 5:26:41 PM UTC-4, duke....@gmail.com wrote:

    I think there is a race condition in the net/http package. The race is (in
    transport.go) between the persistConn's readLoop and the client (whoever is
    reading the response body). The following sequence of events will result in
    this race condition:

    - Thread A (persistConn's readLoop)
    - Thread B (cilent)


    1. Thread B reads the response of bodyEOFSignal till EOF, and enters
    the if conditional on line 738 and calls "fn".
    2. The call to fn publishes to the waitForBodyRead on line 603, which
    causes Thread A to receive on line 627 and hence it continues to line 563.
    3. lastBody.Close() is called by Thread A and we reach line 751. At
    this point, es.fn is still not nil (because Thread B has not reached line
    740). So, we call fn again, which publishes to the "waitForBodyRead"
    channel again. However, this blocks since there is no one waiting on that
    channel anymore.


    Here is my proposed fix (set to nil before calling fn):


    --- .../go1.0.3/src/pkg/net/http/transport.go
    @@ -14,8 +14,9 @@
    panic("http: unexpected bodyEOFSignal Read after Close;
    see issue 1725")
    }
    if err == io.EOF && es.fn != nil {
    - es.fn()
    + callback := es.fn
    es.fn = nil
    + callback()
    }
    return
    }
    @@ -27,8 +28,9 @@
    es.isClosed = true
    err = es.body.Close()
    if err == nil && es.fn != nil {
    - es.fn()
    + callback := es.fn
    es.fn = nil
    + callback()
    }
    return
    }

    Alternatively, we could use locks around checking if es.fn is nil.
    --
  • Dmitry Vyukov at Oct 24, 2012 at 8:38 am
    Perhaps it's a consequence of
    http://code.google.com/p/go/issues/detail?id=3861

    On Wednesday, October 24, 2012 5:29:15 AM UTC+4, ancientlore wrote:

    I noticed a similar issue where I had hundreds of stuck goroutines at:

    goroutine 105 [chan receive]:
    net/http.(*persistConn).readLoop(0xf8404d5660, 0x0)
    c:/Go/src/pkg/net/http/transport.go:539 +0x271
    created by net/http.(*Transport).getConn
    c:/Go/src/pkg/net/http/transport.go:382 +0x5e2

    I am not sure if it's the same issue but maybe it is. I was trying to
    reduce it to a simple example that would reproduce the issue, but when I
    started yanking out code the problem went away. For me it only happened on
    Windows.

    Mike
    On Tuesday, October 23, 2012 5:26:41 PM UTC-4, duke....@gmail.com wrote:

    I think there is a race condition in the net/http package. The race is
    (in transport.go) between the persistConn's readLoop and the client
    (whoever is reading the response body). The following sequence of events
    will result in this race condition:

    - Thread A (persistConn's readLoop)
    - Thread B (cilent)


    1. Thread B reads the response of bodyEOFSignal till EOF, and enters
    the if conditional on line 738 and calls "fn".
    2. The call to fn publishes to the waitForBodyRead on line 603, which
    causes Thread A to receive on line 627 and hence it continues to line 563.
    3. lastBody.Close() is called by Thread A and we reach line 751. At
    this point, es.fn is still not nil (because Thread B has not reached line
    740). So, we call fn again, which publishes to the "waitForBodyRead"
    channel again. However, this blocks since there is no one waiting on that
    channel anymore.


    Here is my proposed fix (set to nil before calling fn):


    --- .../go1.0.3/src/pkg/net/http/transport.go
    @@ -14,8 +14,9 @@
    panic("http: unexpected bodyEOFSignal Read after Close;
    see issue 1725")
    }
    if err == io.EOF && es.fn != nil {
    - es.fn()
    + callback := es.fn
    es.fn = nil
    + callback()
    }
    return
    }
    @@ -27,8 +28,9 @@
    es.isClosed = true
    err = es.body.Close()
    if err == nil && es.fn != nil {
    - es.fn()
    + callback := es.fn
    es.fn = nil
    + callback()
    }
    return
    }

    Alternatively, we could use locks around checking if es.fn is nil.
    --
  • Kar at Oct 30, 2012 at 3:03 am
    Same issue here. As you can see, i've had thousands of failed goroutines
    (go 1.0.3) on one of my production server.

    goroutine 3997 [chan receive]:
    net/http.(*persistConn).readLoop(0xf84426a3f0, 0x0)
    /home/kar/go/src/pkg/net/http/transport.go:627 +0x556
    created by net/http.(*Transport).getConn
    /home/kar/go/src/pkg/net/http/transport.go:399 +0x60f

    --
  • Kevin Gillette at Oct 30, 2012 at 10:09 am
    I think it would need to be:

    --- .../go1.0.3/src/pkg/net/http/transport.go
    @@ -14,8 +14,9 @@
    panic("http: unexpected bodyEOFSignal Read after Close;
    see issue 1725")
    }
    + callback := es.fn
    if err == io.EOF && callback != nil {
    - es.fn()
    es.fn = nil
    + callback()
    }
    return
    }
    @@ -27,8 +28,9 @@
    es.isClosed = true
    err = es.body.Close()
    + callback := es.fn
    if err == nil && es.fn != nil {
    - es.fn()
    es.fn = nil
    + callback()
    }
    return
    }

    If you set callback inside the conditional block, it's possible that
    something else could have set es.fn to nil between the nil check and the
    callback assignment, in which case callback would have had the value nil,
    and a panic would have occurred when trying to call the function.
    On Tuesday, October 23, 2012 3:26:41 PM UTC-6, duke....@gmail.com wrote:

    I think there is a race condition in the net/http package. The race is (in
    transport.go) between the persistConn's readLoop and the client (whoever is
    reading the response body). The following sequence of events will result in
    this race condition:

    - Thread A (persistConn's readLoop)
    - Thread B (cilent)


    1. Thread B reads the response of bodyEOFSignal till EOF, and enters
    the if conditional on line 738 and calls "fn".
    2. The call to fn publishes to the waitForBodyRead on line 603, which
    causes Thread A to receive on line 627 and hence it continues to line 563.
    3. lastBody.Close() is called by Thread A and we reach line 751. At
    this point, es.fn is still not nil (because Thread B has not reached line
    740). So, we call fn again, which publishes to the "waitForBodyRead"
    channel again. However, this blocks since there is no one waiting on that
    channel anymore.


    Here is my proposed fix (set to nil before calling fn):


    --- .../go1.0.3/src/pkg/net/http/transport.go
    @@ -14,8 +14,9 @@
    panic("http: unexpected bodyEOFSignal Read after Close;
    see issue 1725")
    }
    if err == io.EOF && es.fn != nil {
    - es.fn()
    + callback := es.fn
    es.fn = nil
    + callback()
    }
    return
    }
    @@ -27,8 +28,9 @@
    es.isClosed = true
    err = es.body.Close()
    if err == nil && es.fn != nil {
    - es.fn()
    + callback := es.fn
    es.fn = nil
    + callback()
    }
    return
    }

    Alternatively, we could use locks around checking if es.fn is nil.
    --
  • Kevin Gillette at Oct 30, 2012 at 10:12 am
    Though I think without a lock or explicit synchronization, in any of these
    cases the callback could be run twice.

    --
  • Duke Gemini at Oct 30, 2012 at 8:54 pm
    As mentioned I have not been running on the latest (bleeding edge) revision
    of go, but rather go1.0.3

    This commit<http://code.google.com/p/go/source/detail?r=81f45da9df0d366ea7c592e8697f8b82c1546fd2&path=/src/pkg/net/http/transport.go>(made July 11th 2012) seems to fix the problem I originally mentioned.

    Dave, your commit<https://code.google.com/p/go/source/detail?r=cb38e940c38a> (made
    October 11th 2012) also may have been causing issues though it did not
    cause us an immediate problem.

    What is the release schedule for those commits? Will they be included in
    go1.0.4?
    On Tuesday, 30 October 2012 05:12:49 UTC-5, Kevin Gillette wrote:

    Though I think without a lock or explicit synchronization, in any of these
    cases the callback could be run twice.
    --
  • Akale2408 at Dec 10, 2012 at 8:38 pm
    I've run into the same issue when using lot's of threads. How quickly can
    we expect the fix to be included in a major release?
    On Tuesday, October 30, 2012 2:52:52 PM UTC-5, duke....@gmail.com wrote:

    As mentioned I have not been running on the latest (bleeding edge)
    revision of go, but rather go1.0.3

    This commit<http://code.google.com/p/go/source/detail?r=81f45da9df0d366ea7c592e8697f8b82c1546fd2&path=/src/pkg/net/http/transport.go>(made July 11th 2012) seems to fix the problem I originally mentioned.

    Dave, your commit<https://code.google.com/p/go/source/detail?r=cb38e940c38a> (made
    October 11th 2012) also may have been causing issues though it did not
    cause us an immediate problem.

    What is the release schedule for those commits? Will they be included in
    go1.0.4?
    On Tuesday, 30 October 2012 05:12:49 UTC-5, Kevin Gillette wrote:

    Though I think without a lock or explicit synchronization, in any of
    these cases the callback could be run twice.
    --
  • Bryanturley at Dec 10, 2012 at 8:49 pm

    On Monday, December 10, 2012 2:30:34 PM UTC-6, akal...@gmail.com wrote:
    I've run into the same issue when using lot's of threads. How quickly can
    we expect the fix to be included in a major release?
    I think in october people were targetting jan/feb but that was 2 months ago
    so who knows now?

    http://swtch.com/~rsc/go11.html doesn't seem to be falling at a very fast
    rate.

    --
  • Dave Cheney at Oct 30, 2012 at 10:18 am
    Possibly related, https://code.google.com/p/go/source/detail?r=cb38e940c38a

    On Tue, Oct 30, 2012 at 9:09 PM, Kevin Gillette
    wrote:
    I think it would need to be:

    --- .../go1.0.3/src/pkg/net/http/transport.go
    @@ -14,8 +14,9 @@
    panic("http: unexpected bodyEOFSignal Read after Close; see
    issue 1725")
    }
    + callback := es.fn
    if err == io.EOF && callback != nil {
    - es.fn()

    es.fn = nil
    + callback()
    }
    return
    }
    @@ -27,8 +28,9 @@
    es.isClosed = true
    err = es.body.Close()
    + callback := es.fn

    if err == nil && es.fn != nil {
    - es.fn()
    es.fn = nil
    + callback()
    }
    return
    }


    If you set callback inside the conditional block, it's possible that
    something else could have set es.fn to nil between the nil check and the
    callback assignment, in which case callback would have had the value nil,
    and a panic would have occurred when trying to call the function.
    On Tuesday, October 23, 2012 3:26:41 PM UTC-6, duke....@gmail.com wrote:

    I think there is a race condition in the net/http package. The race is (in
    transport.go) between the persistConn's readLoop and the client (whoever is
    reading the response body). The following sequence of events will result in
    this race condition:

    Thread A (persistConn's readLoop)
    Thread B (cilent)

    Thread B reads the response of bodyEOFSignal till EOF, and enters the if
    conditional on line 738 and calls "fn".
    The call to fn publishes to the waitForBodyRead on line 603, which causes
    Thread A to receive on line 627 and hence it continues to line 563.
    lastBody.Close() is called by Thread A and we reach line 751. At this
    point, es.fn is still not nil (because Thread B has not reached line 740).
    So, we call fn again, which publishes to the "waitForBodyRead" channel
    again. However, this blocks since there is no one waiting on that channel
    anymore.


    Here is my proposed fix (set to nil before calling fn):


    --- .../go1.0.3/src/pkg/net/http/transport.go
    @@ -14,8 +14,9 @@
    panic("http: unexpected bodyEOFSignal Read after Close;
    see issue 1725")
    }
    if err == io.EOF && es.fn != nil {
    - es.fn()
    + callback := es.fn
    es.fn = nil
    + callback()
    }
    return
    }
    @@ -27,8 +28,9 @@
    es.isClosed = true
    err = es.body.Close()
    if err == nil && es.fn != nil {
    - es.fn()
    + callback := es.fn
    es.fn = nil
    + callback()
    }
    return
    }

    Alternatively, we could use locks around checking if es.fn is nil.
    --
    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedOct 23, '12 at 9:29p
activeDec 10, '12 at 8:49p
posts10
users8
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase