FAQ
One of our production service wen down during the past few days
occasionally. After digging into the goroutings' stacks, I'm afraid it's
related to DNS resolving process. Briefly, all DNS resolving requests are
wrapped into a singleflight call, and if one DNS resolving requests failed
to return in time, all other requests for the same host will Wait forever.
Here is a partial output:

     500 @ 0x4103d5 0x5ab606 0x5a03a1 0x5a0b69 0x591416 0x5a3b64 0x594db2 0x5907de 0x590abb 0x58f08a 0x573605 0x57382f 0x5736f4 0x477bd4 0x47812b 0x47ca61 0x41fb60
#▸ 0x5ab606▸ net._C2func_getaddrinfo+0x36▸ ▸ net/_obj/_cgo_defun.c:52
#▸ 0x5a03a1▸ net.cgoLookupIPCNAME+0x1e1▸ ▸ /usr/local/go/src/pkg/net/cgo_unix.go:96
#▸ 0x5a0b69▸ net.cgoLookupIP+0x69▸ ▸ ▸ /usr/local/go/src/pkg/net/cgo_unix.go:148
#▸ 0x591416▸ net.lookupIP+0x66▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup_unix.go:64
#▸ 0x5a3b64▸ net.func·024+0x54▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:41
#▸ 0x594db2▸ net.(*singleflight).Do+0x232▸ ▸ /usr/local/go/src/pkg/net/singleflight.go:45
#▸ 0x5907de▸ net.lookupIPMerge+0xae▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:42
#▸ 0x590abb▸ net.lookupIPDeadline+0x12b▸ ▸ /usr/local/go/src/pkg/net/lookup.go:57
#▸ 0x58f08a▸ net.resolveInternetAddr+0x49a▸ ▸ /usr/local/go/src/pkg/net/ipsock.go:285
#▸ 0x573605▸ net.resolveAddr+0x385▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:110
#▸ 0x57382f▸ net.(*Dialer).Dial+0xff▸▸ ▸ /usr/local/go/src/pkg/net/dial.go:159
#▸ 0x5736f4▸ net.Dial+0xa4▸ ▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:144
#▸ 0x477bd4▸ net/http.(*Transport).dial+0xd4▸▸ /usr/local/go/src/pkg/net/http/transport.go:444
#▸ 0x47812b▸ net/http.(*Transport).dialConn+0x9b▸ /usr/local/go/src/pkg/net/http/transport.go:496
#▸ 0x47ca61▸ net/http.func·018+0x41▸ ▸ ▸ /usr/local/go/src/pkg/net/http/transport.go:472

60457 @ 0x41f8c9 0x41f94b 0x43462e 0x434870 0x503afb 0x594ca7 0x5907de 0x590abb 0x58f08a 0x573605 0x57382f 0x5736f4 0x477bd4 0x47812b 0x47ca61 0x41fb60
#▸ 0x434870▸ sync.runtime_Semacquire+0x30▸ ▸ /usr/local/go/src/pkg/runtime/sema.goc:199
#▸ 0x503afb▸ sync.(*WaitGroup).Wait+0x14b▸ ▸ /usr/local/go/src/pkg/sync/waitgroup.go:129
#▸ 0x594ca7▸ net.(*singleflight).Do+0x127▸ ▸ /usr/local/go/src/pkg/net/singleflight.go:37
#▸ 0x5907de▸ net.lookupIPMerge+0xae▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:42
#▸ 0x590abb▸ net.lookupIPDeadline+0x12b▸ ▸ /usr/local/go/src/pkg/net/lookup.go:57
#▸ 0x58f08a▸ net.resolveInternetAddr+0x49a▸ ▸ /usr/local/go/src/pkg/net/ipsock.go:285
#▸ 0x573605▸ net.resolveAddr+0x385▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:110
#▸ 0x57382f▸ net.(*Dialer).Dial+0xff▸▸ ▸ /usr/local/go/src/pkg/net/dial.go:159
#▸ 0x5736f4▸ net.Dial+0xa4▸ ▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:144
#▸ 0x477bd4▸ net/http.(*Transport).dial+0xd4▸▸ /usr/local/go/src/pkg/net/http/transport.go:444
#▸ 0x47812b▸ net/http.(*Transport).dialConn+0x9b▸ /usr/local/go/src/pkg/net/http/transport.go:496
#▸ 0x47ca61▸ net/http.func·018+0x41▸ ▸ ▸ /usr/local/go/src/pkg/net/http/transport.go:472

And most the these goroutines have been waiting for more than 3 hours.

I've went through the closed issues and found no similar reports, so I'm
not sure if this is a bug. Seems set a timeout to the transport used by
http client can fix this, but I think giving DNS resolving requests a
default small timeout value, say 5 seconds, is a better option than setting
an overall timeout value to the http client's transport.

go version
go version go1.3.3 linux/amd64


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Ian Lance Taylor at Apr 1, 2015 at 3:36 pm

    On Tue, Mar 31, 2015 at 9:33 PM, wrote:
    One of our production service wen down during the past few days
    occasionally. After digging into the goroutings' stacks, I'm afraid it's
    related to DNS resolving process. Briefly, all DNS resolving requests are
    wrapped into a singleflight call, and if one DNS resolving requests failed
    to return in time, all other requests for the same host will Wait forever.
    This looks like http://golang.org/issue/8602 , which was fixed in the
    1.4 release.

    Ian

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • You fu at Apr 6, 2015 at 1:51 am
    how to get the go stack?
    On Wednesday, April 1, 2015 at 11:35:35 PM UTC+8, xia...@hohoyi.com wrote:

    One of our production service wen down during the past few days
    occasionally. After digging into the goroutings' stacks, I'm afraid it's
    related to DNS resolving process. Briefly, all DNS resolving requests are
    wrapped into a singleflight call, and if one DNS resolving requests failed
    to return in time, all other requests for the same host will Wait forever.
    Here is a partial output:

    500 @ 0x4103d5 0x5ab606 0x5a03a1 0x5a0b69 0x591416 0x5a3b64 0x594db2 0x5907de 0x590abb 0x58f08a 0x573605 0x57382f 0x5736f4 0x477bd4 0x47812b 0x47ca61 0x41fb60
    #▸ 0x5ab606▸ net._C2func_getaddrinfo+0x36▸ ▸ net/_obj/_cgo_defun.c:52
    #▸ 0x5a03a1▸ net.cgoLookupIPCNAME+0x1e1▸ ▸ /usr/local/go/src/pkg/net/cgo_unix.go:96
    #▸ 0x5a0b69▸ net.cgoLookupIP+0x69▸ ▸ ▸ /usr/local/go/src/pkg/net/cgo_unix.go:148
    #▸ 0x591416▸ net.lookupIP+0x66▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup_unix.go:64
    #▸ 0x5a3b64▸ net.func·024+0x54▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:41
    #▸ 0x594db2▸ net.(*singleflight).Do+0x232▸ ▸ /usr/local/go/src/pkg/net/singleflight.go:45
    #▸ 0x5907de▸ net.lookupIPMerge+0xae▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:42
    #▸ 0x590abb▸ net.lookupIPDeadline+0x12b▸ ▸ /usr/local/go/src/pkg/net/lookup.go:57
    #▸ 0x58f08a▸ net.resolveInternetAddr+0x49a▸ ▸ /usr/local/go/src/pkg/net/ipsock.go:285
    #▸ 0x573605▸ net.resolveAddr+0x385▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:110
    #▸ 0x57382f▸ net.(*Dialer).Dial+0xff▸▸ ▸ /usr/local/go/src/pkg/net/dial.go:159
    #▸ 0x5736f4▸ net.Dial+0xa4▸ ▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:144
    #▸ 0x477bd4▸ net/http.(*Transport).dial+0xd4▸▸ /usr/local/go/src/pkg/net/http/transport.go:444
    #▸ 0x47812b▸ net/http.(*Transport).dialConn+0x9b▸ /usr/local/go/src/pkg/net/http/transport.go:496
    #▸ 0x47ca61▸ net/http.func·018+0x41▸ ▸ ▸ /usr/local/go/src/pkg/net/http/transport.go:472

    60457 @ 0x41f8c9 0x41f94b 0x43462e 0x434870 0x503afb 0x594ca7 0x5907de 0x590abb 0x58f08a 0x573605 0x57382f 0x5736f4 0x477bd4 0x47812b 0x47ca61 0x41fb60
    #▸ 0x434870▸ sync.runtime_Semacquire+0x30▸ ▸ /usr/local/go/src/pkg/runtime/sema.goc:199
    #▸ 0x503afb▸ sync.(*WaitGroup).Wait+0x14b▸ ▸ /usr/local/go/src/pkg/sync/waitgroup.go:129
    #▸ 0x594ca7▸ net.(*singleflight).Do+0x127▸ ▸ /usr/local/go/src/pkg/net/singleflight.go:37
    #▸ 0x5907de▸ net.lookupIPMerge+0xae▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:42
    #▸ 0x590abb▸ net.lookupIPDeadline+0x12b▸ ▸ /usr/local/go/src/pkg/net/lookup.go:57
    #▸ 0x58f08a▸ net.resolveInternetAddr+0x49a▸ ▸ /usr/local/go/src/pkg/net/ipsock.go:285
    #▸ 0x573605▸ net.resolveAddr+0x385▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:110
    #▸ 0x57382f▸ net.(*Dialer).Dial+0xff▸▸ ▸ /usr/local/go/src/pkg/net/dial.go:159
    #▸ 0x5736f4▸ net.Dial+0xa4▸ ▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:144
    #▸ 0x477bd4▸ net/http.(*Transport).dial+0xd4▸▸ /usr/local/go/src/pkg/net/http/transport.go:444
    #▸ 0x47812b▸ net/http.(*Transport).dialConn+0x9b▸ /usr/local/go/src/pkg/net/http/transport.go:496
    #▸ 0x47ca61▸ net/http.func·018+0x41▸ ▸ ▸ /usr/local/go/src/pkg/net/http/transport.go:472

    And most the these goroutines have been waiting for more than 3 hours.

    I've went through the closed issues and found no similar reports, so I'm
    not sure if this is a bug. Seems set a timeout to the transport used by
    http client can fix this, but I think giving DNS resolving requests a
    default small timeout value, say 5 seconds, is a better option than setting
    an overall timeout value to the http client's transport.

    go version
    go version go1.3.3 linux/amd64

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Ian Lance Taylor at Apr 6, 2015 at 4:08 pm

    On Sun, Apr 5, 2015 at 6:51 PM, you fu wrote:
    how to get the go stack?
    Are you looking for http://golang.org/pkg/runtime/#Stack ?

    Ian

    On Wednesday, April 1, 2015 at 11:35:35 PM UTC+8, xia...@hohoyi.com wrote:

    One of our production service wen down during the past few days
    occasionally. After digging into the goroutings' stacks, I'm afraid it's
    related to DNS resolving process. Briefly, all DNS resolving requests are
    wrapped into a singleflight call, and if one DNS resolving requests failed
    to return in time, all other requests for the same host will Wait forever.
    Here is a partial output:

    500 @ 0x4103d5 0x5ab606 0x5a03a1 0x5a0b69 0x591416 0x5a3b64 0x594db2
    0x5907de 0x590abb 0x58f08a 0x573605 0x57382f 0x5736f4 0x477bd4 0x47812b
    0x47ca61 0x41fb60
    #▸ 0x5ab606▸ net._C2func_getaddrinfo+0x36▸ ▸
    net/_obj/_cgo_defun.c:52
    #▸ 0x5a03a1▸ net.cgoLookupIPCNAME+0x1e1▸ ▸
    /usr/local/go/src/pkg/net/cgo_unix.go:96
    #▸ 0x5a0b69▸ net.cgoLookupIP+0x69▸ ▸ ▸
    /usr/local/go/src/pkg/net/cgo_unix.go:148
    #▸ 0x591416▸ net.lookupIP+0x66▸ ▸ ▸
    /usr/local/go/src/pkg/net/lookup_unix.go:64
    #▸ 0x5a3b64▸ net.func·024+0x54▸ ▸ ▸
    /usr/local/go/src/pkg/net/lookup.go:41
    #▸ 0x594db2▸ net.(*singleflight).Do+0x232▸ ▸
    /usr/local/go/src/pkg/net/singleflight.go:45
    #▸ 0x5907de▸ net.lookupIPMerge+0xae▸ ▸ ▸
    /usr/local/go/src/pkg/net/lookup.go:42
    #▸ 0x590abb▸ net.lookupIPDeadline+0x12b▸ ▸
    /usr/local/go/src/pkg/net/lookup.go:57
    #▸ 0x58f08a▸ net.resolveInternetAddr+0x49a▸ ▸
    /usr/local/go/src/pkg/net/ipsock.go:285
    #▸ 0x573605▸ net.resolveAddr+0x385▸ ▸ ▸
    /usr/local/go/src/pkg/net/dial.go:110
    #▸ 0x57382f▸ net.(*Dialer).Dial+0xff▸▸ ▸
    /usr/local/go/src/pkg/net/dial.go:159
    #▸ 0x5736f4▸ net.Dial+0xa4▸ ▸ ▸ ▸
    /usr/local/go/src/pkg/net/dial.go:144
    #▸ 0x477bd4▸ net/http.(*Transport).dial+0xd4▸▸
    /usr/local/go/src/pkg/net/http/transport.go:444
    #▸ 0x47812b▸ net/http.(*Transport).dialConn+0x9b▸
    /usr/local/go/src/pkg/net/http/transport.go:496
    #▸ 0x47ca61▸ net/http.func·018+0x41▸ ▸ ▸
    /usr/local/go/src/pkg/net/http/transport.go:472

    60457 @ 0x41f8c9 0x41f94b 0x43462e 0x434870 0x503afb 0x594ca7 0x5907de
    0x590abb 0x58f08a 0x573605 0x57382f 0x5736f4 0x477bd4 0x47812b 0x47ca61
    0x41fb60
    #▸ 0x434870▸ sync.runtime_Semacquire+0x30▸ ▸
    /usr/local/go/src/pkg/runtime/sema.goc:199
    #▸ 0x503afb▸ sync.(*WaitGroup).Wait+0x14b▸ ▸
    /usr/local/go/src/pkg/sync/waitgroup.go:129
    #▸ 0x594ca7▸ net.(*singleflight).Do+0x127▸ ▸
    /usr/local/go/src/pkg/net/singleflight.go:37
    #▸ 0x5907de▸ net.lookupIPMerge+0xae▸ ▸ ▸
    /usr/local/go/src/pkg/net/lookup.go:42
    #▸ 0x590abb▸ net.lookupIPDeadline+0x12b▸ ▸
    /usr/local/go/src/pkg/net/lookup.go:57
    #▸ 0x58f08a▸ net.resolveInternetAddr+0x49a▸ ▸
    /usr/local/go/src/pkg/net/ipsock.go:285
    #▸ 0x573605▸ net.resolveAddr+0x385▸ ▸ ▸
    /usr/local/go/src/pkg/net/dial.go:110
    #▸ 0x57382f▸ net.(*Dialer).Dial+0xff▸▸ ▸
    /usr/local/go/src/pkg/net/dial.go:159
    #▸ 0x5736f4▸ net.Dial+0xa4▸ ▸ ▸ ▸
    /usr/local/go/src/pkg/net/dial.go:144
    #▸ 0x477bd4▸ net/http.(*Transport).dial+0xd4▸▸
    /usr/local/go/src/pkg/net/http/transport.go:444
    #▸ 0x47812b▸ net/http.(*Transport).dialConn+0x9b▸
    /usr/local/go/src/pkg/net/http/transport.go:496
    #▸ 0x47ca61▸ net/http.func·018+0x41▸ ▸ ▸
    /usr/local/go/src/pkg/net/http/transport.go:472

    And most the these goroutines have been waiting for more than 3 hours.

    I've went through the closed issues and found no similar reports, so I'm
    not sure if this is a bug. Seems set a timeout to the transport used by http
    client can fix this, but I think giving DNS resolving requests a default
    small timeout value, say 5 seconds, is a better option than setting an
    overall timeout value to the http client's transport.

    go version
    go version go1.3.3 linux/amd64
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Wen-Liang Xiao at Apr 7, 2015 at 2:57 am
    The stack was generated with the pprof tool,
    https://golang.org/pkg/net/http/pprof/ , with command curl
    http://localhost:6060/debug/pprof/goroutine?debug=1
    On Mon, Apr 6, 2015 at 9:51 AM, you fu wrote:

    how to get the go stack?
    On Wednesday, April 1, 2015 at 11:35:35 PM UTC+8, xia...@hohoyi.com wrote:

    One of our production service wen down during the past few days
    occasionally. After digging into the goroutings' stacks, I'm afraid it's
    related to DNS resolving process. Briefly, all DNS resolving requests are
    wrapped into a singleflight call, and if one DNS resolving requests failed
    to return in time, all other requests for the same host will Wait forever.
    Here is a partial output:

    500 @ 0x4103d5 0x5ab606 0x5a03a1 0x5a0b69 0x591416 0x5a3b64 0x594db2 0x5907de 0x590abb 0x58f08a 0x573605 0x57382f 0x5736f4 0x477bd4 0x47812b 0x47ca61 0x41fb60
    #▸ 0x5ab606▸ net._C2func_getaddrinfo+0x36▸ ▸ net/_obj/_cgo_defun.c:52
    #▸ 0x5a03a1▸ net.cgoLookupIPCNAME+0x1e1▸ ▸ /usr/local/go/src/pkg/net/cgo_unix.go:96
    #▸ 0x5a0b69▸ net.cgoLookupIP+0x69▸ ▸ ▸ /usr/local/go/src/pkg/net/cgo_unix.go:148
    #▸ 0x591416▸ net.lookupIP+0x66▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup_unix.go:64
    #▸ 0x5a3b64▸ net.func·024+0x54▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:41
    #▸ 0x594db2▸ net.(*singleflight).Do+0x232▸ ▸ /usr/local/go/src/pkg/net/singleflight.go:45
    #▸ 0x5907de▸ net.lookupIPMerge+0xae▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:42
    #▸ 0x590abb▸ net.lookupIPDeadline+0x12b▸ ▸ /usr/local/go/src/pkg/net/lookup.go:57
    #▸ 0x58f08a▸ net.resolveInternetAddr+0x49a▸ ▸ /usr/local/go/src/pkg/net/ipsock.go:285
    #▸ 0x573605▸ net.resolveAddr+0x385▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:110
    #▸ 0x57382f▸ net.(*Dialer).Dial+0xff▸▸ ▸ /usr/local/go/src/pkg/net/dial.go:159
    #▸ 0x5736f4▸ net.Dial+0xa4▸ ▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:144
    #▸ 0x477bd4▸ net/http.(*Transport).dial+0xd4▸▸ /usr/local/go/src/pkg/net/http/transport.go:444
    #▸ 0x47812b▸ net/http.(*Transport).dialConn+0x9b▸ /usr/local/go/src/pkg/net/http/transport.go:496
    #▸ 0x47ca61▸ net/http.func·018+0x41▸ ▸ ▸ /usr/local/go/src/pkg/net/http/transport.go:472

    60457 @ 0x41f8c9 0x41f94b 0x43462e 0x434870 0x503afb 0x594ca7 0x5907de 0x590abb 0x58f08a 0x573605 0x57382f 0x5736f4 0x477bd4 0x47812b 0x47ca61 0x41fb60
    #▸ 0x434870▸ sync.runtime_Semacquire+0x30▸ ▸ /usr/local/go/src/pkg/runtime/sema.goc:199
    #▸ 0x503afb▸ sync.(*WaitGroup).Wait+0x14b▸ ▸ /usr/local/go/src/pkg/sync/waitgroup.go:129
    #▸ 0x594ca7▸ net.(*singleflight).Do+0x127▸ ▸ /usr/local/go/src/pkg/net/singleflight.go:37
    #▸ 0x5907de▸ net.lookupIPMerge+0xae▸ ▸ ▸ /usr/local/go/src/pkg/net/lookup.go:42
    #▸ 0x590abb▸ net.lookupIPDeadline+0x12b▸ ▸ /usr/local/go/src/pkg/net/lookup.go:57
    #▸ 0x58f08a▸ net.resolveInternetAddr+0x49a▸ ▸ /usr/local/go/src/pkg/net/ipsock.go:285
    #▸ 0x573605▸ net.resolveAddr+0x385▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:110
    #▸ 0x57382f▸ net.(*Dialer).Dial+0xff▸▸ ▸ /usr/local/go/src/pkg/net/dial.go:159
    #▸ 0x5736f4▸ net.Dial+0xa4▸ ▸ ▸ ▸ /usr/local/go/src/pkg/net/dial.go:144
    #▸ 0x477bd4▸ net/http.(*Transport).dial+0xd4▸▸ /usr/local/go/src/pkg/net/http/transport.go:444
    #▸ 0x47812b▸ net/http.(*Transport).dialConn+0x9b▸ /usr/local/go/src/pkg/net/http/transport.go:496
    #▸ 0x47ca61▸ net/http.func·018+0x41▸ ▸ ▸ /usr/local/go/src/pkg/net/http/transport.go:472

    And most the these goroutines have been waiting for more than 3 hours.

    I've went through the closed issues and found no similar reports, so I'm
    not sure if this is a bug. Seems set a timeout to the transport used by
    http client can fix this, but I think giving DNS resolving requests a
    default small timeout value, say 5 seconds, is a better option than setting
    an overall timeout value to the http client's transport.

    go version
    go version go1.3.3 linux/amd64

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedApr 1, '15 at 3:35p
activeApr 7, '15 at 2:57a
posts5
users3
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase