FAQ
Hi Golang Dev,

I'm creating this thread to get some community feedback on how the Go
builtin resolver should work, going forward. The motivation for this post
specifically is issue 6579, and conversations with Mikio with how to best
handle it(if at all). In short, the issue says that if resolv.conf
contains two dead nameservers, then a live nameserver, every lookup will
take a long time. This is true - we are using the general default of two
tries, 5 second timeout. So, each lookup would take 20 seconds in the
above scenario. On one hand you could say "Well, fix your resolv.conf", or
on the other, let's make the lookup logic smarter.

Mikio's first resolution to this was to send all queries to all
nameservers, and use the first valid response. This adds no real time to
standard queries, and in the above complainant's scenario, it runs as fast
as the two dead entries not being there. However, I am reluctant to agree
with this approach as it adds unexpected overhead. To the vast majority of
programs, it's just not going to matter much as it's relatively small
overhead. To someone writing a program that does millions of lookups, or
even to someone doing one lookup who is watching network activity closely,
this adds a lot of unexpected overhead as now every message is being sent
three times rather than once. In summary, I consider it 'unexpected
behavior/resource usage.'

So, here are some of my thoughts on solutions, please - I'd be interested
to hear others.

1 - Do nothing. The user should be responsible for having a stable
resolv.conf.
Pro - Expected behavior, current behavior
Con - Down nameservers will kill performance

2 - Use Mikio's first resolution - send all queries to all ns, and use the
fastest response.
Pro - Fastest solution
Con - Most network overhead

3 - Use Mikio's second resolution - basically, like step 2 with a delay
built in(300ms, can be more/less). So if first hasn't answered in 300ms,
send the 2nd, and so on. Still stops after receiving first response.
Initially I liked the idea(and even suggested it), but realized that many
queries can take this long - especially in long chains or that hit slow
nameservers.
Pro - Dramatically speeds up queries when resolver(s) down
Con - Neither fastest, and has extra overhead in allocs and network

4 - Manage the order of nameservers. In essence, each lookup would call a
function to get the list of nameservers. If the first one works, do
nothing. If it does not, call a function to set the order of nameservers
to put the working nameserver as first in list(cfg.servers), so that
subsequent calls would hit the working server first. This would need to
use a mutex/channel/similar logic to prevent races.
Pro - Intelligently sort ns order, no extra network overhead
Con - First lookup(s) slow when first nameserver(s) are down/invalid.
Order in resolv.conf is not strictly followed.

Of course, other ideas are much welcome too. The reason for my
thoughts/proposal of 4, is that currently, the rotate option is being
collected but doesn't appear to be used. Implementing rotate would be
pretty easy if 4 is implemented, since lookups no longer accesses
cfg.servers directly, but are given a slice from a function, which can
order as needed before returning(either 'rotating', or with working ns
first in line).

Look forward to hearing your thoughts.

Thanks,
Alex

--

---
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Search Discussions

  • Minux at Mar 2, 2014 at 10:52 pm

    On Sun, Mar 2, 2014 at 5:46 PM, Alex Skinner wrote:

    I'm creating this thread to get some community feedback on how the Go
    builtin resolver should work, going forward. The motivation for this post
    specifically is issue 6579, and conversations with Mikio with how to best
    handle it(if at all). In short, the issue says that if resolv.conf
    contains two dead nameservers, then a live nameserver, every lookup will
    take a long time. This is true - we are using the general default of two
    tries, 5 second timeout. So, each lookup would take 20 seconds in the
    above scenario. On one hand you could say "Well, fix your resolv.conf", or
    on the other, let's make the lookup logic smarter.
    What does glibc do in this scenario? Why not follow established
    implementations rather than
    invent our own strategy?

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Andrew Gerrand at Mar 2, 2014 at 10:54 pm
    Don't we use glibc's resolver on most systems anyway? Or is that just OS X?

    On 3 March 2014 09:52, minux wrote:

    On Sun, Mar 2, 2014 at 5:46 PM, Alex Skinner wrote:

    I'm creating this thread to get some community feedback on how the Go
    builtin resolver should work, going forward. The motivation for this post
    specifically is issue 6579, and conversations with Mikio with how to best
    handle it(if at all). In short, the issue says that if resolv.conf
    contains two dead nameservers, then a live nameserver, every lookup will
    take a long time. This is true - we are using the general default of two
    tries, 5 second timeout. So, each lookup would take 20 seconds in the
    above scenario. On one hand you could say "Well, fix your resolv.conf", or
    on the other, let's make the lookup logic smarter.
    What does glibc do in this scenario? Why not follow established
    implementations rather than
    invent our own strategy?

    --

    ---
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Minux at Mar 2, 2014 at 10:56 pm

    On Sun, Mar 2, 2014 at 5:54 PM, Andrew Gerrand wrote:

    Don't we use glibc's resolver on most systems anyway? Or is that just OS X?
    Issue 6579 is about the the net package with the netgo build tag enabled.
    On 3 March 2014 09:52, minux wrote:
    On Sun, Mar 2, 2014 at 5:46 PM, Alex Skinner wrote:

    I'm creating this thread to get some community feedback on how the Go
    builtin resolver should work, going forward. The motivation for this post
    specifically is issue 6579, and conversations with Mikio with how to best
    handle it(if at all). In short, the issue says that if resolv.conf
    contains two dead nameservers, then a live nameserver, every lookup will
    take a long time. This is true - we are using the general default of two
    tries, 5 second timeout. So, each lookup would take 20 seconds in the
    above scenario. On one hand you could say "Well, fix your resolv.conf", or
    on the other, let's make the lookup logic smarter.
    What does glibc do in this scenario? Why not follow established
    implementations rather than
    invent our own strategy?
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Alex Skinner at Mar 2, 2014 at 11:18 pm
    Glibc tries the nameservers in order specified.

    From the resolv.conf man -

    If there are multiple servers, the
                   resolver library queries them in the order listed. If no
    name-
                   server entries are present, the default is to use the
    name
                   server on the local machine. (The algorithm used is to
    try a
                   name server, and if the query times out, try the next, until
    out
                   of name servers, then repeat trying all the name servers
    until a
                   maximum number of retries are made.)


    The netgo version isn't compliant with glibc's for a number of reasons,
    namely - no edns, no rotate logic, and our retry algorithm retries each
    server until the max, rather than trying each server per retry as stated
    above. Perhaps just fixing that would half the issue's resolution time.

    If the consensus is to try to follow glibc's lead, we should probably fix
    the things listed above, some of which are very easy changes. The rotate
    change is a little more complicated I think, so thought I'd bring it up in
    case.

    Thanks,
    Alex

    On Sun, Mar 2, 2014 at 5:56 PM, minux wrote:

    On Sun, Mar 2, 2014 at 5:54 PM, Andrew Gerrand wrote:

    Don't we use glibc's resolver on most systems anyway? Or is that just OS
    X?
    Issue 6579 is about the the net package with the netgo build tag enabled.
    On 3 March 2014 09:52, minux wrote:
    On Sun, Mar 2, 2014 at 5:46 PM, Alex Skinner wrote:

    I'm creating this thread to get some community feedback on how the Go
    builtin resolver should work, going forward. The motivation for this post
    specifically is issue 6579, and conversations with Mikio with how to best
    handle it(if at all). In short, the issue says that if resolv.conf
    contains two dead nameservers, then a live nameserver, every lookup will
    take a long time. This is true - we are using the general default of two
    tries, 5 second timeout. So, each lookup would take 20 seconds in the
    above scenario. On one hand you could say "Well, fix your resolv.conf", or
    on the other, let's make the lookup logic smarter.
    What does glibc do in this scenario? Why not follow established
    implementations rather than
    invent our own strategy?
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Andrew Gerrand at Mar 2, 2014 at 11:27 pm
    I'm in favor of just doing what glibc does (the easy stuff first) so that
    we don't surprise or confuse people. Then if their resolv.conf file needs
    fixing the Go programs resolve names just as slowly as any other process on
    the system.



    On 3 March 2014 10:18, Alex Skinner wrote:

    Glibc tries the nameservers in order specified.

    From the resolv.conf man -

    If there are multiple servers, the
    resolver library queries them in the order listed. If no
    name-
    server entries are present, the default is to use the
    name
    server on the local machine. (The algorithm used is to
    try a
    name server, and if the query times out, try the next, until
    out
    of name servers, then repeat trying all the name servers
    until a
    maximum number of retries are made.)


    The netgo version isn't compliant with glibc's for a number of reasons,
    namely - no edns, no rotate logic, and our retry algorithm retries each
    server until the max, rather than trying each server per retry as stated
    above. Perhaps just fixing that would half the issue's resolution time.

    If the consensus is to try to follow glibc's lead, we should probably fix
    the things listed above, some of which are very easy changes. The rotate
    change is a little more complicated I think, so thought I'd bring it up in
    case.

    Thanks,
    Alex

    On Sun, Mar 2, 2014 at 5:56 PM, minux wrote:

    On Sun, Mar 2, 2014 at 5:54 PM, Andrew Gerrand wrote:

    Don't we use glibc's resolver on most systems anyway? Or is that just OS
    X?
    Issue 6579 is about the the net package with the netgo build tag enabled.
    On 3 March 2014 09:52, minux wrote:
    On Sun, Mar 2, 2014 at 5:46 PM, Alex Skinner wrote:

    I'm creating this thread to get some community feedback on how the Go
    builtin resolver should work, going forward. The motivation for this post
    specifically is issue 6579, and conversations with Mikio with how to best
    handle it(if at all). In short, the issue says that if resolv.conf
    contains two dead nameservers, then a live nameserver, every lookup will
    take a long time. This is true - we are using the general default of two
    tries, 5 second timeout. So, each lookup would take 20 seconds in the
    above scenario. On one hand you could say "Well, fix your resolv.conf", or
    on the other, let's make the lookup logic smarter.
    What does glibc do in this scenario? Why not follow established
    implementations rather than
    invent our own strategy?
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Dave Cheney at Mar 2, 2014 at 11:43 pm
    I think we should continue to do option one, do nothing.

    If there are duff name severs in resolv.conf, this will affect more than just a single Go program.
    On 3 Mar 2014, at 10:26, Andrew Gerrand wrote:

    I'm in favor of just doing what glibc does (the easy stuff first) so that we don't surprise or confuse people. Then if their resolv.conf file needs fixing the Go programs resolve names just as slowly as any other process on the system.



    On 3 March 2014 10:18, Alex Skinner wrote:
    Glibc tries the nameservers in order specified.

    From the resolv.conf man -

    If there are multiple servers, the
    resolver library queries them in the order listed. If no name-
    server entries are present, the default is to use the name
    server on the local machine. (The algorithm used is to try a
    name server, and if the query times out, try the next, until out
    of name servers, then repeat trying all the name servers until a
    maximum number of retries are made.)


    The netgo version isn't compliant with glibc's for a number of reasons, namely - no edns, no rotate logic, and our retry algorithm retries each server until the max, rather than trying each server per retry as stated above. Perhaps just fixing that would half the issue's resolution time.

    If the consensus is to try to follow glibc's lead, we should probably fix the things listed above, some of which are very easy changes. The rotate change is a little more complicated I think, so thought I'd bring it up in case.

    Thanks,
    Alex

    On Sun, Mar 2, 2014 at 5:56 PM, minux wrote:

    On Sun, Mar 2, 2014 at 5:54 PM, Andrew Gerrand wrote:
    Don't we use glibc's resolver on most systems anyway? Or is that just OS X?
    Issue 6579 is about the the net package with the netgo build tag enabled.
    On 3 March 2014 09:52, minux wrote:
    On Sun, Mar 2, 2014 at 5:46 PM, Alex Skinner wrote:
    I'm creating this thread to get some community feedback on how the Go builtin resolver should work, going forward. The motivation for this post specifically is issue 6579, and conversations with Mikio with how to best handle it(if at all). In short, the issue says that if resolv.conf contains two dead nameservers, then a live nameserver, every lookup will take a long time. This is true - we are using the general default of two tries, 5 second timeout. So, each lookup would take 20 seconds in the above scenario. On one hand you could say "Well, fix your resolv.conf", or on the other, let's make the lookup logic smarter.
    What does glibc do in this scenario? Why not follow established implementations rather than
    invent our own strategy?
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Minux at Mar 3, 2014 at 3:49 am

    On Sun, Mar 2, 2014 at 6:18 PM, Alex Skinner wrote:

    Glibc tries the nameservers in order specified.

    From the resolv.conf man -

    If there are multiple servers, the
    resolver library queries them in the order listed. If no
    name-
    server entries are present, the default is to use the
    name
    server on the local machine. (The algorithm used is to
    try a
    name server, and if the query times out, try the next, until
    out
    of name servers, then repeat trying all the name servers
    until a
    maximum number of retries are made.)

    The netgo version isn't compliant with glibc's for a number of reasons,
    namely - no edns, no rotate logic, and our retry algorithm retries each
    server until the max, rather than trying each server per retry as stated
    above. Perhaps just fixing that would half the issue's resolution time.
    Yeah, this seems worthwhile and simple enough to do.
    If the consensus is to try to follow glibc's lead, we should probably fix
    the things listed above, some of which are very easy changes. The rotate
    change is a little more complicated I think, so thought I'd bring it up in
    case.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Aram Hăvărneanu at Mar 3, 2014 at 12:08 pm
    I don't understand why we need to implement special logic to deal with
    broken user files.

    --
    Aram Hăvărneanu

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Michael Jones at Mar 3, 2014 at 12:44 pm
    Presumably the user has "good" files (viable DNS resolvers in general
    priority order) and the happenstance of the moment ruins that as those
    sites become unreachable. The question is if the standard facility should
    be smart and agile in such cases.

    On Mon, Mar 3, 2014 at 12:07 PM, Aram Hăvărneanu wrote:

    I don't understand why we need to implement special logic to deal with
    broken user files.

    --
    Aram Hăvărneanu

    --

    ---
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.


    --
    *Michael T. Jones | Chief Technology Advocate | mtj@google.com
    <mtj@google.com> | +1 650-335-5765 <%2B1%20650-335-5765>*

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Aram Hăvărneanu at Mar 3, 2014 at 12:48 pm

    Presumably the user has "good" files (viable DNS resolvers in general
    priority order) and the happenstance of the moment ruins that as those sites
    become unreachable. The question is if the standard facility should be smart
    and agile in such cases.
    When your files become outdated, your monitoring service notices (you
    have one, right?) and your configuration management software (you have
    one, right?) takes care of the problem. Go being "smart" can actually
    hide real problems, and you'll notice much later.

    --
    Aram Hăvărneanu

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Michael Jones at Mar 3, 2014 at 1:18 pm
    No dispute. Our dialog is precisely the point. It's not really about "bad"
    files so much as who or what changes based on circumstances.

    On Mon, Mar 3, 2014 at 12:48 PM, Aram Hăvărneanu wrote:

    Presumably the user has "good" files (viable DNS resolvers in general
    priority order) and the happenstance of the moment ruins that as those sites
    become unreachable. The question is if the standard facility should be smart
    and agile in such cases.
    When your files become outdated, your monitoring service notices (you
    have one, right?) and your configuration management software (you have
    one, right?) takes care of the problem. Go being "smart" can actually
    hide real problems, and you'll notice much later.

    --
    Aram Hăvărneanu


    --
    *Michael T. Jones | Chief Technology Advocate | mtj@google.com
    <mtj@google.com> | +1 650-335-5765*

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • R W Johnstone at Mar 3, 2014 at 2:06 pm
    This makes sense only if you assume that Go will only ever be used to write
    servers running in a production environment. If the standard library is
    going to be generally applicable, then it should be more resilient.
    Further, none of the listed suggestions would interfere with a monitoring
    service.

    On Monday, 3 March 2014 07:48:17 UTC-5, Aram Hăvărneanu wrote:

    Presumably the user has "good" files (viable DNS resolvers in general
    priority order) and the happenstance of the moment ruins that as those sites
    become unreachable. The question is if the standard facility should be smart
    and agile in such cases.
    When your files become outdated, your monitoring service notices (you
    have one, right?) and your configuration management software (you have
    one, right?) takes care of the problem. Go being "smart" can actually
    hide real problems, and you'll notice much later.

    --
    Aram Hăvărneanu
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Alex Skinner at Mar 3, 2014 at 4:17 pm
    In general I agree with most of the sentiments above. However, we cannot
    treat Go and libc builtin as equals, so it gets a little more complicated.

    As an example, my reply to this statement would point out that changes to
    resolv.conf are picked up by the system resolver immediately. However, Go
    only reads the file once, on the first lookup, so any changes made to
    resolv.conf do not take effect. As there is no public function to reload
    it(and perhaps shouldn't be), this would require bouncing any go
    applications using netgo.

    Just food for thought.

    Thanks,
    Alex
    On Monday, March 3, 2014 7:48:17 AM UTC-5, Aram Hăvărneanu wrote:

    Presumably the user has "good" files (viable DNS resolvers in general
    priority order) and the happenstance of the moment ruins that as those sites
    become unreachable. The question is if the standard facility should be smart
    and agile in such cases.
    When your files become outdated, your monitoring service notices (you
    have one, right?) and your configuration management software (you have
    one, right?) takes care of the problem. Go being "smart" can actually
    hide real problems, and you'll notice much later.

    --
    Aram Hăvărneanu
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Russ Cox at Mar 3, 2014 at 4:54 pm
    I would suggest just fixing the retry logic to cycle through the servers
    once per retry (then it will be like glibc) and call this fixed. But it
    doesn't seem pressing enough to bother for Go 1.3.

    Russ

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Mikio Hara at Mar 3, 2014 at 5:20 pm

    On Tue, Mar 4, 2014 at 1:54 AM, Russ Cox wrote:

    I would suggest just fixing the retry logic to cycle through the servers
    once per retry (then it will be like glibc) and call this fixed. But it
    doesn't seem pressing enough to bother for Go 1.3.
    probably it's a time to have go.net/dns package.

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Oleku Konko at Mar 15, 2014 at 3:23 am
    Good Idea .. +1
    On Monday, March 3, 2014 6:20:11 PM UTC+1, Mikio Hara wrote:

    On Tue, Mar 4, 2014 at 1:54 AM, Russ Cox <r...@golang.org <javascript:>>
    wrote:
    I would suggest just fixing the retry logic to cycle through the servers
    once per retry (then it will be like glibc) and call this fixed. But it
    doesn't seem pressing enough to bother for Go 1.3.
    probably it's a time to have go.net/dns package.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Josh Bleecher Snyder at Mar 3, 2014 at 6:19 pm

    As an example, my reply to this statement would point out that changes to
    resolv.conf are picked up by the system resolver immediately. However, Go
    only reads the file once, on the first lookup, so any changes made to
    resolv.conf do not take effect.
    https://code.google.com/p/go/issues/detail?id=6670

    FWIW, I started a fix (https://codereview.appspot.com/20090043/) but
    am not super happy with it and was planning to just wait for
    os/fsnotify to land before reworking it.

    -josh

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Alex Skinner at Mar 13, 2014 at 3:58 am
    Thanks Josh, please keep me in the loop on it if you can remember.

    As I finally found some free time, I spent a bit of time tonight to go
    through the native go DNS logic and found a few areas that were off - the
    aforementioned attempts logic, dns queries being sent too many times,
    nonconcurrent queries.

    I know it's late, not sure how it's viewed since this is a "redo" of
    previous CL. There are a lot of changes I'd appreciate advice/review on
    from those familiar. This change speeds up every single native Go DNS
    address query, some more drastically than others. Hopefully I didn't do
    something silly, as they are pretty big speedups.

    I'll wait to hear opinions before mailing for official review, as I don't
    want to clutter up/bother with a late submission if there's no need, and
    it's been stated it can probably wait until after 1.3 above. Also, please
    feel free to email me privately for further discussion/ideas/critiques
    before I submit for review.

    It can be viewed here - https://codereview.appspot.com/75180043/

    Thank you,
    Alex
    On Monday, March 3, 2014 1:18:58 PM UTC-5, Josh Bleecher Snyder wrote:

    As an example, my reply to this statement would point out that changes to
    resolv.conf are picked up by the system resolver immediately. However, Go
    only reads the file once, on the first lookup, so any changes made to
    resolv.conf do not take effect.
    https://code.google.com/p/go/issues/detail?id=6670

    FWIW, I started a fix (https://codereview.appspot.com/20090043/) but
    am not super happy with it and was planning to just wait for
    os/fsnotify to land before reworking it.

    -josh
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Josh Bleecher Snyder at Mar 15, 2014 at 12:24 am
    Thanks Josh, please keep me in the loop on it if you can remember.
    Sure thing. Be sure to star issue 6670. And if you get impatient, feel
    free to improve my CL and mail it after Go 1.3 is out; I won't mind.
    (The urgency of this fix is low for us, as we found an easy workaround
    -- just ensure that /etc/resolv.conf starts with 8.8.8.8 and 8.8.4.4
    at all times, so even if we miss a change or ten, we always have
    functional DNS.)

    I'll wait to hear opinions before mailing for official review, as I don't
    want to clutter up/bother with a late submission if there's no need, and
    it's been stated it can probably wait until after 1.3 above. Also, please
    feel free to email me privately for further discussion/ideas/critiques
    before I submit for review.
    Since the CL fixes an issue marked Release-Go1.3, I think that it can
    (and should) be mailed and reviewed soon. The outcome of that
    discussion might be to delay the issue/CL until Go 1.4 or ask for an
    alternative solution, but those are both ok outcomes.

    -josh

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Alex Skinner at Mar 15, 2014 at 2:28 am
    Thanks, starred for tracking.

    I agree with your sentiment so went ahead and mailed the CL, anyone
    interested please review/comment. If it's too big of a change for 1.3,
    you'll get no push back from me.

    Thanks all,
    Alex


    On Fri, Mar 14, 2014 at 8:23 PM, Josh Bleecher Snyder
    wrote:
    Thanks Josh, please keep me in the loop on it if you can remember.
    Sure thing. Be sure to star issue 6670. And if you get impatient, feel
    free to improve my CL and mail it after Go 1.3 is out; I won't mind.
    (The urgency of this fix is low for us, as we found an easy workaround
    -- just ensure that /etc/resolv.conf starts with 8.8.8.8 and 8.8.4.4
    at all times, so even if we miss a change or ten, we always have
    functional DNS.)

    I'll wait to hear opinions before mailing for official review, as I don't
    want to clutter up/bother with a late submission if there's no need, and
    it's been stated it can probably wait until after 1.3 above. Also, please
    feel free to email me privately for further discussion/ideas/critiques
    before I submit for review.
    Since the CL fixes an issue marked Release-Go1.3, I think that it can
    (and should) be mailed and reviewed soon. The outcome of that
    discussion might be to delay the issue/CL until Go 1.4 or ask for an
    alternative solution, but those are both ok outcomes.

    -josh
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Mikio Hara at Mar 16, 2014 at 8:32 am
    Hi Alex,

    Sorry for being late.
    On Thu, Mar 13, 2014 at 12:58 PM, Alex Skinner wrote:

    I'll wait to hear opinions before mailing for official review, as I don't
    want to clutter up/bother with a late submission if there's no need, and
    it's been stated it can probably wait until after 1.3 above. Also, please
    feel free to email me privately for further discussion/ideas/critiques
    before I submit for review.
    I'd prefer revisiting DNS client issues in Go 1.4, because just
    skimmed issue tracker and found builtin DNS client related issues as
    follows:
    - issue 6340, no deadline support,
    - issue 6579, no care of bad entries in /etc/resolve.conf,
    - issue 6670, never follows /etc/resolv.conf changes.

    Just a guess: both 6340 and 6590 might require something like, a)
    transport protocol-independent DNS transport, b) query-type agnostic
    querier or roundTripper, c) some logic to control queriers, d) some
    control stuff for discovery-dial pipelines, and 6670 require
    file-system event stuff.

    I mean, I'd prefer to smash those issues together. Does this sound
    reasonable to you?

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Alex Skinner at Mar 16, 2014 at 8:50 am
    Hi Mikio,

    Thanks as always for your insight. I do agree that for go 1.4, we do need
    a big overhaul including items you've mentioned. And perhaps even
    splitting dns into its own subpackage as you previously suggested, though
    it will be hard due to dependency on dial(unless, of course, it can be
    easily redefined). Net as it is is really, really big and nearly unwieldy.

    That said, I do hope we can land this particular change submitted for your
    review into 1.3, and I'll explain why. As the bench shows, it's much much
    faster with little to less alloc overhead. After reviewing the current
    implementation, I found some bugs of sorts(or perhaps they are RFC
    compliant, and the RFC is wonky). For example, in case of no such host,
    the query is sent through retry. In my testing at least, it would send
    query to all servers twice before returning. Compounding it, the lookup
    logic currently runs every non-rooted nosuchhost twice. In some
    circumstances, I was finding the same query sent to the same server 4
    times.
    One of the biggest speedups to every case is that in our current version,
    in host lookups, A and AAAA queries are done one after another. Just by
    using making these concurrent, a single host lookup is sped up 50%. So if
    nothing else, that's one change I think we can land.
    You are much more intimate with the code than I am, so I will ultimately
    trust your judgment. But please review some of the changes in the CL
    submitted, and even if dialed back a little, consider them for 1.3. I have
    no problem if you scrap/change some parts and resubmit it as your own, as
    you had a lot of other niceties in your previously submitted versions.

    Thanks much,
    Alex

    On Sun, Mar 16, 2014 at 4:32 AM, Mikio Hara wrote:

    Hi Alex,

    Sorry for being late.
    On Thu, Mar 13, 2014 at 12:58 PM, Alex Skinner wrote:

    I'll wait to hear opinions before mailing for official review, as I don't
    want to clutter up/bother with a late submission if there's no need, and
    it's been stated it can probably wait until after 1.3 above. Also, please
    feel free to email me privately for further discussion/ideas/critiques
    before I submit for review.
    I'd prefer revisiting DNS client issues in Go 1.4, because just
    skimmed issue tracker and found builtin DNS client related issues as
    follows:
    - issue 6340, no deadline support,
    - issue 6579, no care of bad entries in /etc/resolve.conf,
    - issue 6670, never follows /etc/resolv.conf changes.

    Just a guess: both 6340 and 6590 might require something like, a)
    transport protocol-independent DNS transport, b) query-type agnostic
    querier or roundTripper, c) some logic to control queriers, d) some
    control stuff for discovery-dial pipelines, and 6670 require
    file-system event stuff.

    I mean, I'd prefer to smash those issues together. Does this sound
    reasonable to you?
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Mikio Hara at Mar 3, 2014 at 3:46 am

    On Mon, Mar 3, 2014 at 7:46 AM, Alex Skinner wrote:

    3 - Use Mikio's second resolution - basically, like step 2 with a delay
    built in(300ms, can be more/less).
    Not mine, yours. ;)

    Once you get a consensus, feel free to take over issue 6579 and take
    some code fragments what you need from CL 14441059. Thanks.

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Steven Hartland at Mar 3, 2014 at 5:49 pm
    ----- Original Message -----
    From: "Alex Skinner" <alex@lx.lc>
    To: <golang-dev@googlegroups.com>
    Cc: "Mikio Hara" <mikioh.mikioh@gmail.com>
    Sent: Sunday, March 02, 2014 10:46 PM
    Subject: [golang-dev] Thoughts on Go DNS handling

    Hi Golang Dev,

    I'm creating this thread to get some community feedback on how the Go
    builtin resolver should work, going forward. The motivation for this post
    specifically is issue 6579, and conversations with Mikio with how to best
    handle it(if at all). In short, the issue says that if resolv.conf
    contains two dead nameservers, then a live nameserver, every lookup will
    take a long time. This is true - we are using the general default of two
    tries, 5 second timeout. So, each lookup would take 20 seconds in the
    above scenario. On one hand you could say "Well, fix your resolv.conf", or
    on the other, let's make the lookup logic smarter.

    Mikio's first resolution to this was to send all queries to all
    nameservers, and use the first valid response. This adds no real time to
    standard queries, and in the above complainant's scenario, it runs as fast
    as the two dead entries not being there. However, I am reluctant to agree
    with this approach as it adds unexpected overhead. To the vast majority of
    programs, it's just not going to matter much as it's relatively small
    overhead. To someone writing a program that does millions of lookups, or
    even to someone doing one lookup who is watching network activity closely,
    this adds a lot of unexpected overhead as now every message is being sent
    three times rather than once. In summary, I consider it 'unexpected
    behavior/resource usage.'

    So, here are some of my thoughts on solutions, please - I'd be interested
    to hear others.

    1 - Do nothing. The user should be responsible for having a stable
    resolv.conf.
    Pro - Expected behavior, current behavior
    Con - Down nameservers will kill performance

    2 - Use Mikio's first resolution - send all queries to all ns, and use the
    fastest response.
    Pro - Fastest solution
    Con - Most network overhead

    3 - Use Mikio's second resolution - basically, like step 2 with a delay
    built in(300ms, can be more/less). So if first hasn't answered in 300ms,
    send the 2nd, and so on. Still stops after receiving first response.
    Initially I liked the idea(and even suggested it), but realized that many
    queries can take this long - especially in long chains or that hit slow
    nameservers.
    Pro - Dramatically speeds up queries when resolver(s) down
    Con - Neither fastest, and has extra overhead in allocs and network

    4 - Manage the order of nameservers. In essence, each lookup would call a
    function to get the list of nameservers. If the first one works, do
    nothing. If it does not, call a function to set the order of nameservers
    to put the working nameserver as first in list(cfg.servers), so that
    subsequent calls would hit the working server first. This would need to
    use a mutex/channel/similar logic to prevent races.
    Pro - Intelligently sort ns order, no extra network overhead
    Con - First lookup(s) slow when first nameserver(s) are down/invalid.
    Order in resolv.conf is not strictly followed.

    Of course, other ideas are much welcome too. The reason for my
    thoughts/proposal of 4, is that currently, the rotate option is being
    collected but doesn't appear to be used. Implementing rotate would be
    pretty easy if 4 is implemented, since lookups no longer accesses
    cfg.servers directly, but are given a slice from a function, which can
    order as needed before returning(either 'rotating', or with working ns
    first in line).

    Look forward to hearing your thoughts.
    I'd suggest allowing the user to choose which is best in their use case
    providing the options to configure the resolver.

    I believe the current implementation only reads resolv.conf (or its equiv)
    on the first request, allowing this and other options to be configured by
    the runtime would provide a much more flexible solution which could
    be tuned for a much wider range of use cases, instead of those which
    require the high performance provided by solution #2 having to implement
    a custom code.

         Regards
         Steve

    ================================================
    This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.

    In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
    or return the E.mail to postmaster@multiplay.co.uk.

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedMar 2, '14 at 10:46p
activeMar 16, '14 at 8:50a
posts25
users12
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase