FAQ
Why is it taking so long for a new CL to appear in build.golang.org?

The dashboard is missing this CL and everything after it:

changeset: 21160:6de4384c7541
user: Russ Cox <rsc@golang.org>
date: Fri Sep 12 07:46:11 2014 -0400
summary: runtime: stop scanning stack frames/args conservatively

That was 1.5 hours ago.

I was seeing a similar delay last night. It does not seem to be stopped,
just VERY slow.

Russ

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Brad Fitzpatrick at Sep 12, 2014 at 6:04 pm
    I wonder which machine(s) are running in the -commit mode. I want to move
    that functionality to be part of the coordinator instead and remove it from
    the 'builder' binary.

    On Fri, Sep 12, 2014 at 12:15 PM, Russ Cox wrote:

    Why is it taking so long for a new CL to appear in build.golang.org?

    The dashboard is missing this CL and everything after it:

    changeset: 21160:6de4384c7541
    user: Russ Cox <rsc@golang.org>
    date: Fri Sep 12 07:46:11 2014 -0400
    summary: runtime: stop scanning stack frames/args conservatively

    That was 1.5 hours ago.

    I was seeing a similar delay last night. It does not seem to be stopped,
    just VERY slow.

    Russ
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Russ Cox at Sep 12, 2014 at 8:55 pm

    On Fri, Sep 12, 2014 at 2:04 PM, Brad Fitzpatrick wrote:

    I wonder which machine(s) are running in the -commit mode. I want to move
    that functionality to be part of the coordinator instead and remove it from
    the 'builder' binary.
    I started a builder in -commit mode myself, and still nothing on the
    dashboard. Something is broken.

    Russ

    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Russ Cox at Sep 12, 2014 at 8:56 pm
    The weird thing is that the 'broke the build' emails are still going out,
    so it seems like the dashboard is still running. It is just not updating
    the front page. Bad HTML cache or something?

    Russ

    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Brad Fitzpatrick at Sep 13, 2014 at 1:26 am
    Yes, this is very bizarre. It now has some but not all of the commits. I
    feel like we've seen this before and last time Andrew's conclusion was
    something like "huh that's weird" but it wasn't resolved.

    Andrew, could you look into this? We now have repeatable failures on the
    dashboard whose origin is obscured by missing commits. That's not a good
    place to be during release stabilization.

    On Fri, Sep 12, 2014 at 4:56 PM, Russ Cox wrote:

    The weird thing is that the 'broke the build' emails are still going out,
    so it seems like the dashboard is still running. It is just not updating
    the front page. Bad HTML cache or something?

    Russ
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Dave Cheney at Sep 13, 2014 at 2:00 am
    Something very odd is going on here, the records for d0331dfbc053 were
    on the dashboard an hour ago, now they have disappeared.
    On Sat, Sep 13, 2014 at 11:26 AM, Brad Fitzpatrick wrote:
    Yes, this is very bizarre. It now has some but not all of the commits. I
    feel like we've seen this before and last time Andrew's conclusion was
    something like "huh that's weird" but it wasn't resolved.

    Andrew, could you look into this? We now have repeatable failures on the
    dashboard whose origin is obscured by missing commits. That's not a good
    place to be during release stabilization.

    On Fri, Sep 12, 2014 at 4:56 PM, Russ Cox wrote:

    The weird thing is that the 'broke the build' emails are still going out,
    so it seems like the dashboard is still running. It is just not updating the
    front page. Bad HTML cache or something?

    Russ
    --
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Brad Fitzpatrick at Sep 13, 2014 at 2:15 am
    That rings a bell. I have seen that behavior in the past too.

    On Fri, Sep 12, 2014 at 10:00 PM, Dave Cheney wrote:

    Something very odd is going on here, the records for d0331dfbc053 were
    on the dashboard an hour ago, now they have disappeared.
    On Sat, Sep 13, 2014 at 11:26 AM, Brad Fitzpatrick wrote:
    Yes, this is very bizarre. It now has some but not all of the commits. I
    feel like we've seen this before and last time Andrew's conclusion was
    something like "huh that's weird" but it wasn't resolved.

    Andrew, could you look into this? We now have repeatable failures on the
    dashboard whose origin is obscured by missing commits. That's not a good
    place to be during release stabilization.

    On Fri, Sep 12, 2014 at 4:56 PM, Russ Cox wrote:

    The weird thing is that the 'broke the build' emails are still going
    out,
    so it seems like the dashboard is still running. It is just not
    updating the
    front page. Bad HTML cache or something?

    Russ
    --
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Dmitry Vyukov at Sep 13, 2014 at 2:22 am
    I've seen the disappeared commit as well.

    On Fri, Sep 12, 2014 at 7:00 PM, Dave Cheney wrote:
    Something very odd is going on here, the records for d0331dfbc053 were
    on the dashboard an hour ago, now they have disappeared.
    On Sat, Sep 13, 2014 at 11:26 AM, Brad Fitzpatrick wrote:
    Yes, this is very bizarre. It now has some but not all of the commits. I
    feel like we've seen this before and last time Andrew's conclusion was
    something like "huh that's weird" but it wasn't resolved.

    Andrew, could you look into this? We now have repeatable failures on the
    dashboard whose origin is obscured by missing commits. That's not a good
    place to be during release stabilization.

    On Fri, Sep 12, 2014 at 4:56 PM, Russ Cox wrote:

    The weird thing is that the 'broke the build' emails are still going out,
    so it seems like the dashboard is still running. It is just not updating the
    front page. Bad HTML cache or something?

    Russ
    --
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Russ Cox at Sep 13, 2014 at 6:07 pm
    If this isn't solved soon I think we should turn off commit access for
    everyone. We can't work without the dashboard.

    Russ

    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Brad Fitzpatrick at Sep 13, 2014 at 9:18 pm
    Here's what's going on:

    The main page of http://build.golang.org/ queries all Commits, ordered by
    the Commit's "Num":

    // dashCommits gets a slice of the latest Commits to the current dashboard.
    // If page > 0 it paginates by commitsPerPage.
    func dashCommits(c appengine.Context, pkg *Package, page int) ([]*Commit,
    error) {
    q := datastore.NewQuery("Commit").
                     Ancestor(pkg.Key(c)).
    * Order("-Num").*
                     Limit(commitsPerPage).
             Offset(page * commitsPerPage)
             var commits []*Commit
    _, err := q.GetAll(c, &commits)
    return commits, err
    }

    But that "Num" field is a totally sketchy design:

    type Commit struct {
             PackagePath string // (empty for main repo commits)
             Hash string
    ParentHash string
    * Num int // Internal monotonic counter unique to this
    package.*
    ....

    And is incremented (without a lock!?) on a field of the Package:

    // get the next commit number
             p, err := GetPackage(c, com.PackagePath)
             if err != nil {
    return fmt.Errorf("GetPackage: %v", err)
             }
             com.Num = p.NextNum
    * p.NextNum++*

    // A Package describes a package that is listed on the dashboard.
    type Package struct {
             Kind string // "subrepo", "external", or empty for the main Go
    tree
             Name string
             Path string // (empty for the main Go tree)
    * NextNum int // Num of the next head Commit*
    }

    And sure enough, this "Num" field is 0 in the datastore for missing commits.

    I also think memcache misses are also involved here in how this goes wrong.
      The caching strategy looks complicated and fragile.

    I think this whole design is gross.

    My proposal:

    We have a linear history in our hg, so let's just use hg's numeric IDs for
    "Num":

    changeset: *21167*:d0331dfbc053
    tag: tip
    user: Robert Griesemer <gri@golang.org>
    date: Fri Sep 12 16:35:40 2014 -0700
    summary: cmd/8g: remove unused variable (fix build)

    changeset: *21166*:1b2719823e56
    user: Josh Bleecher Snyder <josharian@gmail.com>
    date: Fri Sep 12 16:16:09 2014 -0700
    summary: runtime: test iteration order of sparse maps

    But we'll need the builders to report that number in, since App Engine
    can't hg update/log itself.

    Currently the builders do this, but that's increasingly going away in the
    new builder design, so let's just rip -commit mode out of the builders now.
    We can make the handler just ignore POSTs from existing builders running
    old code without the new hg-changeset-num field. Then we run a hg poller on
    the coordinator that sends that number in as "hg-changeset-num".

    Then we remove the racy "NextNum" from the Package entity in App Engine.

    For now I'll try to do some fixes by hand and write a little ad-hoc fix
    tool or something.

    On Sat, Sep 13, 2014 at 2:07 PM, Russ Cox wrote:

    If this isn't solved soon I think we should turn off commit access for
    everyone. We can't work without the dashboard.

    Russ
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Brad Fitzpatrick at Sep 13, 2014 at 9:36 pm
    It's not as terrible as I feared: the addCommit handler does at least run
    in a transaction and try to update everything atomically.

    Still debugging.

    On Sat, Sep 13, 2014 at 5:18 PM, Brad Fitzpatrick wrote:

    Here's what's going on:

    The main page of http://build.golang.org/ queries all Commits, ordered by
    the Commit's "Num":

    // dashCommits gets a slice of the latest Commits to the current dashboard.
    // If page > 0 it paginates by commitsPerPage.
    func dashCommits(c appengine.Context, pkg *Package, page int) ([]*Commit,
    error) {
    q := datastore.NewQuery("Commit").
    Ancestor(pkg.Key(c)).
    * Order("-Num").*
    Limit(commitsPerPage).
    Offset(page * commitsPerPage)
    var commits []*Commit
    _, err := q.GetAll(c, &commits)
    return commits, err
    }

    But that "Num" field is a totally sketchy design:

    type Commit struct {
    PackagePath string // (empty for main repo commits)
    Hash string
    ParentHash string
    * Num int // Internal monotonic counter unique to this
    package.*
    ....

    And is incremented (without a lock!?) on a field of the Package:

    // get the next commit number
    p, err := GetPackage(c, com.PackagePath)
    if err != nil {
    return fmt.Errorf("GetPackage: %v", err)
    }
    com.Num = p.NextNum
    * p.NextNum++*

    // A Package describes a package that is listed on the dashboard.
    type Package struct {
    Kind string // "subrepo", "external", or empty for the main Go
    tree
    Name string
    Path string // (empty for the main Go tree)
    * NextNum int // Num of the next head Commit*
    }

    And sure enough, this "Num" field is 0 in the datastore for missing
    commits.

    I also think memcache misses are also involved here in how this goes
    wrong. The caching strategy looks complicated and fragile.

    I think this whole design is gross.

    My proposal:

    We have a linear history in our hg, so let's just use hg's numeric IDs for
    "Num":

    changeset: *21167*:d0331dfbc053
    tag: tip
    user: Robert Griesemer <gri@golang.org>
    date: Fri Sep 12 16:35:40 2014 -0700
    summary: cmd/8g: remove unused variable (fix build)

    changeset: *21166*:1b2719823e56
    user: Josh Bleecher Snyder <josharian@gmail.com>
    date: Fri Sep 12 16:16:09 2014 -0700
    summary: runtime: test iteration order of sparse maps

    But we'll need the builders to report that number in, since App Engine
    can't hg update/log itself.

    Currently the builders do this, but that's increasingly going away in the
    new builder design, so let's just rip -commit mode out of the builders now.
    We can make the handler just ignore POSTs from existing builders running
    old code without the new hg-changeset-num field. Then we run a hg poller on
    the coordinator that sends that number in as "hg-changeset-num".

    Then we remove the racy "NextNum" from the Package entity in App Engine.

    For now I'll try to do some fixes by hand and write a little ad-hoc fix
    tool or something.

    On Sat, Sep 13, 2014 at 2:07 PM, Russ Cox wrote:

    If this isn't solved soon I think we should turn off commit access for
    everyone. We can't work without the dashboard.

    Russ
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Andrew Gerrand at Sep 15, 2014 at 12:39 am
    It doesn't just try to update everything atomically. It must be doing that.
    The only place addCommit is called is from within a transaction, and the
    Package entity should be the root of the entity group inside which each
    Commit lives.

    I don't see the code path that could lead to a zero Num field. Obviously
    there is one, but I don't see it.

    Brad, I'd like to chat with you when you're available to find out where
    you're at. It's not clear from the mail I have read.

    Andrew
    On 14 September 2014 07:36, Brad Fitzpatrick wrote:

    It's not as terrible as I feared: the addCommit handler does at least run
    in a transaction and try to update everything atomically.

    Still debugging.

    On Sat, Sep 13, 2014 at 5:18 PM, Brad Fitzpatrick wrote:

    Here's what's going on:

    The main page of http://build.golang.org/ queries all Commits, ordered
    by the Commit's "Num":

    // dashCommits gets a slice of the latest Commits to the current
    dashboard.
    // If page > 0 it paginates by commitsPerPage.
    func dashCommits(c appengine.Context, pkg *Package, page int) ([]*Commit,
    error) {
    q := datastore.NewQuery("Commit").
    Ancestor(pkg.Key(c)).
    * Order("-Num").*
    Limit(commitsPerPage).
    Offset(page * commitsPerPage)
    var commits []*Commit
    _, err := q.GetAll(c, &commits)
    return commits, err
    }

    But that "Num" field is a totally sketchy design:

    type Commit struct {
    PackagePath string // (empty for main repo commits)
    Hash string
    ParentHash string
    * Num int // Internal monotonic counter unique to this
    package.*
    ....

    And is incremented (without a lock!?) on a field of the Package:

    // get the next commit number
    p, err := GetPackage(c, com.PackagePath)
    if err != nil {
    return fmt.Errorf("GetPackage: %v", err)
    }
    com.Num = p.NextNum
    * p.NextNum++*

    // A Package describes a package that is listed on the dashboard.
    type Package struct {
    Kind string // "subrepo", "external", or empty for the main Go
    tree
    Name string
    Path string // (empty for the main Go tree)
    * NextNum int // Num of the next head Commit*
    }

    And sure enough, this "Num" field is 0 in the datastore for missing
    commits.

    I also think memcache misses are also involved here in how this goes
    wrong. The caching strategy looks complicated and fragile.

    I think this whole design is gross.

    My proposal:

    We have a linear history in our hg, so let's just use hg's numeric IDs
    for "Num":

    changeset: *21167*:d0331dfbc053
    tag: tip
    user: Robert Griesemer <gri@golang.org>
    date: Fri Sep 12 16:35:40 2014 -0700
    summary: cmd/8g: remove unused variable (fix build)

    changeset: *21166*:1b2719823e56
    user: Josh Bleecher Snyder <josharian@gmail.com>
    date: Fri Sep 12 16:16:09 2014 -0700
    summary: runtime: test iteration order of sparse maps

    But we'll need the builders to report that number in, since App Engine
    can't hg update/log itself.

    Currently the builders do this, but that's increasingly going away in the
    new builder design, so let's just rip -commit mode out of the builders now.
    We can make the handler just ignore POSTs from existing builders running
    old code without the new hg-changeset-num field. Then we run a hg poller on
    the coordinator that sends that number in as "hg-changeset-num".

    Then we remove the racy "NextNum" from the Package entity in App Engine.

    For now I'll try to do some fixes by hand and write a little ad-hoc fix
    tool or something.

    On Sat, Sep 13, 2014 at 2:07 PM, Russ Cox wrote:

    If this isn't solved soon I think we should turn off commit access for
    everyone. We can't work without the dashboard.

    Russ
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Brad Fitzpatrick at Sep 15, 2014 at 12:06 pm

    On Sun, Sep 14, 2014 at 8:38 PM, Andrew Gerrand wrote:

    It doesn't just try to update everything atomically. It must be doing
    that. The only place addCommit is called is from within a transaction, and
    the Package entity should be the root of the entity group inside which each
    Commit lives.
    Yeah, I understand the code now. I didn't initially when I wrote earlier
    emails.

    I don't see the code path that could lead to a zero Num field. Obviously
    there is one, but I don't see it.
    Likewise.

    Brad, I'd like to chat with you when you're available to find out where
    you're at. It's not clear from the mail I have read.
    Sounds good.

    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedSep 12, '14 at 4:15p
activeSep 15, '14 at 12:06p
posts13
users5
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase