On Mon, Oct 27, 2014 at 2:57 PM, Dmitry Vyukov wrote:
That commit was done when the database was updated several times
manually and was in a corrupted state. So the commit could well detect
results of that updates.
It's still mystery to me what was the root cause and what was not I
paying attention to...
I think it has to do with the order of the commit logs arriving. In normal
usage if you have commit A then commit B, where B broke the build, then the
result for A comes in first, and then the result for B. The insertion of
the result for B checks that A was okay and since B is not, it sends mail
about B breaking the build. This works.
If you commit A and B back to back, then it is fairly likely that B will
run first, because the builders run newest thing missing first. Then when B
comes in broken, the code records that fact but doesn't send mail, because
it doesn't know about A yet. When A comes in working, then it checks
whether B was broken, finds that it was, and sends mail saying that B is
broken. It is this send mail operation that I believe is passed an
incomplete Commit record for B (with Num and many other fields set to their
zero values). The mail sender updates FailNotificationSent=true in the
record and writes it back into the datastore, blowing away the real record
for B.
This theory matches the failures I observed: the database always got stuck
when I committed 3 or 4 CLs back to back. I do this fairly often, because I
send a bunch of CLs in the same client in different directories and then I
come back to them and fix all the comments and run all.bash and submit them
together.
I have not chased down exactly what is wrong in the code, but if you want
to do so, that's where I would start.
Russ
--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/d/optout.