Vadim Mikheev <vadim4o@email.com> writes:
>> -- Judging from the commit timestamps surrounding prior
>> -- checkpoints, checkpoints were happening every five
>> -- minutes approximately on the 5-minute mark, so
> You can't count on this: postmaster runs checkpoint
> "maker" in 5 minutes *after* prev checkpoint was created,
> not from the moment "maker" started. And checkpoint can
> take *minutes*.
Good point, although with so little going on (this is the *whole*
relevant section of the log), that seems unlikely.
>> -- here. But it's worse than that: check the commit
>> -- timestamps and the xid numbers before and after the
>> -- discontinuity. Did time go backwards here?
> Commit timestamps are created *before* XLogInsert call,
> which can suspend backend for some time (in multi-user
> env). Random xid-s are also ok, generally.
Hmm ... maybe. Though again, this installation doesn't seem to have
been busy enough to cause a commit to be delayed for very long.
What I realized after posting that analysis is that the last checkpoint
record has SUI 30 whereas the earlier ones have SUI 29 ... so there was
a system restart in there somewhere. That still leaves me wondering
about the discontinuity and broken back-link, but it may account for
the "missing" checkpoint records --- perhaps they weren't generated
because the system wasn't up the entire interval.
>> -- What's even nastier (and the immediate cause of
>> -- Scott's inability to restart) is that the pg_control
>> -- file's checkPoint pointer points to 0/005AF9F0, which
>> -- is *not* the location of this checkpoint, but of
>> -- the record after it.
> Well, well. Checkpoint position is taken from
> MyLastRecord - I wonder how could this internal var
> take "invalid" data from concurrent backend.
I have not been able to figure that one out either.
> Ok, we're leaving Krasnoyarsk in 8 hrs and should
> arrive SF Feb 5 ~ 10pm.
Have a safe trip!
regards, tom lane