Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-12-11 10:07:19 -0500, Tom Lane wrote:
>> Do you remember offhand where the failures are?
> No, but they are easy enough to reproduce. Out of 10 runs, I've attached
> the one with the most failures and checked that it seems to contain all
> the failures from other runs. All of them probably could be fixed by
> moving things around, but I am not sure how maintainable that approach
> is :/
Thanks for doing the legwork. These all seem to be cases where the
planner decided against doing an index-only scan on tenk1, which is
presumably because its relallvisible fraction is too low. But these are
later in the test series than the "vacuum analyze tenk1" that's currently
present in create_index, and most of them are even later than the
database-wide VACUUM in sanity_check. So those vacuums are failing to
mark the table as all-visible, even though it's not changed since the COPY
test. This seems odd. Do you know why your slave server is holding back
the xmin horizon so much?
After looking at this, I conclude that moving the vacuums earlier would
probably make things worse not better, because the critical interval seems
to be from the "COPY TO tenk1" command to the vacuum command. So the idea
of putting vacuums into the COPY test is a bad one, and I'll proceed with
the patch I posted yesterday for moving the ANALYZE steps around. I think
fixing what you're seeing is going to be a different issue.
regards, tom lane