[ reviving a thread that's been idle for awhile ]
I wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
>> Huh, idiacanthus failed showing vacuum_count 0, in select_parallel.
>> So ... the VACUUM command somehow skipped those tables?
> No, because the reltuples counts are correct. I think what we're
> looking at there is the stats collector dropping a packet that
> told it about vacuum activity.
> I'm surprised that we saw such a failure so quickly. I'd always
> figured that the collector mechanism, while it's designed to be
> unreliable, is only a little bit unreliable. Maybe it's more
> than a little bit.
So that data-collection patch has been in place for nearly 2 months
(since 2019-05-21), and in that time we've seen a grand total of
no repeats of the original problem, as far as I've seen. That's
fairly annoying considering we'd had four repeats in the month
prior to putting the patch in, but such is life.
In the meantime, we've had *lots* of buildfarm failures in the
added pg_stat_all_tables query, which indicate that indeed the
stats collector mechanism isn't terribly reliable. But that
doesn't directly prove anything about the original problem,
since the planner doesn't look at stats collector data.
Anyway, I'm now starting to feel that these failures are more
of a pain than they're worth, especially since there's not much
reason to hope that the original problem will recur soon.
What I propose to do is remove the pg_stat_all_tables query
but keep the relpages/reltuples query. That should fix the
buildfarm instability, but we can still hope to get at least
some insight if the original problem ever does recur.
regards, tom lane