On Sat, May 19, 2018 at 02:39:26PM -0400, Tom Lane wrote:
> Hm, so was the timeout error happening every time through on that table,
> or just occasionally, or did you provoke it somehow? I'm wondering how
> your 9s timeout relates to the expected completion time.
I did not knowingly provoke it :)
Note that my script's non-artificial failure this morning, vac full of
pg_statistic DIDN'T timeout but the relation before it (pg_attrdef) DID. I
guess the logs I sent earlier were incomplete.
I don't know if it times out every time..but I'm thinking timeout is
implicated, but I don't see how a time of on a previous command can cause an
error on a future session, for a non-"shared" relation.
However, I see this happened (after a few hours) on one server where I was
looping WITHOUT timeout. So hopefully they have the same root cause and
timeout will be a good way to help trigger it.
postgres.pg_statistic...
ERROR: missing chunk number 0 for toast value 615791167 in pg_toast_2619
Sat May 19 17:18:03 EDT 2018
I should have sent the output from my script:
<<Sat May 19 07:48:51 MDT 2018: starting db=ts(analyze parents and un-analyzed tables)
...
DELETE 11185
Sat May 19 07:49:15 MDT 2018: ts: VACUUM FULL pg_catalog|pg_statistic|table|postgres|845 MB|...
ERROR: canceling statement due to statement timeout
Sat May 19 07:49:25 MDT 2018: ts: VACUUM FULL pg_catalog|pg_attrdef|table|postgres|305 MB|...
ERROR: canceling statement due to statement timeout
Sat May 19 07:49:36 MDT 2018: ts: VACUUM FULL pg_catalog|pg_constraint|table|postgres|14 MB|...
Sat May 19 07:49:37 MDT 2018: ts: VACUUM FULL pg_catalog|pg_constraint|table|postgres|14 MB|...done
<<Sat May 19 07:49:37 MDT 2018: starting db=postgres(analyze parents and un-analyzed tables)
DELETE 0
Sat May 19 07:49:38 MDT 2018: postgres: VACUUM FULL pg_catalog|pg_statistic|table|postgres|3344 kB|...
ERROR: missing chunk number 0 for toast value 730125403 in pg_toast_2619
BTW I just grepped logs for this error. I see it's happened at some point at
fifteen of our customers going back to Nov 2, 2016, shortly after I implemented
VACUUM FULL of pg_statistic (but not other tables).
I hadn't noticed most of the errors because it seems to fix itself, at least
sometimes.
Justin