Thread: vacuum analyze corrupts db with larger tuples (< 8k)
ok, here is what I have found out on 6.5.3, Linux 2.2.10: DROP TABLE buf; CREATE TABLE buf (s varchar(5000)); -- type TEXT is the same prob COPY buf FROM stdin; xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ [... almost 5k ...] xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ [... almost 5k ...] xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ [ 10 times ] \. # vacuumdb --analyze test ERROR: Tuple is too big: size 9604 vacuumdb: database vacuum failed on test. - this is repeatable - in in this test the db isn't corrupted but in our production db :( - Tom, you made a remark that you found a bug in access/hio.c. Does the bug also hit here? Dirk
Dirk Lutzebaeck writes: > - Tom, you made a remark that you found a bug in access/hio.c. > Does the bug also hit here? I applied this patch and the corruption is gone but vacuumdb --analyze still complains about the tuple being too large. Dirk
Dirk Lutzebaeck <lutzeb@aeccom.com> writes: > ok, here is what I have found out on 6.5.3, Linux 2.2.10: > [ make table with a bunch of almost-5K varchar fields ] > # vacuumdb --analyze test > ERROR: Tuple is too big: size 9604 > vacuumdb: database vacuum failed on test. Ohhh ... I know what's going on. The oversize tuple is the one that VACUUM is attempting to store in pg_statistic, containing the min and max values for your varchar column. In this example, both the min and max are just shy of 5K characters, so the pg_statistic tuple is too big to fit on a page. I had already patched this in current sources, by the expedient of not trying to store a pg_statistic tuple at all if it's too big. (Then you don't get stats for that particular column, but the stats probably wouldn't be useful anyway.) I suppose I should make up a back-patch for REL6_5 with this fix. regards, tom lane
I wrote: > Ohhh ... I know what's going on. The oversize tuple is the one that > VACUUM is attempting to store in pg_statistic, containing the min and > max values for your varchar column. In this example, both the min and > max are just shy of 5K characters, so the pg_statistic tuple is too > big to fit on a page. I have applied the attached patch to the REL6_5 branch. (There is already similar code in the current branch.) I'm afraid it won't help you recover from an already-corrupted pg_statistic table, though. regards, tom lane *** src/backend/commands/vacuum.c.orig Wed Aug 25 08:01:45 1999 --- src/backend/commands/vacuum.c Tue Jan 4 12:15:17 2000 *************** *** 2405,2414 **** stup = heap_formtuple(sd->rd_att, values, nulls); /* ---------------- ! * insert the tuple in the relation and get the tuple's oid. * ---------------- */ ! heap_insert(sd, stup); pfree(DatumGetPointer(values[3])); pfree(DatumGetPointer(values[4])); pfree(stup); --- 2405,2425 ---- stup = heap_formtuple(sd->rd_att, values, nulls); /* ---------------- ! * Watch out for oversize tuple, which can happen if ! * both of the saved data values are long. ! * Our fallback strategy is just to not store the ! * pg_statistic tuple at all in that case. (We could ! * replace the values by NULLs and still store the ! * numeric stats, but presently selfuncs.c couldn't ! * do anything useful with that case anyway.) * ---------------- */ ! if (MAXALIGN(stup->t_len) <= MaxTupleSize) ! { ! /* OK to store tuple */ ! heap_insert(sd, stup); ! } ! pfree(DatumGetPointer(values[3])); pfree(DatumGetPointer(values[4])); pfree(stup);
> Dirk Lutzebaeck <lutzeb@aeccom.com> writes: > > ok, here is what I have found out on 6.5.3, Linux 2.2.10: > > [ make table with a bunch of almost-5K varchar fields ] > > # vacuumdb --analyze test > > ERROR: Tuple is too big: size 9604 > > vacuumdb: database vacuum failed on test. > > Ohhh ... I know what's going on. The oversize tuple is the one that > VACUUM is attempting to store in pg_statistic, containing the min and > max values for your varchar column. In this example, both the min and > max are just shy of 5K characters, so the pg_statistic tuple is too > big to fit on a page. > > I had already patched this in current sources, by the expedient of not > trying to store a pg_statistic tuple at all if it's too big. (Then > you don't get stats for that particular column, but the stats probably > wouldn't be useful anyway.) > > I suppose I should make up a back-patch for REL6_5 with this fix. Oh, good we know the cause. Seems we should wait for 7.0 for this. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Tom Lane writes: > I have applied the attached patch to the REL6_5 branch. (There is > already similar code in the current branch.) I'm afraid it won't > help you recover from an already-corrupted pg_statistic table, though. How can I recover from this? Simply delete the entries in pg_statistic? Is this possible? Dirk
Dirk Lutzebaeck <lutzeb@aeccom.com> writes: > Tom Lane writes: >> I have applied the attached patch to the REL6_5 branch. (There is >> already similar code in the current branch.) I'm afraid it won't >> help you recover from an already-corrupted pg_statistic table, though. > How can I recover from this? Simply delete the entries in pg_statistic? > Is this possible? It's worth a try, anyway: just DELETE FROM pg_statistic and see what happens... but, depending on how badly pg_statistic is messed up, that might itself crash... regards, tom lane
> Tom Lane writes: > > > I have applied the attached patch to the REL6_5 branch. (There is > > already similar code in the current branch.) I'm afraid it won't > > help you recover from an already-corrupted pg_statistic table, though. > > How can I recover from this? Simply delete the entries in pg_statistic? > Is this possible? The only fix I know is pg_upgrade. You may have to enable the script. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026