Re: [GENERAL] Strange problem with create table as select * from table; - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [GENERAL] Strange problem with create table as select * from table;
Date
Msg-id 20657.1320455042@sss.pgh.pa.us
Whole thread Raw
Responses Re: [GENERAL] Strange problem with create table as select * from table;  (Martijn van Oosterhout <kleptog@svana.org>)
Re: [GENERAL] Strange problem with create table as select * from table;  (hubert depesz lubaczewski <depesz@depesz.com>)
List pgsql-hackers
I wrote:
> A different line of thought is that there's something about these
> specific source rows, and only these rows, that makes them vulnerable to
> corruption during INSERT/SELECT.  Do they by any chance contain any
> values that are unusual elsewhere in your table?  One thing I'm
> wondering about right now is the nulls bitmap --- so do these rows have
> nulls (or not-nulls) in any place that's unusual elsewhere?

Hah ... I have a theory.

I will bet that you recently added some column(s) to the source table
using ALTER TABLE ADD COLUMN and no default value, so that the added
columns were nulls and no table rewrite happened.  And that these
troublesome rows predate that addition, but contained no nulls before
that.  And that they are the only rows that, in addition to the above
conditions, contain data fields wide enough to require out-of-line
toasting.

These conditions together are enough to break the assumption in
toast_insert_or_update that the old and new tuples must have the same
value of t_hoff.  But it can only happen when the source tuple is an
original on-disk tuple, which explains why only INSERT ... SELECT *
causes the problem, not any variants that require projection of a new
column set.  When it does happen, toast_insert_or_update correctly
computes the required size of the new tuple ... but then it tells
heap_fill_tuple to fill the data part at offset olddata->t_hoff, which
is wrong (too small) and so the nulls bitmap that heap_fill_tuple
concurrently constructs will overwrite the first few data bytes.  In
your example, the table contains 49 columns so the nulls bitmap requires
7 bytes, just enough to overwrite the first 6 data bytes as observed.
(In fact, given the values we see being filled in, I can confidently say
that you have two added-since-creation null columns, no more, no less.)

I can reproduce the problem with the attached test case (using the
regression database).  With asserts enabled, the
        Assert(new_len == olddata->t_hoff);
fails.  With asserts off, corrupt data.

This is trivial to fix, now that we know there's a problem --- the
function is only using that assumption to save itself a couple lines
of code.  Penny wise, pound foolish :-(

            regards, tom lane


drop table wide;

create table wide as
select
ten as firstc,
unique1 as unique1_1,
unique2 as unique2_1,
two as two_1,
four as four_1,
ten as ten_1,
twenty as twenty_1,
hundred as hundred_1,
thousand as thousand_1,
twothousand as twothousand_1,
fivethous as fivethous_1,
tenthous as tenthous_1,
odd as odd_1,
even as even_1,
stringu1 as stringu1_1,
stringu2 as stringu2_1,
string4 as string4_1,
unique1 as unique1_2,
unique2 as unique2_2,
two as two_2,
four as four_2,
ten as ten_2,
twenty as twenty_2,
hundred as hundred_2,
thousand as thousand_2,
twothousand as twothousand_2,
fivethous as fivethous_2,
tenthous as tenthous_2,
odd as odd_2,
even as even_2,
stringu1 as stringu1_2,
stringu2 as stringu2_2,
string4 as string4_2,
unique1 as unique1_3,
unique2 as unique2_3,
two as two_3,
four as four_3,
ten as ten_3,
twenty as twenty_3,
hundred as hundred_3,
thousand as thousand_3,
twothousand as twothousand_3,
fivethous as fivethous_3,
tenthous as tenthous_3,
odd as odd_3,
even as even_3,
repeat('xyzzyxydlkadlkndvlelfzzy', 20000) as widec
from onek limit 10;

alter table wide add column nullc1 int;
alter table wide add column nullc2 int;

drop table widec;

create table widec as select * from wide;

select firstc, to_hex(unique1_1), unique2_1, to_hex(unique1_2) from widec;

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pg_upgrade automatic testing
Next
From: Josh Kupershmidt
Date:
Subject: proposal: psql concise mode