On Fri, Nov 04, 2011 at 09:04:02PM -0400, Tom Lane wrote:
> that. And that they are the only rows that, in addition to the above
> conditions, contain data fields wide enough to require out-of-line
> toasting.
checked lengths of the text/varchar columns in database.
there are 16 such columns in the table.
full report of lengths is in
http://www.depesz.com/various/lengths.report.gz
it was obtained using:
select length( "first_text_column" ) as length_1, count(*) from etsy_v2.receipts group by 1 order by 1;
and so on for every text column, and at the end I also made summary of
sum-of-lengths.
there is also:
http://www.depesz.com/various/lengths2.report.gz
which has the same summary, but only of the damaged rows.
As you can see the length of columns is not really special - somewhere
in the middle of all other rows. summarized length is also not special
in any way.
> These conditions together are enough to break the assumption in
> toast_insert_or_update that the old and new tuples must have the same
> value of t_hoff. But it can only happen when the source tuple is an
> original on-disk tuple, which explains why only INSERT ... SELECT *
> causes the problem, not any variants that require projection of a new
> column set. When it does happen, toast_insert_or_update correctly
> computes the required size of the new tuple ... but then it tells
> heap_fill_tuple to fill the data part at offset olddata->t_hoff, which
> is wrong (too small) and so the nulls bitmap that heap_fill_tuple
> concurrently constructs will overwrite the first few data bytes. In
> your example, the table contains 49 columns so the nulls bitmap requires
> 7 bytes, just enough to overwrite the first 6 data bytes as observed.
> (In fact, given the values we see being filled in, I can confidently say
> that you have two added-since-creation null columns, no more, no less.)
>
> I can reproduce the problem with the attached test case (using the
> regression database). With asserts enabled, the
> Assert(new_len == olddata->t_hoff);
> fails. With asserts off, corrupt data.
How can I make the onek table for the test? is it standard table from
something?
> This is trivial to fix, now that we know there's a problem --- the
> function is only using that assumption to save itself a couple lines
> of code. Penny wise, pound foolish :-(
Any chance of getting the fix in patch format so we could test it on
this system?
Best regards,
depesz
--
The best thing about modern society is how easy it is to avoid contact with it.
http://depesz.com/