Thread: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns

BUG #16369: Segmentation Faults and Data Corruption with Generated Columns

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16369
Logged by:          Cameron Ezell
Email address:      cameron.ezell@clearcapital.com
PostgreSQL version: 12.2
Operating system:   CentOS 8, Red Hat 8, Mac OS X 10.14.6
Description:

It seems that there are a few bugs that are throwing segmentation faults and
causing data corruption. These issues keep appearing for our team when using
tables that contain generated columns introduced in PostgreSQL 12. This has
been tested on 12.2 & 12.1 on CentOS 8 as well as 12.1 on MacOS 10.14.6:

select version();
-- PostgreSQL 12.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3
20140911 (Red Hat 4.8.3-9), 64-bit

CREATE SCHEMA if not exists test;commit;

CREATE TABLE if not exists test.bug_report
(
    id bigint generated by default as identity,
    hostname varchar,
    hostname_short varchar GENERATED ALWAYS AS (split_part(hostname, '.',
1)) STORED,
    device text,
    mount text,
    used_space_bytes bigint,
    used_space_gb numeric GENERATED ALWAYS AS (ROUND(used_space_bytes /
1073741824.0,2)) STORED,
    avail_space_bytes bigint,
    avail_space_gb numeric GENERATED ALWAYS AS (ROUND(avail_space_bytes /
1073741824.0,2)) STORED,
    inserted_dts timestamp with time zone NOT NULL DEFAULT
clock_timestamp(),
    inserted_by text NOT NULL DEFAULT session_user
);commit;

-- No problems on the following insert
INSERT INTO test.bug_report(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('123456789', 'devtmpfs', '/dev', 0,
6047076131313);
select * from test.bug_report;commit;

-- On CentOS 8, this bug is triggered with a hostname with 10+ characters.
On MacOS 10.14.6, 19+ characters.
INSERT INTO test.bug_report(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('12345678901234567890', 'devtmpfs', '/dev', 0,
6047076131313);commit;
-- This should immediately crash the postgres service

-- Inserting some strings below that character threshold will insert just
fine, but a select statement on the table will now throw an error
INSERT INTO test.bug_report(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('abc', 'devtmpfs', '/dev', 0,
6047076131313);commit;
select * from test.bug_report;commit;
--ERROR:  XX000: invalid memory alloc request size 18446744073709551613
--LOCATION:  palloc, mcxt.c:934


-- Once the "hostname_short" column no longer references any other column, I
am unable to reproduce this error
CREATE TABLE if not exists test.bug_report2
(
    id bigint generated by default as identity,
    hostname varchar,
    hostname_short varchar GENERATED ALWAYS AS ('static_string') STORED,
    device text,
    mount text,
    used_space_bytes bigint,
    used_space_gb numeric GENERATED ALWAYS AS (ROUND(used_space_bytes /
1073741824.0,2)) STORED,
    avail_space_bytes bigint,
    avail_space_gb numeric GENERATED ALWAYS AS (ROUND(avail_space_bytes /
1073741824.0,2)) STORED,
    inserted_dts timestamp with time zone NOT NULL DEFAULT
clock_timestamp(),
    inserted_by text NOT NULL DEFAULT session_user
);commit;

INSERT INTO test.bug_report2(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('12345678901234567890', 'devtmpfs', '/dev', 0,
6047076131313);commit;
select * from test.bug_report2;commit;

INSERT INTO test.bug_report2(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('abc', 'devtmpfs', '/dev', 0,
6047076131313);commit;
select * from test.bug_report2;commit;

-- simply referencing another column in the generated column will cause a
crash
CREATE TABLE if not exists test.bug_report3
(
    id bigint generated by default as identity,
    hostname varchar,
    hostname_short varchar GENERATED ALWAYS AS (hostname) STORED,
    device text,
    mount text,
    used_space_bytes bigint,
    used_space_gb numeric GENERATED ALWAYS AS (ROUND(used_space_bytes /
1073741824.0,2)) STORED,
    avail_space_bytes bigint,
    avail_space_gb numeric GENERATED ALWAYS AS (ROUND(avail_space_bytes /
1073741824.0,2)) STORED,
    inserted_dts timestamp with time zone NOT NULL DEFAULT
clock_timestamp(),
    inserted_by text NOT NULL DEFAULT session_user
);commit;

-- immediate crash
INSERT INTO test.bug_report3(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('12345678901234567890', 'devtmpfs', '/dev', 0,
6047076131313);commit;

-- no crash on insert
INSERT INTO test.bug_report3(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('abc', 'devtmpfs', '/dev', 0,
6047076131313);commit;
-- error thrown on select
select * from test.bug_report3;commit;
--ERROR:  XX000: invalid memory alloc request size 18446744073709551613
--LOCATION:  palloc, mcxt.c:934


On Thu, 16 Apr 2020 at 08:53, PG Bug reporting form
<noreply@postgresql.org> wrote:
> -- On CentOS 8, this bug is triggered with a hostname with 10+ characters.
> On MacOS 10.14.6, 19+ characters.
> INSERT INTO test.bug_report(hostname, device, mount, used_space_bytes,
> avail_space_bytes) VALUES ('12345678901234567890', 'devtmpfs', '/dev', 0,
> 6047076131313);commit;
> -- This should immediately crash the postgres service
 --LOCATION:  palloc, mcxt.c:934

Thanks for the report. This is certainly a bug.

I slightly simplified the test case to become:

drop table if exists crash;

CREATE TABLE crash (
    id bigint generated by default as identity,
    hostname varchar,
    hostname_short varchar GENERATED ALWAYS AS (hostname) STORED,
    device text,
    mount text,
    used_space_bytes bigint,
    used_space_gb bigint GENERATED ALWAYS AS (used_space_bytes) STORED,
    avail_space_bytes bigint,
    avail_space_gb bigint GENERATED ALWAYS AS (avail_space_bytes) STORED,
    inserted_dts timestamp with time zone NOT NULL DEFAULT clock_timestamp(),
    inserted_by text NOT NULL DEFAULT session_user
);


INSERT INTO crash (hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('12345678901234567', 'devtmpfs', '/dev', 0,
6047076131313); -- no crash
INSERT INTO crash (hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('123456789012345678', 'devtmpfs', '/dev',
0, 6047076131313); -- crash

The crash occurs during ExecComputeStoredGenerated() when calculating
the hostname_short column.  When we call ExecEvalExpr() for that
column, the eval function just returns the value of the Datum in the
"hostname" column, which is stored in "slot". This happens to be a
pointer into the tuple. This results in a crash because, later in that
function, we do ExecClearTuple(slot), which frees the memory for that
tuple.

I suppose it normally works because generally the calculation will be
using some SQL function which will return some calculated value which
is allocated, but in this case, we just return the pointer to the
memory in the tuple slot.

I'd say the fix should be to just datumCopy() the result of the
calculation. i.e the attached.

David

Attachment
David Rowley <dgrowleyml@gmail.com> writes:
> I'd say the fix should be to just datumCopy() the result of the
> calculation. i.e the attached.

That looks like it'll fail on a null result.

            regards, tom lane



On Thu, 16 Apr 2020 at 11:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > I'd say the fix should be to just datumCopy() the result of the
> > calculation. i.e the attached.
>
> That looks like it'll fail on a null result.

Thanks.  I'll have another go then.

Attachment

Re: BUG #16369: Segmentation Faults and Data Corruption withGenerated Columns

From
Michael Paquier
Date:
On Thu, Apr 16, 2020 at 12:40:24PM +1200, David Rowley wrote:
> On Thu, 16 Apr 2020 at 11:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That looks like it'll fail on a null result.
>
> Thanks.  I'll have another go then.

Shouldn't you have more regression test cases here?  I guess one for
the non-NULL case, and one for the NULL case?
--
Michael

Attachment

Re: BUG #16369: Segmentation Faults and Data Corruption withGenerated Columns

From
Peter Eisentraut
Date:
On 2020-04-16 07:30, Michael Paquier wrote:
> On Thu, Apr 16, 2020 at 12:40:24PM +1200, David Rowley wrote:
>> On Thu, 16 Apr 2020 at 11:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> That looks like it'll fail on a null result.
>>
>> Thanks.  I'll have another go then.
> 
> Shouldn't you have more regression test cases here?  I guess one for
> the non-NULL case, and one for the NULL case?

I've tried creating a smaller test case, but it's difficult.  I haven't 
fully understood the reason why it sometimes crashes and sometimes not. 
The smallest I could come up with so far is something like this:

CREATE TABLE crash (
     hostname varchar,
     hostname_short varchar GENERATED ALWAYS AS (hostname) STORED,
     device text,
     mount text,
     used_space_bytes bigint,
     avail_space_bytes bigint
);
INSERT INTO crash (hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES (repeat('1234567890', 100), 'devtmpfs', 
'/dev', 0, 6047076131313);

Would this be easier to reproduce with one of the memory-cleaning 
#defines enabled?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Fri, 17 Apr 2020 at 21:04, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> I've tried creating a smaller test case, but it's difficult.  I haven't
> fully understood the reason why it sometimes crashes and sometimes not.
>
> The smallest I could come up with so far is something like this:
>
> CREATE TABLE crash (
>      hostname varchar,
>      hostname_short varchar GENERATED ALWAYS AS (hostname) STORED,
>      device text,
>      mount text,
>      used_space_bytes bigint,
>      avail_space_bytes bigint
> );
> INSERT INTO crash (hostname, device, mount, used_space_bytes,
> avail_space_bytes) VALUES (repeat('1234567890', 100), 'devtmpfs',
> '/dev', 0, 6047076131313);
>
> Would this be easier to reproduce with one of the memory-cleaning
> #defines enabled?

I've done a bit more work on this and have something close. The case I
came up with for it was:

+CREATE TABLE gtest_varlena (a varchar, b varchar GENERATED ALWAYS AS
(a) STORED);
+INSERT INTO gtest_varlena (a) VALUES('01234567890123456789');
+INSERT INTO gtest_varlena (a) VALUES(NULL);
+SELECT * FROM gtest_varlena ORDER BY a;
+          a           |          b
+----------------------+----------------------
+ 01234567890123456789 | 01234567890123456789
+                      |
+(2 rows)
+
+DROP TABLE gtest_varlena;

With that, on an unpatched server, there's no crash, but there's
clearly junk in the b column.



Re: BUG #16369: Segmentation Faults and Data Corruption withGenerated Columns

From
Terry Schmitt
Date:


I've tried creating a smaller test case, but it's difficult.  I haven't
fully understood the reason why it sometimes crashes and sometimes not.
The smallest I could come up with so far is something like this:


For context, I work with Cameron, who submitted this bug.
I totally agree with Peter on this. We tried to simplify the test case on our end, but the symptoms would change in odd ways. Changing the column order would change the symptoms slightly and at times the columns with DEFAULT values would be corrupt. It also seemed like string text based columns would behave differently than integers.
On Thu, 16 Apr 2020 at 17:30, Michael Paquier <michael@paquier.xyz> wrote:
> Shouldn't you have more regression test cases here?  I guess one for
> the non-NULL case, and one for the NULL case?

Yeah. I've just pushed a fix that included a fairly simple regression test.

David



Re: BUG #16369: Segmentation Faults and Data Corruption withGenerated Columns

From
Michael Paquier
Date:
On Sat, Apr 18, 2020 at 02:13:17PM +1200, David Rowley wrote:
> On Thu, 16 Apr 2020 at 17:30, Michael Paquier <michael@paquier.xyz> wrote:
> > Shouldn't you have more regression test cases here?  I guess one for
> > the non-NULL case, and one for the NULL case?
>
> Yeah. I've just pushed a fix that included a fairly simple regression test.

Thanks for the commit, David!  I was wondering if it would have been
better to add a "\pset null", but that's a nit and the result looks
fine anyway.
--
Michael

Attachment