Thread: BUG #15684: Server crash on DROP partitioned table

BUG #15684: Server crash on DROP partitioned table

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      15684
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 11.2
Operating system:   Ubuntu 18.04
Description:

The following query:
create table at_partitioned (a int, b text) partition by range (a);
create table at_part_1 partition of at_partitioned for values from (0) to
(1000);
create table at_part_2 partition of at_partitioned for values from (1000) to
(2000);
create index on at_partitioned (b);
alter table at_partitioned alter column b type numeric using b::numeric;
alter table at_partitioned alter column b type numeric using b::numeric;
drop table at_partitioned cascade;

crashes server (on REL_11_2 and REL_11_STABLE) with the error messages:
psql:query.sql:7: WARNING:  AbortTransaction while in COMMIT state
psql:query.sql:7: ERROR:  SMgrRelation hashtable corrupted
PANIC:  cannot abort transaction 575, it was already committed
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
psql:query.sql:7: connection to server was lost

and the following stack trace:
Core was generated by `postgres: law regression [local] DROP TABLE
                        '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f715f347801 in __GI_abort () at abort.c:79
#2  0x00005624ec109351 in errfinish (dummy=dummy@entry=0) at elog.c:555
#3  0x00005624ec10b1c6 in elog_finish (elevel=elevel@entry=22, 
    fmt=fmt@entry=0x5624ec195190 "cannot abort transaction %u, it was
already committed") at elog.c:1376
#4  0x00005624ebd63826 in RecordTransactionAbort
(isSubXact=isSubXact@entry=false) at xact.c:1580
#5  0x00005624ebd63942 in AbortTransaction () at xact.c:2602
#6  0x00005624ebd64385 in AbortCurrentTransaction () at xact.c:3144
#7  0x00005624ebfeba10 in PostgresMain (argc=<optimized out>,
argv=argv@entry=0x5624ece2d748, dbname=<optimized out>, 
    username=0x5624ecdfeab8 "law") at postgres.c:3968
#8  0x00005624ebcc133d in BackendRun (port=0x5624ece25e40) at
postmaster.c:4361
#9  BackendStartup (port=0x5624ece25e40) at postmaster.c:4033
#10 ServerLoop () at postmaster.c:1706
#11 0x00005624ebf6a668 in PostmasterMain (argc=3, argv=0x5624ecdfc9f0) at
postmaster.c:1379
#12 0x00005624ebcc2f69 in main (argc=3, argv=0x5624ecdfc9f0) at main.c:228


Re: BUG #15684: Server crash on DROP partitioned table

From
Julien Rouhaud
Date:
Hi,

On Sun, Mar 10, 2019 at 7:55 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following query:
> create table at_partitioned (a int, b text) partition by range (a);
> create table at_part_1 partition of at_partitioned for values from (0) to
> (1000);
> create table at_part_2 partition of at_partitioned for values from (1000) to
> (2000);
> create index on at_partitioned (b);
> alter table at_partitioned alter column b type numeric using b::numeric;
> alter table at_partitioned alter column b type numeric using b::numeric;
> drop table at_partitioned cascade;
>
> crashes server (on REL_11_2 and REL_11_STABLE) with the error messages:
> psql:query.sql:7: WARNING:  AbortTransaction while in COMMIT state
> psql:query.sql:7: ERROR:  SMgrRelation hashtable corrupted
> PANIC:  cannot abort transaction 575, it was already committed
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> psql:query.sql:7: connection to server was lost
>
> and the following stack trace:
> [...]

It seems to be the same bug as described in bug #15672
(https://www.postgresql.org/message-id/15672-b9fa7db32698269f@postgresql.org).
I'm Cc-ing Amit just in case.


Re: BUG #15684: Server crash on DROP partitioned table

From
Alexander Lakhin
Date:
Hello Julien,
10.03.2019 22:04, Julien Rouhaud wrote:
> Hi,
>
> On Sun, Mar 10, 2019 at 7:55 PM PG Bug reporting form
> <noreply@postgresql.org> wrote:
>> The following query:
>> create table at_partitioned (a int, b text) partition by range (a);
>> create table at_part_1 partition of at_partitioned for values from (0) to
>> (1000);
>> create table at_part_2 partition of at_partitioned for values from (1000) to
>> (2000);
>> create index on at_partitioned (b);
>> alter table at_partitioned alter column b type numeric using b::numeric;
>> alter table at_partitioned alter column b type numeric using b::numeric;
>> drop table at_partitioned cascade;
>>
> It seems to be the same bug as described in bug #15672
> (https://www.postgresql.org/message-id/15672-b9fa7db32698269f@postgresql.org).
> I'm Cc-ing Amit just in case.
Yes, after applying the patch presented in that thread the crash is not
reproduced.
Thanks for the tip!

Best regards,
Alexander


Re: BUG #15684: Server crash on DROP partitioned table

From
Amit Langote
Date:
On 2019/03/11 4:15, Alexander Lakhin wrote:
> 10.03.2019 22:04, Julien Rouhaud wrote:
>> Hi,
>>
>> On Sun, Mar 10, 2019 at 7:55 PM PG Bug reporting form
>> <noreply@postgresql.org> wrote:
>>> The following query:
>>> create table at_partitioned (a int, b text) partition by range (a);
>>> create table at_part_1 partition of at_partitioned for values from (0) to
>>> (1000);
>>> create table at_part_2 partition of at_partitioned for values from (1000) to
>>> (2000);
>>> create index on at_partitioned (b);
>>> alter table at_partitioned alter column b type numeric using b::numeric;
>>> alter table at_partitioned alter column b type numeric using b::numeric;
>>> drop table at_partitioned cascade;
>>>
>> It seems to be the same bug as described in bug #15672
>> (https://www.postgresql.org/message-id/15672-b9fa7db32698269f@postgresql.org).
>> I'm Cc-ing Amit just in case.

Thanks Julien.

To summarize, the problem is that partition/child indexes, when recreated
due to ALTER COLUMN, all get the same relfilenode, which amounts to a
corrupted catalog state.

> Yes, after applying the patch presented in that thread the crash is not
> reproduced.
> Thanks for the tip!

Unfortunately, the patch I posted there is still very sketchy.  I will try
to revise it this week and will be waiting for comments.

Thanks,
Amit