BUG #16378: Invalid memory access on interrupting CLUSTER after CREATE TEMP TABLE - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #16378: Invalid memory access on interrupting CLUSTER after CREATE TEMP TABLE
Date
Msg-id 16378-7a70ca41b3ec2009@postgresql.org
Whole thread Raw
Responses Re: BUG #16378: Invalid memory access on interrupting CLUSTER after CREATE TEMP TABLE
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      16378
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 12.2
Operating system:   Ubuntu 18.04
Description:

The following script:
valgrind --leak-check=no --time-stamp=yes --read-var-info=yes  \
 --gen-suppressions=all  --suppressions=src/tools/valgrind.supp \
 --trace-children=yes \
pg_ctl -w -t 30 -D "$PGDB" -l postmaster.log start

echo "
CREATE TEMP TABLE temp_table(i int);

CREATE TABLE clstr(id serial primary key, a int);
INSERT INTO clstr(a) SELECT g.i % 42 FROM generate_series(1, 133000) g(i);
CREATE INDEX clstr_a ON clstr(a);
CLUSTER clstr USING clstr_a;
" | time psql postgres >>psql.log 2>&1 &

sleep 28

echo "
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE query LIKE
'CLUSTER%'
" | psql postgres

leads to a cassert-enabled server crash with the following messages in the
log (for the master branch):
2020-04-19 09:31:16.351 MSK [28410] STATEMENT:  INSERT INTO clstr(a) SELECT
g.i % 42 FROM generate_series(1, 133000) g(i);
2020-04-19 09:31:21.187 MSK [28410] FATAL:  terminating connection due to
administrator command
2020-04-19 09:31:21.187 MSK [28410] STATEMENT:  CLUSTER clstr USING
clstr_a;
==00:00:00:36.289 28410== Invalid read of size 4
==00:00:00:36.289 28410==    at 0x42AFED: list_member_oid (list.c:678)
==00:00:00:36.289 28410==    by 0x2B8FCD: ReindexIsProcessingIndex
(index.c:3852)
==00:00:00:36.289 28410==    by 0x237D42: systable_beginscan (genam.c:367)
==00:00:00:36.289 28410==    by 0x2AD82A: findDependentObjects
(dependency.c:560)
==00:00:00:36.289 28410==    by 0x2AEBB1: performDeletion
(dependency.c:340)
==00:00:00:36.289 28410==    by 0x2B9833: RemoveTempRelations
(namespace.c:4232)
==00:00:00:36.289 28410==    by 0x2B9C86: RemoveTempRelationsCallback
(namespace.c:4251)
==00:00:00:36.289 28410==    by 0x532D2B: shmem_exit (ipc.c:239)
==00:00:00:36.289 28410==    by 0x532E28: proc_exit_prepare (ipc.c:194)
==00:00:00:36.289 28410==    by 0x532EAD: proc_exit (ipc.c:107)
==00:00:00:36.289 28410==    by 0x6892D6: errfinish (elog.c:578)
==00:00:00:36.289 28410==    by 0x55BA87: ProcessInterrupts
(postgres.c:3076)
==00:00:00:36.289 28410==  Address 0x1232af10 is 0 bytes inside a block of
size 48 free'd
==00:00:00:36.289 28410==    at 0x6B66A9: pfree (mcxt.c:1061)
==00:00:00:36.289 28410==    by 0x42A4A6: list_free_private (list.c:1364)
==00:00:00:36.289 28410==    by 0x42B6EB: list_free (list.c:1378)
==00:00:00:36.289 28410==    by 0x67AD26: RelationGetIndexAttrBitmap
(relcache.c:5107)
==00:00:00:36.289 28410==    by 0x21D2E6: heap_update (heapam.c:2973)
==00:00:00:36.289 28410==    by 0x21EDED: simple_heap_update
(heapam.c:3901)
==00:00:00:36.289 28410==    by 0x2B97A0: CatalogTupleUpdate
(indexing.c:230)
==00:00:00:36.289 28410==    by 0x32F8C0: copy_table_data (cluster.c:927)
==00:00:00:36.289 28410==    by 0x330E67: rebuild_relation (cluster.c:580)
==00:00:00:36.289 28410==    by 0x330FB0: cluster_rel (cluster.c:403)
==00:00:00:36.289 28410==    by 0x331271: cluster (cluster.c:173)
==00:00:00:36.289 28410==    by 0x563105: standard_ProcessUtility
(utility.c:819)
==00:00:00:36.289 28410==  Block was alloc'd at
==00:00:00:36.289 28410==    at 0x6B6063: palloc (mcxt.c:974)
==00:00:00:36.289 28410==    by 0x42A43C: new_list (list.c:134)
==00:00:00:36.289 28410==    by 0x42BC71: list_copy (list.c:1410)
==00:00:00:36.289 28410==    by 0x67996C: RelationGetIndexList
(relcache.c:4521)
==00:00:00:36.289 28410==    by 0x67AC49: RelationGetIndexAttrBitmap
(relcache.c:5101)
==00:00:00:36.289 28410==    by 0x21D2E6: heap_update (heapam.c:2973)
==00:00:00:36.289 28410==    by 0x21EDED: simple_heap_update
(heapam.c:3901)
==00:00:00:36.289 28410==    by 0x2B97A0: CatalogTupleUpdate
(indexing.c:230)
==00:00:00:36.289 28410==    by 0x32F8C0: copy_table_data (cluster.c:927)
==00:00:00:36.289 28410==    by 0x330E67: rebuild_relation (cluster.c:580)
==00:00:00:36.289 28410==    by 0x330FB0: cluster_rel (cluster.c:403)
==00:00:00:36.289 28410==    by 0x331271: cluster (cluster.c:173)
==00:00:00:36.289 28410== 
{
   <insert_a_suppression_name_here>
   Memcheck:Addr4
   fun:list_member_oid
   fun:ReindexIsProcessingIndex
   fun:systable_beginscan
   fun:findDependentObjects
   fun:performDeletion
   fun:RemoveTempRelations
   fun:RemoveTempRelationsCallback
   fun:shmem_exit
   fun:proc_exit_prepare
   fun:proc_exit
   fun:errfinish
   fun:ProcessInterrupts
}
TRAP: FailedAssertion("IsOidList(list)", File: "list.c", Line: 678)
==00:00:00:36.291 28481== 
==00:00:00:36.291 28481== HEAP SUMMARY:
==00:00:00:36.291 28481==     in use at exit: 1,302,653 bytes in 440
blocks
==00:00:00:36.291 28481==   total heap usage: 9,446 allocs, 312 frees,
9,487,792 bytes allocated
==00:00:00:36.291 28481== 
==00:00:00:36.291 28481== For a detailed leak analysis, rerun with:
--leak-check=full
==00:00:00:36.291 28481== 
==00:00:00:36.291 28481== For counts of detected and suppressed errors,
rerun with: -v
==00:00:00:36.291 28481== ERROR SUMMARY: 0 errors from 0 contexts
(suppressed: 7 from 3)
postgres: law postgres [local]
CLUSTER(ExceptionalCondition+0x7c)[0x685026]
postgres: law postgres [local] CLUSTER(list_member_oid+0x46)[0x42b023]
postgres: law postgres [local]
CLUSTER(ReindexIsProcessingIndex+0x1a)[0x2b8fce]
postgres: law postgres [local] CLUSTER(systable_beginscan+0x37)[0x237d43]
postgres: law postgres [local] CLUSTER(+0x1a582b)[0x2ad82b]
postgres: law postgres [local] CLUSTER(performDeletion+0x74)[0x2aebb2]
postgres: law postgres [local] CLUSTER(+0x1b1834)[0x2b9834]
postgres: law postgres [local] CLUSTER(+0x1b1c87)[0x2b9c87]
postgres: law postgres [local] CLUSTER(shmem_exit+0x68)[0x532d2c]
postgres: law postgres [local] CLUSTER(+0x42ae29)[0x532e29]
postgres: law postgres [local] CLUSTER(proc_exit+0x8)[0x532eae]
postgres: law postgres [local] CLUSTER(errfinish+0x3d7)[0x6892d7]
postgres: law postgres [local] CLUSTER(ProcessInterrupts+0x35b)[0x55ba88]
postgres: law postgres [local] CLUSTER(+0x120394)[0x228394]
postgres: law postgres [local] CLUSTER(+0x147603)[0x24f603]
postgres: law postgres [local] CLUSTER(btbuild+0x7b)[0x251053]
postgres: law postgres [local] CLUSTER(index_build+0x14a)[0x2b71e6]
postgres: law postgres [local] CLUSTER(reindex_index+0x3c7)[0x2b7ad6]
postgres: law postgres [local] CLUSTER(reindex_relation+0x1b9)[0x2b91ac]
postgres: law postgres [local] CLUSTER(finish_heap_swap+0xe0)[0x330b84]
postgres: law postgres [local] CLUSTER(+0x228e95)[0x330e95]
postgres: law postgres [local] CLUSTER(cluster_rel+0xe5)[0x330fb1]
postgres: law postgres [local] CLUSTER(cluster+0x9d)[0x331272]
postgres: law postgres [local]
CLUSTER(standard_ProcessUtility+0x638)[0x563106]
postgres: law postgres [local] CLUSTER(ProcessUtility+0xcd)[0x5636ad]
postgres: law postgres [local] CLUSTER(+0x457bfb)[0x55fbfb]
postgres: law postgres [local] CLUSTER(+0x458862)[0x560862]
postgres: law postgres [local] CLUSTER(PortalRun+0x2b8)[0x56153d]
postgres: law postgres [local] CLUSTER(+0x455821)[0x55d821]
postgres: law postgres [local] CLUSTER(PostgresMain+0x8b8)[0x55f852]
postgres: law postgres [local] CLUSTER(+0x3c474d)[0x4cc74d]
postgres: law postgres [local] CLUSTER(+0x3c77e6)[0x4cf7e6]
postgres: law postgres [local] CLUSTER(+0x3c7aea)[0x4cfaea]
postgres: law postgres [local] CLUSTER(PostmasterMain+0x109b)[0x4d0e63]
postgres: law postgres [local] CLUSTER(main+0x20c)[0x4194d6]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x6285b97]
postgres: law postgres [local] CLUSTER(_start+0x2a)[0x1c0b6a]

The required sleep delay may vary (25-30 seconds delay works for me), but
I've managed to reproduce it reliably on REL_10_STABLE..master.


pgsql-bugs by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BUG #16377: select stuck when use order by and limit 1 iforder-by-field has one index and result has no records
Next
From: David Kubecka
Date:
Subject: Re: Possibly misleading documentation of Template Patterns forDate/Time Formatting