Thread: BUG #18903: TRAP: failed Assert("false") in file: "tuplesortvariants.c"

BUG #18903: TRAP: failed Assert("false") in file: "tuplesortvariants.c"

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18903
Logged by:          Nikita Kalinin
Email address:      n.kalinin@postgrespro.ru
PostgreSQL version: 17.4
Operating system:   ubuntu 22.04
Description:

Hello. I might be doing something strange, but I’d like to understand why
this all ends with a SIGABRT.
After building PostgreSQL like this:
./configure --enable-tap-tests --enable-debug --with-openssl
--enable-cassert --prefix=/tmp/pg && make -j8
And running these two files:
1.sql:
VACUUM FULL pg_am;
VACUUM FULL pg_amop;
BEGIN;
CLUSTER pg_class USING pg_class_oid_index;
VACUUM FULL pg_proc;
2.sql:
CLUSTER pg_class USING pg_class_oid_index;
ROLLBACK;
BEGIN;
CLUSTER pg_class USING pg_class_oid_index;
CLUSTER pg_class USING pg_class_oid_index;
VACUUM FULL pg_class;
VACUUM FULL pg_class;
VACUUM FULL pg_class;
VACUUM FULL pg_class;
VACUUM FULL pg_class;
VACUUM FULL pg_class;
VACUUM FULL pg_class;
CLUSTER pg_class USING pg_class_oid_index;
COMMIT;
VACUUM FULL pg_class;
REINDEX TABLE pg_class;
REINDEX INDEX pg_class_oid_index;
REINDEX INDEX pg_class_tblspc_relfilenode_index;
Like this:
for i in $(seq 5); do
( psql -f ~/1.sql &> 1.log ) &
( psql -f ~/2.sql &> 2.log ) &
wait
done
It might be necessary to adjust the number of runs, but in my case, it
consistently reproduces with 5 iterations.
I get the following:
2025-04-25 05:05:23.320 UTC [125476] CONTEXT:  while checking uniqueness of
tuple (10,37) in relation "pg_class"
2025-04-25 05:05:23.320 UTC [125476] STATEMENT:  REINDEX INDEX
pg_class_oid_index;
TRAP: failed Assert("false"), File: "tuplesortvariants.c", Line: 1701, PID:
125476
postgres: test postgres [local]
REINDEX(ExceptionalCondition+0x71)[0x55ea5afb63c1]
postgres: test postgres [local] REINDEX(+0x6f60c6)[0x55ea5aff70c6]
postgres: test postgres [local] REINDEX(+0x6eeee9)[0x55ea5afefee9]
postgres: test postgres [local] REINDEX(+0x6eee53)[0x55ea5afefe53]
postgres: test postgres [local] REINDEX(+0x6eee53)[0x55ea5afefe53]
postgres: test postgres [local] REINDEX(+0x6eee78)[0x55ea5afefe78]
postgres: test postgres [local] REINDEX(+0x6eee78)[0x55ea5afefe78]
postgres: test postgres [local]
REINDEX(tuplesort_performsort+0x478)[0x55ea5aff4248]
postgres: test postgres [local] REINDEX(btbuild+0xadf)[0x55ea5ab0e53f]
postgres: test postgres [local] REINDEX(index_build+0x161)[0x55ea5ab79731]
postgres: test postgres [local] REINDEX(reindex_index+0x24f)[0x55ea5ab7c11f]
postgres: test postgres [local] REINDEX(ExecReindex+0x73f)[0x55ea5ac14ccf]
postgres: test postgres [local] REINDEX(+0x569735)[0x55ea5ae6a735]
postgres: test postgres [local]
REINDEX(standard_ProcessUtility+0x341)[0x55ea5ae69351]
postgres: test postgres [local] REINDEX(+0x566af1)[0x55ea5ae67af1]
postgres: test postgres [local] REINDEX(+0x566c2d)[0x55ea5ae67c2d]
postgres: test postgres [local] REINDEX(PortalRun+0x27b)[0x55ea5ae682ab]
postgres: test postgres [local] REINDEX(+0x562bb3)[0x55ea5ae63bb3]
postgres: test postgres [local] REINDEX(PostgresMain+0x1934)[0x55ea5ae65954]
postgres: test postgres [local] REINDEX(BackendMain+0x53)[0x55ea5ae5fa83]
postgres: test postgres [local]
REINDEX(postmaster_child_launch+0x102)[0x55ea5adaf122]
postgres: test postgres [local] REINDEX(+0x4b1eb7)[0x55ea5adb2eb7]
postgres: test postgres [local]
REINDEX(PostmasterMain+0xd1a)[0x55ea5adb491a]
postgres: test postgres [local] REINDEX(main+0x1d0)[0x55ea5aa842e0]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7efcc98cfd90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7efcc98cfe40]
postgres: test postgres [local] REINDEX(_start+0x25)[0x55ea5aa84665]
coredump:
#0  0x00007efcc993c9fc in pthread_kill () from
/lib/x86_64-linux-gnu/libc.so.6
#1  0x00007efcc98e8476 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007efcc98ce7f3 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x000055ea5afb63e0 in ExceptionalCondition
(conditionName=conditionName@entry=0x55ea5b142462 "false",
    fileName=fileName@entry=0x55ea5b202067 "tuplesortvariants.c",
lineNumber=lineNumber@entry=1701) at assert.c:66
#4  0x000055ea5aff70c6 in comparetup_index_btree_tiebreak (a=<optimized
out>, b=<optimized out>, state=<optimized out>)
    at tuplesortvariants.c:1701
#5  0x000055ea5afefee9 in qsort_tuple (data=data@entry=0x55ea79a33188,
n=n@entry=2,
    compare=compare@entry=0x55ea5aff72f0 <comparetup_index_btree>,
arg=arg@entry=0x55ea79a21700)
    at ../../../../src/include/lib/sort_template.h:316
#6  0x000055ea5afefe53 in qsort_tuple (data=data@entry=0x55ea79a32e58,
n=<optimized out>, n@entry=41,
    compare=compare@entry=0x55ea5aff72f0 <comparetup_index_btree>,
arg=arg@entry=0x55ea79a21700)
    at ../../../../src/include/lib/sort_template.h:391
#7  0x000055ea5afefe53 in qsort_tuple (data=data@entry=0x55ea79a32e58,
n=<optimized out>,
    compare=compare@entry=0x55ea5aff72f0 <comparetup_index_btree>,
arg=arg@entry=0x55ea79a21700)
    at ../../../../src/include/lib/sort_template.h:391
#8  0x000055ea5afefe78 in qsort_tuple (data=data@entry=0x55ea79a324e0,
n=<optimized out>,
    compare=compare@entry=0x55ea5aff72f0 <comparetup_index_btree>,
arg=arg@entry=0x55ea79a21700)
    at ../../../../src/include/lib/sort_template.h:405
#9  0x000055ea5afefe78 in qsort_tuple (data=<optimized out>, n=<optimized
out>, compare=0x55ea5aff72f0 <comparetup_index_btree>,
    arg=arg@entry=0x55ea79a21700) at
../../../../src/include/lib/sort_template.h:405
#10 0x000055ea5aff2f3b in tuplesort_sort_memtuples
(state=state@entry=0x55ea79a21700) at tuplesort.c:2721
#11 0x000055ea5aff4248 in tuplesort_performsort (state=0x55ea79a21700) at
tuplesort.c:1382
#12 0x000055ea5ab0e53f in _bt_leafbuild (btspool2=<optimized out>,
btspool=0x55ea79951188) at nbtsort.c:553
#13 btbuild (heap=<optimized out>, index=<optimized out>,
indexInfo=0x55ea799ef4d8) at nbtsort.c:330
#14 0x000055ea5ab79731 in index_build (heapRelation=0x7efcbe16dba8,
indexRelation=0x7efcbdfd3a08, indexInfo=0x55ea799ef4d8,
    isreindex=<optimized out>, parallel=<optimized out>) at index.c:3078
#15 0x000055ea5ab7c11f in reindex_index (stmt=stmt@entry=0x55ea79929a38,
indexId=indexId@entry=3455,
    skip_constraint_checks=skip_constraint_checks@entry=false,
persistence=persistence@entry=112 'p',
    params=params@entry=0x7ffe689a890c) at index.c:3814
#16 0x000055ea5ac14ccf in ReindexIndex (isTopLevel=true,
params=0x7ffe689a8904, stmt=0x55ea79929a38) at indexcmds.c:2929
#17 ExecReindex (pstate=pstate@entry=0x55ea798ab300,
stmt=stmt@entry=0x55ea79929a38, isTopLevel=isTopLevel@entry=true)
    at indexcmds.c:2852
#18 0x000055ea5ae6a735 in ProcessUtilitySlow (pstate=0x55ea798ab300,
pstmt=0x55ea79929ae8,
    queryString=0x55ea79928f90 "REINDEX INDEX
pg_class_tblspc_relfilenode_index;", context=PROCESS_UTILITY_TOPLEVEL,
params=0x0,
    queryEnv=0x0, qc=0x7ffe689a8f40, dest=<optimized out>) at utility.c:1570
#19 0x000055ea5ae69351 in standard_ProcessUtility (pstmt=0x55ea79929ae8,
queryString=0x55ea79928f90 "REINDEX INDEX
pg_class_tblspc_relfilenode_index;", readOnlyTree=<optimized out>,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x55ea79929ea8, qc=0x7ffe689a8f40) at utility.c:1070
#20 0x000055ea5ae67af1 in PortalRunUtility
(portal=portal@entry=0x55ea799a8a20, pstmt=pstmt@entry=0x55ea79929ae8,
isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x55ea79929ea8,
qc=qc@entry=0x7ffe689a8f40) at pquery.c:1185
#21 0x000055ea5ae67c2d in PortalRunMulti
(portal=portal@entry=0x55ea799a8a20, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x55ea79929ea8,
altdest=altdest@entry=0x55ea79929ea8, qc=qc@entry=0x7ffe689a8f40) at
pquery.c:1349
#22 0x000055ea5ae682ab in PortalRun (portal=portal@entry=0x55ea799a8a20,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
dest=dest@entry=0x55ea79929ea8, altdest=altdest@entry=0x55ea79929ea8,
qc=qc@entry=0x7ffe689a8f40) at pquery.c:820
#23 0x000055ea5ae63bb3 in exec_simple_query (query_string=0x55ea79928f90
"REINDEX INDEX pg_class_tblspc_relfilenode_index;") at postgres.c:1274
#24 0x000055ea5ae65954 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4771
#25 0x000055ea5ae5fa83 in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:124
#26 0x000055ea5adaf122 in postmaster_child_launch (child_type=<optimized
out>, child_slot=1, startup_data=startup_data@entry=0x7ffe689a93f0,
startup_data_len=startup_data_len@entry=24,
client_sock=client_sock@entry=0x7ffe689a9410) at launch_backend.c:290
#27 0x000055ea5adb2eb7 in BackendStartup (client_sock=0x7ffe689a9410) at
postmaster.c:3580
#28 ServerLoop () at postmaster.c:1702
#29 0x000055ea5adb491a in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x55ea79890bd0) at postmaster.c:1400
#30 0x000055ea5aa842e0 in main (argc=3, argv=0x55ea79890bd0) at main.c:227
This behavior can be reproduced on the master branch and, for example, on
REL_13_STABLE.


On Fri, Apr 25, 2025 at 11:54 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference:      18903
> Logged by:          Nikita Kalinin
> Email address:      n.kalinin@postgrespro.ru
> PostgreSQL version: 17.4
> Operating system:   ubuntu 22.04
> Description:
>
> Hello. I might be doing something strange, but I’d like to understand why
> this all ends with a SIGABRT.
> After building PostgreSQL like this:
> ./configure --enable-tap-tests --enable-debug --with-openssl
> --enable-cassert --prefix=/tmp/pg && make -j8
> And running these two files:


I tried using the steps you provided, but had no luck.  While testing,
I noticed that a lot of steps [1] in your test will fail because the
vacuum can not be run within a transaction block. Is that part of
reproducing the issue?  I am looking through the call stack.
Meanwhile, if you can recheck the steps you provided, then it would be
helpful for analysis.

[1]
BEGIN;
CLUSTER pg_class USING pg_class_oid_index;
CLUSTER pg_class USING pg_class_oid_index;
VACUUM FULL pg_class;
VACUUM FULL pg_class;
VACUUM FULL pg_class;

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: BUG #18903: TRAP: failed Assert("false") in file: "tuplesortvariants.c"

From
Никита Калинин
Date:
Strange. I rechecked my reproduction on several machines and got the same result.
It seems this issue was already raised in BUG #18490, but the reproduction itself differs from mine.

> On 25 Apr 2025, at 13:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Apr 25, 2025 at 11:54 AM PG Bug reporting form
> <noreply@postgresql.org> wrote:
>>
>> The following bug has been logged on the website:
>>
>> Bug reference:      18903
>> Logged by:          Nikita Kalinin
>> Email address:      n.kalinin@postgrespro.ru
>> PostgreSQL version: 17.4
>> Operating system:   ubuntu 22.04
>> Description:
>>
>> Hello. I might be doing something strange, but I’d like to understand why
>> this all ends with a SIGABRT.
>> After building PostgreSQL like this:
>> ./configure --enable-tap-tests --enable-debug --with-openssl
>> --enable-cassert --prefix=/tmp/pg && make -j8
>> And running these two files:
>
>
> I tried using the steps you provided, but had no luck.  While testing,
> I noticed that a lot of steps [1] in your test will fail because the
> vacuum can not be run within a transaction block. Is that part of
> reproducing the issue?  I am looking through the call stack.
> Meanwhile, if you can recheck the steps you provided, then it would be
> helpful for analysis.
>
> [1]
> BEGIN;
> CLUSTER pg_class USING pg_class_oid_index;
> CLUSTER pg_class USING pg_class_oid_index;
> VACUUM FULL pg_class;
> VACUUM FULL pg_class;
> VACUUM FULL pg_class;
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com




On Fri, Apr 25, 2025 at 12:30 PM Никита Калинин
<n.kalinin@postgrespro.ru> wrote:
>
> Strange. I rechecked my reproduction on several machines and got the same result.
> It seems this issue was already raised in BUG #18490, but the reproduction itself differs from mine.
>
Okay, so as Peter told, this is a known issue, so we might not need to
dig down further here.  Thanks.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com