Re: pgsql: Specialize tuplesort routines for different kinds of abbreviated - Mailing list pgsql-committers

From Andres Freund
Subject Re: pgsql: Specialize tuplesort routines for different kinds of abbreviated
Date
Msg-id 20220402201557.thanbsxcql5lk6pc@alap3.anarazel.de
Whole thread Raw
In response to Re: pgsql: Specialize tuplesort routines for different kinds of abbreviated  (John Naylor <john.naylor@enterprisedb.com>)
List pgsql-committers
Hi,

On 2022-04-02 16:06:40 +0700, John Naylor wrote:
> So far, kestrel and tamandua don't like this, and they both use
> "-fsanitize=undefined,alignment". I'll try to reproduce locally in a
> bit.

Just hit this in my development environment.

FWIW, I found it hard to work with ubsan without applying 0002 from
https://postgr.es/m/20220323173537.ll7klrglnp4gn2um%40alap3.anarazel.de
I plan to polish that to be able to commit it, but won't have cycles before
the CF ends.

with that applied
CFLAGS=-fsanitize=alignment,undefined,address -fno-sanitize-recover=all
and
export ASAN_OPTIONS="detect_leaks=0:disable_coredump=0:print_stacktrace=1:abort_on_error=1"
UBSAN_OPTIONS="disable_coredump=0:print_stacktrace=1:abort_on_error=1"

I get a backtrace to investigate. Unfortunately abort_on_error=1 also removes
the nicer error message :(. So I ran both.

It triggers reliably on
CLUSTER clstr_expression USING clstr_expression_minus_a;
for me.

/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:723:23: runtime error: load of value 150, which is not a
validvalue for type '_Bool'
 

#0  __ubsan::__ubsan_handle_load_invalid_value_abort (Data=0x7ff65e8fada0, Val=70) at
../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:548
#1  0x00007ff65a649093 in qsort_tuple_int32_compare (a=0x62b000fc0240, b=0x62b000fc0258, state=0x625000066a20)
    at /home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:723
#2  0x00007ff65a64bc7b in qsort_tuple_int32 (data=0x62b000fc0240, n=133, arg=0x625000066a20)
    at /home/andres/src/postgresql/src/include/lib/sort_template.h:313
#3  0x00007ff65a672d59 in tuplesort_sort_memtuples (state=0x625000066a20) at
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:3613
#4  0x00007ff65a6617ec in tuplesort_performsort (state=0x625000066a20) at
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:2154
#5  0x00007ff6586965dd in heapam_relation_copy_for_cluster (OldHeap=0x7ff64d893308, NewHeap=0x7ff64d896fc0,
OldIndex=0x7ff64d89c860,use_sort=true,
 
    OldestXmin=23706, xid_cutoff=0x7ffd4650a480, multi_cutoff=0x7ffd4650a490, num_tuples=0x7ffd4650a4a0,
tups_vacuumed=0x7ffd4650a4c0,
    tups_recently_dead=0x7ffd4650a4e0) at /home/andres/src/postgresql/src/backend/access/heap/heapam_handler.c:955
#6  0x00007ff658d6c4b7 in table_relation_copy_for_cluster (OldTable=0x7ff64d893308, NewTable=0x7ff64d896fc0,
OldIndex=0x7ff64d89c860,use_sort=true,
 
    OldestXmin=23706, xid_cutoff=0x7ffd4650a480, multi_cutoff=0x7ffd4650a490, num_tuples=0x7ffd4650a4a0,
tups_vacuumed=0x7ffd4650a4c0,
    tups_recently_dead=0x7ffd4650a4e0) at /home/andres/src/postgresql/src/include/access/tableam.h:1658
#7  0x00007ff658d728aa in copy_table_data (OIDNewHeap=73728, OIDOldHeap=47052, OIDOldIndex=47078, verbose=false,
pSwapToastByContent=0x7ffd4650a670,
    pFreezeXid=0x7ffd4650a680, pCutoffMulti=0x7ffd4650a690) at
/home/andres/src/postgresql/src/backend/commands/cluster.c:913
#8  0x00007ff658d6ff9e in rebuild_relation (OldHeap=0x7ff64d893308, indexOid=47078, verbose=false)
    at /home/andres/src/postgresql/src/backend/commands/cluster.c:606
#9  0x00007ff658d6e622 in cluster_rel (tableOid=47052, indexOid=47078, params=0x7ffd4650a7b0)
    at /home/andres/src/postgresql/src/backend/commands/cluster.c:427
#10 0x00007ff658d6d774 in cluster (pstate=0x619000001aa0, stmt=0x625000005d00, isTopLevel=true)
    at /home/andres/src/postgresql/src/backend/commands/cluster.c:195
#11 0x00007ff659d425b1 in standard_ProcessUtility (pstmt=0x625000006010,
    queryString=0x625000005220 "CLUSTER clstr_expression USING clstr_expression_minus_a;", readOnlyTree=false,
context=PROCESS_UTILITY_TOPLEVEL,params=0x0,
 
    queryEnv=0x0, dest=0x6250000060e0, qc=0x7ffd4650ae40) at
/home/andres/src/postgresql/src/backend/tcop/utility.c:862
#12 0x00007ff659d40901 in ProcessUtility (pstmt=0x625000006010, queryString=0x625000005220 "CLUSTER clstr_expression
USINGclstr_expression_minus_a;",
 
    readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x6250000060e0,
qc=0x7ffd4650ae40)

(rr) p/x *b
$4 = {tuple = 0x62500006ba78, datum1 = 0x7ff659cc7e9e, isnull1 = 0x46, srctape = 0x7ff6}

There's definitely something borked - looks like this is ending up with bogus
pointers? Using rr to set a watchpoint on isnull1, and continuing backward I
see the memory written to with the following stack:

(rr) watch -l b->isnull1
Hardware watchpoint 3: -location b->isnull1

(rr) reverse-continue
Continuing.

Hardware watchpoint 3: -location b->isnull1

Old value = 70
New value = 190
0x00007ff65a65fbe4 in puttuple_common (state=0x625000066a20, tuple=0x7ffd4650a150) at
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:2000
2000                state->memtuples[state->memtupcount++] = *tuple;
(rr) bt
#0  0x00007ff65a65fbe4 in puttuple_common (state=0x625000066a20, tuple=0x7ffd4650a150) at
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:2000
#1  0x00007ff65a65a478 in tuplesort_putheaptuple (state=0x625000066a20, tup=0x625000071d28)
    at /home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:1810
#2  0x00007ff658696300 in heapam_relation_copy_for_cluster (OldHeap=0x7ff64d893308, NewHeap=0x7ff64d896fc0,
OldIndex=0x7ff64d89c860,use_sort=true,
 
    OldestXmin=23706, xid_cutoff=0x7ffd4650a480, multi_cutoff=0x7ffd4650a490, num_tuples=0x7ffd4650a4a0,
tups_vacuumed=0x7ffd4650a4c0,

(note new/old are kind of switched around due to continuing in reverse)

and then

Hardware watchpoint 3: -location b->isnull1

Old value = 190
New value = false
__memset_evex_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:143
143    ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
(rr) bt
#0  __memset_evex_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:143
#1  0x00007ff654d2c3c4 in __asan::Allocator::Allocate (this=this@entry=0x7ff654e3ce00 <__asan::instance>,
size=<optimizedout>, size@entry=24640,
 
    alignment=<optimized out>, alignment@entry=8, stack=stack@entry=0x7ffd465092c0,
alloc_type=alloc_type@entry=__asan::FROM_MALLOC,
    can_fill=can_fill@entry=true) at ../../../../src/libsanitizer/asan/asan_allocator.cpp:598
#2  0x00007ff654d28a9b in __asan::asan_malloc (size=size@entry=24640, stack=stack@entry=0x7ffd465092c0)
    at ../../../../src/libsanitizer/asan/asan_allocator.cpp:964
#3  0x00007ff654db998f in __interceptor_malloc (size=24640) at
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:70
#4  0x00007ff65a5c4f37 in AllocSetAlloc (context=0x625000066900, size=24576) at
/home/andres/src/postgresql/src/backend/utils/mmgr/aset.c:740
#5  0x00007ff65a60e385 in palloc (size=24576) at /home/andres/src/postgresql/src/backend/utils/mmgr/mcxt.c:1082
#6  0x00007ff65a6513dc in tuplesort_begin_batch (state=0x625000066a20) at
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:963
#7  0x00007ff65a6501d9 in tuplesort_begin_common (workMem=65536, coordinate=0x0, randomAccess=false)
    at /home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:878


Greetings,

Andres Freund



pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Fix overflow hazards in interval input and output conversions.
Next
From: Andres Freund
Date:
Subject: Re: pgsql: Specialize tuplesort routines for different kinds of abbreviated