Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker - Mailing list pgsql-bugs

From Tomas Vondra
Subject Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
Date
Msg-id 20191121103940.gpadc7xmssl63sad@development
Whole thread Raw
In response to BUG #16129: Segfault in tts_virtual_materialize in logical replication worker  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
List pgsql-bugs
On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
>The following bug has been logged on the website:
>
>Bug reference:      16129
>Logged by:          Ondrej Jirman
>Email address:      ienieghapheoghaiwida@xff.cz
>PostgreSQL version: 12.1
>Operating system:   Arch Linux
>Description:
>
>Hello,
>
>I've upgraded my main PostgreSQL cluster from 11.5 to 12.1 via pg_dumpall
>method and after a while I started getting segfault in logical replication
>worker.
>
>My setup is fairly vanilla, non-default options:
>
>shared_buffers = 256MB
>work_mem = 512MB
>temp_buffers = 64MB
>maintenance_work_mem = 4GB
>effective_cache_size = 16GB
>max_logical_replication_workers = 30
>max_replication_slots = 30
>max_worker_processes = 30
>wal_level = logical
>
>I have several databases that I subscribe to from this database cluster
>using logical replication.
>
>Replication of one of my databases (running on ARMv7 machine) started
>segfaulting on the subscriber side (x86_64) like this:
>
>#0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
>/usr/lib/libc.so.6
>#1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
>__dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
>#2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
>#3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
>(slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
>shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624

Hmmm, so it's failing on this memcpy() in tts_virtual_materialize:

     else
     {
         Size        data_length = 0;

         data = (char *) att_align_nominal(data, att->attalign);
         data_length = att_addlength_datum(data_length, att->attlen, val);

         memcpy(data, DatumGetPointer(val), data_length);

         slot->tts_values[natt] = PointerGetDatum(data);
         data += data_length;
     }

The question is, which of the pointers is bogus. You seem to already
have a core file, so can you inspect the variables in frame #2? I think
especially

  p *slot
  p natt
  p val
  p *att

would be interesting to see.

Also, how does the replicated schema look like? Can we see the table
definitions?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #16127: PostgreSQL 12.1 on Windows 2008 R2 copy table from ‘large 2GB csv’report “Unknown error”
Next
From: Ondřej Jirman
Date:
Subject: Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker