Re: logical decoding vs. VACUUM FULL / CLUSTER on table with TOAST-eddata - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding vs. VACUUM FULL / CLUSTER on table with TOAST-eddata
Date
Msg-id 9d8e9a45-11e6-740a-b387-83c1c0829263@2ndquadrant.com
Whole thread Raw
In response to Re: logical decoding vs. VACUUM FULL / CLUSTER on table with TOAST-ed data  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: logical decoding vs. VACUUM FULL / CLUSTER on table with TOAST-eddata
List pgsql-hackers
On 11/19/18 10:28 AM, Masahiko Sawada wrote:
> On Mon, Nov 19, 2018 at 6:52 AM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>>
>> Hi,
>>
>> It seems we have pretty annoying problem with logical decoding when
>> performing VACUUM FULL / CLUSTER on a table with toast-ed data.
>>
>> The trouble is that the rewritten heap is WAL-logged using XLOG/FPI
>> records, the TOAST data is logged as regular INSERT records. XLOG/FPI is
>> ignored in logical decoding, and so reorderbuffer never gets those
>> records. But we do decode the TOAST data, and reorderbuffer stashes them
>> in toast_hash hash table. Which gets reset only when handling a row from
>> the main heap, and that never arrives. So we end up stashing all the
>> TOAST data in memory :-(
>>
>> So do VACUUM FULL (or CLUSTER) on a sufficiently large table, and you're
>> likely to break any logical replication connection. And it does not
>> matter if you replicate this particular table.
>>
>> Luckily enough, this can leverage some of the pieces introduced by
>> e9edc1ba which was meant to deal with rewrites of system tables, and in
>> raw_heap_insert it added this:
>>
>>      /*
>>       * The new relfilenode's relcache entrye doesn't have the necessary
>>       * information to determine whether a relation should emit data for
>>       * logical decoding.  Force it to off if necessary.
>>       */
>>      if (!RelationIsLogicallyLogged(state->rs_old_rel))
>>          options |= HEAP_INSERT_NO_LOGICAL;
>>
>> As raw_heap_insert is used only for heap rewrites, we can simply remove
>> the if condition and use the HEAP_INSERT_NO_LOGICAL flag for all TOAST
>> data logged from here.
>>
> 
> This fix seems fine to me.
> 
>> This does fix the issue, because we still decode the TOAST changes but
>> there are no data and so
>>
>>      if (change->data.tp.newtuple != NULL)
>>      {
>>          dlist_delete(&change->node);
>>          ReorderBufferToastAppendChunk(rb, txn, relation,
>>                                        change);
>>      }
>>
>> ends up not stashing the change in the hash table.
> 
> With the below change you proposed can we remove the above condition
> because toast-insertion changes being processed by the reorder buffer
> always have a new tuple? If a toast-insertion record doesn't have a
> new tuple it has already ignored when decoding.
> 

Good point. I think you're right the reorderbuffer part may be 
simplified as you propose.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Daniel Westermann
Date:
Subject: Re: zheap: a new storage format for PostgreSQL
Next
From: Amit Kapila
Date:
Subject: Re: zheap: a new storage format for PostgreSQL