Re: Skip collecting decoded changes of already-aborted transactions - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Skip collecting decoded changes of already-aborted transactions
Date
Msg-id CAD21AoD38hZHiMGeXbruNomJGSdoZi4hWxPtYJ3fubhvWva3mg@mail.gmail.com
Whole thread Raw
In response to Re: Skip collecting decoded changes of already-aborted transactions  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
On Tue, Nov 26, 2024 at 10:01 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 26 Nov 2024 at 02:58, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Nov 18, 2024 at 11:12 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > >
> > > Few comments:
> >
> > Thank you for reviewing the patch!
> >
> > > 1) Should we have the Assert inside ReorderBufferTruncateTXNIfAborted
> > > instead of having it at multiple callers, ReorderBufferResetTXN also
> > > has the Assert inside the function after truncate of the transaction:
> > > @@ -3672,6 +3758,14 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb)
> > >                         Assert(txn->total_size > 0);
> > >                         Assert(rb->size >= txn->total_size);
> > >
> > > +                       /* skip the transaction if aborted */
> > > +                       if (ReorderBufferTruncateTXNIfAborted(rb, txn))
> > > +                       {
> > > +                               /* All changes should be discarded */
> > > +                               Assert(txn->size == 0 && txn->total_size == 0);
> > > +                               continue;
> > > +                       }
> > > +
> > >                         ReorderBufferStreamTXN(rb, txn);
> > >                 }
> > >                 else
> > > @@ -3687,6 +3781,14 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb)
> > >                         Assert(txn->size > 0);
> > >                         Assert(rb->size >= txn->size);
> > >
> > > +                       /* skip the transaction if aborted */
> > > +                       if (ReorderBufferTruncateTXNIfAborted(rb, txn))
> > > +                       {
> > > +                               /* All changes should be discarded */
> > > +                               Assert(txn->size == 0 && txn->total_size == 0);
> > > +                               continue;
> > > +                       }
> >
> > Moved.
> >
> > >
> > > 2) txn->txn_flags can be moved to the next line to keep it within 80
> > > chars in this case:
> > >  * Check the transaction status by looking CLOG and discard all changes if
> > >  * the transaction is aborted. The transaction status is cached in
> > > txn->txn_flags
> > >  * so we can skip future changes and avoid CLOG lookups on the next call. Return
> >
> > Fixed.
> >
> > >
> > > 3) Is there any scenario where the Assert can fail as the toast is not reset:
> > > +        * Since we don't check the transaction status while replaying the
> > > +        * transaction, we don't need to reset toast reconstruction data here.
> > > +        */
> > > +       ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn), false);
> > >
> > > +                       if (ReorderBufferTruncateTXNIfAborted(rb, txn))
> > > +                       {
> > > +                               /* All changes should be discarded */
> > > +                               Assert(txn->size == 0 && txn->total_size == 0);
> > > +                               continue;
> > > +                       }
> >
> > IIUC we reconstruct TOAST data when replaying the transaction. On the
> > other hand, this function is called while adding a decoded change but
> > not when replaying the transaction. So we should not have any toast
> > reconstruction data at this point unless I'm missing something. Do you
> > have any scenario where we call ReorderBufferTruncateTXNIfAborted()
> > while a transaction has TOAST reconstruction data?
>
> I have checked further regarding the toast and verified the population
> of the toast hash. I agree with you on this. Overall, the patch
> appears to be in good shape.

Thank you for the confirmation!

I thought we'd done performance tests with this patch but Michael-san
pointed out we've not done yet. So I've done benchmark tests in two
scenarios:

A. Skip decoding large aborted transactions.

1. Preparation (SQL commands)

create table test (c int);
select pg_create_logical_replication_slot('s', 'test_decoding');
begin;
insert into test select generate_series(1, 1_000_000);
commit;
begin;
insert into test select generate_series(1, 1_000_000);
rollback;
begin;
insert into test select generate_series(1, 1_000_000);
rollback;

2. Performance tests (results are w/o patch vs. w/ patch)

-- causes some spill/streamed transactions
set logical_decoding_work_mem to '64MB';

select 'non-streaming', count(*) from
pg_logical_slot_peek_changes('s', null, null, 'stream-changes',
'false');
  -> 2636.208 ms vs. 2070.906 ms

select 'streaming', count(*) from pg_logical_slot_peek_changes('s',
null, null, 'stream-changes', 'true');
  -> 910.579 ms vs. 653.574 ms

 -- no spill/streamed transactions
set logical_decoding_work_mem to '5GB';

select 'non-streaming', count(*) from
pg_logical_slot_peek_changes('s', null, null, 'stream-changes',
'false');
   -> 962.863 ms vs. 956.910 ms

select 'streaming', count(*) from pg_logical_slot_peek_changes('s',
null, null, 'stream-changes', 'true');
   -> 973.426 ms vs. 973.033 ms

According to the results, skipping logical decoding of already-aborted
transactions contributes performance improvements.

B. Decoding medium-size transactions to check overheads of CLOG lookups.

1. Preparation (shell script)

pgbench -i -s 1 postgres
psql -c "create table test (c int)"
psql -c "select pg_create_logical_replication_slot('s', 'test_decoding')"
echo "insert into test select generate_series(1, 100)" > /tmp/bench.sql
pgbench -t 10000 -c 10 -j 5 -f /tmp/bench.sql postgres

2. Performance tests

 -- spill/streamed transactions
set logical_decoding_work_mem to '64';

select 'non-streaming', count(*) from
pg_logical_slot_peek_changes('s', null, null, 'stream-changes',
'false');
  -> 7230.537 ms vs. 7154.322 ms

select 'streaming', count(*) from pg_logical_slot_peek_changes('s',
null, null, 'stream-changes', 'true');
   -> 6702.438 ms vs. 6678.232 ms

Overall, I don't see noticeable overheads of CLOG lookups.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: FileFallocate misbehaving on XFS
Next
From: Yan Chengpeng
Date:
Subject: Re: Incorrect EXPLAIN ANALYZE output in bloom index docs