Thread: pgsql: Implement streaming mode in ReorderBuffer.

pgsql: Implement streaming mode in ReorderBuffer.

From
Amit Kapila
Date:
Implement streaming mode in ReorderBuffer.

Instead of serializing the transaction to disk after reaching the
logical_decoding_work_mem limit in memory, we consume the changes we have
in memory and invoke stream API methods added by commit 45fdc9738b.
However, sometimes if we have incomplete toast or speculative insert we
spill to the disk because we can't generate the complete tuple and stream.
And, as soon as we get the complete tuple we stream the transaction
including the serialized changes.

We can do this incremental processing thanks to having assignments
(associating subxact with toplevel xacts) in WAL right away, and
thanks to logging the invalidation messages at each command end. These
features are added by commits 0bead9af48 and c55040ccd0 respectively.

Now that we can stream in-progress transactions, the concurrent aborts
may cause failures when the output plugin consults catalogs (both system
and user-defined).

We handle such failures by returning ERRCODE_TRANSACTION_ROLLBACK
sqlerrcode from system table scan APIs to the backend or WALSender
decoding a specific uncommitted transaction. The decoding logic on the
receipt of such a sqlerrcode aborts the decoding of the current
transaction and continue with the decoding of other transactions.

We have ReorderBufferTXN pointer in each ReorderBufferChange by which we
know which xact it belongs to.  The output plugin can use this to decide
which changes to discard in case of stream_abort_cb (e.g. when a subxact
gets discarded).

We also provide a new option via SQL APIs to fetch the changes being
streamed.

Author: Dilip Kumar, Tomas Vondra, Amit Kapila, Nikhil Sontakke
Reviewed-by: Amit Kapila, Kuntal Ghosh, Ajin Cherian
Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/7259736a6e5b7c7588fff9578370736a6648acbb

Modified Files
--------------
contrib/test_decoding/Makefile                  |   2 +-
contrib/test_decoding/expected/stream.out       |  94 +++
contrib/test_decoding/expected/truncate.out     |   6 +
contrib/test_decoding/sql/stream.sql            |  30 +
contrib/test_decoding/sql/truncate.sql          |   1 +
contrib/test_decoding/test_decoding.c           |  13 +
doc/src/sgml/logicaldecoding.sgml               |   9 +-
doc/src/sgml/test-decoding.sgml                 |  22 +
src/backend/access/heap/heapam.c                |  13 +
src/backend/access/heap/heapam_visibility.c     |  42 +-
src/backend/access/index/genam.c                |  53 ++
src/backend/access/table/tableam.c              |   8 +
src/backend/access/transam/xact.c               |  19 +
src/backend/replication/logical/decode.c        |  17 +-
src/backend/replication/logical/logical.c       |  10 +
src/backend/replication/logical/reorderbuffer.c | 981 +++++++++++++++++++++---
src/include/access/heapam_xlog.h                |   1 +
src/include/access/tableam.h                    |  55 ++
src/include/access/xact.h                       |   4 +
src/include/replication/logical.h               |   1 +
src/include/replication/reorderbuffer.h         |  56 +-
21 files changed, 1331 insertions(+), 106 deletions(-)


Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Amit Kapila
Date:
On Sat, Aug 8, 2020 at 7:59 AM Amit Kapila <akapila@postgresql.org> wrote:
>
> Implement streaming mode in ReorderBuffer.
>

There is one failure [1] due to this commit.  Looking into same.

[1] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2020-08-08%2004%3A26%3A04

-- 
With Regards,
Amit Kapila.



Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Amit Kapila
Date:
On Sat, Aug 8, 2020 at 11:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Aug 8, 2020 at 7:59 AM Amit Kapila <akapila@postgresql.org> wrote:
> >
> > Implement streaming mode in ReorderBuffer.
> >
>
> There is one failure [1] due to this commit.  Looking into same.
>

The problem seems to be due to some parallel transactions (like
autovacuum) WAL which will be shown as an empty transaction as that
doesn't perform any Insert/Update/Delete to the non-catalog table. The
solution is to skip empty transactions while getting streamed changes.
We normally do this in other tests but missed doing it for the
streaming changes test. I'll push the fix in some time after some
verification.

-- 
With Regards,
Amit Kapila.



Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Tom Lane
Date:
Amit Kapila <akapila@postgresql.org> writes:
> Implement streaming mode in ReorderBuffer.

Looks like this test is somewhat unstable still:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2020-09-09%2003%3A42%3A19

diff -U3 /home/andres/build/buildfarm/HEAD/pgsql.build/../pgsql/contrib/test_decoding/expected/stream.out
/home/andres/build/buildfarm/HEAD/pgsql.build/contrib/test_decoding/results/stream.out
--- /home/andres/build/buildfarm/HEAD/pgsql.build/../pgsql/contrib/test_decoding/expected/stream.out    2020-08-08
09:18:34.493194657+0000 
+++ /home/andres/build/buildfarm/HEAD/pgsql.build/contrib/test_decoding/results/stream.out    2020-09-09
04:19:12.985912853+0000 
@@ -71,6 +71,8 @@
                    data
 ------------------------------------------
  opening a streamed block for transaction
+ closing a streamed block for transaction
+ opening a streamed block for transaction
  streaming change for transaction
  streaming change for transaction
  streaming change for transaction
@@ -83,7 +85,7 @@
  streaming change for transaction
  closing a streamed block for transaction
  committing streamed transaction
-(13 rows)
+(15 rows)

It's likely relevant that skink uses valgrind, which'd slow it
down a lot.  That suggests some timing dependency is still there.
(Note that skink has passed this test multiple times.)

            regards, tom lane



Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Amit Kapila
Date:
On Thu, Sep 10, 2020 at 4:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <akapila@postgresql.org> writes:
> > Implement streaming mode in ReorderBuffer.
>
> Looks like this test is somewhat unstable still:
>
>

Thanks for pointing out. I'll analyze and share my findings.

-- 
With Regards,
Amit Kapila.



Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Amit Kapila
Date:
On Thu, Sep 10, 2020 at 9:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 10, 2020 at 4:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <akapila@postgresql.org> writes:
> > > Implement streaming mode in ReorderBuffer.
> >
> > Looks like this test is somewhat unstable still:
> >
> >
>
> Thanks for pointing out. I'll analyze and share my findings.
>

I have started a discussion for this on hackers [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BOqgFNZkf7%3DETe_y5ntjgDk3T0wcdkd4Sot_u1hySGfw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Tom Lane
Date:
Amit Kapila <akapila@postgresql.org> writes:
> Implement streaming mode in ReorderBuffer.

I notice that new buildfarm member wrasse is unhappy about some of the
code added by this commit:

"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/replication/logical/reorderbuffer.c", line
2510:Warning: Likely null pointer dereference (*(curtxn+272)): ReorderBufferProcessTXN 

It's griping about this:

            curtxn->concurrent_abort = true;

and I think it's got a point.  There is little if any reason to have
confidence that curtxn must be non-NULL when this code is reached,
because it's in a PG_CATCH segment and there is a lot of code within
the PG_TRY that could throw an error before the spot where curtxn
is set.  Not to mention that curtxn is set only conditionally.

So I think

            if (curtxn)
                curtxn->concurrent_abort = true;

would be cheap insurance against a core dump.  Or maybe we should
do something else, but this seems really rickety as-is.

(wrasse seems to be generating boatloads of utterly useless warnings
otherwise, but after sifting through the chaff I did find this one.)

            regards, tom lane



Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Amit Kapila
Date:
On Wed, Apr 28, 2021 at 5:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <akapila@postgresql.org> writes:
> > Implement streaming mode in ReorderBuffer.
>
> I notice that new buildfarm member wrasse is unhappy about some of the
> code added by this commit:
>
> "/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/replication/logical/reorderbuffer.c", line
2510:Warning: Likely null pointer dereference (*(curtxn+272)): ReorderBufferProcessTXN
 
>
> It's griping about this:
>
>                         curtxn->concurrent_abort = true;
>
> and I think it's got a point.  There is little if any reason to have
> confidence that curtxn must be non-NULL when this code is reached,
> because it's in a PG_CATCH segment and there is a lot of code within
> the PG_TRY that could throw an error before the spot where curtxn
> is set.  Not to mention that curtxn is set only conditionally.
>
> So I think
>
>                         if (curtxn)
>                                 curtxn->concurrent_abort = true;
>
> would be cheap insurance against a core dump.
>

We can do that to silence this Warning, otherwise, there won't be any
problem because it is set only when we get a particular error code and
we can get that error code only when the conditions that lead to
curtxn being set are met.

-- 
With Regards,
Amit Kapila.



Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Wed, Apr 28, 2021 at 5:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It's griping about this:
>>         curtxn->concurrent_abort = true;
>> and I think it's got a point.  There is little if any reason to have
>> confidence that curtxn must be non-NULL when this code is reached,
>> because it's in a PG_CATCH segment and there is a lot of code within
>> the PG_TRY that could throw an error before the spot where curtxn
>> is set.  Not to mention that curtxn is set only conditionally.

> We can do that to silence this Warning, otherwise, there won't be any
> problem because it is set only when we get a particular error code and
> we can get that error code only when the conditions that lead to
> curtxn being set are met.

To be blunt, that argument is hopelessly naive.  You cannot safely
make assumptions about a CATCH block having omniscient knowledge
of where an error was thrown from.  At least, not with a check
no stronger than checking the SQLSTATE.  All you need is some
extension ... or a user-defined function ... deciding to throw
that same SQLSTATE, and you're hosed.

If this code really does make the assumption that there's
only one possible cause of that SQLSTATE, I wonder whether
it's broken in ways worse than a mere core dump.

            regards, tom lane



Re: pgsql: Implement streaming mode in ReorderBuffer.

From
Amit Kapila
Date:
On Wed, Apr 28, 2021 at 8:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Wed, Apr 28, 2021 at 5:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> It's griping about this:
> >>              curtxn->concurrent_abort = true;
> >> and I think it's got a point.  There is little if any reason to have
> >> confidence that curtxn must be non-NULL when this code is reached,
> >> because it's in a PG_CATCH segment and there is a lot of code within
> >> the PG_TRY that could throw an error before the spot where curtxn
> >> is set.  Not to mention that curtxn is set only conditionally.
>
> > We can do that to silence this Warning, otherwise, there won't be any
> > problem because it is set only when we get a particular error code and
> > we can get that error code only when the conditions that lead to
> > curtxn being set are met.
>
> To be blunt, that argument is hopelessly naive.  You cannot safely
> make assumptions about a CATCH block having omniscient knowledge
> of where an error was thrown from.  At least, not with a check
> no stronger than checking the SQLSTATE.  All you need is some
> extension ... or a user-defined function ... deciding to throw
> that same SQLSTATE, and you're hosed.
>

I see your point and it is possible that some callback can throw this
error code. I think we need a stronger check than just SQLSTATE to
ensure that we set concurrent abort flag and do other things in that
check only when we are decoding non-committed xacts. I'll write about
this on -hackers and do a further discussion there.

-- 
With Regards,
Amit Kapila.