Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id CAFiTN-uY3by6E6pFjN6RuDRMKMiNZQiRL81u5ppzWQ7N3VVBAQ@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Mahendra Singh Thalor <mahi6run@gmail.com>)
List pgsql-hackers
On Wed, Apr 29, 2020 at 12:37 PM Mahendra Singh Thalor
<mahi6run@gmail.com> wrote:
>
> On Wed, 29 Apr 2020 at 11:15, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
> >
> > On Fri, 24 Apr 2020 at 11:55, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Apr 23, 2020 at 2:28 PM Erik Rijkers <er@xs4all.nl> wrote:
> > > >
> > > > On 2020-04-23 05:24, Dilip Kumar wrote:
> > > > > On Wed, Apr 22, 2020 at 9:31 PM Erik Rijkers <er@xs4all.nl> wrote:
> > > > >>
> > > > >> The 'ddl' one is apparently not quite fixed  - I get this in (cd
> > > > >> contrib; make check)' (in both assert-enabled and non-assert-enabled
> > > > >> build)
> > > > >
> > > > > Can you send me the contrib/test_decoding/regression.diffs file?
> > > >
> > > > Attached.
> > >
> > > So from regression.diff, it appears that in failing in memory
> > > allocation (+ERROR:  invalid memory alloc request size
> > > 94119198201896).  My colleague tried to reproduce this in a different
> > > environment but there is no success so far.  One more thing surprises
> > > me is that after
> > > (v15-0011-Provide-new-api-to-get-the-streaming-changes.patch)
> > > actually, it should never go for the streaming path. However, we can
> > > not ignore the fact that some of the changes might impact the
> > > non-streaming path as well.  Is it possible for you to somehow stop or
> > > break the code and send the stack trace?  One idea is by seeing the
> > > log we can see from where the error is raised i.e MemoryContextAlloc
> > > or palloc or some other similar function.  Once we know that we can
> > > convert that error to an assert and find the call stack.
> > >
> > > --
> >
> > Thanks Erik for reporting this issue.
> >
> > I am able to reproduce this issue(+ERROR:  invalid memory alloc
> > request size) on the top of v16 patch set. I applied all patches(12
> > patches) of v16 series and then I fired "make check -i" from
> > "contrib/test_decoding" folder. Below is stack trace of error:
> >
> > #0 0x0000560b1350902d in MemoryContextAlloc (context=0x560b14188d70,
> > size=94605581787992) at mcxt.c:806
> > #1 0x0000560b130f0ad5 in ReorderBufferRestoreChange
> > (rb=0x560b14188e90, txn=0x560b141baf08, data=0x560b1418a5e8 "K") at
> > reorderbuffer.c:3680
> > #2 0x0000560b130f0662 in ReorderBufferRestoreChanges
> > (rb=0x560b14188e90, txn=0x560b141baf08, file=0x560b1418ad10,
> > segno=0x560b1418ad20) at reorderbuffer.c:3564
> > #3 0x0000560b130e918a in ReorderBufferIterTXNInit (rb=0x560b14188e90,
> > txn=0x560b141baf08, iter_state=0x7ffef18b1600) at reorderbuffer.c:1186
> > #4 0x0000560b130eaee1 in ReorderBufferProcessTXN (rb=0x560b14188e90,
> > txn=0x560b141baf08, commit_lsn=25986584, snapshot_now=0x560b141b74d8,
> > command_id=0, streaming=false)
> > at reorderbuffer.c:1785
> > #5 0x0000560b130ecae1 in ReorderBufferCommit (rb=0x560b14188e90,
> > xid=508, commit_lsn=25986584, end_lsn=25989088,
> > commit_time=641449268431600, origin_id=0, origin_lsn=0)
> > at reorderbuffer.c:2315
> > #6 0x0000560b130d14a1 in DecodeCommit (ctx=0x560b1416ea80,
> > buf=0x7ffef18b19b0, parsed=0x7ffef18b1850, xid=508) at decode.c:654
> > #7 0x0000560b130cff98 in DecodeXactOp (ctx=0x560b1416ea80,
> > buf=0x7ffef18b19b0) at decode.c:261
> > #8 0x0000560b130cf99a in LogicalDecodingProcessRecord
> > (ctx=0x560b1416ea80, record=0x560b1416ee00) at decode.c:130
> > #9 0x0000560b130dbbbc in pg_logical_slot_get_changes_guts
> > (fcinfo=0x560b1417ee50, confirm=true, binary=false, streaming=false)
> > at logicalfuncs.c:285
> > #10 0x0000560b130dbe71 in pg_logical_slot_get_changes
> > (fcinfo=0x560b1417ee50) at logicalfuncs.c:354
> > #11 0x0000560b12e294d4 in ExecMakeTableFunctionResult
> > (setexpr=0x560b14177838, econtext=0x560b14177748,
> > argContext=0x560b1417ed30, expectedDesc=0x560b141814a0,
> > randomAccess=false) at execSRF.c:234
> > #12 0x0000560b12e5490f in FunctionNext (node=0x560b14177630) at
> > nodeFunctionscan.c:94
> > #13 0x0000560b12e2c108 in ExecScanFetch (node=0x560b14177630,
> > accessMtd=0x560b12e54836 <FunctionNext>, recheckMtd=0x560b12e54e15
> > <FunctionRecheck>) at execScan.c:133
> > #14 0x0000560b12e2c227 in ExecScan (node=0x560b14177630,
> > accessMtd=0x560b12e54836 <FunctionNext>, recheckMtd=0x560b12e54e15
> > <FunctionRecheck>) at execScan.c:199
> > #15 0x0000560b12e54e9b in ExecFunctionScan (pstate=0x560b14177630) at
> > nodeFunctionscan.c:270
> > #16 0x0000560b12e24e23 in ExecProcNodeFirst (node=0x560b14177630) at
> > execProcnode.c:450
> > #17 0x0000560b12e3e172 in ExecProcNode (node=0x560b14177630) at
> > ../../../src/include/executor/executor.h:245
> > #18 0x0000560b12e3e998 in fetch_input_tuple (aggstate=0x560b14176f40)
> > at nodeAgg.c:566
> > #19 0x0000560b12e4398f in agg_fill_hash_table
> > (aggstate=0x560b14176f40) at nodeAgg.c:2518
> > #20 0x0000560b12e42c9a in ExecAgg (pstate=0x560b14176f40) at nodeAgg.c:2139
> > #21 0x0000560b12e24e23 in ExecProcNodeFirst (node=0x560b14176f40) at
> > execProcnode.c:450
> > #22 0x0000560b12e8bb58 in ExecProcNode (node=0x560b14176f40) at
> > ../../../src/include/executor/executor.h:245
> > #23 0x0000560b12e8bd59 in ExecSort (pstate=0x560b14176d28) at nodeSort.c:108
> > #24 0x0000560b12e24e23 in ExecProcNodeFirst (node=0x560b14176d28) at
> > execProcnode.c:450
> > #25 0x0000560b12e10e71 in ExecProcNode (node=0x560b14176d28) at
> > ../../../src/include/executor/executor.h:245
> > #26 0x0000560b12e15c4c in ExecutePlan (estate=0x560b14176af0,
> > planstate=0x560b14176d28, use_parallel_mode=false,
> > operation=CMD_SELECT, sendTuples=true, numberTuples=0,
> > direction=ForwardScanDirection, dest=0x560b1419d188,
> > execute_once=true) at execMain.c:1646
> > #27 0x0000560b12e11a19 in standard_ExecutorRun
> > (queryDesc=0x560b1412db10, direction=ForwardScanDirection, count=0,
> > execute_once=true) at execMain.c:364
> > #28 0x0000560b12e116e1 in ExecutorRun (queryDesc=0x560b1412db10,
> > direction=ForwardScanDirection, count=0, execute_once=true) at
> > execMain.c:308
> > #29 0x0000560b131f2177 in PortalRunSelect (portal=0x560b140db860,
> > forward=true, count=0, dest=0x560b1419d188) at pquery.c:912
> > #30 0x0000560b131f1b14 in PortalRun (portal=0x560b140db860,
> > count=9223372036854775807, isTopLevel=true, run_once=true,
> > dest=0x560b1419d188, altdest=0x560b1419d188,
> > qc=0x7ffef18b2350) at pquery.c:756
> > #31 0x0000560b131e550b in exec_simple_query (
> > query_string=0x560b14076720 "/ display results, but hide most of the
> > output /\nSELECT count(*), min(data), max(data)\nFROM
> > pg_logical_slot_get_changes('regression_slot', NULL, NULL,
> > 'include-xids', '0', 'skip-empty-xacts', '1')\nG"...) at
> > postgres.c:1239
> > #32 0x0000560b131ee343 in PostgresMain (argc=1, argv=0x560b1409faa0,
> > dbname=0x560b1409f858 "contrib_regression", username=0x560b1409f830
> > "mahendrathalor") at postgres.c:4315
> > #33 0x0000560b130a325b in BackendRun (port=0x560b14096880) at postmaster.c:4510
> > #34 0x0000560b130a22c3 in BackendStartup (port=0x560b14096880) at
> > postmaster.c:4202
> > #35 0x0000560b1309a5cc in ServerLoop () at postmaster.c:1727
> > #36 0x0000560b130997c9 in PostmasterMain (argc=8, argv=0x560b1406f010)
> > at postmaster.c:1400
> > #37 0x0000560b12ee9530 in main (argc=8, argv=0x560b1406f010) at main.c:210
> >
> > I have Ubuntu setup. I think, this is reproducing into Ubuntu only. I
> > am looking into this issue with Dilip.
>
> This error is due to invalid size.
>
> diff --git a/src/backend/replication/logical/reorderbuffer.c
> b/src/backend/replication/logical/reorderbuffer.c
> index eed9a5048b..487c1b4252 100644
> --- a/src/backend/replication/logical/reorderbuffer.c
> +++ b/src/backend/replication/logical/reorderbuffer.c
> @@ -3678,7 +3678,7 @@ ReorderBufferRestoreChange(ReorderBuffer *rb,
> ReorderBufferTXN *txn,
>
>                                 change->data.inval.invalidations =
>                                                 MemoryContextAlloc(rb->context,
> -
>             change->data.msg.message_size);
> +
>             inval_size);
>                                 /* read the message */
>
> memcpy(change->data.inval.invalidations, data, inval_size);
>                                 data += inval_size;
>
> Above change, fixes the error. Thanks Dilip for helping.

Thanks, Mahendra for reproducing and help in fixing this.  I will
include this change in my next patch set.



pgsql-hackers by date:

Previous
From: Mahendra Singh Thalor
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Next
From: Richard Guo
Date:
Subject: Remove unnecessary relabel stripping