Re: Parallel scan with SubTransGetTopmostTransaction assert coredump - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Date
Msg-id CAJcOf-ebgfWeAqDFQfwRN=h49VzkjPunR0TvkY8nax9c1e2szw@mail.gmail.com
Whole thread Raw
In response to RE: Parallel scan with SubTransGetTopmostTransaction assert coredump  ("Pengchengliu" <pengchengliu@tju.edu.cn>)
Responses RE: Parallel scan with SubTransGetTopmostTransaction assert coredump  ("Pengchengliu" <pengchengliu@tju.edu.cn>)
List pgsql-hackers
On Fri, May 14, 2021 at 12:25 PM Pengchengliu <pengchengliu@tju.edu.cn> wrote:
>
> Hi Greg,
>
>   Thanks for you replay and test.
>
>
>
>   When main process gets the transaction snapshot in InitializeParallelDSM->GetTransactionSnapshot,  the transaction
snapshotxmin is very likely follows active snapshot xmin.
 
>
>
>
>   Use gdb it is easy to verify it.
>
>   Create env as blow:
>
>
>
>   1, Use PG13.2(3fb4c75e857adee3da4386e947ba58a75f3e74b7), init a cluster database.
>
>   2, Append the postgres.conf as below:
>
>
>
>   max_connections = 2000
>
>   parallel_setup_cost=0
>
>   parallel_tuple_cost=0
>
>   min_parallel_table_scan_size=0
>
>   max_parallel_workers_per_gather=8
>
>   max_parallel_workers = 128
>
>
>
>   3, Start the cluster database. Use the init_test.sql script in attachment to create some test tables.
>
>   4, Use the sub_120.sql script in attachment with pgbench to test it.
>
>
>
>   pgbench  -d postgres -p 33550  -n -r -f sub_120.sql   -c 200 -j 200 -T 1800
>
>
>
>
>
>
>
>   Then you can login the database, and use GDB to verify it.
>
>   1, First use explain, make sure force Parallel is OK.
>
>
>
>   postgres=# explain (verbose,analyze) select count(*) from contend1;
>
>                                                                      QUERY PLAN
>
>
>
>
-------------------------------------------------------------------------------------------------------------------------------------
>
> ----------------
>
> Finalize Aggregate  (cost=12006.11..12006.12 rows=1 width=8) (actual time=1075.214..1075.449 rows=1 loops=1)
>
>    Output: count(*)
>
>    ->  Gather  (cost=12006.08..12006.09 rows=8 width=8) (actual time=1075.198..1075.433 rows=1 loops=1)
>
>          Output: (PARTIAL count(*))
>
>          Workers Planned: 8
>
>          Workers Launched: 0
>
>          ->  Partial Aggregate  (cost=12006.08..12006.09 rows=1 width=8) (actual time=1074.674..1074.676 rows=1
loops=1)
>
>                Output: PARTIAL count(*)
>
>                ->  Parallel Seq Scan on public.contend1  (cost=0.00..11690.06 rows=126406 width=0) (actual
time=0.008..587.454rows=1
 
>
> 010200 loops=1)
>
>                      Output: id, val, c2, c3, c4, c5, c6, c7, c8, c9, c10, crt_time
>
> Planning Time: 0.123 ms
>
> Execution Time: 1075.588 ms
>
>  postgres=# select pg_backend_pid();
>
> pg_backend_pid
>
> ----------------
>
>         2182678
>
>
>
>
>
>   2, use gdb to debug our backend process. Add the breakpoint in parallel.c:219 and continue.
>
>
>
>   gdb  -q -p 2182678
>
>   ...
>
>   (gdb) b parallel.c:219
>
> Breakpoint 1 at 0x55d085: file
/home/liupc/build/build_postgres2/../../devel/postgres2/src/backend/access/transam/parallel.c,line 219.
 
>
>   (gdb) c
>
> Continuing.
>
>
>
>   3, In the psql clinet, we can execute the explain command in step1 again.
>
>      After we get the breakpoint in gdb, we wait a moment. Then we execute next.
>
>      Use gdb check active_snapshot and transaction_snapshot, active_snapshot->xmin is 158987 and
transaction_snapshot->xminis 162160.
 
>
> When I use gdb test it, sometimes active_snapshot is the same as transaction_snapshot. Then you can try it multiple
times,and before execute next, try wait longer time.
 
>
>
>
>    Breakpoint 1, InitializeParallelDSM (pcxt=0x2d53670)
>
>     at /home/liupc/build/build_postgres2/../../devel/postgres2/src/backend/access/transam/parallel.c:219
>
> 219             Snapshot        transaction_snapshot = GetTransactionSnapshot();
>
> (gdb) n
>
> 220             Snapshot        active_snapshot = GetActiveSnapshot();
>
> (gdb)
>
> 223             oldcontext = MemoryContextSwitchTo(TopTransactionContext);
>
> (gdb) p *transaction_snapshot
>
> $1 = {snapshot_type = SNAPSHOT_MVCC, xmin = 162160, xmax = 183011, xip = 0x2d50d10, xcnt = 179, subxip =
0x148a9cddf010,
>
>   subxcnt = 0, suboverflowed = true, takenDuringRecovery = false, copied = false, curcid = 0, speculativeToken = 0,
>
>   active_count = 0, regd_count = 0, ph_node = {first_child = 0x0, next_sibling = 0x0, prev_or_parent = 0x0},
whenTaken= 0, lsn = 0}
 
>
> (gdb) p *active_snapshot
>
> $2 = {snapshot_type = SNAPSHOT_MVCC, xmin = 158987, xmax = 173138, xip = 0x2d53288, xcnt = 178, subxip = 0x0, subxcnt
=0,
 
>
>   suboverflowed = true, takenDuringRecovery = false, copied = true, curcid = 0, speculativeToken = 0, active_count =
1,
>
>   regd_count = 2, ph_node = {first_child = 0x0, next_sibling = 0x0, prev_or_parent = 0x2d52e48}, whenTaken = 0, lsn =
0}
>
> (gdb)
>
>

Hi Pengcheng,

I followed all your steps.
However, I perhaps get different behavior in my environment.
99% of the time, the xmin and xmax of the active_snapshot and
transaction_snapshot are the same (regardless of how long I wait at
different points after the breakpoint is hit). I've had one or two
instances where the xmax values differ. I managed to catch just one
case where there were different xmin and xmax values in the snapshots,
but this occurred just prior to the pgbench client completing and
terminating, and when I continued in the debugger, there was no
crash/coredump.

However, I think I've spotted something potentially important to this issue:
For me, almost always "suboverflowed = false" in the snapshots (except
in that one case, right at the end of the pgbench run), yet in your
gdb example "suboverflowed = true" in both of the snapshots (i.e. the
snapshot subxip array has overflowed). I'm guessing that this may be
related to the coredump issue, but I'm not exactly sure how it has
happened, and why it seemingly isn't being handled correctly and
causes that Assert to fire in your case.
Can you try and find out how the snapshot suboverflow is being set in
your case? (since you are getting this readily in your examined
snapshots???) I think there's only several places where it can be set
to "true" (e.g. procarray.c:1641).
Also, does increasing PGPROC_MAX_CACHED_SUBXIDS avoid, or delay, the
problem for you? It's currently defined as 64.
I notice that there's been some changes related to snapshot data
handling and subxid overflow since 13.2, so I'm wondering whether your
coredump issue can be reproduced with the latest code?

Regards,
Greg Nancarrow
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Pavel Borisov
Date:
Subject: Re: OOM in spgist insert
Next
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: Forget close an open relation in ReorderBufferProcessTXN()