Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range - Mailing list pgsql-hackers

From Vladimir Borodin
Subject Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Date
Msg-id D4C1FFBF-E724-4EFA-941C-4A68AE357D73@simply.name
Whole thread Raw
In response to Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi all.

28 апр. 2017 г., в 0:22, Tom Lane <tgl@sss.pgh.pa.us> написал(а):

Robert Haas <robertmhaas@gmail.com> writes:
Yep, and I've seen that turn into a serious problem in production.

And that’s why we started digging into it :) We have already seen that to be a problem in another project (not so serious) and that time we rewrote the query. But now planning of such query consumes much more CPU (i.e. on [1] each line is for one DB host) and nearly all queries on all hosts (primary and all standbys) work much worse.


How so?  Shouldn't the indexscan go back and mark such tuples dead in
the index, such that they'd be visited this way only once?  If that's
not happening, maybe we should try to fix it.

That is definitely not happening. Here is the perf output when the problem happens:
root@pgload01e ~ # perf report | grep -v '^#' | head   56.67%  postgres   postgres                [.] _bt_checkkeys   19.27%  postgres   postgres                [.] _bt_readpage    2.09%  postgres   postgres                [.] pglz_decompress    2.03%  postgres   postgres                [.] LWLockAttemptLock    1.61%  postgres   postgres                [.] PinBuffer.isra.3    1.14%  postgres   postgres                [.] hash_search_with_hash_value    0.68%  postgres   postgres                [.] LWLockRelease    0.42%  postgres   postgres                [.] AllocSetAlloc    0.40%  postgres   postgres                [.] SearchCatCache    0.40%  postgres   postgres                [.] ReadBuffer_common
root@pgload01e ~ #
And here is one of backtrace of one of backends (they all look pretty the same):
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fe9e2660709 in _bt_checkkeys (scan=scan@entry=0x7fe9e48ec838, page=page@entry=0x7fe8e477d780 "\263$", offnum=offnum@entry=338, dir=dir@entry=ForwardScanDirection, continuescan=continuescan@entry=0x7ffe3addfd1f "\001h\341\b") at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/nbtree/nbtutils.c:1378
#0  0x00007fe9e2660709 in _bt_checkkeys (scan=scan@entry=0x7fe9e48ec838, page=page@entry=0x7fe8e477d780 "\263$", offnum=offnum@entry=338, dir=dir@entry=ForwardScanDirection, continuescan=continuescan@entry=0x7ffe3addfd1f "\001h\341\b") at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/nbtree/nbtutils.c:1378
#1  0x00007fe9e265d301 in _bt_readpage (scan=scan@entry=0x7fe9e48ec838, dir=dir@entry=ForwardScanDirection, offnum=338) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/nbtree/nbtsearch.c:1198
#2  0x00007fe9e265dbce in _bt_steppage (scan=scan@entry=0x7fe9e48ec838, dir=dir@entry=ForwardScanDirection) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/nbtree/nbtsearch.c:1357
#3  0x00007fe9e265efec in _bt_endpoint (dir=ForwardScanDirection, scan=0x7fe9e48ec838) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/nbtree/nbtsearch.c:1739
#4  _bt_first (scan=scan@entry=0x7fe9e48ec838, dir=dir@entry=ForwardScanDirection) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/nbtree/nbtsearch.c:746
#5  0x00007fe9e265c119 in btgettuple (scan=0x7fe9e48ec838, dir=ForwardScanDirection) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/nbtree/nbtree.c:326
#6  0x00007fe9e2656162 in index_getnext_tid (scan=scan@entry=0x7fe9e48ec838, direction=direction@entry=ForwardScanDirection) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/index/indexam.c:415
#7  0x00007fe9e2656354 in index_getnext (scan=scan@entry=0x7fe9e48ec838, direction=direction@entry=ForwardScanDirection) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/access/index/indexam.c:553
#8  0x00007fe9e2942f70 in get_actual_variable_range (vardata=vardata@entry=0x7ffe3ade1620, sortop=<optimized out>, min=0x7fe9e4912ae0, max=0x0, root=0x7fe9e48e89b0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/utils/adt/selfuncs.c:5137
#9  0x00007fe9e2943c24 in ineq_histogram_selectivity (root=root@entry=0x7fe9e48e89b0, vardata=vardata@entry=0x7ffe3ade1620, opproc=opproc@entry=0x7ffe3ade1550, isgt=isgt@entry=0 '\000', constval=constval@entry=493076, consttype=consttype@entry=23) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/utils/adt/selfuncs.c:833
#10 0x00007fe9e29445f8 in scalarineqsel (root=root@entry=0x7fe9e48e89b0, operator=operator@entry=97, isgt=isgt@entry=0 '\000', vardata=vardata@entry=0x7ffe3ade1620, constval=493076, consttype=23) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/utils/adt/selfuncs.c:563
#11 0x00007fe9e2946233 in mergejoinscansel (root=root@entry=0x7fe9e48e89b0, clause=<optimized out>, opfamily=<optimized out>, strategy=<optimized out>, nulls_first=<optimized out>, leftstart=leftstart@entry=0x7ffe3ade1728, leftend=leftend@entry=0x7ffe3ade1730, rightstart=rightstart@entry=0x7ffe3ade1738, rightend=rightend@entry=0x7ffe3ade1740) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/utils/adt/selfuncs.c:3056
#12 0x00007fe9e27e1b16 in cached_scansel (rinfo=0x7fe9e49100c0, rinfo=0x7fe9e49100c0, pathkey=0x7fe9e490fb80, root=0x7fe9e48e89b0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/costsize.c:2626
#13 initial_cost_mergejoin (root=root@entry=0x7fe9e48e89b0, workspace=workspace@entry=0x7ffe3ade1830, jointype=jointype@entry=JOIN_INNER, mergeclauses=mergeclauses@entry=0x7fe9e4912a60, outer_path=outer_path@entry=0x7fe9e490f810, inner_path=inner_path@entry=0x7fe9e490f5b8, outersortkeys=outersortkeys@entry=0x7fe9e4912a10, innersortkeys=innersortkeys@entry=0x7fe9e4912ab0, sjinfo=0x7ffe3ade1a70) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/costsize.c:2226
#14 0x00007fe9e27ebde6 in try_mergejoin_path (root=root@entry=0x7fe9e48e89b0, joinrel=joinrel@entry=0x7fe9e4910b28, outer_path=outer_path@entry=0x7fe9e490f810, inner_path=inner_path@entry=0x7fe9e490f5b8, pathkeys=0x0, mergeclauses=mergeclauses@entry=0x7fe9e4912a60, outersortkeys=outersortkeys@entry=0x7fe9e4912a10, innersortkeys=innersortkeys@entry=0x7fe9e4912ab0, jointype=jointype@entry=JOIN_INNER, extra=extra@entry=0x7ffe3ade1970) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/joinpath.c:443
#15 0x00007fe9e27ec705 in sort_inner_and_outer (extra=0x7ffe3ade1970, jointype=JOIN_INNER, innerrel=0x7fe9e48eba00, outerrel=0x7fe9e48ea990, joinrel=0x7fe9e4910b28, root=0x7fe9e48e89b0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/joinpath.c:766
#16 add_paths_to_joinrel (root=root@entry=0x7fe9e48e89b0, joinrel=joinrel@entry=0x7fe9e4910b28, outerrel=outerrel@entry=0x7fe9e48ea990, innerrel=innerrel@entry=0x7fe9e48eba00, jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x7ffe3ade1a70, restrictlist=restrictlist@entry=0x7fe9e4912850) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/joinpath.c:173
#17 0x00007fe9e27ee201 in make_join_rel (root=root@entry=0x7fe9e48e89b0, rel1=rel1@entry=0x7fe9e48ea990, rel2=rel2@entry=0x7fe9e48eba00) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/joinrels.c:754
#18 0x00007fe9e27eeabb in make_rels_by_clause_joins (other_rels=<optimized out>, old_rel=<optimized out>, root=<optimized out>) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/joinrels.c:274
#19 join_search_one_level (root=root@entry=0x7fe9e48e89b0, level=level@entry=2) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/joinrels.c:96
#20 0x00007fe9e27df09b in standard_join_search (root=0x7fe9e48e89b0, levels_needed=2, initial_rels=<optimized out>) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/allpaths.c:2186
#21 0x00007fe9e27df41b in make_one_rel (root=root@entry=0x7fe9e48e89b0, joinlist=joinlist@entry=0x7fe9e48ebd58) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/path/allpaths.c:176
#22 0x00007fe9e27fa10e in query_planner (root=root@entry=0x7fe9e48e89b0, tlist=tlist@entry=0x7fe9e48e9020, qp_callback=qp_callback@entry=0x7fe9e27fa740 <standard_qp_callback>, qp_extra=qp_extra@entry=0x7ffe3ade1d60) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/plan/planmain.c:255
#23 0x00007fe9e27fbc14 in grouping_planner (root=root@entry=0x7fe9e48e89b0, inheritance_update=inheritance_update@entry=0 '\000', tuple_fraction=<optimized out>, tuple_fraction@entry=0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/plan/planner.c:1701
#24 0x00007fe9e27fe861 in subquery_planner (glob=glob@entry=0x7fe9e4533270, parse=parse@entry=0x7fe9e4760290, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=0 '\000', tuple_fraction=tuple_fraction@entry=0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/plan/planner.c:759
#25 0x00007fe9e27ff7bd in standard_planner (parse=0x7fe9e4760290, cursorOptions=256, boundParams=0x0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/optimizer/plan/planner.c:292
#26 0x00007fe9dbeb6345 in pathman_planner_hook () from /usr/lib/postgresql/9.6/lib/pg_pathman.so
#27 0x00007fe9e288d1d4 in pg_plan_query (querytree=<optimized out>, cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/tcop/postgres.c:798
#28 0x00007fe9e288d2c4 in pg_plan_queries (querytrees=<optimized out>, cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/tcop/postgres.c:857
#29 0x00007fe9e288f52d in exec_simple_query (query_string=0x7fe9e475e9d0 "SELECT * FROM polygon_table polygon INNER JOIN second_table second ON (polygon.second_id = second.id) WHERE ST_Intersects(poly, ST_SetSrid(ST_MakePoint(52.3433914, 58.7438431), 4326));") at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/tcop/postgres.c:1022
#30 PostgresMain (argc=<optimized out>, argv=argv@entry=0x7fe9e452f428, dbname=0x7fe9e452f350 "small_weather", username=<optimized out>) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/tcop/postgres.c:4076
#31 0x00007fe9e26134e5 in BackendRun (port=0x7fe9e452d0b0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/postmaster/postmaster.c:4271
#32 BackendStartup (port=0x7fe9e452d0b0) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/postmaster/postmaster.c:3945
#33 ServerLoop () at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/postmaster/postmaster.c:1701
#34 0x00007fe9e282e4db in PostmasterMain (argc=5, argv=<optimized out>) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/postmaster/postmaster.c:1309
#35 0x00007fe9e2614232 in main (argc=5, argv=0x7fe9e44e1210) at /home/robot-gerrit/workspace/postgresql-9.6-deb/src/build/../src/backend/main/main.c:228
And if you disable autovacuum on the table, the problem will continue until you manually run vacuum on it. There could be thousands of index scans during this time. Dmitriy investigated the problem a bit more and may provide some more details.

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] vcregress support for single TAP tests
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] convert EXSITS to inner join gotcha and bug