Thread: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689)

FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689)

From
Justin Pryzby
Date:
Core was generated by `postgres: telsasoft ts [local] BIND                                           '.

(gdb) bt
#0  0x00007f0951303387 in raise () from /lib64/libc.so.6
#1  0x00007f0951304a78 in abort () from /lib64/libc.so.6
#2  0x0000000000921005 in ExceptionalCondition (conditionName=conditionName@entry=0xa5db3d "pd_idx == pinfo->nparts",
errorType=errorType@entry=0x977389"FailedAssertion", 
 
    fileName=fileName@entry=0xa5da88 "execPartition.c", lineNumber=lineNumber@entry=1689) at assert.c:67
#3  0x0000000000672806 in ExecCreatePartitionPruneState (planstate=planstate@entry=0x908f6d8,
partitionpruneinfo=<optimizedout>) at execPartition.c:1689
 
#4  0x000000000068444a in ExecInitAppend (node=node@entry=0x7036b90, estate=estate@entry=0x11563f0,
eflags=eflags@entry=16)at nodeAppend.c:132
 
#5  0x00000000006731fd in ExecInitNode (node=0x7036b90, estate=estate@entry=0x11563f0, eflags=eflags@entry=16) at
execProcnode.c:179
#6  0x000000000069d03a in ExecInitResult (node=node@entry=0x70363d8, estate=estate@entry=0x11563f0,
eflags=eflags@entry=16)at nodeResult.c:210
 
#7  0x000000000067323c in ExecInitNode (node=0x70363d8, estate=estate@entry=0x11563f0, eflags=eflags@entry=16) at
execProcnode.c:164
#8  0x000000000069e834 in ExecInitSort (node=node@entry=0x7035ca8, estate=estate@entry=0x11563f0,
eflags=eflags@entry=16)at nodeSort.c:210
 
#9  0x0000000000672ff0 in ExecInitNode (node=0x7035ca8, estate=estate@entry=0x11563f0, eflags=eflags@entry=16) at
execProcnode.c:313
#10 0x00000000006812e8 in ExecInitAgg (node=node@entry=0x68311d0, estate=estate@entry=0x11563f0,
eflags=eflags@entry=16)at nodeAgg.c:3292
 
#11 0x0000000000672fb1 in ExecInitNode (node=0x68311d0, estate=estate@entry=0x11563f0, eflags=eflags@entry=16) at
execProcnode.c:328
#12 0x000000000068925a in ExecInitGatherMerge (node=node@entry=0x6830998, estate=estate@entry=0x11563f0,
eflags=eflags@entry=16)at nodeGatherMerge.c:110
 
#13 0x0000000000672f33 in ExecInitNode (node=0x6830998, estate=estate@entry=0x11563f0, eflags=eflags@entry=16) at
execProcnode.c:348
#14 0x00000000006812e8 in ExecInitAgg (node=node@entry=0x682eda8, estate=estate@entry=0x11563f0,
eflags=eflags@entry=16)at nodeAgg.c:3292
 
#15 0x0000000000672fb1 in ExecInitNode (node=node@entry=0x682eda8, estate=estate@entry=0x11563f0,
eflags=eflags@entry=16)at execProcnode.c:328
 
#16 0x000000000066c8e6 in InitPlan (eflags=16, queryDesc=<optimized out>) at execMain.c:1020
#17 standard_ExecutorStart (queryDesc=<optimized out>, eflags=16) at execMain.c:266
#18 0x00007f0944ca83b5 in pgss_ExecutorStart (queryDesc=0x1239b08, eflags=<optimized out>) at
pg_stat_statements.c:1007
#19 0x00007f09117e4891 in explain_ExecutorStart (queryDesc=0x1239b08, eflags=<optimized out>) at auto_explain.c:301
#20 0x00000000007f9983 in PortalStart (portal=0xeff810, params=0xfacc98, eflags=0, snapshot=0x0) at pquery.c:505
#21 0x00000000007f7370 in PostgresMain (argc=<optimized out>, argv=argv@entry=0xeb8500, dbname=0xeb84e0 "ts",
username=<optimizedout>) at postgres.c:1987
 
#22 0x000000000048916e in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4523
#23 BackendStartup (port=0xeb1000) at postmaster.c:4215
#24 ServerLoop () at postmaster.c:1727
#25 0x000000000076ec85 in PostmasterMain (argc=argc@entry=13, argv=argv@entry=0xe859b0) at postmaster.c:1400
#26 0x000000000048a82d in main (argc=13, argv=0xe859b0) at main.c:210

#3  0x0000000000672806 in ExecCreatePartitionPruneState (planstate=planstate@entry=0x908f6d8,
partitionpruneinfo=<optimizedout>) at execPartition.c:1689
 
        pd_idx = <optimized out>
        pp_idx = <optimized out>
        pprune = 0x908f910
        partdesc = 0x91937f8
        pinfo = 0x7d6ee78
        partrel = <optimized out>
        partkey = 0xfbba28
        lc2__state = {l = 0x7d6ee20, i = 0}
        partrelpruneinfos = 0x7d6ee20
        lc2 = <optimized out>
        npartrelpruneinfos = <optimized out>
        prunedata = 0x908f908
        j = 0
        lc__state = {l = 0x7d6edc8, i = 0}
        estate = 0x11563f0
        prunestate = 0x908f8b0
        n_part_hierarchies = <optimized out>
        lc = <optimized out>
        i = 0

(gdb) p *pinfo
$2 = {type = T_PartitionedRelPruneInfo, rtindex = 7, present_parts = 0x7d6ef10, nparts = 414, subplan_map = 0x7d6ef68,
subpart_map= 0x7d6f780, relid_map = 0x7d6ff98, initial_pruning_steps = 0x7d707b0, 
 
  exec_pruning_steps = 0x0, execparamids = 0x0}

(gdb) p pd_idx        
$3 = <optimized out>


< 2020-08-02 02:04:17.358 SST  >LOG:  server process (PID 20954) was terminated by signal 6: Aborted
< 2020-08-02 02:04:17.358 SST  >DETAIL:  Failed process was running: 
                    INSERT INTO child.cdrs_data_users_per_cell_20200801 (...list of columns elided...)
                    (
                    SELECT ..., $3::timestamp, $2,
                MODE() WITHIN GROUP (ORDER BY ...) AS ..., STRING_AGG(DISTINCT ..., ',') AS ..., ...

This crashed at 2am, which at first I thought was maybe due to simultaneously
creating today's partition.

Aug  2 02:04:08 telsasoftsky abrt-hook-ccpp: Process 19264 (postgres) of user 26 killed by SIGABRT - dumping core
Aug  2 02:04:17 telsasoftsky abrt-hook-ccpp: Process 20954 (postgres) of user 26 killed by SIGABRT - ignoring (repeated
crash)

Running:
postgresql13-server-13-beta2_1PGDG.rhel7.x86_64

Maybe this is a problem tickled by something new in v13.  However, this is a
new VM, and at the time of the crash I was running a shell loop around
pg_restore, in reverse-chronological order.  I have full logs, and I found that
just CREATEd was a table which the crashing process would've tried to SELECT FROM:

| 2020-08-02 02:04:01.48-11  | duration: 106.275 ms  statement: CREATE TABLE child.cdrs_huawei_sgwrecord_2019_06_14 (

That table *currently* has:
|Number of partitions: 416 (Use \d+ to list them.)
And the oldest table is still child.cdrs_huawei_sgwrecord_2019_06_14 (since the
shell loop probably quickly spun through hundreds of pg_restores, failing to
connect to the database "in recovery").  And today's partition was already
created, at: 2020-08-02 01:30:35.  So I think 

Based on commit logs, I suspect this may be an "older bug", specifically maybe
with:

|commit 898e5e3290a72d288923260143930fb32036c00c
|Author: Robert Haas <rhaas@postgresql.org>
|Date:   Thu Mar 7 11:13:12 2019 -0500
|
|    Allow ATTACH PARTITION with only ShareUpdateExclusiveLock.

I don't think it matters, but the process surrounding the table being INSERTed
INTO is more than a little special, involving renames, detaches, creation,
re-attaching within a transaction.  I think that doesn't matter though, and the
issue is surrounding the table being SELECTed *from*, which is actually behind
a view.



On Sun, Aug 2, 2020 at 2:11 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> Based on commit logs, I suspect this may be an "older bug", specifically maybe
> with:
>
> |commit 898e5e3290a72d288923260143930fb32036c00c
> |Author: Robert Haas <rhaas@postgresql.org>
> |Date:   Thu Mar 7 11:13:12 2019 -0500
> |
> |    Allow ATTACH PARTITION with only ShareUpdateExclusiveLock.
>
> I don't think it matters, but the process surrounding the table being INSERTed
> INTO is more than a little special, involving renames, detaches, creation,
> re-attaching within a transaction.  I think that doesn't matter though, and the
> issue is surrounding the table being SELECTed *from*, which is actually behind
> a view.

That's an entirely reasonable guess, but it doesn't seem easy to
understand exactly what happened here based on the provided
information. The assertion failure probably indicates that
pinfo->relid_map[] and partdesc->oids[] differ in some way other than
additional elements having been inserted into the latter. For example,
some elements might have disappeared, or the order might have changed.
This isn't supposed to happen, because DETACH PARTITION requires
heavier locking, and the order changing without anything getting
detached should be impossible. But evidently it did. If we could dump
out the two arrays in question it might shed more light on exactly how
things went wrong.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Mon, Aug 03, 2020 at 11:41:37AM -0400, Robert Haas wrote:
> On Sun, Aug 2, 2020 at 2:11 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > Based on commit logs, I suspect this may be an "older bug", specifically maybe
> > with:
> >
> > |commit 898e5e3290a72d288923260143930fb32036c00c
> > |Author: Robert Haas <rhaas@postgresql.org>
> > |Date:   Thu Mar 7 11:13:12 2019 -0500
> > |
> > |    Allow ATTACH PARTITION with only ShareUpdateExclusiveLock.
> >
> > I don't think it matters, but the process surrounding the table being INSERTed
> > INTO is more than a little special, involving renames, detaches, creation,
> > re-attaching within a transaction.  I think that doesn't matter though, and the
> > issue is surrounding the table being SELECTed *from*, which is actually behind
> > a view.
> 
> That's an entirely reasonable guess, but it doesn't seem easy to
> understand exactly what happened here based on the provided
> information. The assertion failure probably indicates that
> pinfo->relid_map[] and partdesc->oids[] differ in some way other than
> additional elements having been inserted into the latter. For example,
> some elements might have disappeared, or the order might have changed.
> This isn't supposed to happen, because DETACH PARTITION requires
> heavier locking, and the order changing without anything getting
> detached should be impossible. But evidently it did. If we could dump
> out the two arrays in question it might shed more light on exactly how
> things went wrong.

(gdb) p *pinfo->relid_map@414
$8 = {22652203, 22652104, 22651920, 22651654, 22647359, 22645269, 22644012, 22639600, 22638852, 22621975, 22615355,
22615256,22615069, 22610573, 22606503, 22606404, 22600237, 22600131, 22596610, 22595013, 
 
  22594914, 22594725, 22594464, 22589317, 22589216, 22587504, 22582570, 22580264, 22577047, 22576948, 22576763,
22576656,22574077, 22570911, 22570812, 22564524, 22564113, 22558519, 22557080, 22556981, 22556793, 
 
  22555205, 22550680, 22550579, 22548884, 22543899, 22540822, 22536665, 22536566, 22536377, 22535133, 22528876,
22527780,22526065, 22521131, 22517878, 22513674, 22513575, 22513405, 22513288, 22507520, 22504728, 
 
  22504629, 22493699, 22466016, 22458641, 22457551, 22457421, 22457264, 22452879, 22449864, 22449765, 22443560,
22442952,22436193, 22434644, 22434469, 22434352, 22430792, 22426903, 22426804, 22420732, 22420025, 
 
  22413050, 22411963, 22411864, 22411675, 22407652, 22404156, 22404049, 22397550, 22394622, 22390035, 22389936,
22389752,22388386, 22383211, 22382115, 22381934, 22375210, 22370297, 22367878, 22367779, 22367586, 
 
  22362556, 22359928, 22358236, 22353374, 22348704, 22345692, 22345593, 22345399, 22341347, 22336809, 22336709,
22325812,22292836, 22287756, 22287657, 22287466, 22283194, 22278659, 22278560, 22272041, 22269121, 
 
  22264424, 22264325, 22264135, 22260102, 22255418, 22254818, 22248841, 22245824, 22241490, 22241391, 22241210,
22240354,22236224, 22235123, 22234060, 22228744, 22228345, 22228033, 22222528, 22222429, 22222330, 
 
  22222144, 22222045, 22218408, 22215986, 22215887, 22209311, 22209212, 22207919, 22205203, 22203385, 22203298,
22203211,22203124, 22202954, 22202859, 22202772, 22201869, 22200438, 22197706, 22195027, 22194932, 
 
  22194834, 22191208, 22188412, 22187029, 22182238, 22182134, 22182030, 22181849, 22181737, 22181107, 22175811,
22175710,22169859, 22169604, 22159266, 22158131, 22158021, 22157824, 22153348, 22153236, 22147308, 
 
  22146736, 22143778, 22143599, 22143471, 22138702, 22138590, 22132612, 22132513, 22132271, 22132172, 22131987,
21935599,21932664, 21927997, 21925823, 21885889, 21862973, 21859854, 21859671, 21858869, 21853440, 
 
  21851884, 21845405, 21842901, 21837523, 21837413, 21837209, 21832347, 21829359, 21827652, 21822602, 21816150,
21805995,21805812, 21805235, 21798914, 21798026, 21791895, 21791124, 21783854, 21783744, 21783540, 
 
  21780568, 21774797, 21774687, 21768326, 21764063, 21759627, 21759517, 21759311, 21755697, 21751690, 21751156,
21744906,21738543, 21736176, 21735992, 21735769, 21727603, 21725956, 21716432, 21678682, 21670968, 
 
  21670858, 21670665, 21669342, 21661932, 21661822, 21655311, 21650838, 21646721, 21646611, 21646409, 21640984,
21637816,21637706, 21631061, 21622723, 21621459, 21621320, 21621148, 21612902, 21612790, 21606170, 
 
  21602265, 21597910, 21597800, 21597605, 21592489, 21589415, 21589305, 21582910, 21578017, 21576758, 21576648,
21572692,21566633, 21566521, 21560127, 21560017, 21553910, 21553800, 21553613, 21553495, 21549102, 
 
  21548992, 21542759, 21540922, 21532093, 21531983, 21531786, 21531676, 21531264, 21531154, 21525290, 21524817,
21519470,21519360, 21519165, 21516571, 21514269, 21514159, 21508389, 21508138, 21508028, 21507830, 
 
  21503457, 21502484, 21496897, 21494287, 21493722, 21493527, 21491807, 21488530, 21486122, 21485766, 21485603,
21485383,21481969, 21481672, 21476245, 21472576, 21468851, 21468741, 21468546, 21467832, 21460086, 
 
  21425406, 21420632, 21420506, 21419974, 21417830, 21417365, 21408677, 21401314, 21400808, 21399725, 21399113,
21393312,21393202, 21387393, 21384625, 21384361, 21384172, 21384054, 21379960, 21374013, 21365760, 
 
  21361813, 21361703, 21361504, 21358333, 21358220, 21352848, 21348896, 21348484, 21343591, 21337675, 21337472,
21331017,21330907, 21325895, 21325785, 21325675, 21325565, 21325370, 21319929, 21316068, 21315958, 
 
  21312609, 21284187, 21262186, 21258549, 21258439, 21258279, 21258131, 21254759, 21251782, 21251094, 21250984,
21250874,21250764, 21244302, 21239067, 21238951, 21238831, 21236783, 21235605, 21230205, 21166173, 
 
  21151836, 21151726, 21151608, 21151498, 21151388, 21151278, 21151168, 21151055, 2576248, 2576255, 2576262, 2576269,
2576276,21456497, 22064128, 0}
 

(gdb) p *partdesc->oids@415
$12 = {22653702, 22652203, 22652104, 22651920, 22651654, 22647359, 22645269, 22644012, 22639600, 22638852, 22621975,
22615355,22615256, 22615069, 22610573, 22606503, 22606404, 22600237, 22600131, 22596610,
 
  22595013, 22594914, 22594725, 22594464, 22589317, 22589216, 22587504, 22582570, 22580264, 22577047, 22576948,
22576763,22576656, 22574077, 22570911, 22570812, 22564524, 22564113, 22558519, 22557080, 22556981,
 
  22556793, 22555205, 22550680, 22550579, 22548884, 22543899, 22540822, 22536665, 22536566, 22536377, 22535133,
22528876,22527780, 22526065, 22521131, 22517878, 22513674, 22513575, 22513405, 22513288, 22507520,
 
  22504728, 22504629, 22493699, 22466016, 22458641, 22457551, 22457421, 22457264, 22452879, 22449864, 22449765,
22443560,22442952, 22436193, 22434644, 22434469, 22434352, 22430792, 22426903, 22426804, 22420732,
 
  22420025, 22413050, 22411963, 22411864, 22411675, 22407652, 22404156, 22404049, 22397550, 22394622, 22390035,
22389936,22389752, 22388386, 22383211, 22382115, 22381934, 22375210, 22370297, 22367878, 22367779,
 
  22367586, 22362556, 22359928, 22358236, 22353374, 22348704, 22345692, 22345593, 22345399, 22341347, 22336809,
22336709,22325812, 22292836, 22287756, 22287657, 22287466, 22283194, 22278659, 22278560, 22272041,
 
  22269121, 22264424, 22264325, 22264135, 22260102, 22255418, 22254818, 22248841, 22245824, 22241490, 22241391,
22241210,22240354, 22236224, 22235123, 22234060, 22228744, 22228345, 22228033, 22222528, 22222429,
 
  22222330, 22222144, 22222045, 22218408, 22215986, 22215887, 22209311, 22209212, 22207919, 22205203, 22203385,
22203298,22203211, 22203124, 22202954, 22202859, 22202772, 22201869, 22200438, 22197706, 22195027,
 
  22194932, 22194834, 22191208, 22188412, 22187029, 22182238, 22182134, 22182030, 22181849, 22181737, 22181107,
22175811,22175710, 22169859, 22169604, 22159266, 22158131, 22158021, 22157824, 22153348, 22153236,
 
  22147308, 22146736, 22143778, 22143599, 22143471, 22138702, 22138590, 22132612, 22132513, 22132271, 22132172,
22131987,21935599, 21932664, 21927997, 21925823, 21885889, 21862973, 21859854, 21859671, 21858869,
 
  21853440, 21851884, 21845405, 21842901, 21837523, 21837413, 21837209, 21832347, 21829359, 21827652, 21822602,
21816150,21805995, 21805812, 21805235, 21798914, 21798026, 21791895, 21791124, 21783854, 21783744,
 
  21783540, 21780568, 21774797, 21774687, 21768326, 21764063, 21759627, 21759517, 21759311, 21755697, 21751690,
21751156,21744906, 21738543, 21736176, 21735992, 21735769, 21727603, 21725956, 21716432, 21678682, 
  21670968, 21670858, 21670665, 21669342, 21661932, 21661822, 21655311, 21650838, 21646721, 21646611, 21646409,
21640984,21637816, 21637706, 21631061, 21622723, 21621459, 21621320, 21621148, 21612902, 21612790,
 
  21606170, 21602265, 21597910, 21597800, 21597605, 21592489, 21589415, 21589305, 21582910, 21578017, 21576758,
21576648,21572692, 21566633, 21566521, 21560127, 21560017, 21553910, 21553800, 21553613, 21553495,
 
  21549102, 21548992, 21542759, 21540922, 21532093, 21531983, 21531786, 21531676, 21531264, 21531154, 21525290,
21524817,21519470, 21519360, 21519165, 21516571, 21514269, 21514159, 21508389, 21508138, 21508028,
 
  21507830, 21503457, 21502484, 21496897, 21494287, 21493722, 21493527, 21491807, 21488530, 21486122, 21485766,
21485603,21485383, 21481969, 21481672, 21476245, 21472576, 21468851, 21468741, 21468546, 21467832,
 
  21460086, 21425406, 21420632, 21420506, 21419974, 21417830, 21417365, 21408677, 21401314, 21400808, 21399725,
21399113,21393312, 21393202, 21387393, 21384625, 21384361, 21384172, 21384054, 21379960, 21374013,
 
  21365760, 21361813, 21361703, 21361504, 21358333, 21358220, 21352848, 21348896, 21348484, 21343591, 21337675,
21337472,21331017, 21330907, 21325895, 21325785, 21325675, 21325565, 21325370, 21319929, 21316068,
 
  21315958, 21312609, 21284187, 21262186, 21258549, 21258439, 21258279, 21258131, 21254759, 21251782, 21251094,
21250984,21250874, 21250764, 21244302, 21239067, 21238951, 21238831, 21236783, 21235605, 21230205,
 
  21166173, 21151836, 21151726, 21151608, 21151498, 21151388, 21151278, 21151168, 21151055, 2576248, 2576255, 2576262,
2576269,2576276, 21456497, 22064128, 22628862}
 

ts=# SELECT 22628862 ::regclass; 
regclass | child.cdrs_huawei_msc_voice_2020_08_02

=> This one was *probably* created around 00:30, but I didn't save logs earlier
than 0200.  That table was probably involved in a query around 2020-08-02
02:02:01.

ts=# SELECT 22653702 ::regclass; 
regclass | child.cdrs_huawei_msc_voice_2019_06_15

=> This one was created by pg_restore at: 2020-08-02 02:03:24

Maybe it's significant that the crash happened during BIND.  This is a prepared
query.

-- 
Justin



On Tue, Aug 4, 2020 at 1:11 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Mon, Aug 03, 2020 at 11:41:37AM -0400, Robert Haas wrote:
> > On Sun, Aug 2, 2020 at 2:11 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > Based on commit logs, I suspect this may be an "older bug", specifically maybe
> > > with:
> > >
> > > |commit 898e5e3290a72d288923260143930fb32036c00c
> > > |Author: Robert Haas <rhaas@postgresql.org>
> > > |Date:   Thu Mar 7 11:13:12 2019 -0500
> > > |
> > > |    Allow ATTACH PARTITION with only ShareUpdateExclusiveLock.
> > >
> > > I don't think it matters, but the process surrounding the table being INSERTed
> > > INTO is more than a little special, involving renames, detaches, creation,
> > > re-attaching within a transaction.  I think that doesn't matter though, and the
> > > issue is surrounding the table being SELECTed *from*, which is actually behind
> > > a view.
> >
> > That's an entirely reasonable guess, but it doesn't seem easy to
> > understand exactly what happened here based on the provided
> > information. The assertion failure probably indicates that
> > pinfo->relid_map[] and partdesc->oids[] differ in some way other than
> > additional elements having been inserted into the latter. For example,
> > some elements might have disappeared, or the order might have changed.
> > This isn't supposed to happen, because DETACH PARTITION requires
> > heavier locking, and the order changing without anything getting
> > detached should be impossible. But evidently it did. If we could dump
> > out the two arrays in question it might shed more light on exactly how
> > things went wrong.

It may be this commit that went into PG 12 that is causing the problem:

commit 428b260f87e8861ba8e58807b69d433db491c4f4
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sat Mar 30 18:58:55 2019 -0400

    Speed up planning when partitions can be pruned at plan time.

which had this:

-               /* Double-check that list of relations has not changed. */
-               Assert(memcmp(partdesc->oids, pinfo->relid_map,
-                      pinfo->nparts * sizeof(Oid)) == 0);
+               /*
+                * Double-check that the list of unpruned relations has not
+                * changed.  (Pruned partitions are not in relid_map[].)
+                */
+#ifdef USE_ASSERT_CHECKING
+               for (int k = 0; k < pinfo->nparts; k++)
+               {
+                   Assert(partdesc->oids[k] == pinfo->relid_map[k] ||
+                          pinfo->subplan_map[k] == -1);
+               }
+#endif

to account for partitions that were pruned by the planner for which we
decided to put 0 into relid_map, but it only considered the case where
the number of partitions doesn't change since the plan was created.
The crash reported here is in the other case where the concurrently
added partitions cause the execution-time PartitionDesc to have more
partitions than the one that PartitionedRelPruneInfo is based on.

I was able to reproduce such a crash as follows:

Start with these tables in session 1.

create table foo (a int, b int) partition by list (a);
create table foo1 partition of foo for values in (1);
create table foo2 partition of foo for values in (2);
create table foo3 partition of foo for values in (3);

Attach gdb with a breakpoint set in PartitionDirectoryLookup() and run this:

explain analyze select * from foo where a <> 1 and a = (select 2);
<After hitting the breakpoint in PartitionDirectoryLookup() called by
the planner, step to the end of it and leave it there>

In another session:

create table foo4 (like foo)
alter table foo attach partition foo4 for values in (4);

That should finish without waiting for any lock and send an
invalidation message to session 1.  Go back to gdb attached to session
1 and hit continue, resulting in the plan containing runtime pruning
info being executed.  ExecCreatePartitionPruneState() opens foo which
will now have 4 partitions instead of the 3 that the planner would
have seen, of which foo1 is pruned (a <> 1), so the following block is
executed:

            if (partdesc->nparts == pinfo->nparts)
            ...
            else
            {
                int         pd_idx = 0;
                int         pp_idx;

                /*
                 * Some new partitions have appeared since plan time, and
                 * those are reflected in our PartitionDesc but were not
                 * present in the one used to construct subplan_map and
                 * subpart_map.  So we must construct new and longer arrays
                 * where the partitions that were originally present map to
                 * the same place, and any added indexes map to -1, as if the
                 * new partitions had been pruned.
                 */
                pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
                for (pp_idx = 0; pp_idx < partdesc->nparts; ++pp_idx)
                {
                    if (pinfo->relid_map[pd_idx] != partdesc->oids[pp_idx])
                    {
                        pprune->subplan_map[pp_idx] = -1;
                        pprune->subpart_map[pp_idx] = -1;
                    }
                    else
                    {
                        pprune->subplan_map[pp_idx] =
                            pinfo->subplan_map[pd_idx];
                        pprune->subpart_map[pp_idx] =
                            pinfo->subpart_map[pd_idx++];
                    }
                }
                Assert(pd_idx == pinfo->nparts);
            }

where it crashes due to having relid_map[] and partdesc->oids[] that
look like this:

(gdb) p *pinfo->relid_map@pinfo->nparts
$3 = {0, 74106, 74109}

(gdb) p *partdesc->oids@partdesc->nparts
$6 = {74103, 74106, 74109, 74112}

The 0 in relid_map matches with nothing in partdesc->oids with the
loop ending without moving forward in the relid_map array, causing the
Assert to fail.

The attached patch should fix that.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Tue, Aug 04, 2020 at 08:12:10PM +0900, Amit Langote wrote:
> It may be this commit that went into PG 12 that is causing the problem:

Thanks for digging into this.

> to account for partitions that were pruned by the planner for which we
> decided to put 0 into relid_map, but it only considered the case where
> the number of partitions doesn't change since the plan was created.
> The crash reported here is in the other case where the concurrently
> added partitions cause the execution-time PartitionDesc to have more
> partitions than the one that PartitionedRelPruneInfo is based on.

Is there anything else needed to check that my crash matches your analysis ?

(gdb) up
#4  0x000000000068444a in ExecInitAppend (node=node@entry=0x7036b90, estate=estate@entry=0x11563f0,
eflags=eflags@entry=16)at nodeAppend.c:132
 
132     nodeAppend.c: No such file or directory.
(gdb) p *node->appendplans   
$17 = {type = T_List, length = 413, max_length = 509, elements = 0x7037400, initial_elements = 0x7037400}

(gdb) down
#3  0x0000000000672806 in ExecCreatePartitionPruneState (planstate=planstate@entry=0x908f6d8,
partitionpruneinfo=<optimizedout>) at execPartition.c:1689
 
1689    execPartition.c: No such file or directory.

$27 = {ps = {type = T_AppendState, plan = 0x7036b90, state = 0x11563f0, ExecProcNode = 0x6842c0 <ExecAppend>,
ExecProcNodeReal= 0x0, instrument = 0x0, worker_instrument = 0x0, worker_jit_instrument = 0x0, 
 
    qual = 0x0, lefttree = 0x0, righttree = 0x0, initPlan = 0x0, subPlan = 0x0, chgParam = 0x0, ps_ResultTupleDesc =
0x0,ps_ResultTupleSlot = 0x0, ps_ExprContext = 0x908f7f0, ps_ProjInfo = 0x0, scandesc = 0x0, 
 
    scanops = 0x0, outerops = 0x0, innerops = 0x0, resultops = 0x0, scanopsfixed = false, outeropsfixed = false,
inneropsfixed= false, resultopsfixed = false, scanopsset = false, outeropsset = false, 
 
    inneropsset = false, resultopsset = false}, appendplans = 0x0, as_nplans = 0, as_whichplan = -1,
as_first_partial_plan= 0, as_pstate = 0x0, pstate_len = 0, as_prune_state = 0x0, as_valid_subplans = 0x0, 
 
  choose_next_subplan = 0x0}

-- 
Justin



On Mon, Aug 3, 2020 at 12:11 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> (gdb) p *pinfo->relid_map@414
> (gdb) p *partdesc->oids@415

Whoa, I didn't know about using @ in gdb to print multiple elements. Wild!

Anyway, these two arrays differ in that the latter array has 22653702
inserted at the beginning and 22628862 at the end, and also in that a
0 has been removed. This code can't cope with things getting removed,
so kaboom. I think Amit probably has the right idea about what's going
on here and how to fix it, but I haven't yet had time to study it in
detail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Wed, Aug 5, 2020 at 9:52 AM Amit Langote <amitlangote09@gmail.com> wrote:
> On Wed, Aug 5, 2020 at 9:32 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > On Wed, Aug 05, 2020 at 09:26:20AM +0900, Amit Langote wrote:
> > > On Wed, Aug 5, 2020 at 12:11 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > >
> > > > On Tue, Aug 04, 2020 at 08:12:10PM +0900, Amit Langote wrote:
> > > > > It may be this commit that went into PG 12 that is causing the problem:
> > > >
> > > > Thanks for digging into this.
> > > >
> > > > > to account for partitions that were pruned by the planner for which we
> > > > > decided to put 0 into relid_map, but it only considered the case where
> > > > > the number of partitions doesn't change since the plan was created.
> > > > > The crash reported here is in the other case where the concurrently
> > > > > added partitions cause the execution-time PartitionDesc to have more
> > > > > partitions than the one that PartitionedRelPruneInfo is based on.
> > > >
> > > > Is there anything else needed to check that my crash matches your analysis ?
> > >
> > > If you can spot a 0 in the output of the following, then yes.
> > >
> > > (gdb) p *pinfo->relid_map@pinfo->nparts
> >
> > I guess you knew that an earlier message has just that.  Thanks.
> > https://www.postgresql.org/message-id/20200803161133.GA21372@telsasoft.com
>
> Yeah, you showed:
>
> (gdb) p *pinfo->relid_map@414
>
> And there is indeed a 0 in there, but I wasn't sure if it was actually
> in the array or a stray zero due to forcing gdb to show beyond the
> array bound.  Does pinfo->nparts match 414?

(sorry, I forgot to hit reply all in last two emails.)

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



On Wed, Aug 05, 2020 at 09:53:44AM +0900, Amit Langote wrote:
> On Wed, Aug 5, 2020 at 9:52 AM Amit Langote <amitlangote09@gmail.com> wrote:
> > On Wed, Aug 5, 2020 at 9:32 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > On Wed, Aug 05, 2020 at 09:26:20AM +0900, Amit Langote wrote:
> > > > On Wed, Aug 5, 2020 at 12:11 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > > >
> > > > > On Tue, Aug 04, 2020 at 08:12:10PM +0900, Amit Langote wrote:
> > > > > > It may be this commit that went into PG 12 that is causing the problem:
> > > > >
> > > > > Thanks for digging into this.
> > > > >
> > > > > > to account for partitions that were pruned by the planner for which we
> > > > > > decided to put 0 into relid_map, but it only considered the case where
> > > > > > the number of partitions doesn't change since the plan was created.
> > > > > > The crash reported here is in the other case where the concurrently
> > > > > > added partitions cause the execution-time PartitionDesc to have more
> > > > > > partitions than the one that PartitionedRelPruneInfo is based on.
> > > > >
> > > > > Is there anything else needed to check that my crash matches your analysis ?
> > > >
> > > > If you can spot a 0 in the output of the following, then yes.
> > > >
> > > > (gdb) p *pinfo->relid_map@pinfo->nparts
> > >
> > > I guess you knew that an earlier message has just that.  Thanks.
> > > https://www.postgresql.org/message-id/20200803161133.GA21372@telsasoft.com
> >
> > Yeah, you showed:
> >
> > (gdb) p *pinfo->relid_map@414
> >
> > And there is indeed a 0 in there, but I wasn't sure if it was actually
> > in the array or a stray zero due to forcing gdb to show beyond the
> > array bound.  Does pinfo->nparts match 414?

Yes.  I typed 414 manually since the the array lengths were suspect.

(gdb) p pinfo->nparts
$1 = 414
(gdb) set print elements 0
(gdb) p *pinfo->relid_map@pinfo->nparts
$3 = {22652203, 22652104, 22651920, 22651654, 22647359, 22645269, 22644012, 22639600, 22638852, 22621975, 22615355,
22615256,22615069, 22610573, 22606503, 22606404, 22600237, 22600131, 22596610, 22595013, 
 
  22594914, 22594725, 22594464, 22589317, 22589216, 22587504, 22582570, 22580264, 22577047, 22576948, 22576763,
22576656,22574077, 22570911, 22570812, 22564524, 22564113, 22558519, 22557080, 22556981, 22556793, 
 
  22555205, 22550680, 22550579, 22548884, 22543899, 22540822, 22536665, 22536566, 22536377, 22535133, 22528876,
22527780,22526065, 22521131, 22517878, 22513674, 22513575, 22513405, 22513288, 22507520, 22504728, 
 
  22504629, 22493699, 22466016, 22458641, 22457551, 22457421, 22457264, 22452879, 22449864, 22449765, 22443560,
22442952,22436193, 22434644, 22434469, 22434352, 22430792, 22426903, 22426804, 22420732, 22420025, 
 
  22413050, 22411963, 22411864, 22411675, 22407652, 22404156, 22404049, 22397550, 22394622, 22390035, 22389936,
22389752,22388386, 22383211, 22382115, 22381934, 22375210, 22370297, 22367878, 22367779, 22367586, 
 
  22362556, 22359928, 22358236, 22353374, 22348704, 22345692, 22345593, 22345399, 22341347, 22336809, 22336709,
22325812,22292836, 22287756, 22287657, 22287466, 22283194, 22278659, 22278560, 22272041, 22269121, 
 
  22264424, 22264325, 22264135, 22260102, 22255418, 22254818, 22248841, 22245824, 22241490, 22241391, 22241210,
22240354,22236224, 22235123, 22234060, 22228744, 22228345, 22228033, 22222528, 22222429, 22222330, 
 
  22222144, 22222045, 22218408, 22215986, 22215887, 22209311, 22209212, 22207919, 22205203, 22203385, 22203298,
22203211,22203124, 22202954, 22202859, 22202772, 22201869, 22200438, 22197706, 22195027, 22194932, 
 
  22194834, 22191208, 22188412, 22187029, 22182238, 22182134, 22182030, 22181849, 22181737, 22181107, 22175811,
22175710,22169859, 22169604, 22159266, 22158131, 22158021, 22157824, 22153348, 22153236, 22147308, 
 
  22146736, 22143778, 22143599, 22143471, 22138702, 22138590, 22132612, 22132513, 22132271, 22132172, 22131987,
21935599,21932664, 21927997, 21925823, 21885889, 21862973, 21859854, 21859671, 21858869, 21853440, 
 
  21851884, 21845405, 21842901, 21837523, 21837413, 21837209, 21832347, 21829359, 21827652, 21822602, 21816150,
21805995,21805812, 21805235, 21798914, 21798026, 21791895, 21791124, 21783854, 21783744, 21783540, 
 
  21780568, 21774797, 21774687, 21768326, 21764063, 21759627, 21759517, 21759311, 21755697, 21751690, 21751156,
21744906,21738543, 21736176, 21735992, 21735769, 21727603, 21725956, 21716432, 21678682, 21670968, 
 
  21670858, 21670665, 21669342, 21661932, 21661822, 21655311, 21650838, 21646721, 21646611, 21646409, 21640984,
21637816,21637706, 21631061, 21622723, 21621459, 21621320, 21621148, 21612902, 21612790, 21606170, 
 
  21602265, 21597910, 21597800, 21597605, 21592489, 21589415, 21589305, 21582910, 21578017, 21576758, 21576648,
21572692,21566633, 21566521, 21560127, 21560017, 21553910, 21553800, 21553613, 21553495, 21549102, 
 
  21548992, 21542759, 21540922, 21532093, 21531983, 21531786, 21531676, 21531264, 21531154, 21525290, 21524817,
21519470,21519360, 21519165, 21516571, 21514269, 21514159, 21508389, 21508138, 21508028, 21507830, 
 
  21503457, 21502484, 21496897, 21494287, 21493722, 21493527, 21491807, 21488530, 21486122, 21485766, 21485603,
21485383,21481969, 21481672, 21476245, 21472576, 21468851, 21468741, 21468546, 21467832, 21460086, 
 
  21425406, 21420632, 21420506, 21419974, 21417830, 21417365, 21408677, 21401314, 21400808, 21399725, 21399113,
21393312,21393202, 21387393, 21384625, 21384361, 21384172, 21384054, 21379960, 21374013, 21365760, 
 
  21361813, 21361703, 21361504, 21358333, 21358220, 21352848, 21348896, 21348484, 21343591, 21337675, 21337472,
21331017,21330907, 21325895, 21325785, 21325675, 21325565, 21325370, 21319929, 21316068, 21315958, 
 
  21312609, 21284187, 21262186, 21258549, 21258439, 21258279, 21258131, 21254759, 21251782, 21251094, 21250984,
21250874,21250764, 21244302, 21239067, 21238951, 21238831, 21236783, 21235605, 21230205, 21166173, 
 
  21151836, 21151726, 21151608, 21151498, 21151388, 21151278, 21151168, 21151055, 2576248, 2576255, 2576262, 2576269,
2576276,21456497, 22064128, 0}
 

-- 
Justin



On Wed, Aug 5, 2020 at 10:04 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Wed, Aug 05, 2020 at 09:53:44AM +0900, Amit Langote wrote:
> > On Wed, Aug 5, 2020 at 9:52 AM Amit Langote <amitlangote09@gmail.com> wrote:
> > > On Wed, Aug 5, 2020 at 9:32 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > > On Wed, Aug 05, 2020 at 09:26:20AM +0900, Amit Langote wrote:
> > > > > On Wed, Aug 5, 2020 at 12:11 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > > > >
> > > > > > On Tue, Aug 04, 2020 at 08:12:10PM +0900, Amit Langote wrote:
> > > > > > > It may be this commit that went into PG 12 that is causing the problem:
> > > > > >
> > > > > > Thanks for digging into this.
> > > > > >
> > > > > > > to account for partitions that were pruned by the planner for which we
> > > > > > > decided to put 0 into relid_map, but it only considered the case where
> > > > > > > the number of partitions doesn't change since the plan was created.
> > > > > > > The crash reported here is in the other case where the concurrently
> > > > > > > added partitions cause the execution-time PartitionDesc to have more
> > > > > > > partitions than the one that PartitionedRelPruneInfo is based on.
> > > > > >
> > > > > > Is there anything else needed to check that my crash matches your analysis ?
> > > > >
> > > > > If you can spot a 0 in the output of the following, then yes.
> > > > >
> > > > > (gdb) p *pinfo->relid_map@pinfo->nparts
> > > >
> > > > I guess you knew that an earlier message has just that.  Thanks.
> > > > https://www.postgresql.org/message-id/20200803161133.GA21372@telsasoft.com
> > >
> > > Yeah, you showed:
> > >
> > > (gdb) p *pinfo->relid_map@414
> > >
> > > And there is indeed a 0 in there, but I wasn't sure if it was actually
> > > in the array or a stray zero due to forcing gdb to show beyond the
> > > array bound.  Does pinfo->nparts match 414?
>
> Yes.  I typed 414 manually since the the array lengths were suspect.
>
> (gdb) p pinfo->nparts
> $1 = 414
> (gdb) set print elements 0
> (gdb) p *pinfo->relid_map@pinfo->nparts
> $3 = {....
>   21151836, 21151726, 21151608, 21151498, 21151388, 21151278, 21151168, 21151055, 2576248, 2576255, 2576262, 2576269,
2576276,21456497, 22064128, 0}
 

Thanks.  There is a 0 in there, which can only be there if planner was
able to prune that last partition.  So, the planner saw a table with
414 partitions, was able to prune the last one and constructed an
Append plan with 413 subplans for unpruned partitions as you showed
upthread:

> (gdb) p *node->appendplans
> $17 = {type = T_List, length = 413, max_length = 509, elements = 0x7037400, initial_elements = 0x7037400}

This suggests that the crash I was able produce is similar to what you saw.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Amit Langote <amitlangote09@gmail.com> writes:
> The crash reported here is in the other case where the concurrently
> added partitions cause the execution-time PartitionDesc to have more
> partitions than the one that PartitionedRelPruneInfo is based on.
> I was able to reproduce such a crash as follows:

Yeah, I can repeat the case per these directions.  I concur that the
issue is that ExecCreatePartitionPruneState is failing to cope with
zeroes in the relid_map.

> The attached patch should fix that.

I don't like this patch at all though; I do not think it is being nearly
careful enough to ensure that it's matched the surviving relation OIDs
correctly.  In particular it blithely assumes that a zero in relid_map
*must* match the immediately next entry in partdesc->oids, which is easy
to break if the new partition is adjacent to the one the planner managed
to prune.  So I think we should do it more like the attached.

I'm strongly tempted to convert the trailing Assert to an actual
test-and-elog, too, but didn't do so here.

            regards, tom lane

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index fb6ce49056..221a34e738 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1673,7 +1673,19 @@ ExecCreatePartitionPruneState(PlanState *planstate,
                 pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
                 for (pp_idx = 0; pp_idx < partdesc->nparts; ++pp_idx)
                 {
-                    if (pinfo->relid_map[pd_idx] != partdesc->oids[pp_idx])
+                    /*
+                     * pinfo->relid_map can contain InvalidOid entries for
+                     * partitions pruned by the planner.  We cannot tell
+                     * exactly which of the partdesc entries these correspond
+                     * to, but we don't have to; just skip over them.  The
+                     * non-pruned relid_map entries, however, had better be a
+                     * subset of the partdesc entries and in the same order.
+                     */
+                    while (pd_idx < pinfo->nparts &&
+                           !OidIsValid(pinfo->relid_map[pd_idx]))
+                        pd_idx++;
+                    if (pd_idx >= pinfo->nparts ||
+                        pinfo->relid_map[pd_idx] != partdesc->oids[pp_idx])
                     {
                         pprune->subplan_map[pp_idx] = -1;
                         pprune->subpart_map[pp_idx] = -1;
@@ -1686,6 +1698,14 @@ ExecCreatePartitionPruneState(PlanState *planstate,
                             pinfo->subpart_map[pd_idx++];
                     }
                 }
+
+                /*
+                 * It might seem that we need to skip any trailing InvalidOid
+                 * entries in pinfo->relid_map before asserting that we
+                 * scanned all of the relid_map.  But we will have skipped
+                 * them above, because they must correspond to some
+                 * partdesc->oids entries; we just couldn't tell which.
+                 */
                 Assert(pd_idx == pinfo->nparts);
             }


On Wed, Aug 5, 2020 at 1:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't like this patch at all though; I do not think it is being nearly
> careful enough to ensure that it's matched the surviving relation OIDs
> correctly.  In particular it blithely assumes that a zero in relid_map
> *must* match the immediately next entry in partdesc->oids, which is easy
> to break if the new partition is adjacent to the one the planner managed
> to prune.  So I think we should do it more like the attached.

Ooh, nice catch.

> I'm strongly tempted to convert the trailing Assert to an actual
> test-and-elog, too, but didn't do so here.

I was thinking about that, too. +1 for taking that step.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Aug 5, 2020 at 1:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm strongly tempted to convert the trailing Assert to an actual
>> test-and-elog, too, but didn't do so here.

> I was thinking about that, too. +1 for taking that step.

Will do.

In the longer term, it's annoying that we have no test methodology
for this other than "manually set a breakpoint here".  If we're going
to allow plan-relevant DDL changes to happen with less than full table
lock, I think we need to improve that.  I spent a little bit of time
just now trying to build an isolationtester case for this, and failed
completely.  So I wonder if we can create some sort of test module that
allows capture of a plan tree and then execution of that plan tree later
(even after relcache inval would normally have forced replanning).
Obviously that could not be a normal SQL-accessible feature, because
some types of invals would make the plan completely wrong, but for
testing purposes it'd be mighty helpful to check that a stale plan
still works.

            regards, tom lane



On Wed, Aug 5, 2020 at 2:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In the longer term, it's annoying that we have no test methodology
> for this other than "manually set a breakpoint here".  If we're going
> to allow plan-relevant DDL changes to happen with less than full table
> lock, I think we need to improve that.  I spent a little bit of time
> just now trying to build an isolationtester case for this, and failed
> completely.  So I wonder if we can create some sort of test module that
> allows capture of a plan tree and then execution of that plan tree later
> (even after relcache inval would normally have forced replanning).
> Obviously that could not be a normal SQL-accessible feature, because
> some types of invals would make the plan completely wrong, but for
> testing purposes it'd be mighty helpful to check that a stale plan
> still works.

That's an interesting idea. I don't know exactly how it would work,
but I agree that it would allow useful testing that we can't do today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Aug 5, 2020 at 2:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In the longer term, it's annoying that we have no test methodology
>> for this other than "manually set a breakpoint here".  If we're going
>> to allow plan-relevant DDL changes to happen with less than full table
>> lock, I think we need to improve that.  I spent a little bit of time
>> just now trying to build an isolationtester case for this, and failed
>> completely.  So I wonder if we can create some sort of test module that
>> allows capture of a plan tree and then execution of that plan tree later
>> (even after relcache inval would normally have forced replanning).
>> Obviously that could not be a normal SQL-accessible feature, because
>> some types of invals would make the plan completely wrong, but for
>> testing purposes it'd be mighty helpful to check that a stale plan
>> still works.

> That's an interesting idea. I don't know exactly how it would work,
> but I agree that it would allow useful testing that we can't do today.

After thinking about it for a little bit, I'm envisioning a test module
that can be loaded into a session, and then it gets into the planner_hook,
and what it does is after each planner execution, take and release an
advisory lock with some selectable ID.  Then we can construct
isolationtester specs that do something like

    session 1                session 2

    LOAD test-module;
    SET custom_guc_for_lock_id = n;
    prepare test tables;

                        SELECT pg_advisory_lock(n);
        
    SELECT victim-query-here;
    ... after planning, query blocks on lock

                        perform DDL changes;
                        SELECT pg_advisory_unlock(n);

    ... query executes with now-stale plan

            regards, tom lane



On Wed, Aug 5, 2020 at 4:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> After thinking about it for a little bit, I'm envisioning a test module
> that can be loaded into a session, and then it gets into the planner_hook,
> and what it does is after each planner execution, take and release an
> advisory lock with some selectable ID.  Then we can construct
> isolationtester specs that do something like
>
>         session 1                               session 2
>
>         LOAD test-module;
>         SET custom_guc_for_lock_id = n;
>         prepare test tables;
>
>                                                 SELECT pg_advisory_lock(n);
>
>         SELECT victim-query-here;
>         ... after planning, query blocks on lock
>
>                                                 perform DDL changes;
>                                                 SELECT pg_advisory_unlock(n);
>
>         ... query executes with now-stale plan

Very sneaky!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Thu, Aug 6, 2020 at 2:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > The attached patch should fix that.
>
> I don't like this patch at all though; I do not think it is being nearly
> careful enough to ensure that it's matched the surviving relation OIDs
> correctly.  In particular it blithely assumes that a zero in relid_map
> *must* match the immediately next entry in partdesc->oids, which is easy
> to break if the new partition is adjacent to the one the planner managed
> to prune.

Indeed, you're right.

>  So I think we should do it more like the attached.

Thanks for pushing that.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com





On Thu, Aug 6, 2020 at 2:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Aug 5, 2020 at 1:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm strongly tempted to convert the trailing Assert to an actual
>> test-and-elog, too, but didn't do so here.

> I was thinking about that, too. +1 for taking that step.

Will do.

In the longer term, it's annoying that we have no test methodology
for this other than "manually set a breakpoint here". 

One of the methods I see is we can just add some GUC variable for some
action injection.   basically it adds some code based on the GUC like this;

if (shall_delay_planning)
{
  sleep(10)
};

AFAIK,  MongoDB uses much such technology  in their test framework. First it 
defines the fail point [1],  and then does code injection if the fail point is set [2].  
At last, during the test it can set a fail point like a GUC, but with more attributes [3]. 
If that is useful in PG as well and it is not an urgent task,  I would like to help
in this direction. 



If we're going
to allow plan-relevant DDL changes to happen with less than full table
lock, I think we need to improve that.  I spent a little bit of time
just now trying to build an isolationtester case for this, and failed
completely.  So I wonder if we can create some sort of test module that
allows capture of a plan tree and then execution of that plan tree later
(even after relcache inval would normally have forced replanning).
Obviously that could not be a normal SQL-accessible feature, because
some types of invals would make the plan completely wrong, but for
testing purposes it'd be mighty helpful to check that a stale plan
still works.

                        regards, tom lane




--
Best Regards
Andy Fan
Andy Fan <zhihui.fan1213@gmail.com> writes:
> On Thu, Aug 6, 2020 at 2:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In the longer term, it's annoying that we have no test methodology
>> for this other than "manually set a breakpoint here".

> One of the methods I see is we can just add some GUC variable for some
> action injection.   basically it adds some code based on the GUC like this;

See my straw-man proposal downthread.  I'm not very excited about putting
things like this into the standard build, because it's really hard to be
sure that there are no security-hazard-ish downsides of putting in ways to
get at testing behaviors from standard SQL.  And then there's the question
of whether you're adding noticeable overhead to production builds.  So a
loadable module that can use some existing hook to provide the needed
behavior seems like a better plan to me, whenever we can do it that way.

In general, though, it seems like we've seldom regretted investments in
test tooling.

            regards, tom lane





On Thu, Aug 6, 2020 at 12:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
> On Thu, Aug 6, 2020 at 2:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In the longer term, it's annoying that we have no test methodology
>> for this other than "manually set a breakpoint here".

> One of the methods I see is we can just add some GUC variable for some
> action injection.   basically it adds some code based on the GUC like this;

See my straw-man proposal downthread.  I'm not very excited about putting
things like this into the standard build, because it's really hard to be
sure that there are no security-hazard-ish downsides of putting in ways to
get at testing behaviors from standard SQL.  And then there's the question
of whether you're adding noticeable overhead to production builds.  So a
loadable module that can use some existing hook to provide the needed
behavior seems like a better plan to me, whenever we can do it that way.

In general, though, it seems like we've seldom regretted investments in
test tooling.

                        regards, tom lane


Thanks for your explanation, I checked it again and it looks a very clean 
method.  The attached is a draft patch based on my understanding.  Hope
I didn't  misunderstand you..

--
Best Regards
Andy Fan
Attachment
Andy Fan <zhihui.fan1213@gmail.com> writes:
> On Thu, Aug 6, 2020 at 12:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> See my straw-man proposal downthread.

> Thanks for your explanation, I checked it again and it looks a very clean
> method.  The attached is a draft patch based on my understanding.  Hope
> I didn't  misunderstand you..

Ah, I was going to play arond with that today, but you beat me to it ;-)

A few thoughts after a quick look at the patch:

* I had envisioned that there's a custom GUC controlling the lock ID
used; this would allow blocking different sessions at different points,
if we ever need that.  Also, I'd make the GUC start out as zero which
means "do nothing", so that merely loading the module has no immediate
effect.

* Don't really see the point of the before-planning lock.

* Rather than exposing internal declarations from lockfuncs.c, you
could just write calls to pg_advisory_lock_int8 etc. using
DirectFunctionCall1.

* We need some better name than "test_module".  I had vaguely thought
about "delay_execution", but am surely open to better ideas.

            regards, tom lane





On Thu, Aug 6, 2020 at 10:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
> On Thu, Aug 6, 2020 at 12:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> See my straw-man proposal downthread.

> Thanks for your explanation, I checked it again and it looks a very clean
> method.  The attached is a draft patch based on my understanding.  Hope
> I didn't  misunderstand you..

Ah, I was going to play arond with that today, but you beat me to it ;-)


Very glad to be helpful. 
 
A few thoughts after a quick look at the patch:

* I had envisioned that there's a custom GUC controlling the lock ID
used; this would allow blocking different sessions at different points,
if we ever need that.  Also, I'd make the GUC start out as zero which
means "do nothing", so that merely loading the module has no immediate
effect.

 
I forgot to say I didn't get the point of the guc variable in the last thread, 
now I think it is a smart idea, so added it.  In this way, one session
can only be blocked at one place, it may not be an issue in practise. 

* Don't really see the point of the before-planning lock.

 
yes.. it was removed now.

* Rather than exposing internal declarations from lockfuncs.c, you
could just write calls to pg_advisory_lock_int8 etc. using
DirectFunctionCall1.


Thanks for sharing it,  this method looks pretty good. 
 
* We need some better name than "test_module".  I had vaguely thought
about "delay_execution", but am surely open to better ideas.


Both names look good to me, delay_execution looks better,  it is used in v2. 

Attached is the v2 patch. 

--
Best Regards
Andy Fan
Attachment
Andy Fan <zhihui.fan1213@gmail.com> writes:
> Attached is the v2 patch.

Forgot to mention that I'd envisioned adding this as a src/test/modules/
module; contrib/ is for things that we intend to expose to users, which
I think this isn't.

I played around with this and got the isolation test I'd experimented
with yesterday to work with it.  If you revert 7a980dfc6 then the
attached patch will expose that bug.  Interestingly, I had to add an
explicit AcceptInvalidationMessages() call to reproduce the bug; so
apparently we do none of those between planning and execution in the
ordinary query code path.  Arguably, that means we're testing a scenario
somewhat different from what can happen in live databases, but I think
it's OK.  Amit's recipe for reproducing the bug works because there are
other relation lock acquisitions (and hence AcceptInvalidationMessages
calls) later in planning than where he asked us to wait.  So this
effectively tests a scenario where a very late A.I.M. call within the
planner detects an inval event for some already-planned relation, and
that seems like a valid-enough scenario.

Anyway, attached find a reviewed version of your patch plus a test
scenario contributed by me (I was too lazy to split it into two
patches).  Barring objections, I'll push this tomorrow or so.

(BTW, I checked and found that this test does *not* expose the problems
with Amit's original patch.  Not sure if it's worth trying to finagle
it so it does.)

            regards, tom lane

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 29de73c060..1428529b04 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -7,6 +7,7 @@ include $(top_builddir)/src/Makefile.global
 SUBDIRS = \
           brin \
           commit_ts \
+          delay_execution \
           dummy_index_am \
           dummy_seclabel \
           snapshot_too_old \
diff --git a/src/test/modules/delay_execution/.gitignore b/src/test/modules/delay_execution/.gitignore
new file mode 100644
index 0000000000..ba2160b66c
--- /dev/null
+++ b/src/test/modules/delay_execution/.gitignore
@@ -0,0 +1,3 @@
+# Generated subdirectories
+/output_iso/
+/tmp_check_iso/
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
new file mode 100644
index 0000000000..f270aebf3a
--- /dev/null
+++ b/src/test/modules/delay_execution/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/delay_execution/Makefile
+
+PGFILEDESC = "delay_execution - allow delay between parsing and execution"
+
+MODULE_big = delay_execution
+OBJS = \
+    $(WIN32RES) \
+    delay_execution.o
+
+ISOLATION = partition-addition
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/delay_execution
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
new file mode 100644
index 0000000000..03ea23d0f2
--- /dev/null
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -0,0 +1,104 @@
+/*-------------------------------------------------------------------------
+ *
+ * delay_execution.c
+ *        Test module to allow delay between parsing and execution of a query.
+ *
+ * The delay is implemented by taking and immediately releasing a specified
+ * advisory lock.  If another process has previously taken that lock, the
+ * current process will be blocked until the lock is released; otherwise,
+ * there's no effect.  This allows an isolationtester script to reliably
+ * test behaviors where some specified action happens in another backend
+ * between parsing and execution of any desired query.
+ *
+ * Copyright (c) 2020, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *      src/test/modules/delay_execution/delay_execution.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include <limits.h>
+
+#include "optimizer/planner.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/inval.h"
+
+
+PG_MODULE_MAGIC;
+
+/* GUC: advisory lock ID to use.  Zero disables the feature. */
+static int    post_planning_lock_id = 0;
+
+/* Save previous planner hook user to be a good citizen */
+static planner_hook_type prev_planner_hook = NULL;
+
+/* Module load/unload functions */
+void        _PG_init(void);
+void        _PG_fini(void);
+
+
+/* planner_hook function to provide the desired delay */
+static PlannedStmt *
+delay_execution_planner(Query *parse, const char *query_string,
+                        int cursorOptions, ParamListInfo boundParams)
+{
+    PlannedStmt *result;
+
+    /* Invoke the planner, possibly via a previous hook user */
+    if (prev_planner_hook)
+        result = prev_planner_hook(parse, query_string, cursorOptions,
+                                   boundParams);
+    else
+        result = standard_planner(parse, query_string, cursorOptions,
+                                  boundParams);
+
+    /* If enabled, delay by taking and releasing the specified lock */
+    if (post_planning_lock_id != 0)
+    {
+        DirectFunctionCall1(pg_advisory_lock_int8,
+                            Int64GetDatum((int64) post_planning_lock_id));
+        DirectFunctionCall1(pg_advisory_unlock_int8,
+                            Int64GetDatum((int64) post_planning_lock_id));
+
+        /*
+         * Ensure that we notice any pending invalidations, since the advisory
+         * lock functions don't do this.
+         */
+        AcceptInvalidationMessages();
+    }
+
+    return result;
+}
+
+/* Module load function */
+void
+_PG_init(void)
+{
+    /* Set up the GUC to control which lock is used */
+    DefineCustomIntVariable("delay_execution.post_planning_lock_id",
+                            "Sets the advisory lock ID to be locked/unlocked after planning.",
+                            "Zero disables the delay.",
+                            &post_planning_lock_id,
+                            0,
+                            0, INT_MAX,
+                            PGC_USERSET,
+                            0,
+                            NULL,
+                            NULL,
+                            NULL);
+
+    /* Install our hook */
+    prev_planner_hook = planner_hook;
+    planner_hook = delay_execution_planner;
+}
+
+/* Module unload function (pro forma, not used currently) */
+void
+_PG_fini(void)
+{
+    planner_hook = prev_planner_hook;
+}
diff --git a/src/test/modules/delay_execution/expected/partition-addition.out
b/src/test/modules/delay_execution/expected/partition-addition.out
new file mode 100644
index 0000000000..7c91090eef
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/partition-addition.out
@@ -0,0 +1,21 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s2lock s1exec s2addp s2unlock
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+
+
+step s1exec: LOAD 'delay_execution';
+          SET delay_execution.post_planning_lock_id = 12345;
+          SELECT * FROM foo WHERE a <> 1 AND a <> (SELECT 3); <waiting ...>
+step s2addp: CREATE TABLE foo2 (LIKE foo);
+          ALTER TABLE foo ATTACH PARTITION foo2 FOR VALUES IN (2);
+          INSERT INTO foo VALUES (2, 'ADD2');
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+
+t
+step s1exec: <... completed>
+a              b
+
+4              GHI
diff --git a/src/test/modules/delay_execution/specs/partition-addition.spec
b/src/test/modules/delay_execution/specs/partition-addition.spec
new file mode 100644
index 0000000000..2a0948247e
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/partition-addition.spec
@@ -0,0 +1,38 @@
+# Test addition of a partition with less-than-exclusive locking.
+
+setup
+{
+  CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+  CREATE TABLE foo1 PARTITION OF foo FOR VALUES IN (1);
+  CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+  CREATE TABLE foo4 PARTITION OF foo FOR VALUES IN (4);
+  INSERT INTO foo VALUES (1, 'ABC');
+  INSERT INTO foo VALUES (3, 'DEF');
+  INSERT INTO foo VALUES (4, 'GHI');
+}
+
+teardown
+{
+  DROP TABLE foo;
+}
+
+# The SELECT will be planned with just the three partitions shown above,
+# of which we expect foo1 to be pruned at planning and foo3 at execution.
+# Then we'll block, and by the time the query is actually executed,
+# partition foo2 will also exist.  We expect that not to be scanned.
+# This test is specifically designed to check ExecCreatePartitionPruneState's
+# code for matching up the partition lists in such cases.
+
+session "s1"
+step "s1exec"    { LOAD 'delay_execution';
+          SET delay_execution.post_planning_lock_id = 12345;
+          SELECT * FROM foo WHERE a <> 1 AND a <> (SELECT 3); }
+
+session "s2"
+step "s2lock"    { SELECT pg_advisory_lock(12345); }
+step "s2unlock"    { SELECT pg_advisory_unlock(12345); }
+step "s2addp"    { CREATE TABLE foo2 (LIKE foo);
+          ALTER TABLE foo ATTACH PARTITION foo2 FOR VALUES IN (2);
+          INSERT INTO foo VALUES (2, 'ADD2'); }
+
+permutation "s2lock" "s1exec" "s2addp" "s2unlock"



On Fri, Aug 7, 2020 at 8:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
> Attached is the v2 patch.

Forgot to mention that I'd envisioned adding this as a src/test/modules/
module; contrib/ is for things that we intend to expose to users, which
I think this isn't.

I played around with this and got the isolation test I'd experimented
with yesterday to work with it.  If you revert 7a980dfc6 then the
attached patch will expose that bug.  Interestingly, I had to add an
explicit AcceptInvalidationMessages() call to reproduce the bug; so
apparently we do none of those between planning and execution in the
ordinary query code path.  Arguably, that means we're testing a scenario
somewhat different from what can happen in live databases, but I think
it's OK.  Amit's recipe for reproducing the bug works because there are
other relation lock acquisitions (and hence AcceptInvalidationMessages
calls) later in planning than where he asked us to wait.  So this
effectively tests a scenario where a very late A.I.M. call within the
planner detects an inval event for some already-planned relation, and
that seems like a valid-enough scenario.

Anyway, attached find a reviewed version of your patch plus a test
scenario contributed by me (I was too lazy to split it into two
patches).  Barring objections, I'll push this tomorrow or so.

(BTW, I checked and found that this test does *not* expose the problems
with Amit's original patch.  Not sure if it's worth trying to finagle
it so it does.)

                        regards, tom lane

 
+ * delay_execution.c
+ * Test module to allow delay between parsing and execution of a query.

I am not sure if we need to limit the scope to "between parsing and execution",
IMO, it can be used at any place where we have a hook so that delay_execution
can inject the lock_unlock logic with a predefined lock id. Probably the test writer
only wants one place blocked, then delay_execution.xxx_lock_id can be set so 
that only the given lock ID  is considered.  Just my 0.01 cents. 

--
Best Regards
Andy Fan
Andy Fan <zhihui.fan1213@gmail.com> writes:
> I am not sure if we need to limit the scope to "between parsing and
> execution",

Yeah, there might be reason to add similar functionality in other
places later.  I'm not sure where yet --- but that idea does make
me slightly unhappy with the "delay_execution" moniker.  I don't
have a better name though ...

            regards, tom lane



On Fri, Aug 7, 2020 at 9:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andy Fan <zhihui.fan1213@gmail.com> writes:
> > Attached is the v2 patch.

Thanks Andy and Tom for this.

> Forgot to mention that I'd envisioned adding this as a src/test/modules/
> module; contrib/ is for things that we intend to expose to users, which
> I think this isn't.
>
> I played around with this and got the isolation test I'd experimented
> with yesterday to work with it.  If you revert 7a980dfc6 then the
> attached patch will expose that bug.  Interestingly, I had to add an
> explicit AcceptInvalidationMessages() call to reproduce the bug; so
> apparently we do none of those between planning and execution in the
> ordinary query code path.  Arguably, that means we're testing a scenario
> somewhat different from what can happen in live databases, but I think
> it's OK.  Amit's recipe for reproducing the bug works because there are
> other relation lock acquisitions (and hence AcceptInvalidationMessages
> calls) later in planning than where he asked us to wait.  So this
> effectively tests a scenario where a very late A.I.M. call within the
> planner detects an inval event for some already-planned relation, and
> that seems like a valid-enough scenario.

Agreed.

Curiously, Justin mentioned upthread that the crash occurred during
BIND of a prepared query, so it better had been that a custom plan was
being executed, because a generic one based on fewer partitions would
be thrown away due to A.I.M. invoked during AcquireExecutorLocks().

> Anyway, attached find a reviewed version of your patch plus a test
> scenario contributed by me (I was too lazy to split it into two
> patches).  Barring objections, I'll push this tomorrow or so.
>
> (BTW, I checked and found that this test does *not* expose the problems
> with Amit's original patch.  Not sure if it's worth trying to finagle
> it so it does.)

I tried to figure out a scenario where my patch would fail but
couldn't come up with one either, but it's no proof that it isn't
wrong.  For example, I can see that pinfo->relid_map[pinfo->nparts]
can be accessed with my patch which is not correct.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Amit Langote <amitlangote09@gmail.com> writes:
> On Fri, Aug 7, 2020 at 9:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ...  Amit's recipe for reproducing the bug works because there are
>> other relation lock acquisitions (and hence AcceptInvalidationMessages
>> calls) later in planning than where he asked us to wait.  So this
>> effectively tests a scenario where a very late A.I.M. call within the
>> planner detects an inval event for some already-planned relation, and
>> that seems like a valid-enough scenario.

> Agreed.

> Curiously, Justin mentioned upthread that the crash occurred during
> BIND of a prepared query, so it better had been that a custom plan was
> being executed, because a generic one based on fewer partitions would
> be thrown away due to A.I.M. invoked during AcquireExecutorLocks().

Based on the above, it seems plausible that the plancache did throw away
an old plan and try to replan, but the inval message announcing partition
addition arrived too late during that planning cycle.  Just like the
normal execution path, the plancache code path won't do more than one
iteration of planning on the way to a demanded query execution.

>> (BTW, I checked and found that this test does *not* expose the problems
>> with Amit's original patch.  Not sure if it's worth trying to finagle
>> it so it does.)

> I tried to figure out a scenario where my patch would fail but
> couldn't come up with one either, but it's no proof that it isn't
> wrong.  For example, I can see that pinfo->relid_map[pinfo->nparts]
> can be accessed with my patch which is not correct.

Yeah, touching array entries off the end of the relid_map array definitely
seems possible with that coding.  But the scenario I was worried about
was that the loop actually attaches the wrong subplan (one for a different
partition) to a partdesc entry.  In an assert-enabled build, that would
have led to assertion failure just below, because then we could not match
up all the remaining relid_map entries; but in a non-assert build, we'd
plow through and bad things would likely happen during execution.
You might need further conditions, like the partitions not being all
identical, for that to actually cause any problem.  I'd poked at this
for a little bit without causing an obvious crash, but I can't claim
to have tried hard.

            regards, tom lane



On Fri, Aug 07, 2020 at 12:16:11PM +0900, Amit Langote wrote:
> Curiously, Justin mentioned upthread that the crash occurred during
> BIND of a prepared query, so it better had been that a custom plan was
> being executed,

I'm looking at how to check that ... can you give a hint ?

-- 
Justin



On Fri, Aug 07, 2020 at 12:16:11PM +0900, Amit Langote wrote:
> Curiously, Justin mentioned upthread that the crash occurred during
> BIND of a prepared query, so it better had been that a custom plan was
> being executed, because a generic one based on fewer partitions would
> be thrown away due to A.I.M. invoked during AcquireExecutorLocks().

Well this statement should only be executed once, and should be using
PQexecParams and not PQexecPrepared (pygresql: pg.DB().query_prepared()).

(gdb) p portal->name
$30 = 0xf03238 ""

(gdb) p portal->prepStmtName 
$31 = 0x0

(gdb) p *portal->cplan
$24 = {magic = 953717834, stmt_list = 0x682ec38, is_oneshot = false, is_saved = true, is_valid = true, planRoleId =
16554,dependsOnRole = false, saved_xmin = 0, generation = 1, refcount = 1, context = 0x682dfd0}
 

I'm not sure why is_oneshot=false, though...

-- 
Justin



On Fri, Aug 7, 2020 at 1:05 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Fri, Aug 07, 2020 at 12:16:11PM +0900, Amit Langote wrote:
> > Curiously, Justin mentioned upthread that the crash occurred during
> > BIND of a prepared query, so it better had been that a custom plan was
> > being executed, because a generic one based on fewer partitions would
> > be thrown away due to A.I.M. invoked during AcquireExecutorLocks().
>
> Well this statement should only be executed once, and should be using
> PQexecParams and not PQexecPrepared (pygresql: pg.DB().query_prepared()).
>
> (gdb) p portal->name
> $30 = 0xf03238 ""
>
> (gdb) p portal->prepStmtName
> $31 = 0x0
>
> (gdb) p *portal->cplan
> $24 = {magic = 953717834, stmt_list = 0x682ec38, is_oneshot = false, is_saved = true, is_valid = true, planRoleId =
16554,dependsOnRole = false, saved_xmin = 0, generation = 1, refcount = 1, context = 0x682dfd0}
 
>
> I'm not sure why is_oneshot=false, though...

Perhaps printing *unnamed_stmt_psrc (CachedPlanSource for an unnamed
statement) would put out more information.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



On Fri, Aug 07, 2020 at 01:13:51PM +0900, Amit Langote wrote:
> On Fri, Aug 7, 2020 at 1:05 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > On Fri, Aug 07, 2020 at 12:16:11PM +0900, Amit Langote wrote:
> > > Curiously, Justin mentioned upthread that the crash occurred during
> > > BIND of a prepared query, so it better had been that a custom plan was
> > > being executed, because a generic one based on fewer partitions would
> > > be thrown away due to A.I.M. invoked during AcquireExecutorLocks().
> >
> > Well this statement should only be executed once, and should be using
> > PQexecParams and not PQexecPrepared (pygresql: pg.DB().query_prepared()).
> >
> > (gdb) p portal->name
> > $30 = 0xf03238 ""
> >
> > (gdb) p portal->prepStmtName
> > $31 = 0x0
> >
> > (gdb) p *portal->cplan
> > $24 = {magic = 953717834, stmt_list = 0x682ec38, is_oneshot = false, is_saved = true, is_valid = true, planRoleId =
16554,dependsOnRole = false, saved_xmin = 0, generation = 1, refcount = 1, context = 0x682dfd0}
 
> >
> > I'm not sure why is_oneshot=false, though...
> 
> Perhaps printing *unnamed_stmt_psrc (CachedPlanSource for an unnamed
> statement) would put out more information.

(gdb) p *unnamed_stmt_psrc
$49 = {magic = 195726186, raw_parse_tree = 0xfae788, 
  query_string = 0xfaddc0 "\n", ' ' <repeats 20 times>, "SELECT $3::timestamp as start_time, $2::int as
interval_seconds,\n",' ' <repeats 20 times>, "first_cgi as cgi, gsm_carr_mcc||gsm_carr_mnc as home_plmn,\n", ' '
<repeats20 times>, "SUM(chargeable_"..., commandTag = CMDTAG_SELECT, param_types = 0x1254400, num_params = 3,
parserSetup= 0x0, parserSetupArg = 0x0, cursor_options = 256, fixed_result = true, 
 
  resultDesc = 0x1376670, context = 0xfae550, query_list = 0x103c9a8, relationOids = 0x11aa580, invalItems = 0x11aa600,
search_path= 0x11aa878, query_context = 0xf85790, rewriteRoleId = 16554, 
 
  rewriteRowSecurity = true, dependsOnRLS = false, gplan = 0x0, is_oneshot = false, is_complete = true, is_saved =
true,is_valid = false, generation = 1, node = {prev = 0x12fcf28, 
 
    next = 0xdf2c80 <saved_plan_list>}, generic_cost = -1, total_custom_cost = 12187136.696805555, num_custom_plans =
1}

-- 
Justin



On Fri, Aug 7, 2020 at 1:21 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Fri, Aug 07, 2020 at 01:13:51PM +0900, Amit Langote wrote:
> > On Fri, Aug 7, 2020 at 1:05 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > On Fri, Aug 07, 2020 at 12:16:11PM +0900, Amit Langote wrote:
> > > > Curiously, Justin mentioned upthread that the crash occurred during
> > > > BIND of a prepared query, so it better had been that a custom plan was
> > > > being executed, because a generic one based on fewer partitions would
> > > > be thrown away due to A.I.M. invoked during AcquireExecutorLocks().
> > >
> > > Well this statement should only be executed once, and should be using
> > > PQexecParams and not PQexecPrepared (pygresql: pg.DB().query_prepared()).
> > >
> > > (gdb) p portal->name
> > > $30 = 0xf03238 ""
> > >
> > > (gdb) p portal->prepStmtName
> > > $31 = 0x0
> > >
> > > (gdb) p *portal->cplan
> > > $24 = {magic = 953717834, stmt_list = 0x682ec38, is_oneshot = false, is_saved = true, is_valid = true, planRoleId
=16554, dependsOnRole = false, saved_xmin = 0, generation = 1, refcount = 1, context = 0x682dfd0}
 
> > >
> > > I'm not sure why is_oneshot=false, though...
> >
> > Perhaps printing *unnamed_stmt_psrc (CachedPlanSource for an unnamed
> > statement) would put out more information.
>
> (gdb) p *unnamed_stmt_psrc
> $49 = {... gplan = 0x0, is_oneshot = false, is_complete = true, is_saved = true, is_valid = false, generation = 1,
node= {prev = 0x12fcf28,
 
>     next = 0xdf2c80 <saved_plan_list>}, generic_cost = -1, total_custom_cost = 12187136.696805555, num_custom_plans =
1}

From this part, I think it's clear that a custom plan was used and
that's the only one that this portal seems to know about.  Also, I can
see that only SPI ever builds "oneshot" plans, so is_oneshot would
always be false in your use case.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com