Re: inconsistent results querying table partitioned by date - Mailing list pgsql-bugs

From Amit Langote
Subject Re: inconsistent results querying table partitioned by date
Date
Msg-id 3c21289a-ad03-ad06-f354-cadbb807a63c@lab.ntt.co.jp
Whole thread Raw
In response to Re: inconsistent results querying table partitioned by date  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On 2019/05/17 4:25, Tom Lane wrote:
> I wrote:
>> Anyway, I've pushed David's patch now, on to look at Amit's.
> 
> So, the changes in gen_prune_steps_from_opexps seem okay, but
> the change in perform_pruning_base_step is just wrong, as can
> be seen from this extended form of the test case:
> 
> create table mc3p (a int, b int, c int) partition by range (a, abs(b), c);
> create table mc3p1 partition of mc3p
>   for values from (1, 1, 1) to (1, 1, 10);
> create table mc3p2 partition of mc3p
>   for values from (1, 1, 10) to (2, minvalue, minvalue);
> create table mc3p3 partition of mc3p
>  for values from (2, minvalue, minvalue) to (3, maxvalue, maxvalue);
> insert into mc3p values (1, 1, 1), (1,1,10), (2, 1, 1);
> 
> set plan_cache_mode = force_generic_plan;
> prepare init_exec_prune_q1 as
>   select * from mc3p where a = $1 and abs(b) <= $2 and c < (select 13);
> explain (analyze, costs off, summary off, timing off)
> execute init_exec_prune_q1 (1,2);
> deallocate init_exec_prune_q1;
> reset plan_cache_mode;
> 
> drop table mc3p;
>
> This should of course find the (1,1,1) row, but it does not because
> it converts the initial pruning condition to be "a = 1 and abs(b) = 2",
> whereas the correct thing would be "a = 1 and abs(b) <= 2".

Oops, you're right.

> It seems to me that the correct fix, rather than forcing equality
> strategy when you've missed out some of the quals, is to back up
> and use the strategy of the last clause you did use.  But unless
> I'm missing something, that information is just not available from
> the PartitionPruneStepOp data structure.

It's not.  PartitionPruneStepOp's opstrategy refers to the strategy of the
operator for the last matched key.

> We could extend the PartitionPruneStepOp struct to have an array
> or list of StrategyNumbers (but I'm unsure about how safe that'd
> be to backpatch into v11).

I did consider that, but I too feared that it won't be back-patchable.

> Another idea is that we ought to generate separate partitioning
> steps lists for the initial-pruning and runtime-pruning cases,
> ie invoke gen_partprune_steps yet a third time.  Ugh (and that
> still means a data structure change, just in a different place).
>
> The bottom line here is that we've got an inflexible data structure
> that was designed on the assumption that gen_partprune_steps has all
> the intelligence, and trying to adjust things later is just trouble.
> 
> I think this is really quite closely tied to my previous complaints
> about the design for plan-time versus run-time pruning, just mutatis
> mutandis.  The lack of any holistic, principled approach to
> which-quals-apply-when is what's biting us here.
> 
> Moreover, now that I understand what's going on here a bit better,
> I am deathly afraid that there are yet other bugs hiding in this
> same area.  What we've basically got is that analyze_partkey_exprs
> and perform_pruning_base_step are trying to reverse-engineer what
> gen_partprune_steps would have done differently if it used only the
> quals not containing exec params.  They don't really have enough
> information available to do that correctly, and even if they did,
> the fact that the logic looks nothing like the same is going to
> be a fertile source of bugs and maintenance gotchas.
> 
> In other words, as I compose this I'm talking myself into the
> idea that invoking gen_partprune_steps three times (for planner,
> executor startup, and executor runtime, with corresponding
> filters on which clauses can be used) is the only safe near-term
> fix.  In the future we can look at cutting down the repetitive
> costs that entails.

I have to agree that running gen_partprune_steps separately for each
instance of performing get_matching_partitions() would result in cleaner
code overall.  Cleaner in the sense that the logic of which set of clauses
are applicable to which instance of pruning is now centralized and doesn't
require code to reverse-engineering that in multiple places, as you say.

Maybe, we *can* avoid doing full-blown gen_partprune_steps() a 2nd and a
3rd time, an idea for which I'll post in reply to your other email.

Thanks,
Amit




pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #15804: Assertion failure when using logging_collector withEXEC_BACKEND
Next
From: PG Bug reporting form
Date:
Subject: BUG #15811: luoxiang