Re: inconsistent results querying table partitioned by date - Mailing list pgsql-bugs

From Alan Jackson
Subject Re: inconsistent results querying table partitioned by date
Date
Msg-id 9B8A2306-1C2D-48F0-8A2B-2E5B795D32E6@tvsquared.com
Whole thread Raw
In response to Re: inconsistent results querying table partitioned by date  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: inconsistent results querying table partitioned by date  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hi

Many thanks for the quick and patch-filled response to this issue. It’s great to see such an able and active response!

I’m glad my bug report highlighted the problem sufficiently clearly.

Is this likely to turn into an official patch, and if so, where can I track which release etc it will be included in?

Thanks again,
Alan Jackson
Data Architect
TVSquared



> On 10 May 2019, at 10:18, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> Horiguchi-san,
>
> Thanks for checking.
>
>> On 2019/05/10 15:33, Kyotaro HORIGUCHI wrote:
>> At Fri, 10 May 2019 14:37:34 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> I've attached a patch to fix that.  Actually, I've attached two patches --
>>> the 1st one adds a test for the misbehaving case with *wrong* output
>>> wherein a partition is incorrectly pruned, and the 2nd actually fixes the
>>> bug and updates the output of the test added by the 1st patch.  Divided
>>> the patch this way just to show the bug clearly.
>>
>> But this seems a bit wrong.
>>
>> If the two partition keys were in reverse order, pruning still
>> fails.
>>
>> CREATE TABLE dataid2 (
>>       datadatetime timestamp without time zone NOT NULL,
>>       id integer not null,
>>       CONSTRAINT dataid2_pkey PRIMARY KEY (datadatetime, id)
>> ) PARTITION BY RANGE (datadatetime, id);
>>
>> CREATE TABLE dataid2_201902 PARTITION OF dataid2 FOR VALUES FROM ('2019-02-01 00:00:00', 1) TO ('2019-03-01
00:00:00',1); 
>>
>> CREATE TABLE dataid2_default PARTITION OF dataid2 DEFAULT;
>>
>> insert into dataid2 values ('2019-02-24T00:00:00', 1);
>>
>> select * from dataid2 where id = 1 and datadatetime <  (('2019-02-26T00:00:00'::timestamp::timestamp at time zone
'America/New_York'+ '2 days'::interval) at time zone 'UTC'); 
>> datadatetime | id
>> --------------+----
>> (0 rows)
>>
>> This is wrong.
>
> Ah, indeed.
>
>> The condition is divided into two part (id = 1) and (datadatetime
>> < ..) and the latter reduces to nothing and the former remains
>> unchanged. Pruning continues using id = 1 and (I suppose) but
>> that is not partition_range_datum_bsearch()'s assumption. As the
>> result all partitions (other than default) are gone.
>
> I'd have expected this to result in no values being passed to
> get_matching_range_bounds (nvalues = 0), because the value of the
> expression compared against the 1st key is unavailable at planning time,
> meaning the values for subsequent keys should be ignored.  That would have
> meant there's nothing to prune with and hence no pruning should have
> occurred.  The problem however does not seem to be with the new logic I
> proposed but what turned out to be another bug in
> gen_prune_steps_from_opexps().
>
> Given that datadatetime is the first key in your example table and it's
> being compared using a non-inclusive operator, clauses for subsequent key
> columns (in this case id = 1) should have been ignored by pruning, which
> fails to happen due to a bug in gen_prune_steps_from_opexps().  I've fixed
> that in the attached updated patch.  With the patch, queries like yours
> return the correct result as shown below:
>
> truncate dataid2;
>
> insert into dataid2 (datadatetime, id) select
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC'), 1;
>
> insert into dataid2 (datadatetime, id) select
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '1 days'::interval) at time zone 'UTC'), 1;
>
> -- pruning steps generated for only 1st column (key1 < expr1)
> -- but no pruning occurs during planning because the value for
> -- datadatetime's clause is unavailable
>
> explain select * from dataid2 where id = 1 and datadatetime <
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC');
>
> QUERY PLAN
>
>
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> Append  (cost=10.96..36.21 rows=6 width=12)
>   ->  Bitmap Heap Scan on dataid2_201902  (cost=10.96..18.09 rows=3 width=12)
>         Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22
> 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
>         ->  Bitmap Index Scan on dataid2_201902_pkey  (cost=0.00..10.96
> rows=3 width=0)
>               Index Cond: ((datadatetime < timezone('UTC'::text,
> ('2019-02-22 14:00:00+09'::timestamp with time zone + '2
> days'::interval))) AND (id = 1))
>   ->  Bitmap Heap Scan on dataid2_default  (cost=10.96..18.09 rows=3
> width=12)
>         Recheck Cond: ((datadatetime < timezone('UTC'::text, ('2019-02-22
> 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
>         ->  Bitmap Index Scan on dataid2_default_pkey  (cost=0.00..10.96
> rows=3 width=0)
>               Index Cond: ((datadatetime < timezone('UTC'::text,
> ('2019-02-22 14:00:00+09'::timestamp with time zone + '2
> days'::interval))) AND (id = 1))
> (9 rows)
>
> select * from dataid2 where id = 1 and datadatetime <
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC');
>    datadatetime     │ id
> ─────────────────────┼────
> 2019-02-23 05:00:00 │  1
> (1 row)
>
> -- pruning steps generated for both columns (key1 = expr1, key2 = expr2),
> -- but no pruning occurs during planning because the value for
> -- datadatetime's clause is unavailable
>
> explain select * from dataid2 where id = 1 and datadatetime =
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC');
>
> QUERY PLAN
>
>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> Append  (cost=0.16..16.37 rows=2 width=12)
>   Subplans Removed: 1
>   ->  Index Only Scan using dataid2_201902_pkey on dataid2_201902
> (cost=0.16..8.18 rows=1 width=12)
>         Index Cond: ((datadatetime = timezone('UTC'::text, ('2019-02-22
> 14:00:00+09'::timestamp with time zone + '2 days'::interval))) AND (id = 1))
> (4 rows)
>
>
> select * from dataid2 where id = 1 and datadatetime =
> (('2019-02-22T00:00:00'::timestamp::timestamp at time zone
> 'America/New_York' + '2 days'::interval) at time zone 'UTC');
>    datadatetime     │ id
> ─────────────────────┼────
> 2019-02-24 05:00:00 │  1
> (1 row)
>
>> In passing I found a typo while looking this issue.
>>
>> |   case BTLessStrategyNumber:
>> |
>> |     /*
>> |      * Look for the greatest bound that is < or <= lookup value and
>> |      * set minoff to its offset.
>>
>> I think the "minoff" is typo of "maxoff".
>
> Ah, fixed in the updated patch.
>
>>>> Seems to be equally broken in v11 and HEAD.  I didn't try v10.
>>>
>>> v10 is fine, as it uses constraint exclusion.
>>>
>>> Attached patches apply to both v11 and HEAD.
>>
>> Mmm. This doesn't apply on head on my environment.
>>
>>> patching file src/test/regress/expected/partition_prune.out
>>> Hunk #1 FAILED at 951.
>>
>> git rev-parse --short HEAD
>> d0bbf871ca
>
> Hmm, I rebased the patch's branch over master and didn't get any
> conflicts.  Please check with the new patch.
>
> Thanks,
> Amit
> <v2-0001-Add-test.patch>
> <v2-0002-Bug-fix.patch>

--
TV Squared Limited is a company registered in Scotland.  Registered number:
SC421072.  Registered office: CodeBase, Argyle House, 3 Lady Lawson Street,
Edinburgh, EH3 9DR.
 
TV Squared Inc (File No. 5600204) is an Incorporated
company registered in Delaware. Principal office: 1412 Broadway, 22 Fl, New
York, New York, 10018

TV Squared GmbH is a company registered in Munich.
Registered number: HRB 236077. Registered office: Oskar-von-Miller-Ring 20,
c/o wework, 80333 Munchen

This message is private and confidential.  If
you have received this message in error, please notify us and remove it
from your system.



pgsql-bugs by date:

Previous
From: Amit Langote
Date:
Subject: Re: inconsistent results querying table partitioned by date
Next
From: Tom Lane
Date:
Subject: Re: inconsistent results querying table partitioned by date