Re: default range partition and constraint exclusion - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: default range partition and constraint exclusion
Date
Msg-id 20171127.180457.257636417.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: default range partition and constraint exclusion  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: default range partition and constraint exclusion  (Robert Haas <robertmhaas@gmail.com>)
Re: default range partition and constraint exclusion  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
At Fri, 24 Nov 2017 10:49:07 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoarK4aCcSjYheH7QDchb7uJRpiKkGpPo7O9kMBNf13N3w@mail.gmail.com>
> On Wed, Nov 22, 2017 at 4:21 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> >>> If all predicate_refuted_by() receives is the expression tree (AND/OR)
> >>> with individual nodes being strict clauses involving partition keys (and
> >>> nothing about the nullness of the keys), the downstream code is just
> >>> playing by the rules as explained in the header comment of
> >>> predicate_refuted_by_recurse() in concluding that query's restriction
> >>> clause a = 2 refutes it.
> >>
> >> Oh, wait a minute.  Actually, I think predicate_refuted_by() is doing
> >> the right thing here.  Isn't the problem that mc2p2 shouldn't be
> >> accepting a (2, null) tuple at all?
> >
> > It doesn't.  But, for a query, it does contain (2, <unknown>) tuples,
> > where <unknown> would always be non-null.  So, it should be scanned in the
> > plan for the query that has only a = 2 as restriction and no restriction
> > on b.  That seems to work.
> 
> OK, so I am still confused about whether the constraint is wrong or
> the constraint exclusion logic is wrong.  One of them, at least, has
> to be wrong, and we have to fix whichever one is wrong.  Fixing broken
> constraint exclusion logic by hacking up the constraint, or conversely
> fixing a broken constraint by hacking up the constraint exclusion
> logic, wouldn't be right.
> 
> I think my last email was confused: I thought that the (2, null) tuple
> was ending up in mc2p2, but it's really ending up in mc2p_default,
> whose constraint currently looks like this:
> 
> NOT (
>       ((a < 1) OR ((a = 1) AND (b < 1)))
>         OR
>       ((a > 1) OR ((a = 1) AND (b >= 1)))
>     )
> 
> Now where exactly is constraint exclusion going wrong here?  a = 2
> refutes a < 1 and a = 1, which means that (a < 1) OR ((a = 1) AND (b <
> 1)) must be false and that (a = 1) AND (b >= 1) must also be false.
> But (a > 1) could be either true or null, which means (a > 1) OR ((a =

a > 1 is true when a = 2, so the second term is true?

> 1) AND (b >= 1)) can be true or null, which means the whole thing can
> be false or null, which means that it is not refuted by a = 2.  It

Then the whole thing is false.

> should be possible to dig down in there step by step and figure out
> where the wheels are coming off -- have you tried to do that?

| select  NOT (
|              ((a < 1) OR ((a = 1) AND (b < 1)))
|                OR
|              ((a > 1) OR ((a = 1) AND (b >= 1)))
|             )
| from (values (2::int, null::int)) as t(a, b);
|  ?column? 
| ----------
|  f

The problem here I think is that get_qual_for_range() for default
partition returns an inconsistent qual with what partition
get_partition_for_tuple chooses for keys containing nulls. It
chooses default partition if any of the key values is null,
without referring the constraint expression.

The current behavior is apparently odd.

| select pg_get_partition_constraintdef('mc2p2'::regclass);
|                         pg_get_partition_constraintdef                     
| ----------------------------------------------------------------------------
|  ((a IS NOT NULL) AND (b IS NOT NULL) AND ((a > 1) OR ((a = 1) AND (b >= 1))))


| select pg_get_partition_constraintdef('mc2p_default'::regclass);
|                           pg_get_partition_constraintdef                   
|    
| ---------------------------------------------------------------------------
|  (NOT (((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1)))))


| insert into mc2p2 values (2);
| ERROR:  new row for relation "mc2p2" violates partition constraint
| DETAIL:  Failing row contains (2, null).

This is the correct behavior.

| insert into mc2p_default values (2);
| ERROR:  new row for relation "mc2p_default" violates partition constraint
| DETAIL:  Failing row contains (2, null).

This is the correct behavior in terms of constraint, but
incorrect in terms of partition routing.

But interestingly, the following *works*, in a way contradicting
to the constraint.

| insert into mc2p values (2);
| INSERT 0 1
|
| select * from mc2p_default;
|  a | b 
| ---+---
|  2 |  
| (1 row)

After applying the patch upthread, get_qual_for_range() returns
the consistent qual and "insert into mc2p_default values (2)" is
accepted correctly and everything become consistent.

This is the story in my understanding.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?
Next
From: Rui Hai Jiang
Date:
Subject: RE: How is the PostgreSQL debuginfo file generated