Thread: Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

Tom Lane

Date:

12 November 2008, 11:47:48

"Sergey Konoplev" <gray.ru@gmail.com> writes:
> You are right. I've found the odd thing (that completely drives me
> mad) in postgresql.conf.

> You are able to reproduce slow-not-in queries by switching
> constraint_exclusion to on in your postgresql.conf and running my test
> (which is attached to the first message).

Hmph.  It's trying to see if the NOT IN condition is self-contradictory,
which of course it isn't, but the predicate_refuted_by machinery isn't
smart enough to determine that except by running through all N^2
combinations of the individual x <> const conditions :-(.

(It's not really any smarter about the IN case either, but that only
takes constant time not O(N^2) because it will stop after finding that
the first equality condition doesn't refute itself.)

We could respond to this in a number of ways:

1. "Tough, don't do that."

2. Put some arbitrary limit on the number of subconditions in an AND or
OR clause before we give up and don't attempt to prove anything about
it.

3. Put in a narrow hack that will get us out of this specific case,
but might still allow very slow proof attempts in other large cases.

The specific narrow hack I'm considering for #3 goes like this: in this
case, we repeatedly pass btree_predicate_proof two clauses "x <> const1"
and "x <> const2", and after some fairly expensive probing of the system
catalogs it finds out that there's no way to prove that the former
refutes the latter.  But when considering two ScalarArrayOps, the two
operators will be the same for all of the sub-clauses, and so we could
check once to find out that we can't refute anything.  (It also seems
interesting to cache that catalog lookup in cases where we might be able
to prove something.)

Comments?
        regards, tom lane

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

Richard Huxton

Date:

12 November 2008, 12:29:43

Tom Lane wrote:
> "Sergey Konoplev" <gray.ru@gmail.com> writes:
>> You are right. I've found the odd thing (that completely drives me
>> mad) in postgresql.conf.
> 
>> You are able to reproduce slow-not-in queries by switching
>> constraint_exclusion to on in your postgresql.conf and running my test
>> (which is attached to the first message).
> 
> Hmph.  It's trying to see if the NOT IN condition is self-contradictory,
> which of course it isn't, but the predicate_refuted_by machinery isn't
> smart enough to determine that except by running through all N^2
> combinations of the individual x <> const conditions :-(.

So it's not checking the table, it's looking to see whether <clause1> OR
<clause2> end up excluding each other? Presumably becuase "OR" is just
another operator?

> We could respond to this in a number of ways:
> 
> 1. "Tough, don't do that."
> 
> 2. Put some arbitrary limit on the number of subconditions in an AND or
> OR clause before we give up and don't attempt to prove anything about
> it.

Do we know the estimated cost of just executing the planner-node at this
point? You could scale with the cost of actually doing the tests.

> 3. Put in a narrow hack that will get us out of this specific case,
> but might still allow very slow proof attempts in other large cases.
> 
> The specific narrow hack I'm considering for #3 goes like this: 

The specific hack goes right over my head :-)

--  Richard Huxton Archonet Ltd

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

Tom Lane

Date:

12 November 2008, 13:52:32

Richard Huxton <dev@archonet.com> writes:
> Tom Lane wrote:
>> Hmph.  It's trying to see if the NOT IN condition is self-contradictory,
>> which of course it isn't, but the predicate_refuted_by machinery isn't
>> smart enough to determine that except by running through all N^2
>> combinations of the individual x <> const conditions :-(.

> So it's not checking the table, it's looking to see whether <clause1> OR
> <clause2> end up excluding each other? Presumably becuase "OR" is just
> another operator?

Yeah.  An example of a closely related expression that it *would* be
able to prove self-contradictory isWHERE x = ALL (ARRAY[1, 2, ...])
or perhaps slightly more realisticallyWHERE x = ANY (ARRAY[1, 2, 3]) AND x > 4

The NOT IN is equivalent toWHERE x <> ALL (ARRAY[1, 2, ...])
which can't be proved false.  (Well, it could if x is of a finite domain
and all the possible values are listed, but we aren't gonna check for
that.)  So you can see that some fairly close analysis is needed to
determine whether anything can be done or not.

>> 2. Put some arbitrary limit on the number of subconditions in an AND or
>> OR clause before we give up and don't attempt to prove anything about
>> it.

> Do we know the estimated cost of just executing the planner-node at this
> point? You could scale with the cost of actually doing the tests.

No, this is long before we've developed any cost estimates.
        regards, tom lane

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

"Brendan Jurd"

Date:

12 November 2008, 14:03:53

On Thu, Nov 13, 2008 at 4:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah.  An example of a closely related expression that it *would* be
> able to prove self-contradictory is
>        WHERE x = ALL (ARRAY[1, 2, ...])
> or perhaps slightly more realistically
>        WHERE x = ANY (ARRAY[1, 2, 3]) AND x > 4

It seems like the cure is worse than the disease here.  Surely a user
who has a self-contradictory clause will realise the problem pretty
quickly (i.e., when he receives zero rows) and then just fix it.

I guess my question is, what's the real benefit of going to all this
trouble trying to prove that clauses are false? What real-world
problem does it address?

Cheers,
BJ

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

Tom Lane

Date:

12 November 2008, 14:16:56

"Brendan Jurd" <direvus@gmail.com> writes:
> I guess my question is, what's the real benefit of going to all this
> trouble trying to prove that clauses are false?

Not having to scan gigabytes of data in an excluded partition, for
instance.

Now the docs do say
Currently, constraint_exclusion is disabled by default becausethe constraint checks are relatively expensive, and in
manycircumstanceswill yield no savings. It is recommended to turnthis on only if you are actually using partitioned
tablesdesignedto take advantage of the feature. 

so we could argue that it's the OP's own fault if he turns this option
on for queries where long planning time isn't worth the trouble.
        regards, tom lane

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

Heikki Linnakangas

Date:

12 November 2008, 14:26:38

Brendan Jurd wrote:
> On Thu, Nov 13, 2008 at 4:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah.  An example of a closely related expression that it *would* be
>> able to prove self-contradictory is
>>        WHERE x = ALL (ARRAY[1, 2, ...])
>> or perhaps slightly more realistically
>>        WHERE x = ANY (ARRAY[1, 2, 3]) AND x > 4
> 
> It seems like the cure is worse than the disease here.  Surely a user
> who has a self-contradictory clause will realise the problem pretty
> quickly (i.e., when he receives zero rows) and then just fix it.
> 
> I guess my question is, what's the real benefit of going to all this
> trouble trying to prove that clauses are false? What real-world
> problem does it address?

Constraint exclusion partitioning?

Which brings to mind an interesting customer case. They are running 
queries like "WHERE id IN (...)", where ... is a *very* long list of 
keys, against a table that's partitioned by ranges of id. The query was 
running slow, because while constraint exclusion was able to eliminate 
completely useless partitions, if there was even one id in the list that 
falls into a given partition, the partition was probed for *all* of the 
ids, even those that belong to other partitions. Ideally, we would not 
only prove/refute the whole "x = ANY" expression, but individual values 
within it.

Actually, the long list of keys was obtained by running another query 
first. They originally had a single query with a join, but they split it 
to two queries because constraint exclusion doesn't work at run-time..

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

"Brendan Jurd"

Date:

12 November 2008, 14:34:14

On Thu, Nov 13, 2008 at 5:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Brendan Jurd" <direvus@gmail.com> writes:
>> I guess my question is, what's the real benefit of going to all this
>> trouble trying to prove that clauses are false?
>
> Not having to scan gigabytes of data in an excluded partition, for
> instance.

[after RTFMing ...]

The docs also say:

"When this parameter is on, the planner compares query conditions with
table CHECK constraints, and omits scanning tables for which the
conditions contradict the constraints."

I would normally interpret the above to mean that the planner *only*
performs these checks where a table CHECK constraint is relevant.  I
dug up the original test case posted by Sergey, and his test table
didn't have any CHECK constraint on it at all (unless you count the
NOT NULL implied by PRIMARY KEY).

Cheers,
BJ

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

Tom Lane

Date:

12 November 2008, 14:46:59

I wrote:
> We could respond to this in a number of ways:

> 1. "Tough, don't do that."

> 2. Put some arbitrary limit on the number of subconditions in an AND or
> OR clause before we give up and don't attempt to prove anything about
> it.

> 3. Put in a narrow hack that will get us out of this specific case,
> but might still allow very slow proof attempts in other large cases.

> The specific narrow hack I'm considering for #3 goes like this: in this
> case, we repeatedly pass btree_predicate_proof two clauses "x <> const1"
> and "x <> const2", and after some fairly expensive probing of the system
> catalogs it finds out that there's no way to prove that the former
> refutes the latter.  But when considering two ScalarArrayOps, the two
> operators will be the same for all of the sub-clauses, and so we could
> check once to find out that we can't refute anything.  (It also seems
> interesting to cache that catalog lookup in cases where we might be able
> to prove something.)

I find that it's not too hard to cache the operator lookup stuff, and
that helps some, but putting in a short-circuit path to make the test
only once for a ScalarArrayOpExpr is a lot harder than I expected.  The
problem is the double recursion in predicate_refuted_by_recurse --- you
can stop the recursion when you are looking at two ScalarArrayOpExprs
at the same time, but that only shuts off one of three recursion paths
that are going to end up iterating over the lists.

So option #2 with a cutoff of 100 items or so is looking like the
best response.

Thoughts?
        regards, tom lane

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

Tom Lane

Date:

12 November 2008, 17:10:04

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Which brings to mind an interesting customer case. They are running 
> queries like "WHERE id IN (...)", where ... is a *very* long list of 
> keys, against a table that's partitioned by ranges of id. The query was 
> running slow, because while constraint exclusion was able to eliminate 
> completely useless partitions, if there was even one id in the list that 
> falls into a given partition, the partition was probed for *all* of the 
> ids, even those that belong to other partitions. Ideally, we would not 
> only prove/refute the whole "x = ANY" expression, but individual values 
> within it.

> Actually, the long list of keys was obtained by running another query 
> first. They originally had a single query with a join, but they split it 
> to two queries because constraint exclusion doesn't work at run-time..

Yeah, at some point (after we have an explicit notion of partitioning in
the system, instead of the current build-it-from-spare-parts approach)
we ought to look at managing this stuff at runtime rather than expecting
that exclusion can be proven at plan time.  In particular a plan type
that acted like an indexscan across the whole partitioned table (select
proper partition, then indexscan) would be real handy.
        regards, tom lane

Re: [GENERAL] Very slow queries w/ NOT IN preparation (seems like a bug, test case)

From

Tom Lane

Date:

12 November 2008, 20:24:17

I wrote:
>> 2. Put some arbitrary limit on the number of subconditions in an AND or
>> OR clause before we give up and don't attempt to prove anything about
>> it.

> So option #2 with a cutoff of 100 items or so is looking like the
> best response.

I've applied a patch along this line to 8.2 and up, and also installed
some code (in HEAD only) to cache the results of proof-operator lookup.
        regards, tom lane