Thread: queries with subquery constraints on partitioned tables not optimized?

queries with subquery constraints on partitioned tables not optimized?

From
"Davor J."
Date:
 Let's say you have one partitioned table, "tbl_p", partitioned according to
the PK "p_pk". I have made something similar with triggers, basing myself on
the manual for making partitioned tables.
According to the manual, optimizer searches the CHECKs of the partitions to
determine which table(s) to use (if applicable).

So if one has CHECKs of kind "p_pk = some number", queries like "SELECT *
from tbl_p where p_pk = 1" will only be searched in the appropriate table.
One can check this with EXPLAIN. So far so good.

Now, if one takes a subquery for "1", the optimizer evaluates it first
(let's say to "1"), but then searches for it (sequentially) in every
partition, which, for large partitions, can be very time-consuming and goes
beyond the point of partitioning.

Is this normal, or am I missing something?

Kind regards,
Davor




Re: queries with subquery constraints on partitioned tables not optimized?

From
Tom Lane
Date:
"Davor J." <DavorJ@live.com> writes:
> Now, if one takes a subquery for "1", the optimizer evaluates it first
> (let's say to "1"), but then searches for it (sequentially) in every
> partition, which, for large partitions, can be very time-consuming and goes
> beyond the point of partitioning.

No, the optimizer doesn't "evaluate it first".  Subqueries aren't ever
assumed to reduce to constants.  (If you actually do have a constant
expression, why don't you just leave out the word SELECT?)

            regards, tom lane

Re: queries with subquery constraints on partitioned tables not optimized?

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> "Davor J." <DavorJ@live.com> writes:
>> Now, if one takes a subquery for "1", the optimizer evaluates it first
>> (let's say to "1"), but then searches for it (sequentially) in every
>> partition, which, for large partitions, can be very time-consuming and goes
>> beyond the point of partitioning.
>
> No, the optimizer doesn't "evaluate it first".  Subqueries aren't ever
> assumed to reduce to constants.  (If you actually do have a constant
> expression, why don't you just leave out the word SELECT?)

It's easy to experience the same problem with a JOIN you'd want to
happen at the partition level that the planner will apply on the Append
Node.

I'm yet to figure out if 8.4 is smarter about this, meanwhile I'm using
array tricks to force the push-down.

 WHERE ...
   AND service = ANY ((SELECT array_accum(id) FROM services WHERE x=281)
                   || (SELECT array_accum(id) FROM services WHERE y=281))

It happens that I need the array concatenation more than the = ANY
operator (as compared to IN), so I also have queries using = ANY
('{}':int[] || (SELECT array_accum(x) ...))  to really force the planner
into doing the join in the partitions rather than after the Append has
taken place.

Regards,
--
dim

PS: If you're interrested into complete examples, I'll be able to
provide for them in private.

Re: queries with subquery constraints on partitioned tables not optimized?

From
Nikolas Everett
Date:


On Tue, Feb 2, 2010 at 7:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Davor J." <DavorJ@live.com> writes:
> Now, if one takes a subquery for "1", the optimizer evaluates it first
> (let's say to "1"), but then searches for it (sequentially) in every
> partition, which, for large partitions, can be very time-consuming and goes
> beyond the point of partitioning.

No, the optimizer doesn't "evaluate it first".  Subqueries aren't ever
assumed to reduce to constants.  (If you actually do have a constant
expression, why don't you just leave out the word SELECT?)

                       regards, tom lane

If you don't have a constant expression then you can either explicitly loop in the calling code or a function or you could index the key in all the subtables.  The index isn't really optimal but it gets the job done.

Nik