Thread: PostgreSQL seems to create inefficient plans in simple conditional joins

PostgreSQL seems to create inefficient plans in simple conditional joins

From

Hedayat Vatankhah

Date:

30 January 2016, 15:31:00

Dear all,
First of all, I should apologize if my email doesn't follow all the guidelines.
I'm trying to do that though!

If referencing to links is OK, you can find the full description of
the issue at:
http://dba.stackexchange.com/questions/127082/postgresql-seems-to-create-inefficient-plans-in-simple-conditional-joins

It contains table definitions, queries, explan/explan analyze for them, and
a description of test conditions. But I'll provide a summary of the planning
issue below.

I'm using postgresql 9.3. I've run VACCUME ANALYZE on DB and it is
not modified after that.

Consider these tables:
CREATE TABLE t1
(
  id bigint NOT NULL DEFAULT nextval('ids_seq'::regclass),
  total integer NOT NULL,
  price integer NOT NULL,
  CONSTRAINT pk_t1 PRIMARY KEY (id)
)

CREATE TABLE t2
(
  id bigint NOT NULL,
  category smallint NOT NULL,
  CONSTRAINT pk_t2 PRIMARY KEY (id),
  CONSTRAINT fk_id FOREIGN KEY (id)
      REFERENCES t1 (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)

Personally, I expect both queries below to perform exactly the same:

SELECT
    t1.id, *
FROM
    t1
INNER JOIN
    t2 ON t1.id = t2.id
    where t1.id > -9223372036513411363;

And:

SELECT
    t1.id, *
FROM
    t1
INNER JOIN
    t2 ON t1.id = t2.id
    where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;

Unfortunately, they do not. PostgreSQL creates different plans for these
queries, which results in very poor performance for the first one compared
to the second (What I'm testing against is a DB with around 350 million
rows in t1, and slightly less in t2).

EXPLAIN output:
First query: http://explain.depesz.com/s/uauk
Second query: link: http://explain.depesz.com/s/uQd

The problem with the plan for the first query is that it limits
index scan on t1 with the where condition, but doesn't do so for t2.

A similar behavior happens if you replace INNER JOIN with LEFT JOIN,
and if you use "USING (id) where id > -9223372036513411363" instead
of "ON ...".

But it is important to get the first query right. Consider that I want to create
a view on SELECT statement (without condition) to simplify creating queries on
the data. If providing a single id column in the view, a SELECT query
on the view
with such a condition on id column will result in a query similar to
the first one.
With this problem, I should provide both ID columns in the view so that queries
can add each condition on ID column for both of them. Now assume what happens
when we are joining many tables together with ID column...

Is there anything wrong with my queries or with me expecting both queries to be
the sam? Can I do anything so that PostgreSQL will behave similarly for the
first query? Or if this is fixed in newer versions?

Thanks in advance,
Hedayat

Re: PostgreSQL seems to create inefficient plans in simple conditional joins

From

David Rowley

Date:

30 January 2016, 18:57:10

On 31 January 2016 at 01:30, Hedayat Vatankhah <hedayat.fwd@gmail.com> wrote:
> Personally, I expect both queries below to perform exactly the same:
>
> SELECT
>     t1.id, *
> FROM
>     t1
> INNER JOIN
>     t2 ON t1.id = t2.id
>     where t1.id > -9223372036513411363;
>
> And:
>
> SELECT
>     t1.id, *
> FROM
>     t1
> INNER JOIN
>     t2 ON t1.id = t2.id
>     where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;
>
> Unfortunately, they do not. PostgreSQL creates different plans for these
> queries, which results in very poor performance for the first one compared
> to the second (What I'm testing against is a DB with around 350 million
> rows in t1, and slightly less in t2).
>
> EXPLAIN output:
> First query: http://explain.depesz.com/s/uauk
> Second query: link: http://explain.depesz.com/s/uQd

Yes, unfortunately you've done about the only thing that you can do,
and that's just include both conditions in the query. Is there some
special reason why you can't just write the t2.id > ... condition in
the query too? or is the query generated dynamically by some software
that you have no control over?

I'd personally quite like to see improvements in this area, and even
wrote a patch [1] which fixes this problem too. The problem I had when
proposing the fix for this was that I was unable to report details
about how many people are hit by this planner limitation. The patch I
proposed caused a very small impact on planning time for many queries,
and was thought by many not to apply in enough cases for it to be
worth slowing down queries which cannot possibly benefit. Of course I
agree with this, I've no interest in slowing down planning on queries,
but at the same time understand the annoying poor optimisation in this
area.

Although please remember the patch I proposed was merely a first draft
proposal. Not for production use.

[1]
http://www.postgresql.org/message-id/flat/CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A@mail.gmail.com#CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A@mail.gmail.com

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: PostgreSQL seems to create inefficient plans in simple conditional joins

From

Vitalii Tymchyshyn

Date:

30 January 2016, 20:15:08

It may be more for -hackers, but I often hear "this wont be used because of planning time increase". Now as I know we have statistics on real query time after few runs that is used to decide if plan should be switched.
Can this statistics be used to apply advanced planning features for relatively long running queries? E.g. a parameter like sophisticated_planning_l1_threshold=500ms. If query runs over this threshold, replan it with more sophisticated features taking few more millis. Possibly different levels can be introduced. Also allow to set threshold to 0, saying "apply to all queries right away".
Another good option is to threshold against cumulative query time. E.g. if there was 10000 runs 0.5 millis each, it may be beneficial to spend few millis to get 0.2 millis each.

Best regards, Vitalii Tymchyshyn

Сб, 30 січ. 2016 10:57 David Rowley <david.rowley@2ndquadrant.com> пише:

On 31 January 2016 at 01:30, Hedayat Vatankhah <hedayat.fwd@gmail.com> wrote:
> Personally, I expect both queries below to perform exactly the same:
>
> SELECT
> t1.id, *
> FROM
> t1
> INNER JOIN
> t2 ON t1.id = t2.id
> where t1.id > -9223372036513411363;
>
> And:
>
> SELECT
> t1.id, *
> FROM
> t1
> INNER JOIN
> t2 ON t1.id = t2.id
> where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;
>
> Unfortunately, they do not. PostgreSQL creates different plans for these
> queries, which results in very poor performance for the first one compared
> to the second (What I'm testing against is a DB with around 350 million
> rows in t1, and slightly less in t2).
>
> EXPLAIN output:
> First query: http://explain.depesz.com/s/uauk
> Second query: link: http://explain.depesz.com/s/uQd

Yes, unfortunately you've done about the only thing that you can do,
and that's just include both conditions in the query. Is there some
special reason why you can't just write the t2.id > ... condition in
the query too? or is the query generated dynamically by some software
that you have no control over?

I'd personally quite like to see improvements in this area, and even
wrote a patch [1] which fixes this problem too. The problem I had when
proposing the fix for this was that I was unable to report details
about how many people are hit by this planner limitation. The patch I
proposed caused a very small impact on planning time for many queries,
and was thought by many not to apply in enough cases for it to be
worth slowing down queries which cannot possibly benefit. Of course I
agree with this, I've no interest in slowing down planning on queries,
but at the same time understand the annoying poor optimisation in this
area.

Although please remember the patch I proposed was merely a first draft
proposal. Not for production use.

[1] http://www.postgresql.org/message-id/flat/CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A@mail.gmail.com#CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: PostgreSQL seems to create inefficient plans in simple conditional joins

From

David Rowley

Date:

30 January 2016, 20:31:12

On 31 January 2016 at 06:14, Vitalii Tymchyshyn <vit@tym.im> wrote:
> It may be more for -hackers, but I often hear "this wont be used because of
> planning time increase". Now as I know we have statistics on real query time
> after few runs that is used to decide if plan should be switched.
> Can this statistics be used to apply advanced planning features for
> relatively long running queries? E.g. a parameter like
> sophisticated_planning_l1_threshold=500ms. If query runs over this
> threshold, replan it with more sophisticated features taking few more
> millis. Possibly different levels can be introduced. Also allow to set
> threshold to 0, saying "apply to all queries right away".
> Another good option is to threshold against cumulative query time. E.g. if
> there was 10000 runs 0.5 millis each, it may be beneficial to spend few
> millis to get 0.2 millis each.

I agree with you. I recently was working with long running queries on
a large 3TB database. I discovered a new optimisation was possible,
and wrote a patch to implement. On testing the extra work which the
optimiser performed took 7 micoseconds, and this saved 6 hours of
execution time. Now, I've never been much of an investor in my life,
but a 3 billion times return on an investment seems quite favourable.
Of course, that's quite an extreme case, but it's hard to ignore the
benefit is still significant in less extreme cases.

The idea you've mentioned here is very similar to what I bought up at
the developer meeting a few days ago, see AOB section in [1]

Unfortunately I didn't really get many of the correct people on my
side with it, and some wanted examples of specific patches, which is
completely not what I wanted to talk about. I was more aiming for some
agreement for generic infrastructure to do exactly as you describe.

[1]  https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: PostgreSQL seems to create inefficient plans in simple conditional joins

From

Vitalii Tymchyshyn

Date:

30 January 2016, 21:52:27

Well, as I can see it was just few phrases unless I miss something. May be it's worth to bring it to -hackers for a wider discussion?

Best regards, Vitalii Tymchyshyn

Сб, 30 січ. 2016 12:31 David Rowley <david.rowley@2ndquadrant.com> пише:

On 31 January 2016 at 06:14, Vitalii Tymchyshyn <vit@tym.im> wrote:
> It may be more for -hackers, but I often hear "this wont be used because of
> planning time increase". Now as I know we have statistics on real query time
> after few runs that is used to decide if plan should be switched.
> Can this statistics be used to apply advanced planning features for
> relatively long running queries? E.g. a parameter like
> sophisticated_planning_l1_threshold=500ms. If query runs over this
> threshold, replan it with more sophisticated features taking few more
> millis. Possibly different levels can be introduced. Also allow to set
> threshold to 0, saying "apply to all queries right away".
> Another good option is to threshold against cumulative query time. E.g. if
> there was 10000 runs 0.5 millis each, it may be beneficial to spend few
> millis to get 0.2 millis each.

I agree with you. I recently was working with long running queries on
a large 3TB database. I discovered a new optimisation was possible,
and wrote a patch to implement. On testing the extra work which the
optimiser performed took 7 micoseconds, and this saved 6 hours of
execution time. Now, I've never been much of an investor in my life,
but a 3 billion times return on an investment seems quite favourable.
Of course, that's quite an extreme case, but it's hard to ignore the
benefit is still significant in less extreme cases.

The idea you've mentioned here is very similar to what I bought up at
the developer meeting a few days ago, see AOB section in [1]

Unfortunately I didn't really get many of the correct people on my
side with it, and some wanted examples of specific patches, which is
completely not what I wanted to talk about. I was more aiming for some
agreement for generic infrastructure to do exactly as you describe.

[1] https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: PostgreSQL seems to create inefficient plans in simple conditional joins

From

Hedayat Vatankhah

Date:

31 January 2016, 00:51:17

Hi,

/*David Rowley*/ wrote on Sun, 31 Jan 2016 04:57:04 +1300:
> On 31 January 2016 at 01:30, Hedayat Vatankhah <hedayat.fwd@gmail.com> wrote:
>> Personally, I expect both queries below to perform exactly the same:
>>
>> SELECT
>>      t1.id, *
>> FROM
>>      t1
>> INNER JOIN
>>      t2 ON t1.id = t2.id
>>      where t1.id > -9223372036513411363;
>>
>> And:
>>
>> SELECT
>>      t1.id, *
>> FROM
>>      t1
>> INNER JOIN
>>      t2 ON t1.id = t2.id
>>      where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;
>>
>> Unfortunately, they do not. PostgreSQL creates different plans for these
>> queries, which results in very poor performance for the first one compared
>> to the second (What I'm testing against is a DB with around 350 million
>> rows in t1, and slightly less in t2).
>>
>> EXPLAIN output:
>> First query: http://explain.depesz.com/s/uauk
>> Second query: link: http://explain.depesz.com/s/uQd
> Yes, unfortunately you've done about the only thing that you can do,
> and that's just include both conditions in the query. Is there some
> special reason why you can't just write the t2.id > ... condition in
> the query too? or is the query generated dynamically by some software
> that you have no control over?
I can, but it would make my application code much more complex. I was
hoping to be able to hide the complexity of DB data model in DB itself
using views, triggers etc. If I want to add such conditions, the query
generator in my application code would be more complex, and certainly
the internal structure of DB will be visible to it.

I'm working to re-design a DB which can grow large and slow, as I guess
that we can find a more optimal design before trying optimizations like
using materialized views and other common optimizations. I've found two
completely different approaches for such problems: de-normalizing data,
highly normalizing data (6NF) like Anchor Modeling approach. I decided
to experiment with something similar to the latter one (not that
extreme!) specially since our current design was not that normalized,
and it performs poorly. I'm investigating why it should perform so bad
with my queries, and this problem was one of the reasons. In such a
design, views are used to present the JOIN of many tables as a single
table, so that using the model is easy and transparent. But usually a
single table doesn't have 10 ID columns (which can change as the model
changes) for which you should repeat any conditions to get acceptable
results!
While it can be done, it is so annoying: the application should know how
many tables are joined together, and repeat the condition for all such
columns. And the problem become worse when you are going to create a
relation between two different IDs of different data, e.g. relating
customer info (composed of joining 5 tables) with info about items (s)he
bought (composed of joining 3 tables).

Anyway, it seems that this is what I should implement in my application
code. I just hope that adding explicit conditions for each joined table
will not turn off any other optimizations!

Such an optimization seemed so natural to me that I didn't believe that
PostgreSQL doesn't understand that a condition on ID applies to all id
columns in a JOINed query, that I simplified my query step by step until
I reached the minimum problematic query which is very similar to the one
I posted here. It was at this point that I finally realized that maybe
PostgreSQL really doesn't understand it, and I was ... shocked!

> I'd personally quite like to see improvements in this area, and even
> wrote a patch [1] which fixes this problem too. The problem I had when
> proposing the fix for this was that I was unable to report details
> about how many people are hit by this planner limitation. The patch I
> proposed caused a very small impact on planning time for many queries,
> and was thought by many not to apply in enough cases for it to be
> worth slowing down queries which cannot possibly benefit. Of course I
> agree with this, I've no interest in slowing down planning on queries,
> but at the same time understand the annoying poor optimisation in this
> area.
>
> Although please remember the patch I proposed was merely a first draft
> proposal. Not for production use.
>
> [1]
http://www.postgresql.org/message-id/flat/CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A@mail.gmail.com#CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A@mail.gmail.com
>
That's great, I might consider experimenting with this too.

Regards,
Hedayat

Re: PostgreSQL seems to create inefficient plans in simple conditional joins

From

Hedayat Vatankhah

Date:

08 February 2016, 11:51:06

Hi again,

/*Hedayat Vatankhah*/ wrote on Sun, 31 Jan 2016 01:20:53 +0330:
> Hi,
>
> /*David Rowley*/ wrote on Sun, 31 Jan 2016 04:57:04 +1300:
>> On 31 January 2016 at 01:30, Hedayat Vatankhah
>> <hedayat.fwd@gmail.com> wrote:
>>> Personally, I expect both queries below to perform exactly the same:
>>>
>>> SELECT
>>>      t1.id, *
>>> FROM
>>>      t1
>>> INNER JOIN
>>>      t2 ON t1.id = t2.id
>>>      where t1.id > -9223372036513411363;
>>>
>>> And:
>>>
>>> SELECT
>>>      t1.id, *
>>> FROM
>>>      t1
>>> INNER JOIN
>>>      t2 ON t1.id = t2.id
>>>      where t1.id > -9223372036513411363 and t2.id >
>>> -9223372036513411363;
>>>
>>> Unfortunately, they do not. PostgreSQL creates different plans for
>>> these
>>> queries, which results in very poor performance for the first one
>>> compared
>>> to the second (What I'm testing against is a DB with around 350 million
>>> rows in t1, and slightly less in t2).
>>>
>>> EXPLAIN output:
>>> First query: http://explain.depesz.com/s/uauk
>>> Second query: link: http://explain.depesz.com/s/uQd
>> Yes, unfortunately you've done about the only thing that you can do,
>> and that's just include both conditions in the query. Is there some
>> special reason why you can't just write the t2.id > ... condition in
>> the query too? or is the query generated dynamically by some software
>> that you have no control over?

I just found another issue with using a query like the second one (using
LEFT JOINs instead of INNER JOINs): referencing id columns of joined
tables explicitly disables PostgreSQL join removal optimization when you
only select column(s) from t1! :(
I should forget about creating views on top of JOIN queries, and build
appropriate JOIN queries with referenced table and appropriate
conditions manually, so the whole data model should be exposed to the
application.

If I'm not wrong, PostgreSQL should understand that ANY condition on t2
doesn't change the LEFT JOIN output when t2 columns are not SELECTed.

Regards,
Hedayat