Thread: Sub-optimal plan for a paginated query on a view with another view inside of it.

Good day,
 
I have a performance issue when JOINing a view within another view more than once.
The query takes over three seconds to execute, which is too long in this case. It's not a problem if the tables are
nearlyempty, but that isn't the case on the production database. 
 
I suspect the planner thinks it's better to first put together the v_address view and JOIN it to the parcel table later
on,but the function "fx_get_user_tree_subordinates_by_id" should be JOINed to the parcel table first, as it reduces the
numberof rows to less than 200 and any following JOINs would be much faster. 
 
I have also ran vacuum, reindex and analyze on the whole database, but it seems to have had to effect.
 
Is there any way to nudge the planner toward that way of execution?
 
This is the query:
https://app.box.com/s/jzxiuuxoyj28q4q8rzxr
 
This is the query plan:
https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h (plain text)
https://app.box.com/s/jzxiuuxoyj28q4q8rzxr (graphical output)
 
These are the views:
https://app.box.com/s/uibzidsazwv3eeauovuk (paginated view)
https://app.box.com/s/v71vyexmdyl97m4f3m6u (used three times in the paginated view).
 
 
Thank you.
 
Peter Slapansky


I apologise, I have neglected to mention Postgres versions tested. It occurs with 9.0 and 9.2

I have typo in my previous message - the sentence about vacuum, reindex and analyze should be:

"I had also run vacuum, reindex and analyze on the whole database, but it seems to have had no effect."

 

Thanks for any thoughts on the issue.

 

Peter Slapansky


______________________________________________________________
> Od: <slapo@centrum.sk>
> Komu: <pgsql-performance@postgresql.org>
> Dátum: 02.08.2013 15:43
> Predmet: Sub-optimal plan for a paginated query on a view with another view inside of it.
>


Good day,
 
I have a performance issue when JOINing a view within another view more than once.
The query takes over three seconds to execute, which is too long in this case. It's not a problem if the tables are nearly empty, but that isn't the case on the production database.
 
I suspect the planner thinks it's better to first put together the v_address view and JOIN it to the parcel table later on, but the function "fx_get_user_tree_subordinates_by_id" should be JOINed to the parcel table first, as it reduces the number of rows to less than 200 and any following JOINs would be much faster.
 
I have also ran vacuum, reindex and analyze on the whole database, but it seems to have had to effect.
 
Is there any way to nudge the planner toward that way of execution?
 
This is the query:
https://app.box.com/s/jzxiuuxoyj28q4q8rzxr
 
This is the query plan:
https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h (plain text)
https://app.box.com/s/jzxiuuxoyj28q4q8rzxr (graphical output)
 
These are the views:
https://app.box.com/s/uibzidsazwv3eeauovuk (paginated view)
https://app.box.com/s/v71vyexmdyl97m4f3m6u (used three times in the paginated view).
 
 
Thank you.
 
Peter Slapansky

Hello

please, send result of EXPLAIN ANALYZE

please, use a http://explain.depesz.com/ for saving a plan

there is a more than 8 joins - so try to set geqo_threshold to 16,
join_collapse_limit to 16, and from_collapse_limit to 16.

Regards

Pavel Stehule

2013/8/2  <slapo@centrum.sk>:
> Good day,
>
> I have a performance issue when JOINing a view within another view more than once.
> The query takes over three seconds to execute, which is too long in this case. It's not a problem if the tables are
nearlyempty, but that isn't the case on the production database. 
>
> I suspect the planner thinks it's better to first put together the v_address view and JOIN it to the parcel table
lateron, but the function "fx_get_user_tree_subordinates_by_id" should be JOINed to the parcel table first, as it
reducesthe number of rows to less than 200 and any following JOINs would be much faster. 
>
> I have also ran vacuum, reindex and analyze on the whole database, but it seems to have had to effect.
>
> Is there any way to nudge the planner toward that way of execution?
>
> This is the query:
> https://app.box.com/s/jzxiuuxoyj28q4q8rzxr
>
> This is the query plan:
> https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h (plain text)
> https://app.box.com/s/jzxiuuxoyj28q4q8rzxr (graphical output)
>
> These are the views:
> https://app.box.com/s/uibzidsazwv3eeauovuk (paginated view)
> https://app.box.com/s/v71vyexmdyl97m4f3m6u (used three times in the paginated view).
>
>
> Thank you.
>
> Peter Slapansky
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Good day,

 

I have included a link to the result of EXPLAIN ANALYZE. It's this one:

https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h

 

Here's a link to Depesz's explain (if links to the site are okay):

http://explain.depesz.com/s/gCk

 

I have just tried setting geqo_threshold, join_collapse_limit and from_collapse_limit to 16, but it yielded no improvement.

Changing those three parameters to 32 did speed up the query from about 3.3 seconds to about a second (give or take 50 ms), which is a pretty good improvement, but not quite there, as I'm looking to bring it down to about 300 ms if possible. Changing those three settings to 48 yielded no improvements over 32.

Is there possibly something something else to tweak there?

 

Here's EXPLAIN ANALYZE output when the three settings have been set to 32:

http://explain.depesz.com/s/cj2

 

Thank you.

 

Peter Slapansky

 

______________________________________________________________
> Od: Pavel Stehule <pavel.stehule@gmail.com>
> Komu: <slapo@centrum.sk>
> Dátum: 06.08.2013 21:01
> Predmet: Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.
>

> CC: pgsql-performance@postgresql.org

Hello

please, send result of EXPLAIN ANALYZE

please, use a http://explain.depesz.com/ for saving a plan

there is a more than 8 joins - so try to set geqo_threshold to 16,
join_collapse_limit to 16, and from_collapse_limit to 16.

Regards

Pavel Stehule

2013/8/2  <slapo@centrum.sk>:
> Good day,
>
> I have a performance issue when JOINing a view within another view more than once.
> The query takes over three seconds to execute, which is too long in this case. It's not a problem if the tables are nearly empty, but that isn't the case on the production database.
>
> I suspect the planner thinks it's better to first put together the v_address view and JOIN it to the parcel table later on, but the function "fx_get_user_tree_subordinates_by_id" should be JOINed to the parcel table first, as it reduces the number of rows to less than 200 and any following JOINs would be much faster.
>
> I have also ran vacuum, reindex and analyze on the whole database, but it seems to have had to effect.
>
> Is there any way to nudge the planner toward that way of execution?
>
> This is the query:
> https://app.box.com/s/jzxiuuxoyj28q4q8rzxr
>
> This is the query plan:
> https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h (plain text)
> https://app.box.com/s/jzxiuuxoyj28q4q8rzxr (graphical output)
>
> These are the views:
> https://app.box.com/s/uibzidsazwv3eeauovuk (paginated view)
> https://app.box.com/s/v71vyexmdyl97m4f3m6u (used three times in the paginated view).
>
>
> Thank you.
>
> Peter Slapansky
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
slapo@centrum.sk
Sent: Wednesday, August 07, 2013 8:43 AM
To: Pavel Stehule
Cc: pgsql-performance@postgresql.org
Subject: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.

Good day,
 
I have included a link to the result of EXPLAIN ANALYZE. It's this one:
https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h
 
Here's a link to Depesz's explain (if links to the site are okay):
http://explain.depesz.com/s/gCk
 
I have just tried setting geqo_threshold, join_collapse_limit and from_collapse_limit to 16, but it yielded no
improvement.
Changing those three parameters to 32 did speed up the query from about 3.3 seconds to about a second (give or take 50
ms),which is a pretty good improvement, but not quite there, as I'm looking to bring it down to about 300 ms if
possible.Changing those three settings to 48 yielded no improvements over 32.
 
Is there possibly something something else to tweak there?
 
Here's EXPLAIN ANALYZE output when the three settings have been set to 32:
http://explain.depesz.com/s/cj2
 
Thank you.
 
Peter Slapansky

-----

Your last explain analyze (with 3 settings set to 32)  shows query duration 10ms, not 1sec.
Am I wrong? 

Regards,
Igor Neyman


You're right, it does... but it's quite odd, because I re-ran the explain-analyze statement and got the same results.

Still, the query now runs for about a second as mentioned before, so it's almost like something's missing from the explain, but I'm certain I copied it all.

 

I did this via pgadmin, but that shouldn't matter, should it?

 

Thank you,

 

Peter Slapansky

______________________________________________________________
> Od: Igor Neyman <ineyman@perceptron.com>
> Komu: "slapo@centrum.sk" <slapo@centrum.sk>, Pavel Stehule <pavel.stehule@gmail.com>
> Dátum: 07.08.2013 15:47
> Predmet: RE: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.
>

> CC: "pgsql-performance@postgresql.org"

Your last explain analyze (with 3 settings set to 32)  shows query duration 10ms, not 1sec.
Am I wrong? 

Regards,
Igor Neyman

 

______________________________________________________________
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of slapo@centrum.sk
Sent: Wednesday, August 07, 2013 8:43 AM
To: Pavel Stehule
Cc: pgsql-performance@postgresql.org
Subject: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.

Good day,
 
I have included a link to the result of EXPLAIN ANALYZE. It's this one:
https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h
 
Here's a link to Depesz's explain (if links to the site are okay):
http://explain.depesz.com/s/gCk
 
I have just tried setting geqo_threshold, join_collapse_limit and from_collapse_limit to 16, but it yielded no improvement.
Changing those three parameters to 32 did speed up the query from about 3.3 seconds to about a second (give or take 50 ms), which is a pretty good improvement, but not quite there, as I'm looking to bring it down to about 300 ms if possible. Changing those three settings to 48 yielded no improvements over 32.
Is there possibly something something else to tweak there?
 
Here's EXPLAIN ANALYZE output when the three settings have been set to 32:
http://explain.depesz.com/s/cj2
 
Thank you.
 
Peter Slapansky


From: slapo@centrum.sk [mailto:slapo@centrum.sk] 
Sent: Wednesday, August 07, 2013 10:43 AM
To: Igor Neyman; Pavel Stehule
Cc: pgsql-performance@postgresql.org
Subject: RE: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.

You're right, it does... but it's quite odd, because I re-ran the explain-analyze statement and got the same results.
Still, the query now runs for about a second as mentioned before, so it's almost like something's missing from the
explain,but I'm certain I copied it all.
 
 
I did this via pgadmin, but that shouldn't matter, should it?
 
Thank you,
 
Peter Slapansky
______________________________________________________________
_________________________________________________________

At very end of explain analyze output there should be a line:

Total runtime: ....

What do you get there?

Regards,
Igor Neyman

2013/8/7 Igor Neyman <ineyman@perceptron.com>:
>
>
> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
slapo@centrum.sk
> Sent: Wednesday, August 07, 2013 8:43 AM
> To: Pavel Stehule
> Cc: pgsql-performance@postgresql.org
> Subject: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.
>
> Good day,
>
> I have included a link to the result of EXPLAIN ANALYZE. It's this one:
> https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h
>
> Here's a link to Depesz's explain (if links to the site are okay):
> http://explain.depesz.com/s/gCk
>
> I have just tried setting geqo_threshold, join_collapse_limit and from_collapse_limit to 16, but it yielded no
improvement.
> Changing those three parameters to 32 did speed up the query from about 3.3 seconds to about a second (give or take
50ms), which is a pretty good improvement, but not quite there, as I'm looking to bring it down to about 300 ms if
possible.Changing those three settings to 48 yielded no improvements over 32. 
> Is there possibly something something else to tweak there?
>
> Here's EXPLAIN ANALYZE output when the three settings have been set to 32:
> http://explain.depesz.com/s/cj2
>
> Thank you.
>
> Peter Slapansky
>
> -----
>
> Your last explain analyze (with 3 settings set to 32)  shows query duration 10ms, not 1sec.
> Am I wrong?

I afraid so 1 sec is planning time :( .. So execution is fast, but
planning is expensive and relatively slow .. maybe prepared statements
can helps in this case.

Regards

Pavel

>
> Regards,
> Igor Neyman
>


2013/8/7  <slapo@centrum.sk>:
> You're right, it does... but it's quite odd, because I re-ran the
> explain-analyze statement and got the same results.
>
> Still, the query now runs for about a second as mentioned before, so it's
> almost like something's missing from the explain, but I'm certain I copied
> it all.

what is time of EXPLAIN only ?

Pavel

>
>
>
> I did this via pgadmin, but that shouldn't matter, should it?
>
>
>
> Thank you,
>
>
>
> Peter Slapansky
>
> ______________________________________________________________
>> Od: Igor Neyman <ineyman@perceptron.com>
>> Komu: "slapo@centrum.sk" <slapo@centrum.sk>, Pavel Stehule
>> <pavel.stehule@gmail.com>
>> Dátum: 07.08.2013 15:47
>> Predmet: RE: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated
>> query on a view with another view inside of it.
>>
>
>> CC: "pgsql-performance@postgresql.org"
>
> Your last explain analyze (with 3 settings set to 32)  shows query duration
> 10ms, not 1sec.
> Am I wrong?
>
> Regards,
> Igor Neyman
>
>
>
> ______________________________________________________________
>
>
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
> slapo@centrum.sk
> Sent: Wednesday, August 07, 2013 8:43 AM
> To: Pavel Stehule
> Cc: pgsql-performance@postgresql.org
> Subject: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a
> view with another view inside of it.
>
> Good day,
>
> I have included a link to the result of EXPLAIN ANALYZE. It's this one:
> https://app.box.com/s/u8nk6qvkjs4ae7l7dh4h
>
> Here's a link to Depesz's explain (if links to the site are okay):
> http://explain.depesz.com/s/gCk
>
> I have just tried setting geqo_threshold, join_collapse_limit and
> from_collapse_limit to 16, but it yielded no improvement.
> Changing those three parameters to 32 did speed up the query from about 3.3
> seconds to about a second (give or take 50 ms), which is a pretty good
> improvement, but not quite there, as I'm looking to bring it down to about
> 300 ms if possible. Changing those three settings to 48 yielded no
> improvements over 32.
> Is there possibly something something else to tweak there?
>
> Here's EXPLAIN ANALYZE output when the three settings have been set to 32:
> http://explain.depesz.com/s/cj2
>
> Thank you.
>
> Peter Slapansky
>


I got:

"Total runtime: 9.313 ms" in pgAdmin

"Total runtime: 9.363 ms" in psql.

But timing after the query finished was 912.842 ms in psql.

 

Cheers,

 

Peter Slapansky

______________________________________________________________
> Od: Igor Neyman <ineyman@perceptron.com>
> Komu: "slapo@centrum.sk" <slapo@centrum.sk>, Pavel Stehule <pavel.stehule@gmail.com>
> Dátum: 07.08.2013 16:48
> Predmet: RE: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.
>

> CC: "pgsql-performance@postgresql.org"



From: slapo@centrum.sk [mailto:slapo@centrum.sk]
Sent: Wednesday, August 07, 2013 10:43 AM
To: Igor Neyman; Pavel Stehule
Cc: pgsql-performance@postgresql.org
Subject: RE: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.

You're right, it does... but it's quite odd, because I re-ran the explain-analyze statement and got the same results.
Still, the query now runs for about a second as mentioned before, so it's almost like something's missing from the explain, but I'm certain I copied it all.
 
I did this via pgadmin, but that shouldn't matter, should it?
 
Thank you,
 
Peter Slapansky
______________________________________________________________
_________________________________________________________

At very end of explain analyze output there should be a line:

Total runtime: ....

What do you get there?

Regards,
Igor Neyman


From: slapo@centrum.sk [mailto:slapo@centrum.sk] 
Sent: Wednesday, August 07, 2013 11:34 AM
To: Igor Neyman; Pavel Stehule
Cc: pgsql-performance@postgresql.org
Subject: RE: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view inside of it.

I got:
"Total runtime: 9.313 ms" in pgAdmin
"Total runtime: 9.363 ms" in psql.
But timing after the query finished was 912.842 ms in psql.
 
Cheers,
 
Peter Slapansky
______________________________________________________________

That proves what Pavel suggested regarding planning time.

Regards,
Igor Neyman

<slapo@centrum.sk> writes:
> "Total runtime: 9.313 ms" in pgAdmin
> "Total runtime: 9.363 ms" in psql.
> But timing after the query finished was 912.842 ms in psql.

Well, that's the downside of increasing join_collapse_limit and
from_collapse_limit: you might get a better plan, but it takes a lot
longer to get it because the planner is considering many more options.

If you're sufficiently desperate, you could consider rewriting the query
so that its JOIN structure matches the join order that the planner chooses
at the high collapse_limit settings.  Then you can reduce the limits back
down and it'll still find the same plan.  This tends to suck from a query
readability/maintainability standpoint though :-(.

The prepared-query approach might offer a solution too, if the good plan
isn't dependent on specific parameter values.

            regards, tom lane


I was afraid of something worse but hoping for something better in terms of maintainability. At least now I have a good
explanation.:-) 
I just hope the embedded view use won't interfere too much.
 
Thanks everyone.
 
Regards,
 
Peter Slapansky
 
______________________________________________________________
> Od: Tom Lane <tgl@sss.pgh.pa.us>
> Komu: <slapo@centrum.sk>
> Dátum: 07.08.2013 17:53
> Predmet: Re: [PERFORM] RE: [PERFORM] Re: [PERFORM] Sub-optimal plan for a paginated query on a view with another view
insideof it. 
>
> CC: "Igor Neyman", "Pavel Stehule", "pgsql-performance@postgresql.org"
<slapo@centrum.sk> writes:
> "Total runtime: 9.313 ms" in pgAdmin
> "Total runtime: 9.363 ms" in psql.
> But timing after the query finished was 912.842 ms in psql.

Well, that's the downside of increasing join_collapse_limit and
from_collapse_limit: you might get a better plan, but it takes a lot
longer to get it because the planner is considering many more options.

If you're sufficiently desperate, you could consider rewriting the query
so that its JOIN structure matches the join order that the planner chooses
at the high collapse_limit settings.  Then you can reduce the limits back
down and it'll still find the same plan.  This tends to suck from a query
readability/maintainability standpoint though :-(.

The prepared-query approach might offer a solution too, if the good plan
isn't dependent on specific parameter values.

 regards, tom lane