Thread: Query works when kludged, but would prefer "best practice" solution

Query works when kludged, but would prefer "best practice" solution

From

"Carlo Stonebanks"

Date:

17 September 2007, 17:22:34

Hi all,

Please see the section marked as "PROBLEM" in "ORIGINAL QUERY" plan below.
You can see it's pretty slow. Oddly enough, an index for facility_address_id
is available but not being used, but I suspect it's questionable whether it
would be an improvement.

I knew that the filter was best applied to the results of the join - my
attempts to restructure the query with subqueries, etc didn't fool the
planner - it always figured out a plan that had this problem SEQ SCAN +
FILTER in it.

Finally, I "hid" the condition from the planner with a coalesce function -
see "SOLUTION" in the "KLUDGED QUERY" plan below.

Sure enough, a new plan appeared with a remarkable performance improvement!

The purpose of this query is to find facilities within a geographical area
when the complete address data is missing (hence the facility_address_id is
NULL).

PG is 8.4.2 on RH linux server with 1GB ram, HDD is RAID 1.

I don't like kludging like this - so any and all help or advice is
appreciated!

Carlo

ORIGINAL QUERY
select
   pp.provider_id,
   pp.provider_practice_id,
   nearby.distance
from mdx_core.provider_practice as pp
join mdx_core.facility as f
on f.facility_id = pp.facility_id
join (select * from mdx_core.zips_in_mile_range('08820', 10)) as nearby
on f.default_country_code = 'US'
   and f.default_postal_code = nearby.zip
where facility_address_id is null

Hash Join  (cost=30258.99..107702.53 rows=9438 width=16) (actual
time=169.516..3064.188 rows=872 loops=1)
  Hash Cond: (pp.facility_id = f.facility_id)
PROBLEM:
------------
  ->  Seq Scan on provider_practice pp  (cost=0.00..74632.55 rows=724429
width=12) (actual time=0.039..1999.457 rows=728396 loops=1)
        Filter: (facility_address_id IS NULL)
------------
  ->  Hash  (cost=29954.15..29954.15 rows=24387 width=12) (actual
time=156.668..156.668 rows=907 loops=1)
        ->  Nested Loop  (cost=0.00..29954.15 rows=24387 width=12) (actual
time=149.891..155.343 rows=907 loops=1)
              ->  Function Scan on zips_in_mile_range  (cost=0.00..12.50
rows=1000 width=40) (actual time=149.850..149.920 rows=66 loops=1)
              ->  Index Scan using facility_country_postal_code_idx on
facility f  (cost=0.00..29.64 rows=24 width=15) (actual time=0.015..0.048
rows=14 loops=66)
                    Index Cond: ((f.default_country_code = 'US'::bpchar) AND
((f.default_postal_code)::text = zips_in_mile_range.zip))
Total runtime: 3065.338 ms


KLUDGED QUERY

select
   pp.provider_id,
   pp.provider_practice_id,
   nearby.distance
from mdx_core.provider_practice as pp
join mdx_core.facility as f
on f.facility_id = pp.facility_id
join (select * from mdx_core.zips_in_mile_range('08820', 10)) as nearby
on f.default_country_code = 'US'
   and f.default_postal_code = nearby.zip
   and coalesce(pp.facility_address_id, -1) = -1

Nested Loop  (cost=0.00..112618.87 rows=180 width=16) (actual
time=149.680..167.261 rows=872 loops=1)
  ->  Nested Loop  (cost=0.00..29954.15 rows=24387 width=12) (actual
time=149.659..155.018 rows=907 loops=1)
        ->  Function Scan on zips_in_mile_range  (cost=0.00..12.50 rows=1000
width=40) (actual time=149.620..149.698 rows=66 loops=1)
        ->  Index Scan using facility_country_postal_code_idx on facility f
(cost=0.00..29.64 rows=24 width=15) (actual time=0.015..0.045 rows=14
loops=66)
              Index Cond: ((f.default_country_code = 'US'::bpchar) AND
((f.default_postal_code)::text = zips_in_mile_range.zip))
SOLUTION
-------------
  ->  Index Scan using provider_practice_facility_idx on provider_practice
pp  (cost=0.00..3.38 rows=1 width=12) (actual time=0.007..0.009 rows=1
loops=907)
        Index Cond: (f.facility_id = pp.facility_id)
        Filter: (COALESCE(facility_address_id, -1) = -1)
-------------
Total runtime: 168.275 ms

Re: Query works when kludged, but would prefer "best practice" solution

From

"Merlin Moncure"

Date:

17 September 2007, 21:03:29

On 9/17/07, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
> Hi all,
>
> Please see the section marked as "PROBLEM" in "ORIGINAL QUERY" plan below.
> You can see it's pretty slow. Oddly enough, an index for facility_address_id
> is available but not being used, but I suspect it's questionable whether it
> would be an improvement.

This looks like it might be the problem tom caught and rigged a solution to:
http://people.planetpostgresql.org/dfetter/index.php?/archives/134-PostgreSQL-Weekly-News-September-03-2007.html
(look fro band-aid).

If that's the case, the solution is to wait for 8.2.5 (coming soon).

merlin

Re: Query works when kludged, but would prefer "best practice" solution

From

"Carlo Stonebanks"

Date:

17 September 2007, 21:13:30

Well, there goes my dream of getting a recommendation that will deliver a
blinding insight into how to speed up all of my queries a thousand-fold.

Thanks Merlin!

-----Original Message-----
From: Merlin Moncure [mailto:mmoncure@gmail.com]
Sent: September 17, 2007 8:03 PM
To: Carlo Stonebanks
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query works when kludged, but would prefer "best
practice" solution

On 9/17/07, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
> Hi all,
>
> Please see the section marked as "PROBLEM" in "ORIGINAL QUERY" plan below.
> You can see it's pretty slow. Oddly enough, an index for
facility_address_id
> is available but not being used, but I suspect it's questionable whether
it
> would be an improvement.

This looks like it might be the problem tom caught and rigged a solution to:
http://people.planetpostgresql.org/dfetter/index.php?/archives/134-PostgreSQ
L-Weekly-News-September-03-2007.html
(look fro band-aid).

If that's the case, the solution is to wait for 8.2.5 (coming soon).

merlin

Re: Query works when kludged, but would prefer "best practice" solution

From

"Merlin Moncure"

Date:

17 September 2007, 21:17:58

On 9/17/07, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
> Well, there goes my dream of getting a recommendation that will deliver a
> blinding insight into how to speed up all of my queries a thousand-fold.

that's easy...delete your data! :-)

merlin

Re: Query works when kludged, but would prefer "best practice" solution

From

"Carlo Stonebanks"

Date:

17 September 2007, 21:38:29

Thanks, it worked. Client happy. Big bonus in the mail.

-----Original Message-----
From: Merlin Moncure [mailto:mmoncure@gmail.com]
Sent: September 17, 2007 8:18 PM
To: Carlo Stonebanks
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query works when kludged, but would prefer "best
practice" solution

On 9/17/07, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
> Well, there goes my dream of getting a recommendation that will deliver a
> blinding insight into how to speed up all of my queries a thousand-fold.

that's easy...delete your data! :-)

merlin

Re: Query works when kludged, but would prefer "best practice" solution

From

Tom Lane

Date:

18 September 2007, 00:29:57

"Merlin Moncure" <mmoncure@gmail.com> writes:
> On 9/17/07, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
>> Please see the section marked as "PROBLEM" in "ORIGINAL QUERY" plan below.

> This looks like it might be the problem tom caught and rigged a solution to:
> http://people.planetpostgresql.org/dfetter/index.php?/archives/134-PostgreSQL-Weekly-News-September-03-2007.html
> (look fro band-aid).

No, fraid not, that was about misestimation of outer joins, and I see no
outer join here.

What I do see is misestimation of a set-returning-function's output:

  ->  Function Scan on zips_in_mile_range  (cost=0.00..12.50 rows=1000 width=40) (actual time=149.850..149.920 rows=66
loops=1)

There's not any very nice way to improve that in existing releases :-(.
In 8.3 it will be possible to add a ROWS option to function definitions
to replace the default "1000 rows" estimate with some other number, but
that still helps little if the number of result rows is widely variable.

As far as kluges go: rather than kluging conditions affecting unrelated
tables, maybe you could put in a dummy constraint on the function's
output --- ie, a condition you know is always true, but the planner
won't know that, and will scale down its result-rows estimate accordingly.

            regards, tom lane

Re: Query works when kludged, but would prefer "best practice" solution

From

"Carlo Stonebanks"

Date:

18 September 2007, 03:28:47

Hi Tom,

Thanks for the suggestion - this concept is pretty new to me. Can you expand
a bit on the idea of how to place such a "dummy" constraint on a function,
and the conditions on which it affects the planner? Would this require that
constraint_exclusion be set on?

(When I go to sleep, I have a dream -- and in this dream Tom writes a
brilliant three line code sample that makes it all clear to me, and I wake
up a PostgreSQL guru)

;-)

Carlo

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: September 17, 2007 11:30 PM
To: Merlin Moncure
Cc: Carlo Stonebanks; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query works when kludged, but would prefer "best
practice" solution

"Merlin Moncure" <mmoncure@gmail.com> writes:
> On 9/17/07, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
>> Please see the section marked as "PROBLEM" in "ORIGINAL QUERY" plan
below.

> This looks like it might be the problem tom caught and rigged a solution
to:
>
http://people.planetpostgresql.org/dfetter/index.php?/archives/134-PostgreSQ
L-Weekly-News-September-03-2007.html
> (look fro band-aid).

No, fraid not, that was about misestimation of outer joins, and I see no
outer join here.

What I do see is misestimation of a set-returning-function's output:

  ->  Function Scan on zips_in_mile_range  (cost=0.00..12.50 rows=1000
width=40) (actual time=149.850..149.920 rows=66 loops=1)

There's not any very nice way to improve that in existing releases :-(.
In 8.3 it will be possible to add a ROWS option to function definitions
to replace the default "1000 rows" estimate with some other number, but
that still helps little if the number of result rows is widely variable.

As far as kluges go: rather than kluging conditions affecting unrelated
tables, maybe you could put in a dummy constraint on the function's
output --- ie, a condition you know is always true, but the planner
won't know that, and will scale down its result-rows estimate accordingly.

            regards, tom lane

Re: Query works when kludged, but would prefer "best practice" solution

From

Tom Lane

Date:

18 September 2007, 11:08:46

"Carlo Stonebanks" <stonec.register@sympatico.ca> writes:
> Thanks for the suggestion - this concept is pretty new to me. Can you expand
> a bit on the idea of how to place such a "dummy" constraint on a function,
> and the conditions on which it affects the planner?

Let's say that you know that the function's result column "x" can only
range from 1 to 1000.  The planner does not know that, and has no
statistics from which it could guess, so it's going to fall back on
default selectivity estimates for any WHERE clause involving x.
So for instance you could tack on something like

FROM ... (select * from myfunc() where x <= 1000) ...

which will change the actual query result not at all, but will cause the
planner to reduce its estimate of the number of rows out by whatever the
default selectivity estimate for an inequality is (from memory, 0.333,
but try it and see).  If that's too much or not enough, you could try
some other clauses that will never really reject any rows, for instance

    where x >= 1 and x <= 1000
    where x <> -1
    where x is not null

Of course this technique depends on knowing something that will always
be true about your data, but most people can think of something...

Now this is not going to affect the evaluation of the function itself at
all.  What it will do is affect the shape of a join plan built atop that
function scan, since joins are pretty much all about minimizing the
number of intermediate rows.

> Would this require that
> constraint_exclusion be set on?

No.

            regards, tom lane

Re: Query works when kludged, but would prefer "best practice" solution

From

"Dave Dutcher"

Date:

18 September 2007, 12:11:36

I think Tom is talking about something like this:

explain select * from foo();
                              QUERY PLAN
----------------------------------------------------------------------
 Function Scan on foo  (cost=0.00..12.50 rows=1000 width=50)

The planner is estimating the function will return 1000 rows.

explain select * from foo() where id > 0;
                             QUERY PLAN
---------------------------------------------------------------------
 Function Scan on foo  (cost=0.00..15.00 rows=333 width=50)
   Filter: (id > 0)

In the second case I am asking for all ids greater than zero, but my ids are
all positive integers.  The planner doesn't know that, so it assumes the
where clause will decrease the number of results.

I would still say this is a kludge, and since you already found a kludge
that works, this may not help you at all.

Dave

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Carlo
Stonebanks
Sent: Tuesday, September 18, 2007 1:29 AM
To: 'Tom Lane'; 'Merlin Moncure'
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query works when kludged, but would prefer "best
practice" solution

Hi Tom,

Thanks for the suggestion - this concept is pretty new to me. Can you expand
a bit on the idea of how to place such a "dummy" constraint on a function,
and the conditions on which it affects the planner? Would this require that
constraint_exclusion be set on?

(When I go to sleep, I have a dream -- and in this dream Tom writes a
brilliant three line code sample that makes it all clear to me, and I wake
up a PostgreSQL guru)

;-)

Carlo

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: September 17, 2007 11:30 PM
To: Merlin Moncure
Cc: Carlo Stonebanks; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query works when kludged, but would prefer "best
practice" solution

"Merlin Moncure" <mmoncure@gmail.com> writes:
> On 9/17/07, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
>> Please see the section marked as "PROBLEM" in "ORIGINAL QUERY" plan
below.

> This looks like it might be the problem tom caught and rigged a
> solution
to:
>
http://people.planetpostgresql.org/dfetter/index.php?/archives/134-PostgreSQ
L-Weekly-News-September-03-2007.html
> (look fro band-aid).

No, fraid not, that was about misestimation of outer joins, and I see no
outer join here.

What I do see is misestimation of a set-returning-function's output:

  ->  Function Scan on zips_in_mile_range  (cost=0.00..12.50 rows=1000
width=40) (actual time=149.850..149.920 rows=66 loops=1)

There's not any very nice way to improve that in existing releases :-(.
In 8.3 it will be possible to add a ROWS option to function definitions to
replace the default "1000 rows" estimate with some other number, but that
still helps little if the number of result rows is widely variable.

As far as kluges go: rather than kluging conditions affecting unrelated
tables, maybe you could put in a dummy constraint on the function's output
--- ie, a condition you know is always true, but the planner won't know
that, and will scale down its result-rows estimate accordingly.

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Performance improves only after repeated VACUUM/ANALYZE

From

"Carlo Stonebanks"

Date:

18 September 2007, 16:59:11

My client "publishes" an "edition" of their DB from his production site to
his hosted web/db server. This is done by FTPing a backup of the DB to his
hosting provider.

Immediately after a "publication" (restore to web/db server) we immediately
run VACUUM ANALYZE to make sure the statistics and row estimates are
correct.

The problem is, after this initial VACUUM ANALYZE, the row estimates in
query plans are off by several orders of magnitude. For example, a
disastrous plan was created because the planner estimated 4K rows when in
fact it returned 980K rows.

Sometimes - a day or two later - the plans return to "normal" and row
estimates are closer to realistic values. Guessing that there may be
background events that are correcting the row estimates over time, I ran an
ANALYZE on the DB - and sure enough - the row estimates corrected
themselves. The puzzling thing is, there have been no writes of any sort to
the data - there is no reason for the stats to have changed.

I believe that a VACUUM may not be necessary for a newly restored DB, but I
assumed that VACUUM ANALYZE and ANALYZE have the same net result. Am I
wrong?

If I am not wrong (i.e. VACUUM ANALYZE and ANALYZE should produce the same
results) why would the performance improve on a DB that has seen no
transactional activity only after the SECOND try?

PG 8.2.4 on RH LINUX 1GB RAM SCSI RAID 1

Carlo

Nested loops row estimates always too high

From

"Carlo Stonebanks"

Date:

18 September 2007, 17:07:12

I am noticing that my queries are spending a lot of time in nested loops. The table/index row estimates are not bad, but the nested loops can be off by a factor of 50. In any case, they are always too high.

If this is always occurring, is this an indication of a general configuration problem?

Carlo

select

pp.provider_id,

pp.provider_practice_id,

nearby.distance

from mdx_core.provider_practice as pp

join mdx_core.facility as f

on f.facility_id = pp.facility_id

join (select * from mdx_core.zips_in_mile_range('08820', 10) where zip > '') as nearby

on f.default_country_code = 'US'

and f.default_postal_code = nearby.zip

and pp.facility_address_id is NULL

union select

pp.provider_id,

pp.provider_practice_id,

nearby.distance

from mdx_core.provider_practice as pp

join mdx_core.facility_address as fa

on fa.facility_address_id = pp.facility_address_id

join mdx_core.address as a

on a.address_id = fa.address_id

join (select * from mdx_core.zips_in_mile_range('08820', 10) where zip > '') as nearby

on a.country_code = 'US'

and a.postal_code = nearby.zip

Unique (cost=67605.91..67653.18 rows=4727 width=16) (actual time=8634.618..8637.918 rows=907 loops=1)

-> Sort (cost=67605.91..67617.73 rows=4727 width=16) (actual time=8634.615..8635.651 rows=907 loops=1)

Sort Key: provider_id, provider_practice_id, distance

-> Append (cost=0.00..67317.41 rows=4727 width=16) (actual time=176.056..8632.429 rows=907 loops=1)

-> Nested Loop (cost=0.00..38947.07 rows=3143 width=16) (actual time=176.054..7867.962 rows=872 loops=1)

-> Nested Loop (cost=0.00..11520.79 rows=8121 width=12) (actual time=169.372..3041.010 rows=907 loops=1)

-> Function Scan on zips_in_mile_range (cost=0.00..15.00 rows=333 width=40) (actual time=151.479..151.671 rows=66 loops=1)

Filter: (zip > ''::text)

-> Index Scan using facility_country_postal_code_idx on facility f (cost=0.00..34.25 rows=24 width=15) (actual time=4.969..43.740 rows=14 loops=66)

Index Cond: ((f.default_country_code = 'US'::bpchar) AND ((f.default_postal_code)::text = zips_in_mile_range.zip))

-> Index Scan using provider_practice_facility_idx on provider_practice pp (cost=0.00..3.36 rows=1 width=12) (actual time=4.915..5.316 rows=1 loops=907)

Index Cond: (f.facility_id = pp.facility_id)

Filter: (facility_address_id IS NULL)

-> Nested Loop (cost=0.00..28323.07 rows=1584 width=16) (actual time=170.310..762.472 rows=35 loops=1)

-> Nested Loop (cost=0.00..7791.77 rows=1579 width=12) (actual time=170.289..612.579 rows=36 loops=1)

-> Nested Loop (cost=0.00..2595.96 rows=712 width=12) (actual time=167.017..354.261 rows=29 loops=1)

-> Function Scan on zips_in_mile_range (cost=0.00..15.00 rows=333 width=40) (actual time=150.188..150.312 rows=66 loops=1)

Filter: (zip > ''::text)

-> Index Scan using address_country_postal_code_address_idx on address a (cost=0.00..7.73 rows=2 width=17) (actual time=2.483..3.086 rows=0 loops=66)

Index Cond: ((a.country_code = 'US'::bpchar) AND ((a.postal_code)::text = zips_in_mile_range.zip))

-> Index Scan using facility_address_address_idx on facility_address fa (cost=0.00..7.15 rows=12 width=8) (actual time=7.652..8.901 rows=1 loops=29)

Index Cond: (a.address_id = fa.address_id)

-> Index Scan using provider_practice_facility_address_idx on provider_practice pp (cost=0.00..12.80 rows=16 width=12) (actual time=4.156..4.158 rows=1 loops=36)

Index Cond: (fa.facility_address_id = pp.facility_address_id)

Total runtime: 8639.066 ms