Thread: vacuum analyze slows sql query

vacuum analyze slows sql query

From
patrick ~
Date:
Greetings pgsql-performance :)

Yesterday I posted to the pgsql-sql list about an issue with VACUUM
while trying to track-down an issue with performance of a SQL SELECT
statement invovling a stored function.  It was suggested that I bring
the discussion over to -performance.

Instread of reposting the message here is a link to the original
message followed by a brief summary:

  http://marc.theaimsgroup.com/?l=postgresql-sql&m=109945118928530&w=2


Summary:

Our customer complains about web/php-based UI sluggishness accessing
the data in db.  I created a "stripped down" version of the tables
in question to be able to post to the pgsql-sql list asking for hints
as to how I can improve the SQL query.  While doing this I noticed
that if I 'createdb' and populate it with the "sanatized" data the
query in question is quite fast; 618 rows returned in 864.522 ms.
This was puzzling.  Next I noticed that after a VACUUM the very same
query would slow down to a crawl; 618 rows returned in 1080688.921 ms).

This was reproduced on PostgreSQL 7.4.2 running on a Intel PIII 700Mhz,
512mb.  This system is my /personal/ test system/sandbox. i.e., it
isn't being stressed by any other processes.


Thanks for reading,
--patrick



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



Re: vacuum analyze slows sql query

From
Doug Y
Date:
Given that the plan doesn't change after an analyze, my guess would be that the first query is hitting cached data,
then 
  you vacuum and that chews though all the cache with its own data pushing the good data out of the cache so it has to
be re-fetched from disk.

If you run the select a 2nd time after the vacuum, what is the time?

Not sure what your pkk_offer_has_pending_purch function does, that might be something to look at as well.

I could be wrong, but thats the only thing that makes sense to me. ARC is supposed to help with that type of behavior
in8.0 

patrick ~ wrote:
> Greetings pgsql-performance :)
>
> Yesterday I posted to the pgsql-sql list about an issue with VACUUM
> while trying to track-down an issue with performance of a SQL SELECT
> statement invovling a stored function.  It was suggested that I bring
> the discussion over to -performance.
>
> Instread of reposting the message here is a link to the original
> message followed by a brief summary:
>
>   http://marc.theaimsgroup.com/?l=postgresql-sql&m=109945118928530&w=2
>
>
> Summary:
>
> Our customer complains about web/php-based UI sluggishness accessing
> the data in db.  I created a "stripped down" version of the tables
> in question to be able to post to the pgsql-sql list asking for hints
> as to how I can improve the SQL query.  While doing this I noticed
> that if I 'createdb' and populate it with the "sanatized" data the
> query in question is quite fast; 618 rows returned in 864.522 ms.
> This was puzzling.  Next I noticed that after a VACUUM the very same
> query would slow down to a crawl; 618 rows returned in 1080688.921 ms).
>
> This was reproduced on PostgreSQL 7.4.2 running on a Intel PIII 700Mhz,
> 512mb.  This system is my /personal/ test system/sandbox. i.e., it
> isn't being stressed by any other processes.
>
>
> Thanks for reading,
> --patrick
>


Re: vacuum analyze slows sql query

From
Tom Lane
Date:
patrick ~ <sidsrr@yahoo.com> writes:
> that if I 'createdb' and populate it with the "sanatized" data the
> query in question is quite fast; 618 rows returned in 864.522 ms.
> This was puzzling.  Next I noticed that after a VACUUM the very same
> query would slow down to a crawl; 618 rows returned in 1080688.921 ms).

The outer query is too simple to have more than one possible plan,
so the issue is certainly a change in query plans inside the function.
You need to be investigating what's happening inside that function.
7.1 doesn't have adequate tools for this, but in 7.4 you can use
PREPARE and EXPLAIN ANALYZE EXECUTE to examine the query plans used
for parameterized statements, which is what you've got here.

My bet is that with ANALYZE stats present, the planner guesses wrong
about which index to use; but without looking at EXPLAIN ANALYZE output
there's no way to be sure.

BTW, why the bizarrely complicated substitute for a NOT NULL test?
ISTM you only need

create function
pkk_offer_has_pending_purch( integer )
    returns bool
as  '
                select  p0.purchase_id is not null
                  from  pkk_purchase p0
                 where  p0.offer_id = $1
                        and ( p0.pending = true
                            or ( ( p0.expire_time > now()
                                 or p0.expire_time isnull )
                               and p0.cancel_date isnull ) )
                  limit 1
' language 'sql' ;

(Actually, seeing that pkk_purchase.purchase_id is defined as NOT NULL,
I wonder why the function exists at all ... but I suppose you've
"stripped" the function to the point of being nonsense.)

            regards, tom lane

Re: vacuum analyze slows sql query

From
patrick ~
Date:
Here is a fresh run with 'explain analyze' run before and after the
VACUUM statement:

-- begin
% dropdb pkk
DROP DATABASE
% createdb pkk
CREATE DATABASE
% psql pkk < pkk_db.sql
ERROR:  function pkk_offer_has_pending_purch(integer) does not exist
ERROR:  function pkk_offer_has_pending_purch2(integer) does not exist
ERROR:  table "pkk_billing" does not exist
ERROR:  table "pkk_purchase" does not exist
ERROR:  table "pkk_offer" does not exist
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "pkk_offer_pkey"
for table "pkk_offer"
CREATE TABLE
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"pkk_purchase_pkey" for table "pkk_purchase"
CREATE TABLE
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE TABLE
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
% zcat pkk.20041028_00.sql.gz | psql pkk
SET
SET
SET
SET
% psql pkk
pkk=# select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ;
<ommitting output />
(618 rows)

Time: 877.348 ms
pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id )
from pkk_offer ;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Seq Scan on pkk_offer  (cost=0.00..22.50 rows=1000 width=4) (actual
time=1.291..845.485 rows=618 loops=1)
 Total runtime: 849.475 ms
(2 rows)

Time: 866.613 ms
pkk=# vacuum analyze ;
VACUUM
Time: 99344.399 ms
pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id )
from pkk_offer ;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on pkk_offer  (cost=0.00..13.72 rows=618 width=4) (actual
time=3636.401..1047412.851 rows=618 loops=1)
 Total runtime: 1047415.525 ms
(2 rows)

Time: 1047489.477 ms
-- end



Tom,

The reason of the extra "case" part in the function is to ensure non-null
fields on the result.  I tried your version as well and i get similar
performance results:

-- begin
pkk=# create function toms_pending_purch( integer ) returns bool as 'select
p0.purchase_id is not null from  pkk_purchase p0 where  p0.offer_id = $1 and (
p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and
p0.cancel_date isnull ) ) limit 1 ' language 'sql' ;
CREATE FUNCTION
Time: 2.496 ms
pkk=# select offer_id, toms_pending_purch( offer_id ) from pkk_offer ;
(618 rows)

Time: 1052339.506 ms
-- end


Right now, I'm studying the document section on PREPARE and will
attempt to play around with it.


I was asked (in a prior post) whether running the statement a second
time after the VACUUM improves in performance.  It does not.  After
the VACUUM the statement remains slow.


Thanks for your help,
--patrick



--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> patrick ~ <sidsrr@yahoo.com> writes:
> > that if I 'createdb' and populate it with the "sanatized" data the
> > query in question is quite fast; 618 rows returned in 864.522 ms.
> > This was puzzling.  Next I noticed that after a VACUUM the very same
> > query would slow down to a crawl; 618 rows returned in 1080688.921 ms).
>
> The outer query is too simple to have more than one possible plan,
> so the issue is certainly a change in query plans inside the function.
> You need to be investigating what's happening inside that function.
> 7.1 doesn't have adequate tools for this, but in 7.4 you can use
> PREPARE and EXPLAIN ANALYZE EXECUTE to examine the query plans used
> for parameterized statements, which is what you've got here.
>
> My bet is that with ANALYZE stats present, the planner guesses wrong
> about which index to use; but without looking at EXPLAIN ANALYZE output
> there's no way to be sure.
>
> BTW, why the bizarrely complicated substitute for a NOT NULL test?
> ISTM you only need
>
> create function
> pkk_offer_has_pending_purch( integer )
>     returns bool
> as  '
>                 select  p0.purchase_id is not null
>                   from  pkk_purchase p0
>                  where  p0.offer_id = $1
>                         and ( p0.pending = true
>                             or ( ( p0.expire_time > now()
>                                  or p0.expire_time isnull )
>                                and p0.cancel_date isnull ) )
>                   limit 1
> ' language 'sql' ;
>
> (Actually, seeing that pkk_purchase.purchase_id is defined as NOT NULL,
> I wonder why the function exists at all ... but I suppose you've
> "stripped" the function to the point of being nonsense.)
>
>             regards, tom lane



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



Re: vacuum analyze slows sql query

From
patrick ~
Date:
Just wanted to know if there were any insights after looking at
requested 'explain analyze select ...'?


Thanks,
--patrick



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



Re: vacuum analyze slows sql query

From
patrick ~
Date:
Looking around at the pg_ tables and some PostgreSQL online
docs prompted by another post/reply on this list regarding
ALERT TABLE SET STATISTICS i found out that prior to a VACUUM
the following select (taken from the online docs) shows:

pkk=# select relname, relkind, reltuples, relpages from pg_class where relname
like 'pkk_%';
      relname      | relkind | reltuples | relpages
-------------------+---------+-----------+----------
 pkk_billing       | r       |      1000 |       10
 pkk_offer         | r       |      1000 |       10
 pkk_offer_pkey    | i       |      1000 |        1
 pkk_purchase      | r       |      1000 |       10
 pkk_purchase_pkey | i       |      1000 |        1
(5 rows)

Time: 1097.263 ms


and after a VACUUM:

pkk=# vacuum analyze ;
VACUUM
Time: 100543.359 ms


it shows:

pkk=# select relname, relkind, reltuples, relpages from pg_class where relname
like 'pkk_%';
      relname      | relkind |  reltuples  | relpages
-------------------+---------+-------------+----------
 pkk_billing       | r       |      714830 |     4930
 pkk_offer         | r       |         618 |        6
 pkk_offer_pkey    | i       |         618 |        4
 pkk_purchase      | r       | 1.14863e+06 |     8510
 pkk_purchase_pkey | i       | 1.14863e+06 |     8214
(5 rows)

Time: 3.868 ms



Further, I notice that if I were to delete rows from the
pg_statistic table I get the db in a state where the query
is fast again:

pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id )
from pkk_offer ;
                                                   QUERY PLAN

-----------------------------------------------------------------------------------------------------------------
 Seq Scan on pkk_offer  (cost=0.00..13.72 rows=618 width=4) (actual
time=2415.739..1065709.092 rows=618 loops=1)
 Total runtime: 1065711.651 ms
(2 rows)

Time: 1065713.446 ms



pkk=# delete from pg_statistic where pg_statistic.starelid = pg_class.oid and
pg_class.relname like 'pkk_%';
DELETE 11
Time: 3.368 ms



pkk=# select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ;
(618 rows)

Time: 876.377 ms


pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id )
from pkk_offer ;
                                                QUERY PLAN

----------------------------------------------------------------------------------------------------------
 Seq Scan on pkk_offer  (cost=0.00..13.72 rows=618 width=4) (actual
time=1.329..846.786 rows=618 loops=1)
 Total runtime: 848.170 ms
(2 rows)

Time: 849.958 ms




Now, I'm sure someone (a PostgreSQL developer most likely)
is about to shoot me for doing such a thing :-)

But, however *ugly, wrong, sacrilege* this may be, if this is
the only solution...err workaround I have that will help me
i must resort to it.

The only two questions I have about this are:

1. Is this really the only solution left for me?
2. Am I in anyway screwing the db doing this?


Best regards,
--patrick



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



Re: vacuum analyze slows sql query

From
Tom Lane
Date:
patrick ~ <sidsrr@yahoo.com> writes:
> 1. Is this really the only solution left for me?

You still haven't followed the suggestions that were given to you
(ie, find out what is happening with the plan for the query inside
the problematic function).

            regards, tom lane

Re: vacuum analyze slows sql query

From
patrick ~
Date:
Hi Tom, -performance@,

I apologize if I didn't follow through with the PREPARE and
EXECUTE.  I assume that is what you are refering to.  After
reading the PostgreSQL docs on PREPARE statement I realized
two things: a) PREPARE is only session long and b) that I
can not (at least I haven't figured out how) PREPARE a
statement which would mimic my original select statement
which I could EXECUTE over all rows of pkk_offer table.

Best I could do is either:

 PREPARE pkk_01 ( interger ) select $1, pkk_offer_has_pending_purch( $1 ) from
pkk_offer ;

or

 PREPARE pkk_00 ( integer ) <the def of pkk_offer_has_pending_purc( integer )>

In the former case the EXPLAIN ANALYZE doesn't give enough
data (it is the same as w/o the PREPARE statement).  In the
latter case, I can only execute it with one offer_id at at
time.  Is this sufficient?

If so, here are the results before and after VACUUM ANALYZE:

pkk=# explain analyze execute pkk_00( 795 ) ;
                          QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=8.57..8.58 rows=1 width=0) (actual time=0.095..0.096 rows=1
loops=1)
   InitPlan
     ->  Limit  (cost=0.00..8.57 rows=1 width=4) (actual time=0.083..0.084
rows=1 loops=1)
           ->  Index Scan using pur_offer_id_idx on pkk_purchase p0
(cost=0.00..17.13 rows=2 width=4) (actual time=0.079..0.079 rows=1 loops=1)
                 Index Cond: (offer_id = $1)
                 Filter: ((((expire_time)::timestamp with time zone > now()) OR
(expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR
(pending = true)))
 Total runtime: 0.238 ms
(7 rows)

pkk=# VACUUM ANALYZE ;
VACUUM
Time: 97105.589 ms

pkk=# explain analyze execute pkk_00( 795 ) ;
                          QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=8.57..8.58 rows=1 width=0) (actual time=0.329..0.330 rows=1
loops=1)
   InitPlan
     ->  Limit  (cost=0.00..8.57 rows=1 width=4) (actual time=0.311..0.312
rows=1 loops=1)
           ->  Index Scan using pur_offer_id_idx on pkk_purchase p0
(cost=0.00..17.13 rows=2 width=4) (actual time=0.307..0.307 rows=1 loops=1)
                 Index Cond: (offer_id = $1)
                 Filter: ((((expire_time)::timestamp with time zone > now()) OR
(expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR
(pending = true)))
 Total runtime: 0.969 ms
(7 rows)

Time: 16.252 ms



In both before and after "Index Scan" is used on pur_offer_id_idx.
So, unless I'm missing something obvious here I am at a loss.

I went as far as doing the EXPLAIN ANALYZE EXECUTE pkk_00( offer_id )
for each offer_id in pkk_offer table one at a time (not manually but
by scripting it).  All instances use "Index Scan".

I only noticed a couple that had quite large "actual times" like
this following:


pkk=# explain analyze execute pkk_00( 2312 ) ;

 QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=8.57..8.58 rows=1 width=0) (actual time=21.279..21.282 rows=1
loops=1)
   InitPlan
     ->  Limit  (cost=0.00..8.57 rows=1 width=4) (actual time=21.256..21.258
rows=1 loops=1)
           ->  Index Scan using pur_offer_id_idx on pkk_purchase p0
(cost=0.00..17.13 rows=2 width=4) (actual time=21.249..21.249 rows=1 loops=1)
                 Index Cond: (offer_id = $1)
                 Filter: ((((expire_time)::timestamp with time zone > now()) OR
(expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR
(pending = true)))
 Total runtime: 21.435 ms
(7 rows)

Time: 22.541 ms


Which makes sense when you look at the number of entries this
offer_id has in pkk_purchase table vs offer_id = 795:

pkk=# select offer_id, count(*) from pkk_purchase where offer_id in ( 795, 2312
) group by offer_id ;
 offer_id | count
----------+-------
      795 |     4
     2312 |  1015
(2 rows)

Time: 21.118 ms


--patrick




--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> patrick ~ <sidsrr@yahoo.com> writes:
> > 1. Is this really the only solution left for me?
>
> You still haven't followed the suggestions that were given to you
> (ie, find out what is happening with the plan for the query inside
> the problematic function).
>
>             regards, tom lane



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



Re: vacuum analyze slows sql query

From
Tom Lane
Date:
patrick ~ <sidsrr@yahoo.com> writes:
>  PREPARE pkk_00 ( integer ) <the def of pkk_offer_has_pending_purc( integer )

This is what you want to do, but not quite like that.  The PREPARE
determines the plan and so VACUUMing and re-EXECUTing is going to show
the same plan.  What we need to look at is
    - standing start
    PREPARE pkk_00 ...
    EXPLAIN ANALYZE EXECUTE pkk_00 ...
    VACUUM ANALYZE;
    PREPARE pkk_01 ...
    EXPLAIN ANALYZE EXECUTE pkk_01 ...

            regards, tom lane

Re: vacuum analyze slows sql query

From
patrick ~
Date:
Sorry for the late reply.  Was feeling a bit under the weather
this weekend and didn't get a chance to look at this.


--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> patrick ~ <sidsrr@yahoo.com> writes:
> >  PREPARE pkk_00 ( integer ) <the def of pkk_offer_has_pending_purc( integer
> )
>
> This is what you want to do, but not quite like that.  The PREPARE
> determines the plan and so VACUUMing and re-EXECUTing is going to show
> the same plan.  What we need to look at is
>     - standing start
>     PREPARE pkk_00 ...
>     EXPLAIN ANALYZE EXECUTE pkk_00 ...
>     VACUUM ANALYZE;
>     PREPARE pkk_01 ...
>     EXPLAIN ANALYZE EXECUTE pkk_01 ...

But of course!  I feel a bit silly now.

This is what I get after following Tom's directions:

pkk=# prepare pkk_00 ( integer ) as select ...
PREPARE
Time: 1.753 ms
pkk=# execute pkk_00(    241 );
 case
------
 f
(1 row)

Time: 0.788 ms
pkk=# explain analyze execute pkk_00(    241 );
      QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=10.73..10.74 rows=1 width=0) (actual time=0.067..0.068 rows=1
loops=1)
   InitPlan
     ->  Limit  (cost=0.00..10.73 rows=1 width=4) (actual time=0.055..0.055
rows=0 loops=1)
           ->  Index Scan using pur_offer_id_idx on pkk_purchase p0
(cost=0.00..20690.18 rows=1929 width=4) (actual time=0.052..0.052 rows=0
loops=1)
                 Index Cond: (offer_id = $1)
                 Filter: ((((expire_time)::timestamp with time zone > now()) OR
(expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR
(pending = true)))
 Total runtime: 0.213 ms
(7 rows)

Time: 24.654 ms
pkk=# vacuum analyze ;
VACUUM
Time: 128826.078 ms
pkk=# prepare pkk_01 ( integer ) as select ...
PREPARE
Time: 104.658 ms
pkk=# execute pkk_01(    241 );
 case
------
 f
(1 row)

Time: 7652.708 ms
pkk=# explain analyze execute pkk_01(    241 );
      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=2.66..2.67 rows=1 width=0) (actual time=2872.211..2872.213
rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..2.66 rows=1 width=4) (actual
time=2872.189..2872.189 rows=0 loops=1)
           ->  Seq Scan on pkk_purchase p0  (cost=0.00..37225.83 rows=13983
width=4) (actual time=2872.180..2872.180 rows=0 loops=1)
                 Filter: ((offer_id = $1) AND (((expire_time)::timestamp with
time zone > now()) OR (expire_time IS NULL) OR (pending = true)) AND
((cancel_date IS NULL) OR (pending = true)))
 Total runtime: 2872.339 ms
(6 rows)

Time: 2873.479 ms


So it looks like after the VACCUM the planner resorts to Seq Scan
rather than Index Scan.

This is because of the value of correlation field in pg_stats
(according to PostgreSQL docs) being closer to 0 rather than
�1:

pkk=# select tablename,attname,correlation from pg_stats where tablename =
'pkk_purchase' and attname = 'offer_id' ;
  tablename   | attname  | correlation
--------------+----------+-------------
 pkk_purchase | offer_id |    0.428598
(1 row)


So I started to experiment with ALTER TABLE SET STATISTICS
values to see which gets the correlation closer to �1.  The
trend seems to indicat the higher the stat value is set it
pushes the correlation value closer to 0:

set statistics   correlation
----------------------------
         800     0.393108
         500     0.408137
         200     0.43197
          50     0.435211
           1     0.45758

And a subsequent PREPARE and EXPLAIN ANALYZE confirms that
the Planer reverts back to using the Index Scan after setting
stats to 1 (even though correlation value is still closer
to 0 than 1):

pkk=# explain analyze execute pkk_02(    241 );

 QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=2.95..2.96 rows=1 width=0) (actual time=0.068..0.069 rows=1
loops=1)
   InitPlan
     ->  Limit  (cost=0.00..2.95 rows=1 width=4) (actual time=0.056..0.056
rows=0 loops=1)
           ->  Index Scan using pur_offer_id_idx on pkk_purchase p0
(cost=0.00..35810.51 rows=12119 width=4) (actual time=0.053..0.053 rows=0
loops=1)
                 Index Cond: (offer_id = $1)
                 Filter: ((((expire_time)::timestamp with time zone > now()) OR
(expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR
(pending = true)))
 Total runtime: 0.200 ms
(7 rows)



So, is this the ultimate solution to this issue?

--patrick



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



Re: vacuum analyze slows sql query

From
John Meinel
Date:
patrick ~ wrote:
[...]
> pkk=# explain analyze execute pkk_01(    241 );
>       QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Result  (cost=2.66..2.67 rows=1 width=0) (actual time=2872.211..2872.213
> rows=1 loops=1)
>    InitPlan
>      ->  Limit  (cost=0.00..2.66 rows=1 width=4) (actual
> time=2872.189..2872.189 rows=0 loops=1)
>            ->  Seq Scan on pkk_purchase p0  (cost=0.00..37225.83 rows=13983
> width=4) (actual time=2872.180..2872.180 rows=0 loops=1)
>                  Filter: ((offer_id = $1) AND (((expire_time)::timestamp with
> time zone > now()) OR (expire_time IS NULL) OR (pending = true)) AND
> ((cancel_date IS NULL) OR (pending = true)))
>  Total runtime: 2872.339 ms
> (6 rows)
>
> Time: 2873.479 ms
>

[...]

> So, is this the ultimate solution to this issue?
>
> --patrick

It's not so much that correlation is < 0.5. It sounds like you're
running into the same issue that I ran into in the past. You have a
column with lots of repeated values, and a few exceptional ones. Notice
this part of the query:
->  Seq Scan on pkk_purchase p0  (cost rows=13983) (actual rows=0)

For a general number, it thinks it might return 14,000 rows, hence the
sequential scan. Before you do ANALYZE, it uses whatever defaults exist,
which are probably closer to reality.

The problem is that you probably have some values for pkk_purchase where
it could return 14,000 rows (possibly much much more). And for those,
seq scan is the best plan. However, for the particular value that you
are testing, there are very few (no) entries in the table.

With a prepared statement (or a function) it has to determine ahead of
time what the best query is without knowing what value you are going to
  ask for.

Lets say for a second that you manage to trick it into using index scan,
and then you actually call the function with one of the values that
returns 1,000s of rows. Probably it will take 10-100 times longer than
if it used a seq scan.

So what is the solution? The only one I'm aware of is to turn your
static function into a dynamic one.

So somewhere within the function you build up a SQL query string and
call EXECUTE str. This forces the query planner to be run every time you
call the function. This means that if you call it will a "nice" value,
you will get the fast index scan, and if you call it with a "bad" value,
it will switch back to seq scan.

The downside is you don't get much of a benefit from using as stored
procedure, as it has to run the query planner all the time (as though
you issue the query manually each time.) But it still might be better
for you in the long run.

Example:

instead of

create function test(int) returns int as '
declare
   x alias for $1;
   int y;
begin
   select into y ... from ... where id=x limit ...;
   return y;
end
';

use this format

create function test(int) returns int as '
declare
   x alias for $1;
   int y;
begin
   EXECUTE ''select into y ... from ... where id=''
    ||quote_literal(x)
    || '' limit ...'';
   return y;
end;
';

I think that will point you in the right direction.

John
=:->

Attachment

Re: vacuum analyze slows sql query

From
patrick ~
Date:
Hi John,

Thanks for your reply and analysis.


--- John Meinel <john@johnmeinel.com> wrote:

> patrick ~ wrote:
> [...]
> > pkk=# explain analyze execute pkk_01(    241 );
> >       QUERY PLAN
> >
>

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >  Result  (cost=2.66..2.67 rows=1 width=0) (actual time=2872.211..2872.213
> > rows=1 loops=1)
> >    InitPlan
> >      ->  Limit  (cost=0.00..2.66 rows=1 width=4) (actual
> > time=2872.189..2872.189 rows=0 loops=1)
> >            ->  Seq Scan on pkk_purchase p0  (cost=0.00..37225.83 rows=13983
> > width=4) (actual time=2872.180..2872.180 rows=0 loops=1)
> >                  Filter: ((offer_id = $1) AND (((expire_time)::timestamp
> with
> > time zone > now()) OR (expire_time IS NULL) OR (pending = true)) AND
> > ((cancel_date IS NULL) OR (pending = true)))
> >  Total runtime: 2872.339 ms
> > (6 rows)
> >
> > Time: 2873.479 ms
> >
>
> [...]
>
> > So, is this the ultimate solution to this issue?
> >
> > --patrick
>
> It's not so much that correlation is < 0.5. It sounds like you're
> running into the same issue that I ran into in the past. You have a
> column with lots of repeated values, and a few exceptional ones. Notice
> this part of the query:
> ->  Seq Scan on pkk_purchase p0  (cost rows=13983) (actual rows=0)
>
> For a general number, it thinks it might return 14,000 rows, hence the
> sequential scan. Before you do ANALYZE, it uses whatever defaults exist,
> which are probably closer to reality.
>
> The problem is that you probably have some values for pkk_purchase where
> it could return 14,000 rows (possibly much much more). And for those,
> seq scan is the best plan. However, for the particular value that you
> are testing, there are very few (no) entries in the table.

You are absoultely correct:

pkk=# select offer_id,count(*) from pkk_purchase group by offer_id order by
count ;
 offer_id | count
----------+--------
     1019 |      1
     1018 |      1
     1016 |      1 (many of these)
      ... |    ...
     2131 |      6
      844 |      6
     1098 |      6 (a dozen or so of these)
      ... |    ...
     2263 |    682
     2145 |    723
     2258 |    797
     2091 |    863
      ... |    ...
     1153 |  96330 (the few heavy weights)
      244 | 122163
      242 | 255719
      243 | 273427
      184 | 348476


> With a prepared statement (or a function) it has to determine ahead of
> time what the best query is without knowing what value you are going to
>   ask for.
>
> Lets say for a second that you manage to trick it into using index scan,
> and then you actually call the function with one of the values that
> returns 1,000s of rows. Probably it will take 10-100 times longer than
> if it used a seq scan.


Hmm... The fact is I am selecting (in this example anyway) over all
values in pkk_offer table and calling the stored function with each
pkk_offer.offer_id which in turn does a select on pkk_purchase table.
Note that offer_id is a foreign key in pkk_purchase referencing
pkk_offer table.

I don't know if it matters (I suspect that it does) but I am using
LIMIT 1 in the sub-query/stored function.  All I need is one single
row meeting any of the criteria laid out in the stored procedure to
establish an offer_id is "pending".


> So what is the solution? The only one I'm aware of is to turn your
> static function into a dynamic one.
>
> So somewhere within the function you build up a SQL query string and
> call EXECUTE str. This forces the query planner to be run every time you
> call the function. This means that if you call it will a "nice" value,
> you will get the fast index scan, and if you call it with a "bad" value,
> it will switch back to seq scan.
>
> The downside is you don't get much of a benefit from using as stored
> procedure, as it has to run the query planner all the time (as though
> you issue the query manually each time.) But it still might be better
> for you in the long run.


Well, running the query without the stored function, basically typing
out the stored function as a sub-query shows me:


pkk=# explain analyze select o0.offer_id, ( select  case when ( select
p0.purchase_id from  pkk_purchase p0 where  p0.offer_id = o0.offer_id and (
p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and
p0.cancel_date isnull ) ) limit 1 ) isnull then false else true end ) from
pkk_offer o0 ;
      QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on pkk_offer o0  (cost=0.00..1834.11 rows=618 width=4) (actual
time=2413.398..1341885.084 rows=618 loops=1)
   SubPlan
     ->  Result  (cost=2.94..2.95 rows=1 width=0) (actual
time=2171.287..2171.289 rows=1 loops=618)
           InitPlan
             ->  Limit  (cost=0.00..2.94 rows=1 width=4) (actual
time=2171.264..2171.266 rows=1 loops=618)
                   ->  Seq Scan on pkk_purchase p0  (cost=0.00..37225.83
rows=12670 width=4) (actual time=2171.245..2171.245 rows=1 loops=618)
                         Filter: ((offer_id = $0) AND
(((expire_time)::timestamp with time zone > now()) OR (expire_time IS NULL) OR
(pending = true)) AND ((cancel_date IS NULL) OR (pending = true)))
 Total runtime: 1341887.523 ms
(8 rows)


while deleting all statistics on the pkk_% tables I get:

pkk=# delete from pg_statistic where pg_statistic.starelid = pg_class.oid and
pg_class.relname like 'pkk_%';
DELETE 11

pkk=# explain analyze select o0.offer_id, ( select  case when ( select
p0.purchase_id from  pkk_purchase p0 where  p0.offer_id = o0.offer_id and (
p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and
p0.cancel_date isnull ) ) limit 1 ) isnull then false else true end ) from
pkk_offer o0 ;
      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on pkk_offer o0  (cost=0.00..6646.94 rows=618 width=4) (actual
time=0.190..799.930 rows=618 loops=1)
   SubPlan
     ->  Result  (cost=10.73..10.74 rows=1 width=0) (actual time=1.277..1.278
rows=1 loops=618)
           InitPlan
             ->  Limit  (cost=0.00..10.73 rows=1 width=4) (actual
time=1.266..1.267 rows=1 loops=618)
                   ->  Index Scan using pur_offer_id_idx on pkk_purchase p0
(cost=0.00..20690.18 rows=1929 width=4) (actual time=1.258..1.258 rows=1
loops=618)
                         Index Cond: (offer_id = $0)
                         Filter: ((((expire_time)::timestamp with time zone >
now()) OR (expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL)
OR (pending = true)))
 Total runtime: 801.234 ms
(9 rows)


As you can see this query (over all values of pkk_offer) with out
any pg_statistics on the pkk_purchase table is extremely fast.

Is this a bug in the PostgreSQL planner that misjudges the best
choice with pg_statistics at hand?

--patrick



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



Re: vacuum analyze slows sql query

From
John Meinel
Date:
patrick ~ wrote:
> Hi John,
>
> Thanks for your reply and analysis.
>

No problem. It just happens that this is a problem we ran into recently.

>
> --- John Meinel <john@johnmeinel.com> wrote:
>
>
>>patrick ~ wrote:
[...]

>
> Hmm... The fact is I am selecting (in this example anyway) over all
> values in pkk_offer table and calling the stored function with each
> pkk_offer.offer_id which in turn does a select on pkk_purchase table.
> Note that offer_id is a foreign key in pkk_purchase referencing
> pkk_offer table.
>
> I don't know if it matters (I suspect that it does) but I am using
> LIMIT 1 in the sub-query/stored function.  All I need is one single
> row meeting any of the criteria laid out in the stored procedure to
> establish an offer_id is "pending".
>

If you are trying to establish existence, we also had a whole thread on
this. Basically what we found was that adding an ORDER BY clause, helped
tremendously in getting the planner to switch to an Index scan. You
might try something like:

SELECT column FROM mytable WHERE column='myval' ORDER BY column LIMIT 1;

There seems to be a big difference between the above statement and:

SELECT column FROM mytable WHERE column='myval' LIMIT 1;


>
>
>>So what is the solution? The only one I'm aware of is to turn your
>>static function into a dynamic one.
>>
>>So somewhere within the function you build up a SQL query string and
>>call EXECUTE str. This forces the query planner to be run every time you
>>call the function. This means that if you call it will a "nice" value,
>>you will get the fast index scan, and if you call it with a "bad" value,
>>it will switch back to seq scan.
>>
>>The downside is you don't get much of a benefit from using as stored
>>procedure, as it has to run the query planner all the time (as though
>>you issue the query manually each time.) But it still might be better
>>for you in the long run.
>
>
>
> Well, running the query without the stored function, basically typing
> out the stored function as a sub-query shows me:
>
>
> pkk=# explain analyze select o0.offer_id, ( select  case when ( select
> p0.purchase_id from  pkk_purchase p0 where  p0.offer_id = o0.offer_id and (
> p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and
> p0.cancel_date isnull ) ) limit 1 ) isnull then false else true end ) from
> pkk_offer o0 ;
>       QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on pkk_offer o0  (cost=0.00..1834.11 rows=618 width=4) (actual
> time=2413.398..1341885.084 rows=618 loops=1)
>    SubPlan
>      ->  Result  (cost=2.94..2.95 rows=1 width=0) (actual
> time=2171.287..2171.289 rows=1 loops=618)
>            InitPlan
>              ->  Limit  (cost=0.00..2.94 rows=1 width=4) (actual
> time=2171.264..2171.266 rows=1 loops=618)
>                    ->  Seq Scan on pkk_purchase p0  (cost=0.00..37225.83
> rows=12670 width=4) (actual time=2171.245..2171.245 rows=1 loops=618)
>                          Filter: ((offer_id = $0) AND
> (((expire_time)::timestamp with time zone > now()) OR (expire_time IS NULL) OR
> (pending = true)) AND ((cancel_date IS NULL) OR (pending = true)))
>  Total runtime: 1341887.523 ms
> (8 rows)
>
>
> while deleting all statistics on the pkk_% tables I get:
>
> pkk=# delete from pg_statistic where pg_statistic.starelid = pg_class.oid and
> pg_class.relname like 'pkk_%';
> DELETE 11
>
> pkk=# explain analyze select o0.offer_id, ( select  case when ( select
> p0.purchase_id from  pkk_purchase p0 where  p0.offer_id = o0.offer_id and (
> p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and
> p0.cancel_date isnull ) ) limit 1 ) isnull then false else true end ) from
> pkk_offer o0 ;
>       QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on pkk_offer o0  (cost=0.00..6646.94 rows=618 width=4) (actual
> time=0.190..799.930 rows=618 loops=1)
>    SubPlan
>      ->  Result  (cost=10.73..10.74 rows=1 width=0) (actual time=1.277..1.278
> rows=1 loops=618)
>            InitPlan
>              ->  Limit  (cost=0.00..10.73 rows=1 width=4) (actual
> time=1.266..1.267 rows=1 loops=618)
>                    ->  Index Scan using pur_offer_id_idx on pkk_purchase p0
> (cost=0.00..20690.18 rows=1929 width=4) (actual time=1.258..1.258 rows=1
> loops=618)
>                          Index Cond: (offer_id = $0)
>                          Filter: ((((expire_time)::timestamp with time zone >
> now()) OR (expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL)
> OR (pending = true)))
>  Total runtime: 801.234 ms
> (9 rows)
>
>
> As you can see this query (over all values of pkk_offer) with out
> any pg_statistics on the pkk_purchase table is extremely fast.
>
> Is this a bug in the PostgreSQL planner that misjudges the best
> choice with pg_statistics at hand?
>
> --patrick
>

In order to understand your query I broke it up and restructured it as
follows.
You might try to add the ORDER BY line, and see what you get.

  EXPLAIN ANALYZE
  SELECT o0.offer_id,
      ( SELECT  CASE WHEN
          ( SELECT p0.purchase_id FROM  pkk_purchase p0
             WHERE p0.offer_id = o0.offer_id
               AND ( p0.pending = true
                     OR ( p0.cancel_date ISNULL
                          AND ( p0.expire_time > NOW() or p0.expire_time
ISNULL )
                        )
                   )
             ORDER BY p0.purchase_id --Insert this line
             LIMIT 1
          ) ISNULL THEN false
          ELSE true
          END
      ) FROM pkk_offer o0 ;

I also wonder about some parts of your query. I don't know your business
logic but you are tacking a lot of the query into the WHERE, and I
wonder if postgres just thinks it's going to need to analyze all the
data before it gets a match.

I also don't remember what columns you have indices on. Or whether it is
common to have cancel_date null, or expire_time > NOW() or expire_time
null, etc.

So is your function just everything within the CASE statement?

You might try rewriting it as a loop using a cursor, as I believe using
a cursor again lends itself to index scans (as it is even more likely
that you will not get all the data.)

Something like (this is untested)

create function is_pending(int) returns bool as '
declare
   p_id alias for $1;
begin

   DECLARE is_pending_cursor CURSOR FOR
    SELECT p0.purchase_id FROM pkk_purchase p0
     WHERE p0.offer_id = p_id;
   FOR READ ONLY;

   FOR FETCH NEXT is_pending_cursor
       IF row.pending = true or ...
         RETURN true;

   RETURN false;
END;
';

I don't know cursors terribly well, but this might get you going.
Probably in your case you also have a large portion of the records with
pending = true, which means that with an index scan it doesn't have to
hit very many records. Either you have a low record count for a
particular purchase_id, or you have a lot of pendings. seq scan just
hurts because it has to sift out all the other id's that you don't care
about.

But remember, I'm not a guru, just someone who has been hit by the
inequal distribution problem.

John
=:->

Attachment

Re: vacuum analyze slows sql query

From
Pierre-Frédéric Caillaud
Date:
>> Lets say for a second that you manage to trick it into using index scan,
>> and then you actually call the function with one of the values that
>> returns 1,000s of rows. Probably it will take 10-100 times longer than
>> if it used a seq scan.

> I don't know if it matters (I suspect that it does) but I am using
> LIMIT 1 in the sub-query/stored function.  All I need is one single
> row meeting any of the criteria laid out in the stored procedure to
> establish an offer_id is "pending".

    So, in your case if you LIMIT the index scan will always be fast, and the
seq scan will be catastrophic, because you don't need to retrieve all the
rows, but just one. (IMHO the planner screws these LIMIT clauses becauses
it expects the data to be randomly distributed in the first page while in
real life it's not).

    You could use EXIST to test the existence of a subquery (after all, thats
its purpose), or you could :

    When SELECT ... FROM table WHERE stuff=value LIMIT 1
    obstinately uses a seq scan, spray a little order by :

    When SELECT ... FROM table WHERE stuff=value ORDER BY stuff LIMIT 1

    the ORDER BY will make the planner think "I could use the index to
order"...

Re: vacuum analyze slows sql query

From
John Meinel
Date:
patrick ~ wrote:
> Hi John,
>
> Thanks for your reply and analysis.
>
>
> --- John Meinel <john@johnmeinel.com> wrote:
>
>
>>patrick ~ wrote:
>>[...]
>>
>>>pkk=# explain analyze execute pkk_01(    241 );
>>>      QUERY PLAN
>>>
>>

One other thing that I just thought of. I think it is actually possible
to add an index on a function of a column. So if you have the
"is_really_pending" function, you might be able to do:

CREATE INDEX pkk_is_really_pending ON pkk_purchase
    (is_really_pending(purchase_id));

But you need a better guru than I to make sure of that.

This *might* do what you need.

John
=:->


Attachment

Re: vacuum analyze slows sql query

From
patrick ~
Date:
--- John Meinel <john@johnmeinel.com> wrote:

> If you are trying to establish existence, we also had a whole thread on
> this. Basically what we found was that adding an ORDER BY clause, helped
> tremendously in getting the planner to switch to an Index scan. You
> might try something like:
>
> SELECT column FROM mytable WHERE column='myval' ORDER BY column LIMIT 1;
>
> There seems to be a big difference between the above statement and:
>
> SELECT column FROM mytable WHERE column='myval' LIMIT 1;


The ORDER BY "trick" worked beautifully!  I just hope it'll
continue to work consistently in production code.



> I also wonder about some parts of your query. I don't know your business
> logic but you are tacking a lot of the query into the WHERE, and I
> wonder if postgres just thinks it's going to need to analyze all the
> data before it gets a match.


I have a table of offers (pkk_offer) and a table keeping track
of all purchases against each offer (pkk_purchase) and a third
table keeping track of billing for each purchase (pkk_billing).

That's the basic setup of my db.  In actuallity there are more
tables and views invovled keeping track of usage, etc.

The on UI page that lists all offers in the system needs to
indicated to the user (operator) which offers are "pending".
The term "pending" is used to mean that the particular offer
has either an active purchase against it or has a purchase
which hasn't yet been entered into the billing system yet
(doesn't yet show up in pkk_billing).

An active purcahse is indicated by pkk_purchase.expire_time in
the future or IS NULL.  Where IS NULL indicates a subscription
type purchase (a recurring purchase).  Offers are created either
to be one-time purchasable or subscription type.

The pkk_purchase.pending (boolean) column indicates whether
or not the purchase has been entered into the billing system.
It is a rare case where this flag remains true for a long
period of time (which would indicate something wrong with
the billing sub-system).

There is a foreign key pkk_purchase.offer_id referencing
pkk_offer.offer_id.  And likewise, pkk_billing.client_id
referencing pkk_purchase.client_id and pkk_billing.purchase_id
referecning pkk_purchase.purchase_id.


> So is your function just everything within the CASE statement?

Yes.  I posted a "stripped down" version of the database
on pgsql-sql@ list earlier this month if you are interested
in looking at it:

http://marc.theaimsgroup.com/?l=postgresql-sql&m=109945118928530&w=2

(Just side note: MARC doesn't seem to subscribe to -performance
or -hackers.  I have requested them to carry these two lists. I
think if more people request this it might happen.  I like their
archiving system).


> You might try rewriting it as a loop using a cursor, as I believe using
> a cursor again lends itself to index scans (as it is even more likely
> that you will not get all the data.)

I may try this as well as trying a suggestion by Pierre-Fr���d���ric
Caillaud to use EXISTS, though my initial attempt to use it didn't
seem to be any faster than my original stored function.

So far the ORDER BY "trick" seems to be the best solution.

I appreciate everyone's help and suggestions on this topic!

Best wishes,
--patrick



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



Re: vacuum analyze slows sql query

From
Gaetano Mendola
Date:
patrick ~ wrote:
> --- John Meinel <john@johnmeinel.com> wrote:
>
>
>>If you are trying to establish existence, we also had a whole thread on
>>this. Basically what we found was that adding an ORDER BY clause, helped
>>tremendously in getting the planner to switch to an Index scan. You
>>might try something like:
>>
>>SELECT column FROM mytable WHERE column='myval' ORDER BY column LIMIT 1;
>>
>>There seems to be a big difference between the above statement and:
>>
>>SELECT column FROM mytable WHERE column='myval' LIMIT 1;
>
>
>
> The ORDER BY "trick" worked beautifully!  I just hope it'll
> continue to work consistently in production code.

For sure it will not break the goal: "check the existence".



Regards
Gaetano Mendola