Re: vacuum analyze slows sql query - Mailing list pgsql-performance

From patrick ~
Subject Re: vacuum analyze slows sql query
Date
Msg-id 20041103222257.73682.qmail@web52104.mail.yahoo.com
Whole thread Raw
In response to Re: vacuum analyze slows sql query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Here is a fresh run with 'explain analyze' run before and after the
VACUUM statement:

-- begin
% dropdb pkk
DROP DATABASE
% createdb pkk
CREATE DATABASE
% psql pkk < pkk_db.sql
ERROR:  function pkk_offer_has_pending_purch(integer) does not exist
ERROR:  function pkk_offer_has_pending_purch2(integer) does not exist
ERROR:  table "pkk_billing" does not exist
ERROR:  table "pkk_purchase" does not exist
ERROR:  table "pkk_offer" does not exist
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "pkk_offer_pkey"
for table "pkk_offer"
CREATE TABLE
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"pkk_purchase_pkey" for table "pkk_purchase"
CREATE TABLE
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE TABLE
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
% zcat pkk.20041028_00.sql.gz | psql pkk
SET
SET
SET
SET
% psql pkk
pkk=# select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ;
<ommitting output />
(618 rows)

Time: 877.348 ms
pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id )
from pkk_offer ;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Seq Scan on pkk_offer  (cost=0.00..22.50 rows=1000 width=4) (actual
time=1.291..845.485 rows=618 loops=1)
 Total runtime: 849.475 ms
(2 rows)

Time: 866.613 ms
pkk=# vacuum analyze ;
VACUUM
Time: 99344.399 ms
pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id )
from pkk_offer ;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on pkk_offer  (cost=0.00..13.72 rows=618 width=4) (actual
time=3636.401..1047412.851 rows=618 loops=1)
 Total runtime: 1047415.525 ms
(2 rows)

Time: 1047489.477 ms
-- end



Tom,

The reason of the extra "case" part in the function is to ensure non-null
fields on the result.  I tried your version as well and i get similar
performance results:

-- begin
pkk=# create function toms_pending_purch( integer ) returns bool as 'select
p0.purchase_id is not null from  pkk_purchase p0 where  p0.offer_id = $1 and (
p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and
p0.cancel_date isnull ) ) limit 1 ' language 'sql' ;
CREATE FUNCTION
Time: 2.496 ms
pkk=# select offer_id, toms_pending_purch( offer_id ) from pkk_offer ;
(618 rows)

Time: 1052339.506 ms
-- end


Right now, I'm studying the document section on PREPARE and will
attempt to play around with it.


I was asked (in a prior post) whether running the statement a second
time after the VACUUM improves in performance.  It does not.  After
the VACUUM the statement remains slow.


Thanks for your help,
--patrick



--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> patrick ~ <sidsrr@yahoo.com> writes:
> > that if I 'createdb' and populate it with the "sanatized" data the
> > query in question is quite fast; 618 rows returned in 864.522 ms.
> > This was puzzling.  Next I noticed that after a VACUUM the very same
> > query would slow down to a crawl; 618 rows returned in 1080688.921 ms).
>
> The outer query is too simple to have more than one possible plan,
> so the issue is certainly a change in query plans inside the function.
> You need to be investigating what's happening inside that function.
> 7.1 doesn't have adequate tools for this, but in 7.4 you can use
> PREPARE and EXPLAIN ANALYZE EXECUTE to examine the query plans used
> for parameterized statements, which is what you've got here.
>
> My bet is that with ANALYZE stats present, the planner guesses wrong
> about which index to use; but without looking at EXPLAIN ANALYZE output
> there's no way to be sure.
>
> BTW, why the bizarrely complicated substitute for a NOT NULL test?
> ISTM you only need
>
> create function
> pkk_offer_has_pending_purch( integer )
>     returns bool
> as  '
>                 select  p0.purchase_id is not null
>                   from  pkk_purchase p0
>                  where  p0.offer_id = $1
>                         and ( p0.pending = true
>                             or ( ( p0.expire_time > now()
>                                  or p0.expire_time isnull )
>                                and p0.cancel_date isnull ) )
>                   limit 1
> ' language 'sql' ;
>
> (Actually, seeing that pkk_purchase.purchase_id is defined as NOT NULL,
> I wonder why the function exists at all ... but I suppose you've
> "stripped" the function to the point of being nonsense.)
>
>             regards, tom lane



__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com



pgsql-performance by date:

Previous
From: Martin Foster
Date:
Subject: Re: Restricting Postgres
Next
From: John A Meinel
Date:
Subject: Re: Restricting Postgres