Re: Improve performance of query - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Improve performance of query
Date
Msg-id 41C1C2B1.90000@archonet.com
Whole thread Raw
In response to Improve performance of query  (Richard Rowell <richard@bowmansystems.com>)
List pgsql-performance
Richard Rowell wrote:
> I'm trying to port our application from MS-SQL to Postgres.  We have
> implemented all of our rather complicated application security in the
> database.  The query that follows takes a half of a second or less on
> MS-SQL server and around 5 seconds on Postgres.  My concern is that this
> data set is rather "small" by our applications standards.  It is not
> unusual for the da_answer table to have 2-4 million records.  I'm
> worried that if this very small data set is taking 5 seconds, then a
> "regular sized" data set will take far too long.
>
> I originally thought the NOT EXISTS on the
> "da_data_restrict_except_open" table was killing performance, but the
> query took the exact same amount of time after I deleted all rows from
> this table.  Note that the hard-coded 999999999.0, and 4000 parameters,
> as well as the parameter to svp_getparentproviders are the three
> variables that change from one run of this query to the next.
>
> I'm using Postgres 7.4.5 as packaged in Debian.  shared_buffers is set
> to 57344 and sort_mem=4096.

That shared_buffers value sounds too large for 1GB RAM - rewind to 10000
say. Also make sure you've read the "performance tuning" article at:
   http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php

> I have included an EXPLAIN ANALYZE, relevant table counts, and relevant
> indexing information.  If anyone has any suggestions on how to improve
> performance....  TIA!

I think it's the function call(s).

> SELECT tab.answer_id, client_id, question_id, recordset_id,
> date_effective, virt_field_name
> FROM
> (
>     SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
> date_effective
>     FROM da_answer a
>     WHERE a.date_effective <= 9999999999.0
>     AND a.inactive != 1
>     AND
>     (
>             5000 = 4000
>             OR
>             (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE
> svp_getparentproviderids = a.provider_id))
>     )
...
>SubPlan
>  ->  Function Scan on svp_getparentproviderids  (cost=0.00..15.00 rows=5 width=4) (actual time=0.203..0.203 rows=0
loops=21089)
>        Filter: (svp_getparentproviderids = $1)

Here it's running 21,089 loops around your function. Each one isn't
costing much, but it's the total that's killing you I think. It might be
possible to mark the function STABLE or such, depending on what it does
- see http://www.postgresql.org/docs/7.4/static/sql-createfunction.html

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: Jon Anderson
Date:
Subject: Seqscan rather than Index
Next
From: Tom Lane
Date:
Subject: Re: Improve performance of query