Improve performance of query

From: Richard Rowell
Subject: Improve performance of query
Date: ,
Msg-id: 1103213467.9908.65.camel@richard
(view: Whole thread, Raw)
Responses: Re: Improve performance of query  (Stephen Frost)
Re: Improve performance of query  (Richard Huxton)
Re: Improve performance of query  (Tom Lane)
Re: Improve performance of query  (John A Meinel)
Re: Improve performance of query  (John A Meinel)
List: pgsql-performance

Tree view

Improve performance of query  (Richard Rowell, )
 Re: Improve performance of query  (Stephen Frost, )
 Re: Improve performance of query  (Richard Huxton, )
 Re: Improve performance of query  (Tom Lane, )
 Re: Improve performance of query  (John A Meinel, )
 Re: Improve performance of query  (John A Meinel, )

I'm trying to port our application from MS-SQL to Postgres.  We have
implemented all of our rather complicated application security in the
database.  The query that follows takes a half of a second or less on
MS-SQL server and around 5 seconds on Postgres.  My concern is that this
data set is rather "small" by our applications standards.  It is not
unusual for the da_answer table to have 2-4 million records.  I'm
worried that if this very small data set is taking 5 seconds, then a
"regular sized" data set will take far too long.

I originally thought the NOT EXISTS on the
"da_data_restrict_except_open" table was killing performance, but the
query took the exact same amount of time after I deleted all rows from
this table.  Note that the hard-coded 999999999.0, and 4000 parameters,
as well as the parameter to svp_getparentproviders are the three
variables that change from one run of this query to the next.

I'm using Postgres 7.4.5 as packaged in Debian.  shared_buffers is set
to 57344 and sort_mem=4096.

The machine has an AMD 1.8+ and ` gig of RAM.  Here are some relevant
performance statistics:
richard:/usr/share/cups/model# cat /proc/sys/kernel/shmmax
536870912
richard:/usr/share/cups/model# cat /proc/sys/kernel/shmall
536870912
richard:/home/richard# hdparm -tT /dev/hda
 Timing cached reads:   1112 MB in  2.00 seconds = 556.00 MB/sec
 Timing buffered disk reads:  176 MB in  3.02 seconds =  58.28 MB/sec

I have included an EXPLAIN ANALYZE, relevant table counts, and relevant
indexing information.  If anyone has any suggestions on how to improve
performance....  TIA!

SELECT tab.answer_id, client_id, question_id, recordset_id,
date_effective, virt_field_name
FROM
(
    SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
date_effective
    FROM da_answer a
    WHERE a.date_effective <= 9999999999.0
    AND a.inactive != 1
    AND
    (
            5000 = 4000
            OR
            (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE
svp_getparentproviderids = a.provider_id))
    )
    UNION
    SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
date_effective
    FROM da_answer a,
    (
        SELECT main_id
        FROM da_data_restrict
        WHERE type_id = 2
        AND (provider_id IN (SELECT * FROM svp_getparentproviderids(1)))

        UNION

            SELECT sa.uid AS main_id
            FROM da_answer sa
                JOIN da_data_restrict_except_closed dr ON dr.main_id =
sa.uid AND dr.type_id = 2 AND dr.except_provider_id = 1
            WHERE (restricted = 1)
                AND (restricted_closed_except = 1)
                AND sa.covered_by_roi = 1
            UNION
            SELECT sa.uid AS main_id
            FROM da_answer sa
            WHERE (restricted = 0)
                AND (restricted_open_except = 1)
                AND (NOT EXISTS (SELECT dr.main_id FROM
da_data_restrict_except_open dr WHERE (dr.main_id = sa.uid) AND
(dr.type_id = 2) AND (dr.except_provider_id in (select * from
svp_getparentproviderids(1)))))
                AND sa.covered_by_roi = 1
            UNION
            SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted
= 0) AND (restricted_open_except = 0)
                AND sa.covered_by_roi = 1
    ) sec
    WHERE a.covered_by_roi = 1
        AND a.date_effective <= 9999999999.0
        AND a.inactive != 1
        AND a.uid = sec.main_id
        AND 5000 > 4000
) tab, da_question q
WHERE tab.question_id = q.uid AND (min_access_level <= 4000 OR
min_access_level IS NULL)

Table counts from relevant tables
da_question                      1095
da_answer                       21117
da_question                      1095
da_data_restrict_except_closed   3087
da_data_restrict_except_open    13391
svp_getparentproviderids(1)         1

Relevant Index
create index in_da_data_restrict_provider_id on
da_data_restrict(provider_id);
create index in_da_data_restrict_main_id on da_data_restrict(main_id);
create index in_da_data_restrict_type_id on da_data_restrict(type_id);
create index in_da_data_restrict_client_id on
da_data_restrict(client_id);
create index in_da_dr_type_provider on
da_data_restrict(type_id,provider_id);

create index in_da_data_rec_provider_id ON
da_data_restrict_except_closed(provider_id);
create index in_da_data_rec_type_id ON
da_data_restrict_except_closed(type_id);
create index in_da_data_rec_main_id ON
da_data_restrict_except_closed(main_id);
create index in_da_data_rec_except_provider_id ON
da_data_restrict_except_closed(except_provider_id);

create index in_da_data_reo_provider_id ON
da_data_restrict_except_open(provider_id);
create index in_da_data_reo_type_id ON
da_data_restrict_except_open(type_id);
create index in_da_data_reo_main_id ON
da_data_restrict_except_open(main_id);
create index in_da_data_reo_except_provider_id ON
da_data_restrict_except_open(except_provider_id);

create index in_da_answer_client_id ON da_answer(client_id);
create index in_da_answer_provider_id ON da_answer(provider_id);
create index in_da_answer_question_id ON da_answer(question_id);
create index in_da_answer_recordset_id ON da_answer(recordset_id);
create index in_da_answer_restricted ON da_answer(restricted);
create index in_da_answer_restricted_open_except ON
da_answer(restricted_open_except);
create index in_da_answer_restricted_closed_except ON
da_answer(restricted_closed_except);
create index in_da_answer_date_effective ON da_answer(date_effective);
create index in_da_answer_inactive ON da_answer(inactive);
create index in_da_answer_covered_by_roi ON da_answer(covered_by_roi);

create index in_da_ed_inactive_roi ON da_answer(date_effective,inactive,
covered_by_roi);

create index in_da_question_mal ON da_question(min_access_level);


Attachment

pgsql-performance by date:

From: Adi Alurkar
Date:
Subject: UNION ALL vs INHERITANCE
From: Tom Lane
Date:
Subject: Re: UNION ALL vs INHERITANCE