Thread: performance advice needed: join vs explicit subselect

performance advice needed: join vs explicit subselect

From

Karsten Hilbert

Date:

27 January 2009, 14:12:21

Hello all,

maybe some general advice can be had on this:

table test_results
    modified_by integer foreign key staff(pk),
    intended_reviewer integer foreign key staff(pk),
    actual_reviewer integer foreign key staff(pk)

(this table will contain millions of rows)

table staff
    pk integer
    name text

(this table will contain at most 50 rows)

Now I want to set up a view which aggregates test results
with staff names for all three foreign keys. This would mean
I would either have to

- join test_results to staff three times, once for each
  of the foreign keys, this is going to be messy with
  tracking table aliases, duplicate column names etc

- write three explicit sub-selects for the columns I want
  to denormalize into the view definition

Is there general advice as to which of the alternatives is
worse under most if not all circumstances ?

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: performance advice needed: join vs explicit subselect

From

justin

Date:

27 January 2009, 14:25:20

Karsten Hilbert wrote:
> Hello all,
>
> maybe some general advice can be had on this:
>
> table test_results
>     modified_by integer foreign key staff(pk),
>     intended_reviewer integer foreign key staff(pk),
>     actual_reviewer integer foreign key staff(pk)
>
> (this table will contain millions of rows)
>
> table staff
>     pk integer
>     name text
>
> (this table will contain at most 50 rows)
>
> Now I want to set up a view which aggregates test results
> with staff names for all three foreign keys. This would mean
> I would either have to
>
> - join test_results to staff three times, once for each
>   of the foreign keys, this is going to be messy with
>   tracking table aliases, duplicate column names etc
>
> - write three explicit sub-selects for the columns I want
>   to denormalize into the view definition
>
>
Select testresults.*, Modifer.Name, Intended.name,  Actual.name   from
testresults
    left join (Select pk, name  from staff) Modifer
         on Modifer.pk  = testresults.modified_by
    left join (Select pk, name  from staff) Intended
        on Reviewer.pk  = testresults.intended_reviewer
    left join (Select pk, name  from staff) Actual
        on pk  = testresults.actual_reviewer


This is what i think you are after.  You can do this via nested queries
also for each name

Re: performance advice needed: join vs explicit subselect

From

justin

Date:

27 January 2009, 14:27:52

typo sorry

justin wrote:
> Karsten Hilbert wrote:
>> Hello all,
>>
>> maybe some general advice can be had on this:
>>
>> table test_results
>>     modified_by integer foreign key staff(pk),
>>     intended_reviewer integer foreign key staff(pk),
>>     actual_reviewer integer foreign key staff(pk)
>>
>> (this table will contain millions of rows)
>>
>> table staff
>>     pk integer
>>     name text
>>
>> (this table will contain at most 50 rows)
>>
>> Now I want to set up a view which aggregates test results
>> with staff names for all three foreign keys. This would mean
>> I would either have to
>>
>> - join test_results to staff three times, once for each
>>   of the foreign keys, this is going to be messy with
>>   tracking table aliases, duplicate column names etc
>>
>> - write three explicit sub-selects for the columns I want
>>   to denormalize into the view definition
>>
>>
> Select testresults.*, Modifer.Name, Intended.name,  Actual.name   from
> testresults
>    left join (Select pk, name  from staff) Modifer
>         on Modifer.pk  = testresults.modified_by
>    left join (Select pk, name  from staff) Intended
>        on Inteded.pk  = testresults.intended_reviewer
>    left join (Select pk, name  from staff) Actual
>        on Actual.pk  = testresults.actual_reviewer
>
>
> This is what i think you are after.  You can do this via nested
> queries also for each name
>

Re: performance advice needed: join vs explicit subselect

From

Sam Mason

Date:

27 January 2009, 14:48:18

On Tue, Jan 27, 2009 at 07:12:05PM +0100, Karsten Hilbert wrote:
> Hello all,
>
> maybe some general advice can be had on this:
>
> table test_results
>     modified_by integer foreign key staff(pk),
>     intended_reviewer integer foreign key staff(pk),
>     actual_reviewer integer foreign key staff(pk)
>
> (this table will contain millions of rows)
>
> table staff
>     pk integer
>     name text
>
> (this table will contain at most 50 rows)
>
> Now I want to set up a view which aggregates test results
> with staff names for all three foreign keys. This would mean
> I would either have to
>
> - join test_results to staff three times, once for each
>   of the foreign keys, this is going to be messy with
>   tracking table aliases, duplicate column names etc

if you've only got three columns it shouldn't be too bad should it?

> - write three explicit sub-selects for the columns I want
>   to denormalize into the view definition

This would look a bit prettier, but PG tends not to optimize at all.  It
always executes it as a subplan and hence will only work nicely when
you've got a very small subset of the test_results coming back.  PG will
*sometimes* remove subexpressions, but doesn't seem very predictable
about it:

  SELECT id
  FROM (
    SELECT a.id, (SELECT b.name FROM bar b WHERE a.tid = b.tid)
    FROM foo a) x;

PG seems to recognize that it can remove the subselect in the above
which is nice, but in other situations it doesn't seem to.

--
  Sam  http://samason.me.uk/

Re: performance advice needed: join vs explicit subselect

From

Alban Hertroys

Date:

27 January 2009, 15:04:15

On Jan 27, 2009, at 7:12 PM, Karsten Hilbert wrote:

> Hello all,
>
> maybe some general advice can be had on this:
>
> table test_results
>     modified_by integer foreign key staff(pk),
>     intended_reviewer integer foreign key staff(pk),
>     actual_reviewer integer foreign key staff(pk)
>
> (this table will contain millions of rows)
>
> table staff
>     pk integer
>     name text
>
> (this table will contain at most 50 rows)
>
> Now I want to set up a view which aggregates test results
> with staff names for all three foreign keys. This would mean
> I would either have to
>
> - join test_results to staff three times, once for each
>  of the foreign keys, this is going to be messy with
>  tracking table aliases, duplicate column names etc
>
> - write three explicit sub-selects for the columns I want
>  to denormalize into the view definition
>
> Is there general advice as to which of the alternatives is
> worse under most if not all circumstances ?


I did something similar once using expression logic for my aggregates:

SELECT
    SUM(CASE WHEN modified_by = pk THEN 1 ELSE 0 END) AS modified_by_count,
    SUM(CASE WHEN intended_reviewer = pk THEN 1 ELSE 0 END) AS
intended_reviewer_count,
    SUM(CASE WHEN actual_reviewer = pk THEN 1 ELSE 0 END) AS
actual_reviewer_count
   FROM test_results, staff
  WHERE pk IN (modified_by, intended_reviewer, actual_reviewer)

Mind, this will very probably do a sequential scan over the product of
both tables, but at least now the staff table is in that product only
once.

In actuality I didn't use CASE statements but cast the boolean results
of the expressions directly to integer, something like
SUM((modified_by = pk)::int), but that cast may no longer work since
8.3.

I no longer have access to the project that I used this on, so I can't
verify unfortunately.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,497f5aa8747035160810079!

Re: performance advice needed: join vs explicit subselect

From

Karsten Hilbert

Date:

27 January 2009, 17:12:36

On Tue, Jan 27, 2009 at 06:48:11PM +0000, Sam Mason wrote:

> > table test_results
> >     modified_by integer foreign key staff(pk),
> >     intended_reviewer integer foreign key staff(pk),
> >     actual_reviewer integer foreign key staff(pk)
> >
> > (this table will contain millions of rows)
> >
> > table staff
> >     pk integer
> >     name text
> >
> > (this table will contain at most 50 rows)
> >
> > Now I want to set up a view which aggregates test results
> > with staff names for all three foreign keys. This would mean
> > I would either have to
> >
> > - join test_results to staff three times, once for each
> >   of the foreign keys, this is going to be messy with
> >   tracking table aliases, duplicate column names etc
>
> if you've only got three columns it shouldn't be too bad should it?

This is what one deserves for thinking to be able to distill
the essence of a problem :-)

The view in question is in fact a lot more complicated. This
is the best I've been able to come up with so far (and it is
still slow - slow as in 3-4 seconds for 20 records out of
(currently only) 50 !):

create view clin.v_test_results as

select
    cenc.fk_patient
        as pk_patient,
    -- test_result
    tr.pk as pk_test_result,
    tr.clin_when,
    -- unified
    vttu.unified_code,
    vttu.unified_name,
    case when coalesce(trim(both from tr.val_alpha), '') = ''
        then tr.val_num::text
        else case when tr.val_num is null
            then tr.val_alpha
            else tr.val_num::text || ' (' || tr.val_alpha || ')'
        end
    end as unified_val,
    coalesce(tr.val_target_min, tr.val_normal_min)
        as unified_target_min,
    coalesce(tr.val_target_max, tr.val_normal_max)
        as unified_target_max,
    coalesce(tr.val_target_range, tr.val_normal_range)
        as unified_target_range,
    tr.soap_cat,
    tr.narrative
        as comment,
    -- test result data
    tr.val_num,
    tr.val_alpha,
    tr.val_unit,
    vttu.conversion_unit,
    tr.val_normal_min,
    tr.val_normal_max,
    tr.val_normal_range,
    tr.val_target_min,
    tr.val_target_max,
    tr.val_target_range,
    tr.abnormality_indicator,
    tr.norm_ref_group,
    tr.note_test_org,
    tr.material,
    tr.material_detail,
    -- test type data
    vttu.code_tt,
    vttu.name_tt,
    vttu.coding_system_tt,
    vttu.comment_tt,
    vttu.code_unified,
    vttu.name_unified,
    vttu.coding_system_unified,
    vttu.comment_unified,

    -- episode/issue data
    epi.description
        as episode,

    -- status of last review
    coalesce(rtr.fk_reviewed_row, 0)::bool
        as reviewed,
    rtr.is_technically_abnormal
        as is_technically_abnormal,
    rtr.clinically_relevant
        as is_clinically_relevant,
    rtr.comment
        as review_comment,

    (select
        short_alias || ' (' ||
        coalesce(title || ' ', '') ||
        coalesce(firstnames || ' ', '') ||
        coalesce(lastnames, '') ||
        ')'
     from dem.v_staff
     where pk_staff = rtr.fk_reviewer
    ) as last_reviewer,

    rtr.modified_when
        as last_reviewed,

    coalesce (
        (rtr.fk_reviewer = (select pk from dem.staff where db_user = current_user)),
        False
    )
        as review_by_you,

    coalesce (
        (tr.fk_intended_reviewer = rtr.fk_reviewer),
        False
    )
        as review_by_responsible_reviewer,

    -- potential review status
    (select
        short_alias || ' (' ||
        coalesce(title || ' ', '') ||
        coalesce(firstnames || ' ', '') ||
        coalesce(lastnames, '') ||
        ')'
     from dem.v_staff
     where pk_staff = tr.fk_intended_reviewer
    ) as responsible_reviewer,

    coalesce (
        (tr.fk_intended_reviewer = (select pk from dem.staff where db_user = current_user)),
        False
    )
        as you_are_responsible,

    case when ((select 1 from dem.staff where db_user = tr.modified_by) is null)
        then '<' || tr.modified_by || '>'
        else (select short_alias from dem.staff where db_user = tr.modified_by)
    end
        as modified_by,

    tr.modified_when,
    tr.row_version as row_version,

    -- management keys
    -- clin.clin_root_item
    tr.pk_item,
    tr.fk_encounter as pk_encounter,
    tr.fk_episode as pk_episode,
    -- test_result
    tr.fk_type as pk_test_type,
    tr.fk_intended_reviewer as pk_intended_reviewer,
    tr.xmin as xmin_test_result,
    -- v_unified_test_types
    vttu.pk_test_org,
    vttu.pk_test_type_unified,
    -- v_pat_episodes
    epi.fk_health_issue
        as pk_health_issue,
    -- reviewed_test_results
    rtr.fk_reviewer as pk_last_reviewer
from
    clin.test_result tr
        left join clin.encounter cenc on (tr.fk_encounter = cenc.pk)
            left join clin.episode epi on (tr.fk_episode = epi.pk)
                left join clin.reviewed_test_results rtr on (tr.pk = rtr.fk_reviewed_row)
    ,
    clin.v_unified_test_types vttu
where
    tr.fk_type = vttu.pk_test_type
;

> > - write three explicit sub-selects for the columns I want
> >   to denormalize into the view definition
>
> This would look a bit prettier, but PG tends not to optimize at all.  It
> always executes it as a subplan and hence will only work nicely when
> you've got a very small subset of the test_results coming back.
Potentially in the low hundreds.

Thanks !
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: performance advice needed: join vs explicit subselect

From

Tom Lane

Date:

27 January 2009, 17:42:11

Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:
> The view in question is in fact a lot more complicated. This
> is the best I've been able to come up with so far (and it is
> still slow - slow as in 3-4 seconds for 20 records out of
> (currently only) 50 !):

What does EXPLAIN ANALYZE say about it?  Also, what is the use-case
you are concerned about --- selecting the whole view contents, or
selecting WHERE something-or-other?

            regards, tom lane

Re: performance advice needed: join vs explicit subselect

From

Karsten Hilbert

Date:

27 January 2009, 17:55:48

On Tue, Jan 27, 2009 at 04:42:02PM -0500, Tom Lane wrote:

> Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:
> > The view in question is in fact a lot more complicated. This
> > is the best I've been able to come up with so far (and it is
> > still slow - slow as in 3-4 seconds for 20 records out of
> > (currently only) 50 !):
>
> What does EXPLAIN ANALYZE say about it?  Also, what is the use-case
> you are concerned about --- selecting the whole view contents, or
> selecting WHERE something-or-other?

The query that's run by my application (wiki.gnumed.de) is

select *, xmin_test_result from clin.v_test_results
    where pk_patient = 138                    <--- this is a variable
    order by clin_when desc, pk_episode, unified_name
;

the explain analyze of which is (I've actually gotten it to
work better in the meantime as you can see):

SET
BEGIN
                                                                                   QUERY PLAN
                                                         

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=8512.91..8512.92 rows=1 width=721) (actual time=2039.771..2039.787 rows=14 loops=1)
   Sort Key: tr.clin_when, tr.fk_episode, (COALESCE(ttu.name, tt1.name))
   Sort Method:  quicksort  Memory: 22kB
   InitPlan
     ->  Seq Scan on staff  (cost=0.00..1.07 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)
           Filter: (db_user = "current_user"())
     ->  Seq Scan on staff  (cost=0.00..1.07 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)
           Filter: (db_user = "current_user"())
   ->  Nested Loop Left Join  (cost=3.29..8510.75 rows=1 width=721) (actual time=145.824..2039.427 rows=14 loops=1)
         ->  Nested Loop Left Join  (cost=3.29..27.82 rows=1 width=671) (actual time=4.230..5.298 rows=14 loops=1)
               ->  Nested Loop Left Join  (cost=3.29..23.66 rows=1 width=646) (actual time=4.209..5.061 rows=14
loops=1)
                     Join Filter: (tt1.pk = ltt2ut.fk_test_type)
                     ->  Nested Loop  (cost=2.20..21.42 rows=1 width=565) (actual time=4.089..4.444 rows=14 loops=1)
                           ->  Merge Join  (cost=2.20..20.79 rows=1 width=469) (actual time=4.069..4.201 rows=14
loops=1)
                                 Merge Cond: (cenc.pk = tr.fk_encounter)
                                 ->  Index Scan using encounter_pkey on encounter cenc  (cost=0.00..294.43 rows=16
width=8)(actual time=1.470..3.691 rows=29 loops=1) 
                                       Filter: (fk_patient = 138)
                                 ->  Sort  (cost=2.20..2.29 rows=34 width=465) (actual time=0.279..0.330 rows=34
loops=1)
                                       Sort Key: tr.fk_encounter
                                       Sort Method:  quicksort  Memory: 25kB
                                       ->  Seq Scan on test_result tr  (cost=0.00..1.34 rows=34 width=465) (actual
time=0.027..0.141rows=34 loops=1) 
                           ->  Index Scan using test_type_pkey on test_type tt1  (cost=0.00..0.62 rows=1 width=96)
(actualtime=0.007..0.009 rows=1 loops=14) 
                                 Index Cond: (tt1.pk = tr.fk_type)
                     ->  Hash Join  (cost=1.09..2.19 rows=4 width=89) (actual time=0.012..0.031 rows=4 loops=14)
                           Hash Cond: (ttu.pk = ltt2ut.fk_test_type_unified)
                           ->  Seq Scan on test_type_unified ttu  (cost=0.00..1.04 rows=4 width=85) (actual
time=0.003..0.008rows=4 loops=14) 
                           ->  Hash  (cost=1.04..1.04 rows=4 width=8) (actual time=0.023..0.023 rows=4 loops=1)
                                 ->  Seq Scan on lnk_ttype2unified_type ltt2ut  (cost=0.00..1.04 rows=4 width=8)
(actualtime=0.006..0.013 rows=4 loops=1) 
               ->  Index Scan using episode_pkey on episode epi  (cost=0.00..4.15 rows=1 width=29) (actual
time=0.009..0.011rows=1 loops=14) 
                     Index Cond: (tr.fk_episode = epi.pk)
         ->  Index Scan using unique_review_per_row on reviewed_test_results rtr  (cost=0.00..0.62 rows=1 width=50)
(actualtime=0.005..0.008 rows=1 loops=14) 
               Index Cond: (tr.pk = rtr.fk_reviewed_row)
         SubPlan
           ->  Seq Scan on staff  (cost=0.00..1.06 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=14)
                 Filter: (db_user = $20)
           ->  Seq Scan on staff  (cost=0.00..1.06 rows=1 width=0) (actual time=0.006..0.007 rows=1 loops=14)
                 Filter: (db_user = $20)
           ->  Subquery Scan v_staff  (cost=15.73..4240.07 rows=1 width=128) (actual time=29.739..74.520 rows=1
loops=14)
                 ->  Nested Loop  (cost=15.73..4240.04 rows=1 width=279) (actual time=29.731..74.510 rows=1 loops=14)
                       Join Filter: (s.fk_identity = i.pk)
                       ->  Nested Loop  (cost=0.00..2.31 rows=1 width=143) (actual time=0.015..0.051 rows=1 loops=14)
                             Join Filter: (s.fk_role = sr.pk)
                             ->  Seq Scan on staff s  (cost=0.00..1.06 rows=1 width=131) (actual time=0.006..0.008
rows=1loops=14) 
                                   Filter: (pk = $12)
                             ->  Seq Scan on staff_role sr  (cost=0.00..1.11 rows=11 width=16) (actual
time=0.002..0.018rows=11 loops=14) 
                       ->  Hash Join  (cost=15.73..4210.50 rows=207 width=120) (actual time=0.409..73.865 rows=209
loops=14)
                             Hash Cond: (n.id_identity = i.pk)
                             ->  Seq Scan on names n  (cost=0.00..4.27 rows=210 width=29) (actual time=0.007..0.387
rows=211loops=14) 
                                   Filter: active
                             ->  Hash  (cost=13.12..13.12 rows=209 width=95) (actual time=0.883..0.883 rows=209
loops=1)
                                   ->  Seq Scan on identity i  (cost=0.00..13.12 rows=209 width=95) (actual
time=0.007..0.510rows=209 loops=1) 
                                         Filter: ((deleted IS FALSE) AND (deceased IS NULL))
                             SubPlan
                               ->  Nested Loop  (cost=0.00..9.86 rows=1 width=8) (never executed)
                                     ->  Seq Scan on marital_status ms1  (cost=0.00..1.07 rows=1 width=8) (never
executed)
                                           Filter: (pk = $9)
                                     ->  Index Scan using identity_pkey on identity i1  (cost=0.00..8.27 rows=1
width=0)(never executed) 
                                           Index Cond: (i1.pk = $10)
                               ->  Nested Loop  (cost=0.00..9.35 rows=1 width=8) (never executed)
                                     ->  Seq Scan on marital_status ms  (cost=0.00..1.07 rows=1 width=8) (never
executed)
                                           Filter: (pk = $9)
                                     ->  Index Scan using identity_pkey on identity i1  (cost=0.00..8.27 rows=1
width=0)(never executed) 
                                           Index Cond: (i1.pk = $10)
                       SubPlan
                         ->  Result  (cost=22.06..22.07 rows=1 width=0) (actual time=0.140..0.141 rows=1 loops=14)
                               InitPlan
                                 ->  Result  (cost=11.02..11.03 rows=1 width=0) (actual time=0.068..0.069 rows=1
loops=14)
                                       InitPlan
                                         ->  Index Scan using pg_authid_rolname_index on pg_authid  (cost=1.25..11.02
rows=1width=0) (actual time=0.062..0.062 rows=1 loops=14) 
                                               Index Cond: (rolname = 'gnumed_v10'::name)
                                               Filter: ((NOT rolcanlogin) AND ($13 = ANY ((subplan))))
                                               InitPlan
                                                 ->  Seq Scan on pg_authid  (cost=0.00..1.25 rows=1 width=4) (actual
time=0.011..0.015rows=1 loops=14) 
                                                       Filter: (rolcanlogin AND (rolname = $1))
                                               SubPlan
                                                 ->  Seq Scan on pg_auth_members  (cost=0.00..1.49 rows=4 width=4)
(actualtime=0.015..0.021 rows=4 loops=14) 
                                                       Filter: (roleid = $3)
                                 ->  Result  (cost=11.02..11.03 rows=1 width=0) (actual time=0.060..0.061 rows=1
loops=14)
                                       InitPlan
                                         ->  Index Scan using pg_authid_rolname_index on pg_authid  (cost=1.25..11.02
rows=1width=0) (actual time=0.056..0.056 rows=1 loops=14) 
                                               Index Cond: (rolname = 'gm-logins'::name)
                                               Filter: ((NOT rolcanlogin) AND ($16 = ANY ((subplan))))
                                               InitPlan
                                                 ->  Seq Scan on pg_authid  (cost=0.00..1.25 rows=1 width=4) (actual
time=0.010..0.013rows=1 loops=14) 
                                                       Filter: (rolcanlogin AND (rolname = $1))
                                               SubPlan
                                                 ->  Seq Scan on pg_auth_members  (cost=0.00..1.49 rows=4 width=4)
(actualtime=0.005..0.021 rows=5 loops=14) 
                                                       Filter: (roleid = $3)
           ->  Subquery Scan v_staff  (cost=15.73..4240.07 rows=1 width=128) (actual time=29.500..70.712 rows=1
loops=14)
                 ->  Nested Loop  (cost=15.73..4240.04 rows=1 width=279) (actual time=29.492..70.701 rows=1 loops=14)
                       Join Filter: (s.fk_identity = i.pk)
                       ->  Nested Loop  (cost=0.00..2.31 rows=1 width=143) (actual time=0.016..0.048 rows=1 loops=14)
                             Join Filter: (s.fk_role = sr.pk)
                             ->  Seq Scan on staff s  (cost=0.00..1.06 rows=1 width=131) (actual time=0.007..0.008
rows=1loops=14) 
                                   Filter: (pk = $0)
                             ->  Seq Scan on staff_role sr  (cost=0.00..1.11 rows=11 width=16) (actual
time=0.003..0.016rows=11 loops=14) 
                       ->  Hash Join  (cost=15.73..4210.50 rows=207 width=120) (actual time=0.562..70.059 rows=209
loops=14)
                             Hash Cond: (n.id_identity = i.pk)
                             ->  Seq Scan on names n  (cost=0.00..4.27 rows=210 width=29) (actual time=0.009..0.356
rows=211loops=14) 
                                   Filter: active
                             ->  Hash  (cost=13.12..13.12 rows=209 width=95) (actual time=0.925..0.925 rows=209
loops=1)
                                   ->  Seq Scan on identity i  (cost=0.00..13.12 rows=209 width=95) (actual
time=0.012..0.554rows=209 loops=1) 
                                         Filter: ((deleted IS FALSE) AND (deceased IS NULL))
                             SubPlan
                               ->  Nested Loop  (cost=0.00..9.86 rows=1 width=8) (never executed)
                                     ->  Seq Scan on marital_status ms1  (cost=0.00..1.07 rows=1 width=8) (never
executed)
                                           Filter: (pk = $9)
                                     ->  Index Scan using identity_pkey on identity i1  (cost=0.00..8.27 rows=1
width=0)(never executed) 
                                           Index Cond: (i1.pk = $10)
                               ->  Nested Loop  (cost=0.00..9.35 rows=1 width=8) (never executed)
                                     ->  Seq Scan on marital_status ms  (cost=0.00..1.07 rows=1 width=8) (never
executed)
                                           Filter: (pk = $9)
                                     ->  Index Scan using identity_pkey on identity i1  (cost=0.00..8.27 rows=1
width=0)(never executed) 
                                           Index Cond: (i1.pk = $10)
                       SubPlan
                         ->  Result  (cost=22.06..22.07 rows=1 width=0) (actual time=0.144..0.145 rows=1 loops=14)
                               InitPlan
                                 ->  Result  (cost=11.02..11.03 rows=1 width=0) (actual time=0.070..0.071 rows=1
loops=14)
                                       InitPlan
                                         ->  Index Scan using pg_authid_rolname_index on pg_authid  (cost=1.25..11.02
rows=1width=0) (actual time=0.064..0.064 rows=1 loops=14) 
                                               Index Cond: (rolname = 'gnumed_v10'::name)
                                               Filter: ((NOT rolcanlogin) AND ($2 = ANY ((subplan))))
                                               InitPlan
                                                 ->  Seq Scan on pg_authid  (cost=0.00..1.25 rows=1 width=4) (actual
time=0.011..0.015rows=1 loops=14) 
                                                       Filter: (rolcanlogin AND (rolname = $1))
                                               SubPlan
                                                 ->  Seq Scan on pg_auth_members  (cost=0.00..1.49 rows=4 width=4)
(actualtime=0.016..0.022 rows=4 loops=14) 
                                                       Filter: (roleid = $3)
                                 ->  Result  (cost=11.02..11.03 rows=1 width=0) (actual time=0.062..0.063 rows=1
loops=14)
                                       InitPlan
                                         ->  Index Scan using pg_authid_rolname_index on pg_authid  (cost=1.25..11.02
rows=1width=0) (actual time=0.057..0.057 rows=1 loops=14) 
                                               Index Cond: (rolname = 'gm-logins'::name)
                                               Filter: ((NOT rolcanlogin) AND ($6 = ANY ((subplan))))
                                               InitPlan
                                                 ->  Seq Scan on pg_authid  (cost=0.00..1.25 rows=1 width=4) (actual
time=0.010..0.013rows=1 loops=14) 
                                                       Filter: (rolcanlogin AND (rolname = $1))
                                               SubPlan
                                                 ->  Seq Scan on pg_auth_members  (cost=0.00..1.49 rows=4 width=4)
(actualtime=0.006..0.022 rows=5 loops=14) 
                                                       Filter: (roleid = $3)
 Total runtime: 2041.314 ms
(140 Zeilen)

 count
-------
    14
(1 Zeile)

ROLLBACK

(the count is simply there to verify the view selects the
same number of rows as I am expecting from the base table,
the rollback is there because I have been experimenting
with additional indices)

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: performance advice needed: join vs explicit subselect

From

Karsten Hilbert

Date:

27 January 2009, 17:58:39

On Tue, Jan 27, 2009 at 04:42:02PM -0500, Tom Lane wrote:

> What does EXPLAIN ANALYZE say about it?  Also, what is the use-case
> you are concerned about --- selecting the whole view contents, or
> selecting WHERE something-or-other?

Oh, and the use case is to select all the test_results which
belong to a certain patient:

    where pk_patient = <scalar>

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: performance advice needed: join vs explicit subselect

From

Tom Lane

Date:

27 January 2009, 18:30:30

Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:
> the explain analyze of which is (I've actually gotten it to
> work better in the meantime as you can see):

Looks like most of the problem is in the subquery scans on v_staff,
which seems to be a rather expensive view :-(.  Maybe you can
simplify that a bit.

            regards, tom lane

Re: performance advice needed: join vs explicit subselect

From

Karsten Hilbert

Date:

27 January 2009, 18:58:36

On Tue, Jan 27, 2009 at 05:30:23PM -0500, Tom Lane wrote:

> Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:
> > the explain analyze of which is (I've actually gotten it to
> > work better in the meantime as you can see):
>
> Looks like most of the problem is in the subquery scans on v_staff,
> which seems to be a rather expensive view :-(.  Maybe you can
> simplify that a bit.

Thanks so much. I wasn't quite sure how to correlate the
seemingly expensive parts of the explain with the view/query
parts. Will experiment with that...

Well, going directly to the dem.staff table below
dem.v_staff forces me to forego the actual name of the staff
entry - but the alias will need to suffice ;-)

This brings down query time from 2000ms to 7ms.

Our doctors won't complain about slow lab data retrieval
anymore ... ;-)

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346