Thread: More question about plans & explain (long)

More question about plans & explain (long)

From
"Nick Fankhauser"
Date:
Thanks to everyone who responded to my question about outer joins.

Now I have some more <grin>.

I'm trying to learn what explain can tell me & whether there are any
reasonable ways to push it's choices in one direction based on my knowledge
of the tables without really screwing up general performance. I haven't
found many resources on the explain info in the documentation, so I'm making
a few guesses about what is meant. Corrections are, of course, welcome!

I'm trying to get better performance out of this query:


explain
select
 event.event_date_time,
 event.event_duration,
 event.event_ical_status,
 event.event_location_name,
 event.event_type_code,
 event.event_hearing_type_desc,
 event.event_summary,
 case_data.case_public_id || ' ' ||case_data.case_title,
 court_config.court_name,
 event.event_id,
 case_data.case_id,
 court_config.court_id,
 actor.actor_id,
 actor_identifier.actor_identifier_text,
 actor_identifier.actor_identifier_type
from
 actor_identifier,
 actor,
 actor_case_assignment,
 case_data,
 event,
 court_config
where
 actor_identifier.actor_id = actor.actor_id and
 actor.actor_id = actor_case_assignment.actor_id and
 actor_case_assignment.case_id = case_data.case_id and
 case_data.case_id = event.case_id and
 case_data.court_id = court_config.court_id and
 actor_identifier_text = '7313 53' and
 actor_identifier_type = 'AttorneyStateBarID' and
 event_date_time > '4/1/2002' and
 event_date_time < '6/1/2002';

All of the columns used for a join or constraint have an index.
A vaccum analyze was done just before this explain.

The explain looks like this (I've numbered the lines so I can refer to
them):


1)Merge Join  (cost=1399914.02..1460681.09 rows=154522243 width=248)
2)  ->  Sort  (cost=1374154.11..1374154.11 rows=3605411 width=72)
3)        ->  Merge Join  (cost=138124.99..144473.64 rows=3605411 width=72)
4)              ->  Sort  (cost=93198.01..93198.01 rows=118189 width=60)
5)                    ->  Hash Join  (cost=3285.17..75120.12 rows=118189
width=60)
6)                          ->  Seq Scan on actor_case_assignment
(cost=0.00..18880.77 rows=814677 width=24)
7)                          ->  Hash  (cost=3285.13..3285.13 rows=15
width=36)
8)                                ->  Index Scan using
actor_identifier_actor_id_text on actor_identifier  (cost=0.00..3285.13
rows=15 width=36)
9)              ->  Sort  (cost=44926.98..44926.98 rows=305054 width=12)
10)                    ->  Seq Scan on actor  (cost=0.00..7728.54
rows=305054 width=12)
11)  ->  Materialize  (cost=26017.06..26017.06 rows=428584 width=176)
12)        ->  Merge Join  (cost=25759.91..26017.06 rows=428584 width=176)
13)              ->  Sort  (cost=18978.66..18978.66 rows=18219 width=72)
14)                    ->  Hash Join  (cost=1.12..17251.58 rows=18219
width=72)
15)                          ->  Seq Scan on case_data  (cost=0.00..5407.91
rows=182191 width=48)
16)                          ->  Hash  (cost=1.10..1.10 rows=10 width=24)
17)                                ->  Seq Scan on court_config
(cost=0.00..1.10 rows=10 width=24)
18)              ->  Sort  (cost=6781.25..6781.25 rows=2352 width=104)
19)                    ->  Index Scan using event_event_date_time on event
(cost=0.00..6649.52 rows=2352 width=104)

Here are my questions-

(14-17)It looks like on lines 14-17, case_data and court_config are being
joined without any constraint, thus returning 182191 rows, but the join &
sort on 13 & 14 indicate 18219 rows are expected - why does the planner
expect 1/10th of what I expect? Is there a way to affect this expectation?

10)This table has a unique index on the field being scanned, and this field
is also the only row returned from the table- why is a seq scan chosen here?

15)This table has a unique index on the field being scanned, and only 3 rows
are returned to the query- why the seq scan?

(18-19)Is the row count of 2352 from lines 18 and 19 based on a reasonable
guess that out of the 235239 rows in the table, we'll select one out of 100?
Again, where does this number come from, and can I affect it?

12)Line 12 seems to indicate that the planner expects this join to produce
more rows rather than fewer as in the case of line 14 and line 18. In fact,
the real situation is that the join on line 14 will produce exactly 182191
rows, and since the results of line 19 will produce 8537 rows, 12 will also
produce about 8537 rows. So... how does the planner decide whether a join
will increase or decrease the number of rows?

8)The planner seems to expect 15 rows when the table actually has 66559.
This is so much different from the 1:100 ratio in 19 that I wonder how it
was arrived at? (It *is* a very good guess, since the actual number retturn
will be 1.)

6)Indicates a seq scan occuring on a field that is indexed. The table has
800K rows, but it is only 4 columns wide. Is the narrow width making this
choice a good one?

5)I would expect this join to typically result in 1 to 1000 cases, so the
estimate seems very high in this case.


Some other questions-

What is the difference between a hash join & a merge join?

Does materialize fetch the other fields given a list of index values?

Does a seq scan read the entire table, or can it just retrieve the field?

I guess that's enough questions for now. If you've made it this far, thanks
for reading all of this! I look forward to hearing your thoughts.

-Nick
--------------------------------------------------------------------------
Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/


Re: More question about plans & explain (long)

From
Tom Lane
Date:
"Nick Fankhauser" <nickf@ontko.com> writes:
> 14)                    ->  Hash Join  (cost=1.12..17251.58 rows=18219
> width=72)
> 15)                          ->  Seq Scan on case_data  (cost=0.00..5407.91
> rows=182191 width=48)
> 16)                          ->  Hash  (cost=1.10..1.10 rows=10 width=24)
> 17)                                ->  Seq Scan on court_config
> (cost=0.00..1.10 rows=10 width=24)
> 18)              ->  Sort  (cost=6781.25..6781.25 rows=2352 width=104)
> 19)                    ->  Index Scan using event_event_date_time on event
> (cost=0.00..6649.52 rows=2352 width=104)

> Here are my questions-

> (14-17)It looks like on lines 14-17, case_data and court_config are being
> joined without any constraint, thus returning 182191 rows, but the join &
> sort on 13 & 14 indicate 18219 rows are expected - why does the planner
> expect 1/10th of what I expect? Is there a way to affect this expectation?

A hash join always has some constraint --- otherwise there'd be nothing
to hash on.  In this case it must be using case_data.court_id =
court_config.court_id as the hash clause; I can't see any other
candidate.  It would appear that the system is estimating that the join
selectivity is 0.01 (ie, the estimated number of output rows is only
0.01 of what a full Cartesian product would be).  This seems a
suspiciously round number.  Since it is in fact the default selectivity
estimate (at least pre-7.2) I wonder whether you've ever VACUUM ANALYZEd
these tables.  You seem to have number-of-rows stats but nothing more
for most of these tables.

BTW, current sources display the constraint clauses being used at each
plan step, which takes some of the guesswork out of interpreting EXPLAIN
displays.

> 10)This table has a unique index on the field being scanned, and this field
> is also the only row returned from the table- why is a seq scan chosen here?

It's unlikely that an indexscan would be faster than a seqscan+sort for
obtaining the whole table in sorted order.  Sad but true.

> 15)This table has a unique index on the field being scanned, and only 3 rows
> are returned to the query- why the seq scan?

AFAICT there is no constraint that would allow the three rows to be
identified in advance of doing the join.

> (18-19)Is the row count of 2352 from lines 18 and 19 based on a reasonable
> guess that out of the 235239 rows in the table, we'll select one out of 100?
> Again, where does this number come from, and can I affect it?

See above.

> 12)Line 12 seems to indicate that the planner expects this join to produce
> more rows rather than fewer as in the case of line 14 and line 18.

Still looks like an 0.01 estimate to me ...

> 8)The planner seems to expect 15 rows when the table actually has 66559.
> This is so much different from the 1:100 ratio in 19 that I wonder how it
> was arrived at? (It *is* a very good guess, since the actual number retturn
> will be 1.)

It would seem that you do have ANALYZE stats for actor_identifier.

> What is the difference between a hash join & a merge join?

Hash builds a hashtable of the inner relation and then probes it for
each outer-relation tuple.  Merge wants the two relations to be
pre-sorted on the join key, and then it scans them in parallel.

> Does materialize fetch the other fields given a list of index values?

No, it instantiates the result of the subplan as a temp file (hopefully
all in memory, but may spill to a temp file if it gets too big) so that
the parent plan can rescan it multiple times at relatively low cost.
In this case the reason for the materialize is that the outer mergejoin
may want to back up and rescan portions of its subplan's result, and
the inner mergejoin can't cope.  This is actually a pretty dumb plan;
if we were properly accounting for the cost of the materialize then this
wouldn't have been chosen.  (Simply reversing the order of the outer
and inner plans of the top merge would have eliminated the need for the
materialize.)

> Does a seq scan read the entire table, or can it just retrieve the field?

Reading a tuple is reading a tuple; there's not really any difference.

            regards, tom lane

ALTER TABLE ... SET DEFAULT

From
Brian McCane
Date:

I have a field called 'updated' of type 'integer' in a table.  I have
created a function called 'since_epoch' which is declared with iscacheable
and returns the value of 'extract(epoch from now())'.  I want to make
'updated' have a default of 'since_epoch()'.  I cannot get it to work with
the following command:

alter table foo alter column updated set default 'since_epoch()' ;

this gets the error:

ERROR:  pg_atoi: error in "since_epoch()": can't parse "since_epoch()"

I tried using 'since_epoch()::integer', but that failed as well.
Is it possible to make the default value of an integer field use a
function similar to how we use 'now()' for date and timestamp fields?

- brian

Wm. Brian McCane                    | Life is full of doors that won't open
Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"


Re: ALTER TABLE ... SET DEFAULT

From
Brian McCane
Date:
Okay, I can't wait until morning for an answer, so I made one up ;-).  I
already have a function on this particular table which does processing
'BEFORE INSERT'.  As a temporary hack (or maybe permanent depending on
anyones response), I have added code to see if 'updated' IS NULL and
setting it to the return value of 'since_epoch()' if it is.  I suspect
that since I am already calling this function, it might actually be faster
than adding the 'set default' to the table anyway, especially since the
function 'iscacheable' and is being used in a transaction block.

-brian

On Fri, 12 Apr 2002, Brian McCane wrote:

>
>
>
> I have a field called 'updated' of type 'integer' in a table.  I have
> created a function called 'since_epoch' which is declared with iscacheable
> and returns the value of 'extract(epoch from now())'.  I want to make
> 'updated' have a default of 'since_epoch()'.  I cannot get it to work with
> the following command:
>
> alter table foo alter column updated set default 'since_epoch()' ;
>
> this gets the error:
>
> ERROR:  pg_atoi: error in "since_epoch()": can't parse "since_epoch()"
>
> I tried using 'since_epoch()::integer', but that failed as well.
> Is it possible to make the default value of an integer field use a
> function similar to how we use 'now()' for date and timestamp fields?
>
> - brian
>
> Wm. Brian McCane                    | Life is full of doors that won't open
> Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
> Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
> Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


Re: ALTER TABLE ... SET DEFAULT

From
Tom Lane
Date:
Brian McCane <bmccane@mccons.net> writes:
> alter table foo alter column updated set default 'since_epoch()' ;
> this gets the error:
> ERROR:  pg_atoi: error in "since_epoch()": can't parse "since_epoch()"

Try it without the quotes.

            regards, tom lane

Re: ALTER TABLE ... SET DEFAULT

From
Brian McCane
Date:
Been there, done that :).  It put the current epoch value in as the
default value for updated.

- brian

On Fri, 12 Apr 2002, Tom Lane wrote:

>
> Brian McCane <bmccane@mccons.net> writes:
> > alter table foo alter column updated set default 'since_epoch()' ;
> > this gets the error:
> > ERROR:  pg_atoi: error in "since_epoch()": can't parse "since_epoch()"
>
> Try it without the quotes.
>
>             regards, tom lane
>

Wm. Brian McCane                    | Life is full of doors that won't open
Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"


Re: ALTER TABLE ... SET DEFAULT

From
Tom Lane
Date:
Brian McCane <bmccane@mccons.net> writes:
> Been there, done that :).  It put the current epoch value in as the
> default value for updated.

That would suggest that you marked since_epoch() as cachable.  Wrong
thing to do, if you want it evaluated again every time the default
is used.

            regards, tom lane

Re: ALTER TABLE ... SET DEFAULT

From
Brian McCane
Date:
Okay, so if I create a function as 'iscachable', it assumes that the value
will never change, calls the function and places the value in as the
default?  This doesn't make sense to me since the 'iscacheable' only works
inside of a transaction block.  If I call my function a hundred times, I
get a hundred different answers.  Once I do a 'BEGIN', the result never
changes again until I 'COMMIT', which is what I would expect according to
the docs I have found.

- brian



On Fri, 12 Apr 2002, Tom Lane wrote:

>
> Brian McCane <bmccane@mccons.net> writes:
> > Been there, done that :).  It put the current epoch value in as the
> > default value for updated.
>
> That would suggest that you marked since_epoch() as cachable.  Wrong
> thing to do, if you want it evaluated again every time the default
> is used.
>
>             regards, tom lane
>

Wm. Brian McCane                    | Life is full of doors that won't open
Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"


Re: ALTER TABLE ... SET DEFAULT

From
Tom Lane
Date:
Brian McCane <bmccane@mccons.net> writes:
> This doesn't make sense to me since the 'iscacheable' only works
> inside of a transaction block.

Nope, wrong definition.

This has been refined for 7.3, but in existing releases "iscachable"
really means "result never changes, *ever*".

            regards, tom lane

Re: ALTER TABLE ... SET DEFAULT

From
Tom Lane
Date:
Brian McCane <bmccane@mccons.net> writes:
> Okay, so if I create a function as 'iscachable', it assumes that the value
> will never change, calls the function and places the value in as the
> default?  This doesn't make sense to me since the 'iscacheable' only works
> inside of a transaction block.

Um ... I already told you, that is *not* the meaning of 'isCachable'.

7.3 will split this term into 'isImmutable' and 'isStable', where the
latter means approximately what you seem to expect.  But the practical
behavior of existing releases is that 'isCachable' means 'fixed forever
given fixed input arguments'.  If you don't like it, argue with the
code.

            regards, tom lane