Thread: Premature view materialization in 8.2?

Premature view materialization in 8.2?

From
"Jonathan Ellis"
Date:
I have a query hitting a view that takes 0.15s on 8.1 but 6s on 8.2.3:

   select party_id from clan_members_v cm, clans c
   where cm.clan_id = c.id
     and c.type = 'standard'

The problem seems to be that clan_members_v contains a call to an
expensive function:

create or replace view clan_members_v as
select cm.clan_id, cm.user_id, cp.party_id, cm.date_accepted,
   p.name as party_name, p_tp_total(p.id)::int as tp_total
from clan_members cm, clan_participants cp, parties p
where cm.user_id = p.user_id
 and p.id = cp.party_id
;

p_tp_total takes around 50ms per row.

If I create clan_members_v without the function call, the original
query's speed goes to the 150ms range on 8.2 as well.

Is this a regression, or a "feature" of 8.2?

Re: Premature view materialization in 8.2?

From
Tom Lane
Date:
"Jonathan Ellis" <jonathan@utahpython.org> writes:
> I have a query hitting a view that takes 0.15s on 8.1 but 6s on 8.2.3:
> ...
> Is this a regression, or a "feature" of 8.2?

Hard to say without EXPLAIN ANALYZE output to compare.

            regards, tom lane

Re: Premature view materialization in 8.2?

From
"Jonathan Ellis"
Date:
On 4/5/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Jonathan Ellis" <jonathan@utahpython.org> writes:
> > I have a query hitting a view that takes 0.15s on 8.1 but 6s on 8.2.3:
> > ...
> > Is this a regression, or a "feature" of 8.2?
>
> Hard to say without EXPLAIN ANALYZE output to compare.

To my eye they are identical other than the speed but perhaps I am
missing something.

8.2:

 Hash Join  (cost=91.94..560.71 rows=259 width=4) (actual
time=22.120..6388.754 rows=958 loops=1)
   Hash Cond: (cm.clan_id = c.id)
   ->  Hash Join  (cost=75.34..536.90 rows=336 width=36) (actual
time=19.542..6375.827 rows=1298 loops=1)
         Hash Cond: (p.user_id = cm.user_id)
         ->  Hash Join  (cost=36.32..487.94 rows=1303 width=24)
(actual time=9.019..95.583 rows=1299 loops=1)
               Hash Cond: (p.id = cp.party_id)
               ->  Seq Scan on parties p  (cost=0.00..397.52
rows=10952 width=20) (actual time=0.013..40.558 rows=10952 loops=1)
               ->  Hash  (cost=20.03..20.03 rows=1303 width=4) (actual
time=8.545..8.545 rows=1299 loops=1)
                     ->  Seq Scan on clan_participants cp
(cost=0.00..20.03 rows=1303 width=4) (actual time=0.013..4.063
rows=1299 loops=1)
         ->  Hash  (cost=22.90..22.90 rows=1290 width=16) (actual
time=8.748..8.748 rows=1294 loops=1)
               ->  Seq Scan on clan_members cm  (cost=0.00..22.90
rows=1290 width=16) (actual time=0.013..4.307 rows=1294 loops=1)
   ->  Hash  (cost=11.99..11.99 rows=369 width=4) (actual
time=2.550..2.550 rows=368 loops=1)
         ->  Seq Scan on clans c  (cost=0.00..11.99 rows=369 width=4)
(actual time=0.025..1.341 rows=368 loops=1)
               Filter: (("type")::text = 'standard'::text)
 Total runtime: 6391.999 ms


8.1:

 Hash Join  (cost=62.37..681.10 rows=254 width=4) (actual
time=25.316..138.613 rows=967 loops=1)
   Hash Cond: ("outer".clan_id = "inner".id)
   ->  Hash Join  (cost=49.46..664.00 rows=331 width=8) (actual
time=21.331..126.194 rows=1305 loops=1)
         Hash Cond: ("outer".user_id = "inner".user_id)
         ->  Hash Join  (cost=23.32..628.02 rows=1306 width=8) (actual
time=10.674..105.352 rows=1306 loops=1)
               Hash Cond: ("outer".id = "inner".party_id)
               ->  Seq Scan on parties p  (cost=0.00..537.09
rows=10909 width=8) (actual time=0.018..49.754 rows=10855 loops=1)
               ->  Hash  (cost=20.06..20.06 rows=1306 width=4) (actual
time=10.334..10.334 rows=1306 loops=1)
                     ->  Seq Scan on clan_participants cp
(cost=0.00..20.06 rows=1306 width=4) (actual time=0.020..5.172
rows=1306 loops=1)
         ->  Hash  (cost=22.91..22.91 rows=1291 width=8) (actual
time=10.621..10.621 rows=1291 loops=1)
               ->  Seq Scan on clan_members cm  (cost=0.00..22.91
rows=1291 width=8) (actual time=0.019..5.381 rows=1291 loops=1)
   ->  Hash  (cost=11.99..11.99 rows=368 width=4) (actual
time=3.834..3.834 rows=368 loops=1)
         ->  Seq Scan on clans c  (cost=0.00..11.99 rows=368 width=4)
(actual time=0.043..2.373 rows=368 loops=1)
               Filter: (("type")::text = 'standard'::text)
 Total runtime: 142.209 ms

Re: Premature view materialization in 8.2?

From
Tom Lane
Date:
"Jonathan Ellis" <jonathan@utahpython.org> writes:
> On 4/5/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Is this a regression, or a "feature" of 8.2?
>>
>> Hard to say without EXPLAIN ANALYZE output to compare.

> To my eye they are identical other than the speed but perhaps I am
> missing something.

Yeah, it sure is the same plan, and 8.2 seems to be a tad faster right
up to the hash join on user_id.  Is user_id a textual datatype?  I'm
wondering if the 8.2 installation is using a different locale --- the
speed of simple string comparisons can be horrifically worse in some
locales compared to others.

            regards, tom lane

Re: Premature view materialization in 8.2?

From
"Jonathan Ellis"
Date:
On 4/6/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Jonathan Ellis" <jonathan@utahpython.org> writes:
> > On 4/5/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Is this a regression, or a "feature" of 8.2?
> >>
> >> Hard to say without EXPLAIN ANALYZE output to compare.
>
> > To my eye they are identical other than the speed but perhaps I am
> > missing something.
>
> Yeah, it sure is the same plan, and 8.2 seems to be a tad faster right
> up to the hash join on user_id.  Is user_id a textual datatype?  I'm
> wondering if the 8.2 installation is using a different locale --- the
> speed of simple string comparisons can be horrifically worse in some
> locales compared to others.

user_id is an int; they are both C locale.

Re: Premature view materialization in 8.2?

From
Tom Lane
Date:
"Jonathan Ellis" <jonathan@utahpython.org> writes:
> On 4/6/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, it sure is the same plan, and 8.2 seems to be a tad faster right
>> up to the hash join on user_id.  Is user_id a textual datatype?

> user_id is an int; they are both C locale.

Really!?  So much for that theory.

Is work_mem set similarly on both installations?

The only other thing I can think is that you've exposed some unfortunate
corner case in the hash join logic.  Would you be willing to send me
(off-list) the lists of user_ids being joined?  That would be the
clan_members.user_id column and the user_id column from the join of
parties and clan_participants.

            regards, tom lane

Re: Premature view materialization in 8.2?

From
"Jonathan Ellis"
Date:
On 4/6/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Jonathan Ellis" <jonathan@utahpython.org> writes:
> > On 4/6/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Yeah, it sure is the same plan, and 8.2 seems to be a tad faster right
> >> up to the hash join on user_id.  Is user_id a textual datatype?
>
> > user_id is an int; they are both C locale.
>
> Really!?  So much for that theory.

Yeah, this db goes back to 7.0 so I've been careful to keep the locale
set to C to avoid surprises.

> Is work_mem set similarly on both installations?

work_mem is 8MB on 8.2; work_mem is 1MB and sort_mem is 8MB on 8.1.
(there's no disk io going on with the 8.2 installation either, so it's
not swapping or anything like that.)

> The only other thing I can think is that you've exposed some unfortunate
> corner case in the hash join logic.  Would you be willing to send me
> (off-list) the lists of user_ids being joined?  That would be the
> clan_members.user_id column and the user_id column from the join of
> parties and clan_participants.

I can do that...  you don't think the fact I mentioned, that
redefining the view to leave out the expensive function fixes the
problem, is relevant?

Re: Premature view materialization in 8.2?

From
Tom Lane
Date:
"Jonathan Ellis" <jonathan@utahpython.org> writes:
> I can do that...  you don't think the fact I mentioned, that
> redefining the view to leave out the expensive function fixes the
> problem, is relevant?

Hm, I'd not have thought that an expensive function would get evaluated
partway up the join tree, but maybe that's wrong.  You never did show
us the actual view definition ...

            regards, tom lane

Re: Premature view materialization in 8.2?

From
"Jonathan Ellis"
Date:
On 4/6/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Jonathan Ellis" <jonathan@utahpython.org> writes:
> > I can do that...  you don't think the fact I mentioned, that
> > redefining the view to leave out the expensive function fixes the
> > problem, is relevant?
>
> Hm, I'd not have thought that an expensive function would get evaluated
> partway up the join tree, but maybe that's wrong.  You never did show
> us the actual view definition ...

It was in my original post unless it got clipped:

The problem seems to be that clan_members_v contains a call to an
expensive function:

create or replace view clan_members_v as
select cm.clan_id, cm.user_id, cp.party_id, cm.date_accepted,
  p.name as party_name, p_tp_total(p.id)::int as tp_total
from clan_members cm, clan_participants cp, parties p
where cm.user_id = p.user_id
and p.id = cp.party_id
;

p_tp_total takes around 50ms per row.

If I create clan_members_v without the function call, the original
query's speed goes to the 150ms range on 8.2 as well.

Re: Premature view materialization in 8.2?

From
Tom Lane
Date:
"Jonathan Ellis" <jonathan@utahpython.org> writes:
> It was in my original post unless it got clipped:

Sorry, I had forgotten.

> The problem seems to be that clan_members_v contains a call to an
> expensive function:

I'll bet that the function is marked VOLATILE.  8.2 is more conservative
about optimizing away volatile functions than previous releases.  If
it has no side effects, mark it STABLE (or can it even be IMMUTABLE?).

In some quick testing, I verified that 8.2 does evaluate the function at
the join level corresponding to the view's join (and I think this is
preventing it from considering other join orders, too).  If you change
the function's marking to be nonvolatile then the function disappears
from the plan entirely, and also it seems to prefer joining "clans" sooner.

            regards, tom lane

Re: Premature view materialization in 8.2?

From
"Jonathan Ellis"
Date:
On 4/6/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > The problem seems to be that clan_members_v contains a call to an
> > expensive function:
>
> I'll bet that the function is marked VOLATILE.  8.2 is more conservative
> about optimizing away volatile functions than previous releases.  If
> it has no side effects, mark it STABLE (or can it even be IMMUTABLE?).

That's exactly right, it should have been STABLE.

Thanks a lot for figuring that out for me!