Re: severe performance issue with planner - Mailing list pgsql-performance

From Greg Stark
Subject Re: severe performance issue with planner
Date
Msg-id 873c8e54ji.fsf@stark.xeocode.com
Whole thread Raw
In response to severe performance issue with planner  ("Eric Brown" <bigwhitecow@hotmail.com>)
Responses Re: severe performance issue with planner
List pgsql-performance
The other posts about using explicit joins and using stored procedures are
both good points. But I have a few other comments to make:

"Eric Brown" <bigwhitecow@hotmail.com> writes:

> WHERE
>     w0.wid > 0 AND
>     w0.pinyin = 'zheng4' AND
>     w0.def_exists = 't' AND
>     w0.sequence = 0 AND
>     w1.wid = w0.wid AND
>     w1.pinyin LIKE 'fu_' AND
>     w1.variant = w0.variant AND
>     w1.sequence = (w0.sequence + 1) AND

I'm not sure it'll help the planner, but w0.sequence+1 is always just going to
be 1, and so on with the others. I think the planner might be able to figure
that out but the plan doesn't seem to show it doing so. I'm not sure it would
help the plan though.

Similarly you have w1.wid=w0.wid and w2.wid=w1.wid and w3.wid=w2.wid etc. And
also with the "variant" column. You might be able to get this planned better
by writing it as a join from w0 to all the others rather than a chain of
w0->w1->w2->... Again I'm not sure; you would have to experiment.


But I wonder if there isn't a way to do this in a single pass using an
aggregate. I'm not sure I understand the schema exactly, but perhaps something
like this?

select w8.wid,
       w8.variant,
       w8.num_variants,
       sum_text(w8.unicode) as unicodes,
       sum_text(w8.pinyin) as pinyins
  from (
        select wid,variant,
          from words
         where (sequence = 0 and pinyin = 'zheng4')
            OR (sequence = 1 and pinyin like 'ji_')
            OR (sequence = 2 and pinyin like 'guan_')
            OR (sequence = 3 and pinyin like 'kai_')
            OR (sequence = 4 and pinyin like 'fang_')
            OR (sequence = 5 and pinyin like 'xi_')
            OR (sequence = 6 and pinyin like 'tong_')
            OR (sequence = 7 and pinyin like 'fu_')
        group by wid,variant
        having count(*) = 8
       ) as w
  join words as w8 using (wid,variant)

This might be helped by having an index on <sequence,pinyin> but it might not
even need it.


--
greg

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: optimizing large query with IN (...)
Next
From: Greg Stark
Date:
Subject: Re: severe performance issue with planner