Thread: won't drop the view

won't drop the view

From
Oleg Lebedev
Date:
Hi everybody,
I just vacuumed and vacuumed analyzed my database. Now, I am trying to
execute a view, which was perfectly working before, but it seems to be
very slow. It was sitting there for 10 mins before I cancelled it.
Usually it was taking on the order of 5 secs to execute the view.
I just recreated the view, but the problem still persists.
Here is what EXPLAIN tells me:
EXPLAIN select * from progress_report;
Subquery Scan progress_report  (cost=16386.56..16386.56 rows=2
width=128) ->  Sort  (cost=16386.56..16386.56 rows=2 width=128)       ->  Nested Loop  (cost=16299.45..16386.55 rows=2
width=128)

How can I "restore" the "before-the-vacuum" performance?
thanks,

Oleg



Re: won't drop the view

From
"Josh Berkus"
Date:
Oleg,

> I just vacuumed and vacuumed analyzed my database. Now, I am trying
> to
> execute a view, which was perfectly working before, but it seems to
> be
> very slow. It was sitting there for 10 mins before I cancelled it.
> Usually it was taking on the order of 5 secs to execute the view.
> I just recreated the view, but the problem still persists.
> Here is what EXPLAIN tells me:
> EXPLAIN select * from progress_report;
> Subquery Scan progress_report  (cost=16386.56..16386.56 rows=2
> width=128)
>   ->  Sort  (cost=16386.56..16386.56 rows=2 width=128)
>         ->  Nested Loop  (cost=16299.45..16386.55 rows=2 width=128)
> 
> How can I "restore" the "before-the-vacuum" performance?
> thanks,

This is not normal.  I suspect that you have something wrong with your
Postgres system configuration or your system in general.

Please post:
1. Your postgres version
2. Your platform (OS and version)
3. Your hardware statistics, including:Processor & RAMDisk space free on your root drive and postgres drive
4. The view definition
5. Row counts on all tables involved in the view, as well as whether
those tables have very large text fields or BLOBs.
6. Finally, check your Postgres log to see if VACUUM raised any errors,
and to see if selecting the view causes any errors.

-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________                                      Josh Berkus Complete
informationtechnology      josh@agliodbs.com  and data management solutions       (415) 565-7293 for law firms, small
businesses       fax 621-2533   and non-profit organizations.      San Francisco
 


Re: won't drop the view

From
Chris Ruprecht
Date:
Hi Oleg,
At 19:28 -0700 12/14/2001, Oleg Lebedev wrote:
>Hi everybody,
>I just vacuumed and vacuumed analyzed my database. Now, I am trying to
>execute a view, which was perfectly working before, but it seems to be
>very slow. It was sitting there for 10 mins before I cancelled it.
>Usually it was taking on the order of 5 secs to execute the view.
>I just recreated the view, but the problem still persists.
>Here is what EXPLAIN tells me:
>EXPLAIN select * from progress_report;
>Subquery Scan progress_report  (cost=16386.56..16386.56 rows=2
>width=128)
>   ->  Sort  (cost=16386.56..16386.56 rows=2 width=128)
>         ->  Nested Loop  (cost=16299.45..16386.55 rows=2 width=128)
>
>How can I "restore" the "before-the-vacuum" performance?
>thanks,

I don't know much about the internal workings about 'vacuum'. My 
guess is that there is something wrong with one of the indexes in 
your db somewhere. I would try and rebuild them - starting with the 
ones involved in the view. You might have to drop and re-create them 
- as well as the view.

Good luck,
Chris
-- 
Chris Ruprecht
Network grunt and bit pusher extraordinaíre

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: won't drop the view

From
Oleg Lebedev
Date:
Josh,
I just recreated all the indexes on the tables, but it didn't help.
Please let me know if there is any other information you need to help me
resolve performance issue.
Here are answers to your questions:

1. I am using PostgreSQL 7.1.2
2. Linux RedHat 7.1
3. Pentium II 400 with 256 Megs of RAM   14 out of 17 Gigs are free (there is only one disk on this system)
4. Here is the view definition:
CREATE VIEW progress_report AS
SELECT  acts.product_code AS product_code,       acts.component AS component,       acts.priority AS priority,
acts.statusAS status,       COALESCE(media_acts.art_checked_in, 0) AS art_in,       COALESCE(media_acts.art_waiting, 0)
ASart_wait,       COALESCE(media_acts.audio_checked_in, 0) AS audio_in,       COALESCE(media_acts.audio_waiting, 0) AS
audio_wait,      COALESCE(media_acts.video_checked_in, 0) AS video_in,       COALESCE(media_acts.video_waiting, 0) AS
video_wait
FROM       (SELECT objectid AS actid,               productcode AS product_code,               actname AS component,
          status AS status,               priority AS priority       FROM activity       WHERE   activity.productcode ~
'^m3')acts
 

LEFT OUTER JOIN       (       SELECT  actid,               SUM     (CASE WHEN
lower(stats.media_status)~'^checked'                                       AND lower(stats.type)='art'
    THEN 1 ELSE 0 END)               AS art_checked_in,               SUM     (CASE WHEN
 
lower(stats.media_status)~'^waiting'                                       AND lower(stats.type)='art'
    THEN 1 ELSE 0 END)               AS art_waiting,               SUM     (CASE WHEN
 
lower(stats.media_status)~'^checked'                                       AND lower(stats.type)='audio'
      THEN 1 ELSE 0 END)               AS audio_checked_in,               SUM     (CASE WHEN
 
lower(stats.media_status)~'^waiting'                                       AND lower(stats.type)='audio'
      THEN 1 ELSE 0 END)               AS audio_waiting,               SUM     (CASE WHEN
 
lower(stats.media_status)~'^checked'                                       AND lower(stats.type)='video'
      THEN 1 ELSE 0 END)               AS video_checked_in,               SUM     (CASE WHEN
 
lower(stats.media_status)~'^waiting'                                       AND lower(stats.type)='video'
      THEN 1 ELSE 0 END)               AS video_waiting       FROM       (       (SELECT media.objectid AS mediaid,
         media.status AS media_status,               mediatypemap.typecategory AS type,               media.activity AS
actid      FROM media, mediatypemap       WHERE media.mediatype = mediatypemap.mediatype)               UNION
(SELECTintsetmedia.media AS mediaid,               media.status AS media_status,
mediatypemap.typecategoryAS type,               set.activity AS actid       FROM    intsetmedia, set, media,
mediatypemap      WHERE   media.mediatype = mediatypemap.mediatype               AND intsetmedia.set = set.objectid
         AND intsetmedia.media = media.objectid       )               UNION       (SELECT dtrowmedia.media AS mediaid,
            media.status AS media_status,               mediatypemap.typecategory AS type,
datatable.activityAS actid       FROM media, mediatypemap, dtrowmedia, dtrow, dtcol, datatable       WHERE
media.mediatype= mediatypemap.mediatype               AND dtrowmedia.media = media.objectid               AND
dtrowmedia.dtrow= dtrow.objectid               AND dtrow.dtcol = dtcol.objectid               AND dtcol.datatable =
datatable.objectid
       )       ) stats       GROUP BY actid ) media_acts

ON       acts.actid = media_acts.actid
ORDER BY product_code;


5. None of the tables involved in the view has BLOBs in it.   Row count for the tables are as follows:   activity:
253  media:    12406   set:         826   intsetmedia:    22916   mediatypemap: 25   datatable:    318   dtcol:
1698  dtrow:      18406   dtrowmedia: 10238
 

6. I didn't have debug on when running vacuum, so I can't tell you if it
raised any errors.


thanks,





Josh Berkus wrote:

> Oleg,
>
> > I just vacuumed and vacuumed analyzed my database. Now, I am trying
> > to
> > execute a view, which was perfectly working before, but it seems to
> > be
> > very slow. It was sitting there for 10 mins before I cancelled it.
> > Usually it was taking on the order of 5 secs to execute the view.
> > I just recreated the view, but the problem still persists.
> > Here is what EXPLAIN tells me:
> > EXPLAIN select * from progress_report;
> > Subquery Scan progress_report  (cost=16386.56..16386.56 rows=2
> > width=128)
> >   ->  Sort  (cost=16386.56..16386.56 rows=2 width=128)
> >         ->  Nested Loop  (cost=16299.45..16386.55 rows=2 width=128)
> >
> > How can I "restore" the "before-the-vacuum" performance?
> > thanks,
>
> This is not normal.  I suspect that you have something wrong with your
> Postgres system configuration or your system in general.
>
> Please post:
> 1. Your postgres version
> 2. Your platform (OS and version)
> 3. Your hardware statistics, including:
>         Processor & RAM
>         Disk space free on your root drive and postgres drive
> 4. The view definition
> 5. Row counts on all tables involved in the view, as well as whether
> those tables have very large text fields or BLOBs.
> 6. Finally, check your Postgres log to see if VACUUM raised any errors,
> and to see if selecting the view causes any errors.
>
> -Josh Berkus
>
> ______AGLIO DATABASE SOLUTIONS___________________________
>                                        Josh Berkus
>   Complete information technology      josh@agliodbs.com
>    and data management solutions       (415) 565-7293
>   for law firms, small businesses        fax 621-2533
>     and non-profit organizations.      San Francisco



Re: won't drop the view

From
"Josh Berkus"
Date:
Oleg,

> 1. I am using PostgreSQL 7.1.2
> 2. Linux RedHat 7.1
> 3. Pentium II 400 with 256 Megs of RAM
>     14 out of 17 Gigs are free (there is only one disk on this
> system)
> 4. Here is the view definition:
<snip>
> 5. None of the tables involved in the view has BLOBs in it.
>     Row count for the tables are as follows:
>     activity:    253
>     media:    12406
>     set:         826
>     intsetmedia:    22916
>     mediatypemap: 25
>     datatable:    318
>     dtcol:       1698
>     dtrow:      18406
>     dtrowmedia: 10238
> 
> 6. I didn't have debug on when running vacuum, so I can't tell you if
> it
> raised any errors.

Damn.  I can only really help with the obvious things, and you've
covered most of those.  That view should take a long time, given the
complexity ... but a long time is 30-60 seconds, not 10 minutes.

Therefore:

1. Turn up the debug level in postgres.conf
2. Re-start postgresql, and open a console to tail the log.
3. Watch the log as you:a) Vacuum Analyzeb) SELECT from the view again
4. Hope that one of the core team looks into your question.

Beyond that, we're down to monkeying with Postgres' memory settings;
it's possible (but not likely) that you've exhausted the available sort
memory and Postgres is getting stuck in swap-access.  However, that's
just a stab in the dark.

-Josh

______AGLIO DATABASE SOLUTIONS___________________________                                      Josh Berkus Complete
informationtechnology      josh@agliodbs.com  and data management solutions       (415) 565-7293 for law firms, small
businesses       fax 621-2533   and non-profit organizations.      San Francisco
 


Re: won't drop the view

From
Tom Lane
Date:
"Josh Berkus" <josh@agliodbs.com> writes:
> Damn.  I can only really help with the obvious things, and you've
> covered most of those.  That view should take a long time, given the
> complexity ... but a long time is 30-60 seconds, not 10 minutes.

I'm guessing it's ye olde stupid-choice-of-plan kind of problem.
Unless I missed it, we have not been shown the plan currently being
used (no, the first four lines don't do it for me...)

You could probably get back to the previous, not-so-slow plan choice
by doing "DELETE FROM pg_statistic" to remove the data collected by
VACUUM ANALYZE.  A comparison of the EXPLAIN output in that state
vs. the EXPLAIN output in the analyzed state might be enlightening.
        regards, tom lane


Re: won't drop the view

From
Oleg Lebedev
Date:
Thanks everybody for your help.
Deleting from pg_statistic restored view performance to the way it was
before I ran vacuum analyze.
Below I attach two files that contain explain statistics for the view before
(progress_report.txt) and after (progress_report_analyzed.txt) I ran vacuum
analyze. The first one takes about 15 secs. to complete, the second one
takes 12-13 minutes to complete.
I hope this will help you improve your query optimizer.
thanks,

Oleg

Tom Lane wrote:

> "Josh Berkus" <josh@agliodbs.com> writes:
> > Damn.  I can only really help with the obvious things, and you've
> > covered most of those.  That view should take a long time, given the
> > complexity ... but a long time is 30-60 seconds, not 10 minutes.
>
> I'm guessing it's ye olde stupid-choice-of-plan kind of problem.
> Unless I missed it, we have not been shown the plan currently being
> used (no, the first four lines don't do it for me...)
>
> You could probably get back to the previous, not-so-slow plan choice
> by doing "DELETE FROM pg_statistic" to remove the data collected by
> VACUUM ANALYZE.  A comparison of the EXPLAIN output in that state
> vs. the EXPLAIN output in the analyzed state might be enlightening.
>
>                         regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
execution time ~ 15sec.

NOTICE:  QUERY PLAN:

Subquery Scan progress_report  (cost=14648.65..14648.65 rows=4 width=128)
  ->  Sort  (cost=14648.65..14648.65 rows=4 width=128)
        ->  Merge Join  (cost=14644.00..14648.62 rows=4 width=128)
              ->  Sort  (cost=11.02..11.02 rows=2 width=48)
                    ->  Seq Scan on activity  (cost=0.00..11.00 rows=2 width=48)
              ->  Sort  (cost=14632.99..14632.99 rows=367 width=112)
                    ->  Subquery Scan media_acts  (cost=14553.17..14617.36 rows=367 width=112)
                          ->  Aggregate  (cost=14553.17..14617.36 rows=367 width=112)
                                ->  Group  (cost=14553.17..14562.34 rows=3668 width=112)
                                      ->  Sort  (cost=14553.17..14553.17 rows=3668 width=112)
                                            ->  Subquery Scan stats  (cost=13969.18..14336.00 rows=3668 width=112)
                                                  ->  Unique  (cost=13969.18..14336.00 rows=3668 width=112)
                                                        ->  Sort  (cost=13969.18..13969.18 rows=36682 width=112)
                                                              ->  Append  (cost=1.31..9744.33 rows=36682 width=112)
                                                                    ->  Subquery Scan *SELECT* 1  (cost=1.31..1154.05
rows=11459width=56) 
                                                                          ->  Hash Join  (cost=1.31..1154.05 rows=11459
width=56)
                                                                                ->  Seq Scan on media
(cost=0.00..407.59rows=11459 width=36) 
                                                                                ->  Hash  (cost=1.25..1.25 rows=25
width=20)
                                                                                      ->  Seq Scan on mediatypemap
(cost=0.00..1.25rows=25 width=20) 
                                                                    ->  Subquery Scan *SELECT* 2
(cost=1313.31..4930.60rows=20704 width=80) 
                                                                          ->  Hash Join  (cost=1313.31..4930.60
rows=20704width=80) 
                                                                                ->  Hash Join  (cost=1283.70..3513.64
rows=21196width=64) 
                                                                                      ->  Seq Scan on intsetmedia
(cost=0.00..399.96rows=21196 width=16) 
                                                                                      ->  Hash  (cost=1154.05..1154.05
rows=11459width=48) 
                                                                                            ->  Hash Join
(cost=1.31..1154.05rows=11459 width=48) 
                                                                                                  ->  Seq Scan on media
(cost=0.00..407.59 rows=11459 width=28) 
                                                                                                  ->  Hash
(cost=1.25..1.25rows=25 width=20) 
                                                                                                        ->  Seq Scan on
mediatypemap (cost=0.00..1.25 rows=25 width=20) 
                                                                                ->  Hash  (cost=27.69..27.69 rows=769
width=16)
                                                                                      ->  Seq Scan on set
(cost=0.00..27.69rows=769 width=16) 
                                                                    ->  Subquery Scan *SELECT* 3
(cost=1469.06..3659.68rows=4519 width=112) 
                                                                          ->  Hash Join  (cost=1469.06..3659.68
rows=4519width=112) 
                                                                                ->  Hash Join  (cost=1467.74..3364.32
rows=4519width=92) 
                                                                                      ->  Hash Join
(cost=689.81..1978.41rows=4519 width=64) 
                                                                                            ->  Hash Join
(cost=585.40..1498.78rows=9148 width=32) 
                                                                                                  ->  Seq Scan on
dtrowmedia (cost=0.00..164.48 rows=9148 width=16) 
                                                                                                  ->  Hash
(cost=306.99..306.99rows=16799 width=16) 
                                                                                                        ->  Seq Scan on
dtrow (cost=0.00..306.99 rows=16799 width=16) 
                                                                                            ->  Hash
(cost=102.55..102.55rows=744 width=32) 
                                                                                                  ->  Hash Join
(cost=6.62..102.55rows=744 width=32) 
                                                                                                        ->  Seq Scan on
dtcol (cost=0.00..32.06 rows=1506 width=16) 
                                                                                                        ->  Hash
(cost=5.90..5.90rows=290 width=16) 
                                                                                                              ->  Seq
Scanon datatable  (cost=0.00..5.90 rows=290 width=16) 
                                                                                      ->  Hash  (cost=407.59..407.59
rows=11459width=28) 
                                                                                            ->  Seq Scan on media
(cost=0.00..407.59rows=11459 width=28) 
                                                                                ->  Hash  (cost=1.25..1.25 rows=25
width=20)
                                                                                      ->  Seq Scan on mediatypemap
(cost=0.00..1.25rows=25 width=20) 

EXPLAIN
execution time ~ 12-13 mins.

NOTICE:  QUERY PLAN:

Subquery Scan progress_report  (cost=14687.00..14687.00 rows=2 width=128)
  ->  Sort  (cost=14687.00..14687.00 rows=2 width=128)
        ->  Nested Loop  (cost=14605.17..14686.99 rows=2 width=128)
              ->  Seq Scan on activity  (cost=0.00..11.00 rows=1 width=48)
              ->  Subquery Scan media_acts  (cost=14605.17..14671.27 rows=378 width=112)
                    ->  Aggregate  (cost=14605.17..14671.27 rows=378 width=112)
                          ->  Group  (cost=14605.17..14614.61 rows=3777 width=112)
                                ->  Sort  (cost=14605.17..14605.17 rows=3777 width=112)
                                      ->  Subquery Scan stats  (cost=14003.05..14380.75 rows=3777 width=112)
                                            ->  Unique  (cost=14003.05..14380.75 rows=3777 width=112)
                                                  ->  Sort  (cost=14003.05..14003.05 rows=37771 width=112)
                                                        ->  Append  (cost=1.31..9627.85 rows=37771 width=112)
                                                              ->  Subquery Scan *SELECT* 1  (cost=1.31..1154.05
rows=11459width=56) 
                                                                    ->  Hash Join  (cost=1.31..1154.05 rows=11459
width=56)
                                                                          ->  Seq Scan on media  (cost=0.00..407.59
rows=11459width=36) 
                                                                          ->  Hash  (cost=1.25..1.25 rows=25 width=20)
                                                                                ->  Seq Scan on mediatypemap
(cost=0.00..1.25rows=25 width=20) 
                                                              ->  Subquery Scan *SELECT* 2  (cost=1313.31..4930.60
rows=21196width=80) 
                                                                    ->  Hash Join  (cost=1313.31..4930.60 rows=21196
width=80)
                                                                          ->  Hash Join  (cost=1283.70..3513.64
rows=21196width=64) 
                                                                                ->  Seq Scan on intsetmedia
(cost=0.00..399.96rows=21196 width=16) 
                                                                                ->  Hash  (cost=1154.05..1154.05
rows=11459width=48) 
                                                                                      ->  Hash Join
(cost=1.31..1154.05rows=11459 width=48) 
                                                                                            ->  Seq Scan on media
(cost=0.00..407.59rows=11459 width=28) 
                                                                                            ->  Hash  (cost=1.25..1.25
rows=25width=20) 
                                                                                                  ->  Seq Scan on
mediatypemap (cost=0.00..1.25 rows=25 width=20) 
                                                                          ->  Hash  (cost=27.69..27.69 rows=769
width=16)
                                                                                ->  Seq Scan on set  (cost=0.00..27.69
rows=769width=16) 
                                                              ->  Subquery Scan *SELECT* 3  (cost=1993.99..3543.20
rows=5116width=112) 
                                                                    ->  Hash Join  (cost=1993.99..3543.20 rows=5116
width=112)
                                                                          ->  Hash Join  (cost=1992.68..3209.05
rows=5116width=92) 
                                                                                ->  Seq Scan on media
(cost=0.00..407.59rows=11459 width=28) 
                                                                                ->  Hash  (cost=1979.89..1979.89
rows=5116width=64) 
                                                                                      ->  Hash Join
(cost=690.05..1979.89rows=5116 width=64) 
                                                                                            ->  Hash Join
(cost=585.40..1498.78rows=9148 width=32) 
                                                                                                  ->  Seq Scan on
dtrowmedia (cost=0.00..164.48 rows=9148 width=16) 
                                                                                                  ->  Hash
(cost=306.99..306.99rows=16799 width=16) 
                                                                                                        ->  Seq Scan on
dtrow (cost=0.00..306.99 rows=16799 width=16) 
                                                                                            ->  Hash
(cost=102.55..102.55rows=842 width=32) 
                                                                                                  ->  Hash Join
(cost=6.62..102.55rows=842 width=32) 
                                                                                                        ->  Seq Scan on
dtcol (cost=0.00..32.06 rows=1506 width=16) 
                                                                                                        ->  Hash
(cost=5.90..5.90rows=290 width=16) 
                                                                                                              ->  Seq
Scanon datatable  (cost=0.00..5.90 rows=290 width=16) 
                                                                          ->  Hash  (cost=1.25..1.25 rows=25 width=20)
                                                                                ->  Seq Scan on mediatypemap
(cost=0.00..1.25rows=25 width=20) 

EXPLAIN

Re: won't drop the view

From
Tom Lane
Date:
Oleg Lebedev <olebedev@waterford.org> writes:
> Deleting from pg_statistic restored view performance to the way it was
> before I ran vacuum analyze.
> Below I attach two files that contain explain statistics for the view before
> (progress_report.txt) and after (progress_report_analyzed.txt) I ran vacuum
> analyze. The first one takes about 15 secs. to complete, the second one
> takes 12-13 minutes to complete.

Looks like the issue is the number of rows estimated to be obtained from
the "activity" table:

good plan:

>         ->  Merge Join  (cost=14644.00..14648.62 rows=4 width=128)
>               ->  Sort  (cost=11.02..11.02 rows=2 width=48)
>                     ->  Seq Scan on activity  (cost=0.00..11.00 rows=2 width=48)
>               ->  Sort  (cost=14632.99..14632.99 rows=367 width=112)
>                     ->  Subquery Scan media_acts  (cost=14553.17..14617.36 rows=367 width=112)

bad plan:

>         ->  Nested Loop  (cost=14605.17..14686.99 rows=2 width=128)
>               ->  Seq Scan on activity  (cost=0.00..11.00 rows=1 width=48)
>               ->  Subquery Scan media_acts  (cost=14605.17..14671.27 rows=378 width=112)

The plans for media_acts look about the same, so I have to guess that
activity actually yields 50 or so rows, not just one.  That doesn't
hurt the mergejoin too much, but it is a killer for the nestloop.

You showed the query as (SELECT ...       FROM activity       WHERE   activity.productcode ~ '^m3') acts
How many rows actually match activity.productcode ~ '^m3' ?  How many
rows altogether in activity?
        regards, tom lane


Re: won't drop the view

From
Oleg Lebedev
Date:
You are right Tom,
(SELECT ...       FROM activity       WHERE   activity.productcode ~ '^m3') acts
yeilds 235 rows, with a total of 240 rows in activity table. I wonder why EXPLAIN estimates only 1
row? Are there any online docs on how the planner works?
BTW, I would like to buy a book that would help me tune up postgres performance. Which book would
you suggest?
Thanks,

Oleg

Tom Lane wrote:

> Oleg Lebedev <olebedev@waterford.org> writes:
> > Deleting from pg_statistic restored view performance to the way it was
> > before I ran vacuum analyze.
> > Below I attach two files that contain explain statistics for the view before
> > (progress_report.txt) and after (progress_report_analyzed.txt) I ran vacuum
> > analyze. The first one takes about 15 secs. to complete, the second one
> > takes 12-13 minutes to complete.
>
> Looks like the issue is the number of rows estimated to be obtained from
> the "activity" table:
>
> good plan:
>
> >         ->  Merge Join  (cost=14644.00..14648.62 rows=4 width=128)
> >               ->  Sort  (cost=11.02..11.02 rows=2 width=48)
> >                     ->  Seq Scan on activity  (cost=0.00..11.00 rows=2 width=48)
> >               ->  Sort  (cost=14632.99..14632.99 rows=367 width=112)
> >                     ->  Subquery Scan media_acts  (cost=14553.17..14617.36 rows=367 width=112)
>
> bad plan:
>
> >         ->  Nested Loop  (cost=14605.17..14686.99 rows=2 width=128)
> >               ->  Seq Scan on activity  (cost=0.00..11.00 rows=1 width=48)
> >               ->  Subquery Scan media_acts  (cost=14605.17..14671.27 rows=378 width=112)
>
> The plans for media_acts look about the same, so I have to guess that
> activity actually yields 50 or so rows, not just one.  That doesn't
> hurt the mergejoin too much, but it is a killer for the nestloop.
>
> You showed the query as
>  (SELECT ...
>         FROM activity
>         WHERE   activity.productcode ~ '^m3') acts
> How many rows actually match activity.productcode ~ '^m3' ?  How many
> rows altogether in activity?
>
>                         regards, tom lane



Re: won't drop the view

From
Tom Lane
Date:
Oleg Lebedev <olebedev@waterford.org> writes:
> yeilds 235 rows, with a total of 240 rows in activity table. I wonder
> why EXPLAIN estimates only 1 row?

Ugh, that's a pretty horrible misestimation.  What do you get from

select attname,attdispersion,s.*
from pg_statistic s, pg_attribute a, pg_class c
where starelid = c.oid and attrelid = c.oid and staattnum = attnum
and relname = 'activity';

(you'll need to do "VACUUM ANALYZE activity" first).

> Are there any online docs on how the planner works?

The first two sections of 
http://developer.postgresql.org/docs/postgres/performance-tips.html
are a starting point --- but keep in mind that section 11.2 describes
PG 7.2's stats.  The pg_statistic scheme in 7.1 is much simpler.

I am hoping that 7.2 will get a better result for this query, but can't
tell without more info.
        regards, tom lane


Re: won't drop the view

From
Oleg Lebedev
Date:
Thanks for your comments Tom.
I attached a file with the query you mentioned in your posting and it's output. Those numbers don't
really tell me anything, but let me know if you find anything interesting there.
thanks,

Oleg

Tom Lane wrote:

> Oleg Lebedev <olebedev@waterford.org> writes:
> > Deleting from pg_statistic restored view performance to the way it was
> > before I ran vacuum analyze.
> > Below I attach two files that contain explain statistics for the view before
> > (progress_report.txt) and after (progress_report_analyzed.txt) I ran vacuum
> > analyze. The first one takes about 15 secs. to complete, the second one
> > takes 12-13 minutes to complete.
>
> Looks like the issue is the number of rows estimated to be obtained from
> the "activity" table:
>
> good plan:
>
> >         ->  Merge Join  (cost=14644.00..14648.62 rows=4 width=128)
> >               ->  Sort  (cost=11.02..11.02 rows=2 width=48)
> >                     ->  Seq Scan on activity  (cost=0.00..11.00 rows=2 width=48)
> >               ->  Sort  (cost=14632.99..14632.99 rows=367 width=112)
> >                     ->  Subquery Scan media_acts  (cost=14553.17..14617.36 rows=367 width=112)
>
> bad plan:
>
> >         ->  Nested Loop  (cost=14605.17..14686.99 rows=2 width=128)
> >               ->  Seq Scan on activity  (cost=0.00..11.00 rows=1 width=48)
> >               ->  Subquery Scan media_acts  (cost=14605.17..14671.27 rows=378 width=112)
>
> The plans for media_acts look about the same, so I have to guess that
> activity actually yields 50 or so rows, not just one.  That doesn't
> hurt the mergejoin too much, but it is a killer for the nestloop.
>
> You showed the query as
>  (SELECT ...
>         FROM activity
>         WHERE   activity.productcode ~ '^m3') acts
> How many rows actually match activity.productcode ~ '^m3' ?  How many
> rows altogether in activity?
>
>                         regards, tom lane
select attname, attdispersion, s.*
from pg_statistic s, pg_attribute a, pg_class c
where starelid = c.oid and attrelid=c.oid and staattnum=attnum and relname='activity';

         attname         | attdispersion | starelid | staattnum | staop | stanullfrac | stacommonfrac |
stacommonval            |  staloval   |           stahival            

-------------------------+---------------+----------+-----------+-------+-------------+---------------+--------------------------------------+-------------+------------------------------
 actname                 |      0.110222 | 14559105 |         1 |  1066 |           0 |      0.266667 | ip
                    | 0000        | test2 
 actdesc                 |      0.002625 | 14559105 |         2 |  1066 |           0 |        0.0125 | Operations
extendedpractice.        |             | this is for testing purposes 
 origdesigner            |      0.447556 | 14559105 |         3 |  1066 |    0.633333 |        0.0625 | 52
                    |             | 85 
 origwriter              |      0.537514 | 14559105 |         4 |  1066 |    0.704167 |     0.0583333 |
                    |             | 55 
 origartist              |      0.537514 | 14559105 |         5 |  1066 |    0.704167 |     0.0166667 |
                    |             | 98 
 origvideo               |      0.494181 | 14559105 |         6 |  1066 |    0.670833 |           0.1 | 31
                    |             | 31 
 origaudio               |      0.537514 | 14559105 |         7 |  1066 |    0.704167 |         0.275 |
                    |             | 42 
 origprogrammer          |      0.537514 | 14559105 |         8 |  1066 |    0.704167 |     0.0416667 | 4
                    |             | 4 
 formaldesigner          |      0.883556 | 14559105 |         9 |  1066 |    0.933333 |     0.0208333 | 83
                    |             | 84 
 formalwriter            |      0.883556 | 14559105 |        10 |  1066 |    0.933333 |         0.025 | 116
                    |             | 55 
 formalartist            |      0.883556 | 14559105 |        11 |  1066 |    0.933333 |    0.00833333 | 105
                    |             | 98 
 formalvideo             |      0.875035 | 14559105 |        12 |  1066 |    0.933333 |        0.0625 |
                    |             | 31 
 formalaudio             |      0.883556 | 14559105 |        13 |  1066 |    0.933333 |        0.0375 | 42
                    |             | 42 
 formalprogrammer        |      0.883556 | 14559105 |        14 |  1066 |    0.933333 |     0.0208333 | 124
                    |             | 136 
 status                  |      0.473181 | 14559105 |        15 |  1066 |           0 |      0.654167 | Predesign
                    | Design      | Production 
 artnotes                |      0.767625 | 14559105 |        16 |  1066 |      0.8625 |    0.00833333 | 10 art days
totalfor IP and Assess. | 10 Art Days | sdafasdfasdfasdf 
 productcode             |      0.002625 | 14559105 |        17 |  1066 |           0 |        0.0125 | m3nt22
                    | 0000        | t3nt15 
 artproduction           |      0.540139 | 14559105 |        18 |    58 |           0 |      0.641667 | t
                    | f           | t 
 audioproduction         |      0.540139 | 14559105 |        19 |    58 |           0 |      0.641667 | t
                    | f           | t 
 actclass                |      0.703181 | 14559105 |        20 |  1066 |    0.179167 |      0.820833 |
                    |             |  
 comments                |      0.356514 | 14559105 |        21 |  1066 |    0.554167 |      0.445833 |
                    |             |  
 priority                |     0.0208472 | 14559105 |        22 |    97 |           0 |     0.0791667 | 1
                    | 0           | 148 
 prodaudio               |      0.948181 | 14559105 |        23 |  1066 |   0.0291667 |      0.970833 |
                    |             |  
 prodartist              |      0.948181 | 14559105 |        24 |  1066 |   0.0291667 |      0.970833 |
                    |             |  
 proddesigner            |      0.948181 | 14559105 |        25 |  1066 |   0.0291667 |      0.970833 |
                    |             |  
 prodprogrammer          |      0.948181 | 14559105 |        26 |  1066 |   0.0291667 |      0.970833 |
                    |             |  
 prodvideo               |      0.948181 | 14559105 |        27 |  1066 |   0.0291667 |      0.970833 |
                    |             |  
 prodwriter              |      0.948181 | 14559105 |        28 |  1066 |   0.0291667 |      0.970833 |
                    |             |  
 prodaudioisgroup        |             1 | 14559105 |        29 |    58 |           0 |             1 | f
                    | f           | f 
 prodartistisgroup       |             1 | 14559105 |        30 |    58 |           0 |             1 | f
                    | f           | f 
 proddesignerisgroup     |             1 | 14559105 |        31 |    58 |           0 |             1 | f
                    | f           | f 
 prodprogrammerisgroup   |             1 | 14559105 |        32 |    58 |           0 |             1 | f
                    | f           | f 
 prodvideoisgroup        |             1 | 14559105 |        33 |    58 |           0 |             1 | f
                    | f           | f 
 prodwriterisgroup       |             1 | 14559105 |        34 |    58 |           0 |             1 | f
                    | f           | f 
 origdesignerisgroup     |             1 | 14559105 |        35 |    58 |           0 |             1 | f
                    | f           | f 
 origwriterisgroup       |             1 | 14559105 |        36 |    58 |           0 |             1 | f
                    | f           | f 
 origartistisgroup       |             1 | 14559105 |        37 |    58 |           0 |             1 | f
                    | f           | f 
 origaudioisgroup        |             1 | 14559105 |        38 |    58 |           0 |             1 | f
                    | f           | f 
 origprogrammerisgroup   |             1 | 14559105 |        39 |    58 |           0 |             1 | f
                    | f           | f 
 formaldesignerisgroup   |             1 | 14559105 |        40 |    58 |           0 |             1 | f
                    | f           | f 
 origvideoisgroup        |             1 | 14559105 |        41 |    58 |           0 |             1 | f
                    | f           | f 
 formalwriterisgroup     |             1 | 14559105 |        42 |    58 |           0 |             1 | f
                    | f           | f 
 formalartistisgroup     |             1 | 14559105 |        43 |    58 |           0 |             1 | f
                    | f           | f 
 formalvideoisgroup      |             1 | 14559105 |        44 |    58 |           0 |             1 | f
                    | f           | f 
 formalaudioisgroup      |             1 | 14559105 |        45 |    58 |           0 |             1 | f
                    | f           | f 
 formalprogrammerisgroup |             1 | 14559105 |        46 |    58 |           0 |             1 | f
                    | f           | f 
 objectid                |            -1 | 14559105 |        47 |   412 |           0 |    0.00416667 | 143264
                    | 1668        | 193932 
 objectversion           |      0.152514 | 14559105 |        48 |    97 |           0 |      0.329167 | 3
                    | 0           | 18 
(48 rows)


Re: won't drop the view

From
Tom Lane
Date:
Oleg Lebedev <olebedev@waterford.org> writes:
> select attname, attdispersion, s.* 
> from pg_statistic s, pg_attribute a, pg_class c 
> where starelid = c.oid and attrelid=c.oid and staattnum=attnum and relname='activity';

>          attname         | attdispersion | starelid | staattnum | staop | stanullfrac | stacommonfrac |
stacommonval            |  staloval   |           stahival           
 
>
-------------------------+---------------+----------+-----------+-------+-------------+---------------+--------------------------------------+-------------+------------------------------
>  productcode             |      0.002625 | 14559105 |        17 |  1066 |           0 |        0.0125 | m3nt22
                      | 0000        | t3nt15
 

Well, I can see that 7.1 wouldn't be likely to make a good estimate
about the selectivity of productcode ~ '^m3' on this table; it doesn't
take stacommonval into account (and that's only 1% of the table anyway),
and the range '0000' .. 't3nt15' is too wide to make it plausible that
m3's take up 98% of the table.  I believe 7.2 will do better though.
Care to try out your database with a beta version?
        regards, tom lane


Momjian "Support Functions" section: possible typo and question

From
Terrence Brannon
Date:
I am reading the Momjian "Support Functions" section and have the 
following comments:

1 - there is a comma after "BOTH" that is not before "LEADING" or 
"TRAILING", so I believe it is an error.

     trim(BOTH...)    trim(BOTH, col)    same as trim()
     trim(LEADING...)    trim(LEADING col)    col with leading spaces 
removed
     trim(TRAILING...)    trim(TRAILING col)    col with trailing 
spaces removed

2 - I typed \h trim and no help came up. Is it required to resort 
to HTML documentation for docs on this function



Re: Momjian "Support Functions" section: possible typo and question

From
Bruce Momjian
Date:
Terrence Brannon wrote:
> I am reading the Momjian "Support Functions" section and have the 
> following comments:
> 
> 1 - there is a comma after "BOTH" that is not before "LEADING" or 
> "TRAILING", so I believe it is an error.
> 
> ?    trim(BOTH...)    trim(BOTH, col)    same as trim()
> ?    trim(LEADING...)    trim(LEADING col)    col with leading spaces 
> removed
> ?    trim(TRAILING...)    trim(TRAILING col)    col with trailing 
> spaces removed

Thanks.  The BOTH, is wrong, it should be just BOTH.  I have updated the
book web site with this error.

> 2 - I typed \h trim and no help came up. Is it required to resort 
> to HTML documentation for docs on this function

TRIM is an ANSI word that masks several functions like rtrim, ltrim, and
btrim.  \df rtrim shows these.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026