Thread: Performance issue 6.5 versus 7.0

Performance issue 6.5 versus 7.0

From
Herbert Liechti
Date:
I made an upgrade these days to 7.0 for one machine. I noticed
a better performance in general. Congratulation.

I have one problem with a query which is joining 3 tables and which
is returning one record. In 6.5.3 the results came up immediately.
In 7.0 the same query takes about 5 seconds. Here the explain
results  (Same database with the same amount of records and
same indexes, vacuum was executed before)

Query under 6.5.3:
------------------
EXPLAIN
SELECT MailingListe_Id, MailingStatus, AbschlussDatum, Anschrift.Anzeige,
       Anschrift.Postfach, Anschrift.Strasse, Anschrift.Plz, Anschrift.Ort,
       Anschrift.TelDirekt, Person.Person_Id, Person.Name, Person.Vorname,
       Person.Anzeige
  FROM Anschrift, Person, MailingListe
 WHERE Person.Person_Id = MailingListe.Person_Id
   AND Anschrift.Anschrift_Id = MailingListe.Anschrift_Id
   AND MailingListe_Id = 2104;

 NOTICE:  QUERY PLAN:
Nested Loop  (cost=53.30 rows=23 width=513)
  ->  Nested Loop  (cost=6.15 rows=23 width=232)
        ->  Index Scan using mailingliste_id_ix on mailingliste  (cost=2.05 rows=2 width=36)
        ->  Index Scan using person_id_ix on person  (cost=2.05 rows=24051 width=196)
  ->  Index Scan using anschrift_id_ix on anschrift  (cost=2.05 rows=13661 width=281)

result is returned immediately on a Pentium II

Query under 7.0
---------------
Same query:
NOTICE:  QUERY PLAN:
Merge Join  (cost=5991.20..6395.18 rows=1988449 width=136)
  ->  Sort  (cost=2984.98..2984.98 rows=8268 width=96)
        ->  Merge Join  (cost=2037.87..2209.39 rows=8268 width=96)
              ->  Sort  (cost=1993.14..1993.14 rows=13661 width=76)
                    ->  Seq Scan on anschrift  (cost=0.00..728.61 rows=13661 width=76)
              ->  Sort  (cost=44.73..44.73 rows=61 width=20)
                    ->  Index Scan using mailingliste_id_ix on mailingliste  (cost=0.00..42.94 rows=61 width=20)
  ->  Sort  (cost=3006.22..3006.22 rows=24051 width=40)
        ->  Seq Scan on person  (cost=0.00..896.51 rows=24051 width=40)

Select takes about 5 seconds on Pentium III/550 Mhz

When I do a select * From those tables I get the same query plan
under 6.5.3 and a different query plan und 7.0;

NOTICE:  QUERY PLAN (7.0)
Merge Join  (cost=4201.39..8326.56 rows=1988449 width=513)
  ->  Index Scan using person_id_ix on person  (cost=0.00..3721.19 rows=24051 width=196)
  ->  Sort  (cost=4201.39..4201.39 rows=8268 width=317)
        ->  Merge Join  (cost=44.73..2941.79 rows=8268 width=317)
              ->  Index Scan using anschrift_id_ix on anschrift  (cost=0.00..2725.55 rows=13661 width=281)
              ->  Sort  (cost=44.73..44.73 rows=61 width=36)
                    ->  Index Scan using mailingliste_id_ix on mailingliste  (cost=0.00..42.94 rows=61 width=36)



Is there a way to force 7.0 using the defined indexes? Thanks
for your help.

Greetings Herbie
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Herbert Liechti                     E-Mail: Herbert.Liechti@thinx.ch
ThinX networked business services        Stahlrain 10, CH-5200 Brugg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Re: Performance issue 6.5 versus 7.0

From
Alfred Perlstein
Date:
* Herbert Liechti <Herbert.Liechti@thinx.ch> [000526 13:45] wrote:
>
> Is there a way to force 7.0 using the defined indexes? Thanks
> for your help.

set enable_seqscan = off;

-Alfred

Re: Performance issue 6.5 versus 7.0

From
Herbert Liechti
Date:
Bryan White wrote:
>
> > Is there a way to force 7.0 using the defined indexes? Thanks
> > for your help.
>
> Try set enable_seqscan=off before issuing your query.

A little bit better (3 sec instead of 5 seconds). But fare away
from the performance of 6.5.3. The explain now looks like this:

EXPLAIN
SELECT MailingListe_Id, MailingStatus, AbschlussDatum, Anschrift.Anzeige,
       Anschrift.Postfach, Anschrift.Strasse, Anschrift.Plz, Anschrift.Ort,
       Anschrift.TelDirekt, Person.Person_Id, Person.Name, Person.Vorname,
       Person.Anzeige
  FROM Anschrift, Person, MailingListe
 WHERE Person.Person_Id = MailingListe.Person_Id
   AND Anschrift.Anschrift_Id = MailingListe.Anschrift_Id
   AND MailingListe_Id = 2104;

NOTICE:  QUERY PLAN:
Merge Join  (cost=3717.38..7842.55 rows=1988449 width=136)
  ->  Sort  (cost=3717.38..3717.38 rows=8268 width=96)
        ->  Merge Join  (cost=44.73..2941.79 rows=8268 width=96)
              ->  Index Scan using anschrift_id_ix on anschrift  (cost=0.00..2725.55 rows=13661 width=76)
              ->  Sort  (cost=44.73..44.73 rows=61 width=20)
                    ->  Index Scan using mailingliste_id_ix on mailingliste  (cost=0.00..42.94 rows=61 width=20)
  ->  Index Scan using person_id_ix on person  (cost=0.00..3721.19 rows=24051 width=40)


An other advice?
Herbie
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Herbert Liechti                     E-Mail: Herbert.Liechti@thinx.ch
ThinX networked business services        Stahlrain 10, CH-5200 Brugg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Re: Performance issue 6.5 versus 7.0

From
Alfred Perlstein
Date:
* Herbert Liechti <Herbert.Liechti@thinx.ch> [000526 14:32] wrote:
> Bryan White wrote:
> >
> > > Is there a way to force 7.0 using the defined indexes? Thanks
> > > for your help.
> >
> > Try set enable_seqscan=off before issuing your query.
>
> A little bit better (3 sec instead of 5 seconds). But fare away
> from the performance of 6.5.3. The explain now looks like this:
>
> EXPLAIN
> SELECT MailingListe_Id, MailingStatus, AbschlussDatum, Anschrift.Anzeige,
>        Anschrift.Postfach, Anschrift.Strasse, Anschrift.Plz, Anschrift.Ort,
>        Anschrift.TelDirekt, Person.Person_Id, Person.Name, Person.Vorname,
>        Person.Anzeige
>   FROM Anschrift, Person, MailingListe
>  WHERE Person.Person_Id = MailingListe.Person_Id
>    AND Anschrift.Anschrift_Id = MailingListe.Anschrift_Id
>    AND MailingListe_Id = 2104;
>
> NOTICE:  QUERY PLAN:
> Merge Join  (cost=3717.38..7842.55 rows=1988449 width=136)
>   ->  Sort  (cost=3717.38..3717.38 rows=8268 width=96)
>         ->  Merge Join  (cost=44.73..2941.79 rows=8268 width=96)
>               ->  Index Scan using anschrift_id_ix on anschrift  (cost=0.00..2725.55 rows=13661 width=76)
>               ->  Sort  (cost=44.73..44.73 rows=61 width=20)
>                     ->  Index Scan using mailingliste_id_ix on mailingliste  (cost=0.00..42.94 rows=61 width=20)
>   ->  Index Scan using person_id_ix on person  (cost=0.00..3721.19 rows=24051 width=40)
>
>
> An other advice?

Not really, try a "vacuum analyze tablename;" but otherwise i've hit
upon this regression in 7.0 as well :(

-Alfred

Re: Performance issue 6.5 versus 7.0

From
Herbert Liechti
Date:
Alfred Perlstein wrote:
>
> > Bryan White wrote:
> > >
> > > > Is there a way to force 7.0 using the defined indexes? Thanks
> > > > for your help.
> > >
> > > Try set enable_seqscan=off before issuing your query.
> >
> > A little bit better (3 sec instead of 5 seconds). But fare away
> > from the performance of 6.5.3. The explain now looks like this:
> >

> > An other advice?
>
> Not really, try a "vacuum analyze tablename;" but otherwise i've hit
> upon this regression in 7.0 as well :(

Magic. Vacuum analyze tablename; made the things running with the
expected performance. I didn't knew that feature of vacuum. Thanks

Herbie

Re: Performance issue 6.5 versus 7.0

From
Tom Lane
Date:
Herbert Liechti <Herbert.Liechti@thinx.ch> writes:
> I have one problem with a query which is joining 3 tables and which
> is returning one record. In 6.5.3 the results came up immediately.
> In 7.0 the same query takes about 5 seconds. Here the explain
> results  (Same database with the same amount of records and
> same indexes, vacuum was executed before)

Um, did you do a VACUUM ANALYZE, or just a VACUUM?

It looks to me like the major problem here is that 7.0 is estimating
that the query will return a lot of rows, and it's therefore producing
a heavy-duty plan that would do well with a large number of matching
rows.  But I'm guessing that in reality, Person.Person_Id and
Anschrift.Anschrift_Id are unique columns, so the actual number of rows
out should be just the same as the number of MailingListe rows selected
by "MailingListe_Id = 2104", not so?  And that number is probably only
a few dozen?  For a small number of rows, 6.5.3's simple nested-loop-
with-inner-indexscan plan is probably just the right thing.  The trick
is to get the planner to realize that it's only looking for a small
number of rows...

It would help to know how large each of these tables are, and how many
rows you expect to get from the query.  Also, if you have done a VACUUM
ANALYZE, I would like to see the stats produced by the analyze.  You can
get those with a query like

select attname,attdisbursion,s.*
from pg_statistic s, pg_attribute a, pg_class c
where starelid = c.oid and attrelid = c.oid and staattnum = attnum
and relname = 'anschrift';

and similarly for the other two tables.

            regards, tom lane