Thread: Massive performance differences

Massive performance differences

From
Andreas Hartmann
Date:
Hi all,

I'm running the same database on two systems:

A) Debian PostgreSQL 7.4.7
B) SuSE   PostgreSQL 7.3.4

Both machines have approx. 1GHz and 1GB RAM.
The amount of data is almost equal (+- 10%). But I'm facing
huge performance differences. For instance, a simple sequential
scan results in the following query plans:


explain analyze select * from veranstaltung_original order by semester;

A) ------------------------------------------------------------

  Sort  (cost=734.74..747.77 rows=5210 width=232) (actual time=89.935..92.730
rows=5210 loops=1)
    Sort Key: semester
    ->  Seq Scan on veranstaltung_original  (cost=0.00..413.10 rows=5210
width=232) (actual time=0.011..7.852 rows=5210 loops=1)
  Total runtime: 96.900 ms


B) ------------------------------------------------------------

  Sort  (cost=3054.08..3067.74 rows=5467 width=223) (actual
time=2568.10..2573.02 rows=5467 loops=1)
    Sort Key: semester
    ->  Seq Scan on veranstaltung_original  (cost=0.00..2714.67 rows=5467
width=223) (actual time=1936.68..2506.83 rows=5467 loops=1)
  Total runtime: 2579.08 msec

------------------------------------------------------------

Could this be due to the different PostgreSQL versions?
Is there a typical cause for such performance problems?
How can I find out what's wrong with installation (B)?

Thanks in advance!

-- Andreas


Re: Massive performance differences

From
Andreas Hartmann
Date:
Andreas Hartmann wrote:
> Hi all,
>
> I'm running the same database on two systems:
>
> A) Debian PostgreSQL 7.4.7
> B) SuSE   PostgreSQL 7.3.4
>
> Both machines have approx. 1GHz and 1GB RAM.
> The amount of data is almost equal (+- 10%). But I'm facing
> huge performance differences. For instance, a simple sequential
> scan results in the following query plans:

[...]

I just imported a dump of (B) into (A) to be sure to have the
same data set. When I executed the query on (A) the first time, it was
as slow as on (B). But the next time it was significantly faster:

vvz_dev=> explain analyze select * from veranstaltung_original order by semester;
                                                             QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=587.08..600.74 rows=5467 width=229) (actual
time=3188.975..3192.020 rows=5467 loops=1)
    Sort Key: semester
    ->  Seq Scan on veranstaltung_original  (cost=0.00..247.67 rows=5467
width=229) (actual time=2.330..1587.832 rows=5467 loops=1)
  Total runtime: 3205.281 ms
(4 Zeilen)

vvz_dev=> explain analyze select * from veranstaltung_original order by semester;
                                                            QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=587.08..600.74 rows=5467 width=229) (actual time=95.061..98.143
rows=5467 loops=1)
    Sort Key: semester
    ->  Seq Scan on veranstaltung_original  (cost=0.00..247.67 rows=5467
width=229) (actual time=0.015..7.749 rows=5467 loops=1)
  Total runtime: 102.948 ms
(4 Zeilen)


This leads to the idea that it might be a caching problem.
How can I confirm this?

-- Andreas

Re: Massive performance differences

From
Scott Marlowe
Date:
On Tue, 2005-03-15 at 11:10, Andreas Hartmann wrote:
> Hi all,
>
> I'm running the same database on two systems:
>
> A) Debian PostgreSQL 7.4.7
> B) SuSE   PostgreSQL 7.3.4
>
> Both machines have approx. 1GHz and 1GB RAM.
> The amount of data is almost equal (+- 10%). But I'm facing
> huge performance differences. For instance, a simple sequential
> scan results in the following query plans:
>
>
> explain analyze select * from veranstaltung_original order by semester;
>
> A) ------------------------------------------------------------
>
>   Sort  (cost=734.74..747.77 rows=5210 width=232) (actual time=89.935..92.730
> rows=5210 loops=1)
>     Sort Key: semester
>     ->  Seq Scan on veranstaltung_original  (cost=0.00..413.10 rows=5210
> width=232) (actual time=0.011..7.852 rows=5210 loops=1)
>   Total runtime: 96.900 ms
>
>
> B) ------------------------------------------------------------
>
>   Sort  (cost=3054.08..3067.74 rows=5467 width=223) (actual
> time=2568.10..2573.02 rows=5467 loops=1)
>     Sort Key: semester
>     ->  Seq Scan on veranstaltung_original  (cost=0.00..2714.67 rows=5467
> width=223) (actual time=1936.68..2506.83 rows=5467 loops=1)
>   Total runtime: 2579.08 msec
>
> ------------------------------------------------------------
>
> Could this be due to the different PostgreSQL versions?
> Is there a typical cause for such performance problems?
> How can I find out what's wrong with installation (B)?

I would expect your I/O subsystem and or kernel revision are more likely
the cause of differences here than the pg version.  Note that the time
spent is in the seq scan, not the sort.  Otherwise I'd worry about what
version of sort was being used.

Re: Massive performance differences

From
Ragnar Hafstað
Date:
On Tue, 2005-03-15 at 18:10 +0100, Andreas Hartmann wrote:

> explain analyze select * from veranstaltung_original order by semester;
>
>   Sort  (cost=3054.08..3067.74 rows=5467 width=223) (actual
> time=2568.10..2573.02 rows=5467 loops=1)
>     Sort Key: semester
>     ->  Seq Scan on veranstaltung_original  (cost=0.00..2714.67 rows=5467
> width=223) (actual time=1936.68..2506.83 rows=5467 loops=1)
                          ^^^^^^^
isn't this value (1936.68) suspiscious for a seq scan ?
can a lot of dead tuples cause this?
maybe VACUUM FULL ANALYSE time ?

gnari



Re: Massive performance differences

From
Scott Marlowe
Date:
On Tue, 2005-03-15 at 12:18, Ragnar Hafstað wrote:
> On Tue, 2005-03-15 at 18:10 +0100, Andreas Hartmann wrote:
>
> > explain analyze select * from veranstaltung_original order by semester;
> >
> >   Sort  (cost=3054.08..3067.74 rows=5467 width=223) (actual
> > time=2568.10..2573.02 rows=5467 loops=1)
> >     Sort Key: semester
> >     ->  Seq Scan on veranstaltung_original  (cost=0.00..2714.67 rows=5467
> > width=223) (actual time=1936.68..2506.83 rows=5467 loops=1)
>                           ^^^^^^^
> isn't this value (1936.68) suspiscious for a seq scan ?
> can a lot of dead tuples cause this?
> maybe VACUUM FULL ANALYSE time ?

It's not unreasonable for the first run when the machine has to hit the
hard drives, but if it's that slow on subsequent reads, then there's
likely some problem.