Perfomance difference between 7.2 and 7.3 - Mailing list pgsql-general

From Paulo Jan
Subject Perfomance difference between 7.2 and 7.3
Date
Msg-id 3FB25221.9040008@digital.ddnet.es
Whole thread Raw
Responses Re: Perfomance difference between 7.2 and 7.3
Re: Perfomance difference between 7.2 and 7.3
List pgsql-general
Hi all:

    I have here a table with the following schema:

                        Table "todocinetv"
    Column    |            Type             |      Modifiers
-------------+-----------------------------+----------------------
  id          | integer                     | not null default '0'
  datestamp   | timestamp without time zone | not null
  thread      | integer                     | not null default '0'
  parent      | integer                     | not null default '0'
  author      | character(37)               | not null default ''
  subject     | character(255)              | not null default ''
  email       | character(200)              | not null default ''
  attachment  | character(64)               | default ''
  host        | character(50)               | not null default ''
  email_reply | character(1)                | not null default 'N'
  approved    | character(1)                | not null default 'N'
  msgid       | character(100)              | not null default ''
  modifystamp | integer                     | not null default '0'
  userid      | integer                     | not null default '0'
Indexes: todocinetv_approved,
          todocinetv_author,
          todocinetv_datestamp,
          todocinetv_modifystamp,
          todocinetv_msgid,
          todocinetv_parent,
          todocinetv_subject,
          todocinetv_thread,
          todocinetv_userid,
          todocinetvpri_key


    (It's actually a table created by the discussion board application
Phorum (version 3.3)).
    This table has about 28000 rows, and is running with Postgres 7.2.3
under Red Hat 8.0, in a 2.4 Ghz. Pentiun 4 with 512 Mb. of RAM.
    The problem I'm having is that, when you access the main page of the
discussion board, it takes forever to show you the list of posts. The
query that Phorum uses for doing so is:


phorum=# explain
phorum-#  SELECT thread, modifystamp, count(id) AS tcount,
datetime(modifystamp) AS latest, max(id) as maxid FROM todocinetv WHERE
approved='Y' GROUP BY thread, modifystamp ORDER BY modifystamp desc,
thread desc limit 30;
NOTICE:  QUERY PLAN:

Limit  (cost=40354.79..40354.79 rows=30 width=12)
   ->  Sort  (cost=40354.79..40354.79 rows=2879 width=12)
         ->  Aggregate  (cost=39901.43..40189.35 rows=2879 width=12)
               ->  Group  (cost=39901.43..40045.39 rows=28792 width=12)
                     ->  Sort  (cost=39901.43..39901.43 rows=28792 width=12)
                           ->  Seq Scan on todocinetv
(cost=0.00..37768.90 rows=28792 width=12)


    This query takes up to 3 minutes to execute. I have tried to strip it
down and leaving it in its most vanilla form (without "count(id)" and
such), and it's still almost as slow:


phorum=# explain
phorum-# SELECT thread, modifystamp,  datetime(modifystamp) AS latest
from todocinetv WHERE approved='Y'  ORDER BY modifystamp desc, thread
desc limit 30;
NOTICE:  QUERY PLAN:

Limit  (cost=39901.43..39901.43 rows=30 width=8)
   ->  Sort  (cost=39901.43..39901.43 rows=28792 width=8)
         ->  Seq Scan on todocinetv  (cost=0.00..37768.90 rows=28792
width=8)


    But here is the weird thing: I dump the table, export it into another
machine running Postgres 7.3.2 (Celeron 1.7 Ghz, 512 Mb. of memory), and
the query takes only 2 or 3 seconds to execute, even though the query
plan is almost the same:


provphorum=# explain
provphorum-#  SELECT thread, modifystamp, count(id) AS tcount,
modifystamp AS latest, max(id) as maxid FROM todocinetv WHERE
approved='Y' GROUP BY thread, modifystamp ORDER BY modifystamp desc,
thread desc limit 30 ;
                                            QUERY PLAN

------------------------------------------------------------------------------------------------
  Limit  (cost=5765.92..5765.99 rows=30 width=12)
    ->  Sort  (cost=5765.92..5772.96 rows=2817 width=12)
          Sort Key: modifystamp, thread
          ->  Aggregate  (cost=5252.34..5604.49 rows=2817 width=12)
                ->  Group  (cost=5252.34..5463.63 rows=28172 width=12)
                      ->  Sort  (cost=5252.34..5322.77 rows=28172 width=12)
                            Sort Key: thread, modifystamp
                            ->  Seq Scan on todocinetv
(cost=0.00..3170.15 rows=28172 width=12)
                                  Filter: (approved = 'Y'::bpchar)
(9 rows)


    (I took out the "datetime" function, since 7.3 didn't accept it and I
didn't think it was relevant to the performance problem (am I wrong?))

    So my question is: what causes such a big difference? (3 min. vs. 3
seconds) Does the version difference (7.2 vs. 7.3) account for all of
it? Or should I start looking at other factors? As I said, both machines
are almost equivalent hardware-wise, and as for the number of shared
buffers, the faster machine actually has less of them (the 7.3 machine
has "shared_buffers = 768", while the 7.2 one has "shared_buffers = 1024").



                    Paulo Jan.
                    DDnet.



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: multibyte support
Next
From: Tom Lane
Date:
Subject: Re: Perfomance difference between 7.2 and 7.3