7.4 performance issue - Mailing list pgsql-general

From Christophe Musielak
Subject 7.4 performance issue
Date
Msg-id 1087376870.1861.60.camel@fangorn
Whole thread Raw
In response to Re: how does license work for non-profit companies?  (Andrew Kelly <akelly@transparency.org>)
Responses Re: 7.4 performance issue  (Richard Huxton <dev@archonet.com>)
Re: 7.4 performance issue  (Bill Moran <wmoran@potentialtech.com>)
Re: 7.4 performance issue  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi,

We're working since 4 years on Postgres for a e-crm web based
application that deals with emails.

As our customers bases keep growing (more than one million rows in
sereval tables) we started to face performance issue with 7.2 and
decided to upgrade to 7.4 since one month.

The upgrade process was ok.

But we still face performances problems.

We decided to import the database into SQLServer to make some benchmarks
between the two with same hardware and same data.

Please find included main query we're testing on, the explain plan and
indexes list.

Here is the result :

Total running time Postges: 10 - 11s
Total running time sql Server : < 1s

We must admit that we're quite disappointed with the result and
currently think there is something wrong in the way we're working with
Postgres.

What do you think? is there some wrong with the configuration? with the
way we're building the query?

Thanks a lot for your answers.

Christophe Musielak
Christine Bruzaud
Akio Software




PS : Concerning the seq_scan shown in the explain plan, as there is
indexes on the tables as show below, we think Postgres is choosing
seq_scan versus index_scan to improve performance. We tried to force
using index_scan with 'set enable_seqscan = false) but there is no
improvement in the total duration of the query.


------------------------------------------------------------------------

We're running our tests on :

Table threads rows number   : 125 000
Table emails rows number        : 650 000
Table operators rows number  : 50
Total nb rows returned by the select  : 116 000 (without the LIMIT)
explain SELECT t.*, substring(t.subject::text, 0, 30) AS subject,
e.email,
o.lastname AS "operator"
FROM threads t JOIN emails e ON (e.id = t.client_email_id)
JOIN operators o ON (o.id = t.operator_id)
WHERE t.mailbox_id IN ( 2,3,5,20,21,13,22,23,24,25,26,19 )
and t.desktop = 2
ORDER BY t.date_last asc, t.id asc
 LIMIT 16 OFFSET 0;

QUERY PLAN
------------------------------------------------------------------

Limit  (cost=93539.02..93539.06 rows=16 width=619)
   ->  Sort  (cost=93539.02..93745.69 rows=82669 width=619)
         Sort Key: t.date_last, t.id
         ->  Hash Join  (cost=26186.03..73789.00 rows=82669 width=619)
               Hash Cond: ("outer".operator_id = "inner".id)
               ->  Merge Join  (cost=26183.07..69487.26 rows=82668
width=609)
                     Merge Cond: ("outer".id = "inner".client_email_id)
                     ->  Index Scan using emails_pkey on emails e
(cost=0.00..37782.57 rows=654511 width=30)
                     ->  Sort  (cost=26183.07..26389.74 rows=82667
width=583)
                           Sort Key: t.client_email_id
                           ->  Seq Scan on threads t
(cost=0.00..19431.23
rows=82667 width=583)
                                 Filter: (((mailbox_id = 2) OR
(mailbox_id
= 3) OR (mailbox_id = 5) OR (mailbox_id = 20) OR (mailbox_id = 21) OR
(mailbox_id = 13) OR                (mailbox_id = 22) OR (mailbox_id =
23)
OR (mailbox_id = 24) OR (mailbox_id = 25) OR (mailbox_id = 26)
            OR (mailbox_id = 19)) AND (desktop = 2))
               ->  Hash  (cost=2.85..2.85 rows=44 width=14)
                     ->  Seq Scan on operators o  (cost=0.00..2.85
rows=44
width=14)


------------------------------------------------------------------

Indexes on threads :

threads_pkey PRIMARY KEY (id)
CREATE INDEX threads_date_last_idx   ON threads  (date_last);
CREATE INDEX threads_desktop_idx  ON threads  (desktop);
CREATE INDEX threads_operator_id_idx  ON threads  (operator_id);
CREATE UNIQUE INDEX threads_pri_dlast_id_idx ON threads (priority,
date_last, id);

Indexes on emails :

emails_pkey PRIMARY KEY (id)

Indexes on operators :

operators_pkey PRIMARY KEY (id)

-----------------------------------------------------------------
postgresql.conf modified parameters :

shared_buffers = 14000 # = 112 Mo with 10 Mo used by wal_buffers
sort_mem = 51200       # = 50 Mo
vacuum_mem = 102400    # = 100 Mo
# Recommended : 64 MB for 1 - 2 Go RAM
wal_buffers = 1280     # = 10240 Ko = 10 Mo
effective_cache_size = 65536  # Choice : 50% of RAM
# <=> 1 * 1024 * 1024 * .50 / 8 = 65536
random_page_cost = 2     # make the planner favor indexscans
cpu_tuple_cost = 0.042   # got the planner to choose the index

stats_command_string = true
stats_block_level = true
stats_row_level = true

datestyle = 'iso, dmy'
------------------------------------------------------------------







pgsql-general by date:

Previous
From: Frank van Vugt
Date:
Subject: Re: Why is a union of two null-results automatically casted to type text ?
Next
From: Brendan Jurd
Date:
Subject: Re: Can you help me with this query?