Home > mailing lists

Re: 7.4 performance issue - Mailing list pgsql-general

From	Bill Moran
Subject	Re: 7.4 performance issue
Date	June 16, 2004 08:09:10
Msg-id	20040616070854.57250ddb.wmoran@potentialtech.com Whole thread Raw
In response to	7.4 performance issue (Christophe Musielak <cmusielak@akio-software.com>)
List	pgsql-general

Tree view

Just to cover the ovbious solution ...

How often do you run "vacuum analyze" on this database?  If the data has been
around for a long time, have you considered using CLUSTER to physically reorder
the tables?

Christophe Musielak <cmusielak@akio-software.com> wrote:
> Hi,
>
> We're working since 4 years on Postgres for a e-crm web based
> application that deals with emails.
>
> As our customers bases keep growing (more than one million rows in
> sereval tables) we started to face performance issue with 7.2 and
> decided to upgrade to 7.4 since one month.
>
> The upgrade process was ok.
>
> But we still face performances problems.
>
> We decided to import the database into SQLServer to make some benchmarks
> between the two with same hardware and same data.
>
> Please find included main query we're testing on, the explain plan and
> indexes list.
>
> Here is the result :
>
> Total running time Postges: 10 - 11s
> Total running time sql Server : < 1s
>
> We must admit that we're quite disappointed with the result and
> currently think there is something wrong in the way we're working with
> Postgres.
>
> What do you think? is there some wrong with the configuration? with the
> way we're building the query?
>
> Thanks a lot for your answers.
>
> Christophe Musielak
> Christine Bruzaud
> Akio Software
>
>
>
>
> PS : Concerning the seq_scan shown in the explain plan, as there is
> indexes on the tables as show below, we think Postgres is choosing
> seq_scan versus index_scan to improve performance. We tried to force
> using index_scan with 'set enable_seqscan = false) but there is no
> improvement in the total duration of the query.
>
>
> ------------------------------------------------------------------------
>
> We're running our tests on :
>
> Table threads rows number   : 125 000
> Table emails rows number        : 650 000
> Table operators rows number  : 50
> Total nb rows returned by the select  : 116 000 (without the LIMIT)
> explain SELECT t.*, substring(t.subject::text, 0, 30) AS subject,
> e.email,
> o.lastname AS "operator"
> FROM threads t JOIN emails e ON (e.id = t.client_email_id)
> JOIN operators o ON (o.id = t.operator_id)
> WHERE t.mailbox_id IN ( 2,3,5,20,21,13,22,23,24,25,26,19 )
> and t.desktop = 2
> ORDER BY t.date_last asc, t.id asc
>  LIMIT 16 OFFSET 0;
>
> QUERY PLAN
> ------------------------------------------------------------------
>
> Limit  (cost=93539.02..93539.06 rows=16 width=619)
>    ->  Sort  (cost=93539.02..93745.69 rows=82669 width=619)
>          Sort Key: t.date_last, t.id
>          ->  Hash Join  (cost=26186.03..73789.00 rows=82669 width=619)
>                Hash Cond: ("outer".operator_id = "inner".id)
>                ->  Merge Join  (cost=26183.07..69487.26 rows=82668
> width=609)
>                      Merge Cond: ("outer".id = "inner".client_email_id)
>                      ->  Index Scan using emails_pkey on emails e
> (cost=0.00..37782.57 rows=654511 width=30)
>                      ->  Sort  (cost=26183.07..26389.74 rows=82667
> width=583)
>                            Sort Key: t.client_email_id
>                            ->  Seq Scan on threads t
> (cost=0.00..19431.23
> rows=82667 width=583)
>                                  Filter: (((mailbox_id = 2) OR
> (mailbox_id
> = 3) OR (mailbox_id = 5) OR (mailbox_id = 20) OR (mailbox_id = 21) OR
> (mailbox_id = 13) OR                (mailbox_id = 22) OR (mailbox_id =
> 23)
> OR (mailbox_id = 24) OR (mailbox_id = 25) OR (mailbox_id = 26)
>             OR (mailbox_id = 19)) AND (desktop = 2))
>                ->  Hash  (cost=2.85..2.85 rows=44 width=14)
>                      ->  Seq Scan on operators o  (cost=0.00..2.85
> rows=44
> width=14)
>
>
> ------------------------------------------------------------------
>
> Indexes on threads :
>
> threads_pkey PRIMARY KEY (id)
> CREATE INDEX threads_date_last_idx   ON threads  (date_last);
> CREATE INDEX threads_desktop_idx  ON threads  (desktop);
> CREATE INDEX threads_operator_id_idx  ON threads  (operator_id);
> CREATE UNIQUE INDEX threads_pri_dlast_id_idx ON threads (priority,
> date_last, id);
>
> Indexes on emails :
>
> emails_pkey PRIMARY KEY (id)
>
> Indexes on operators :
>
> operators_pkey PRIMARY KEY (id)
>
> -----------------------------------------------------------------
> postgresql.conf modified parameters :
>
> shared_buffers = 14000 # = 112 Mo with 10 Mo used by wal_buffers
> sort_mem = 51200       # = 50 Mo
> vacuum_mem = 102400    # = 100 Mo
> # Recommended : 64 MB for 1 - 2 Go RAM
> wal_buffers = 1280     # = 10240 Ko = 10 Mo
> effective_cache_size = 65536  # Choice : 50% of RAM
> # <=> 1 * 1024 * 1024 * .50 / 8 = 65536
> random_page_cost = 2     # make the planner favor indexscans
> cpu_tuple_cost = 0.042   # got the planner to choose the index
>
> stats_command_string = true
> stats_block_level = true
> stats_row_level = true
>
> datestyle = 'iso, dmy'
> ------------------------------------------------------------------
>
>
>
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend


--
Bill Moran
Potential Technologies
http://www.potentialtech.com

pgsql-general by date:

From: Richard Huxton
Date: 16 June 2004, 07:24:29
Subject: Re: 7.4 performance issue

From: Jernej Kos
Date: 16 June 2004, 10:39:45
Subject: Re: Multicolumn indexes and ORDER BY

Re: 7.4 performance issue - Mailing list pgsql-general

Previous

Next