Re: General performance/load issue - Mailing list pgsql-general

From Tomas Vondra
Subject Re: General performance/load issue
Date
Msg-id ffa9ad34a8bd6ea2c7869e3e305d458a.squirrel@sq.gransy.com
Whole thread Raw
In response to Re: General performance/load issue  (Gaëtan Allart <gaetan@nexylan.com>)
Responses Re: General performance/load issue  (Gaëtan Allart <gaetan@nexylan.com>)
List pgsql-general
On 26 Listopad 2011, 10:45, Gaëtan Allart wrote:
> A better view of iotop :
>
>   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
> 31875 be/4 postgres    0.00 B/s   15.23 M/s  0.00 %  0.00 % postgres:
> database database 46.105.104.205(50228) SELECT
> 30985 be/4 postgres    0.00 B/s   10.55 M/s  0.00 %  0.00 % postgres:
> database database 46.105.104.205(47672) SELECT
>
> As you can see, SELECTS are writing very much on the disk.
>
> At this moment, pg_stat_activity show many many queries running at the
> same time.
> It looks like that all queries are suddenly writing on disk and not a
> particular one, making me think of a buffer issue or something.

No, I don't think this is an issue with the size of shared buffers. That'd
influence the pg_stat_bgwriter - the buffers_backend would grow much
faster, and it's not the case.

So I'm guessing it's either hint bits or sorting. If I had to, I'd
probably guess about the hint bits - see for example this thread

http://archives.postgresql.org/pgsql-performance/2008-05/msg00182.php

it more or less behaves like what you described so far.

> This is a example of 10 MB/s writing query :
>
> SELECT COUNT(*) FROM (SELECT DISTINCT "table"."id" AS "id",
> "table"."flux_id" AS "flux_id", "table"."locale_id" AS "locale_id",
> "table"."url_article" AS "url_article", "table"."original_url" AS
> "original_url", "table"."name" AS "name", "table"."description" AS
> "description", "table"."content" AS "content", "table"."permis" AS
> "permis", "table"."reviewed" AS "reviewed", "table"."author_id" AS
> "author_id", "table"."poster_id" AS "poster_id", "table"."post_date" AS
> "post_date", "table"."edit_date" AS "edit_date", "table"."add_date" AS
> "add_date", "table"."comments_open" AS "comments_open", "table"."site_id"
> AS "site_id", "table"."is_local" AS "is_local", "table"."status" AS
> "status", "table"."visits" AS "visits", "table"."votes" AS "votes",
> "table"."score" AS "score", "arti

Post EXPLAIN ANALYZE output of this query (use explain.depesz.com to post
it).

> Checkpoints logs still show very long write times :
>
> LOG:  checkpoint complete: wrote 92 buffers (0.0%); 0 transaction log
> file(s) added, 0 removed, 1 recycled; write=49.622 s, sync=6.510 s,
> total=63.625 s

No, that's fine - that's what spread checkpoints do. Once the
checkpoint_timeout expires, the system decides to perform a checkpoint,
i.e. it has to write all dirty (modified) shared buffers to the disk. But
it knows the next checkpoint will happen in checkpoint_timeout, so it has
about 5 minutes to write all the data.

So it says something like 'I need to write 540MB want to write that in 270
seconds (5 minutes * completion_target), so I'll write at 2MB/s'. So the
write phase is expected to take long. But it has to sync the data at the
end, and that's where the problems usually happen - so the important thing
is 'sync' and that improved significantly. 6 seconds is not great but it's
not a big issue I guess.

> Couldn't this be a hardware issue ?

I don't think so. The problem is that the queries write a lot of data, and
the SSD can't help in this case. The SSDs provide much better random
performance (compared to spinners), but the sequential performance is not
much better. So pushing 70MB/s to the SSD may be fully utilized.

Have you tested the basic performance (like bonnie++, dd etc) before going
into production? There are some common SSD-related mistakes (e.g. not
aligning the partitions properly) that may easily cause 30% performance
loss. But this won't fix the problem, although you could get a bit better
performance.

Tomas


pgsql-general by date:

Previous
From: Gaëtan Allart
Date:
Subject: Re: General performance/load issue
Next
From: Robert Treat
Date:
Subject: Re: General performance/load issue