Re: Slow queries on 9.3.1 despite use of index - Mailing list pgsql-performance

From Michael van Rooyen
Subject Re: Slow queries on 9.3.1 despite use of index
Date
Msg-id 535FB490.6020405@loot.co.za
Whole thread Raw
In response to Re: Slow queries on 9.3.1 despite use of index  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
On 2014/04/28 07:52 PM, Jeff Janes wrote:
> On Mon, Apr 28, 2014 at 10:12 AM, Michael van Rooyen
> <michael@loot.co.za <mailto:michael@loot.co.za>> wrote:
>
> It looks like something is causing your IO to seize up briefly.  It is
> common for the sync phase of the checkpoint to do that, but that would
> only explain 3 of the 4 reports above.
>
> Is this causing an actual problem for your users, or are you just
> trying to be proactive?
>
> You could change the kernel setting dirty_background_bytes to try to
> reduce this problem.
The problem is that this server running background tasks very slowly
(about 10x slower than a similar server with the same DB but 3x more RAM).

I changed dirty_background_bytes to 16M, previously the
dirty_background_ratio was 10%.  No real effect on the DB performance,
but it seems a good change anyway. Thanks for the tip.
>
>
>     Overall the load on the server seems quite low, for example,
>     typical vmstat -1 is:
>
>
>     procs -----------memory---------- ---swap-- -----io---- --system--
>     -----cpu------
>      r  b   swpd   free   buff  cache   si   so    bi    bo in   cs us
>     sy id wa st
>      0  1    304  77740  11960 17709156    0    0    99    82    2
>      2 2  1 89  8  0
>      1  0    304  75228  11960 17711164    0    0  1256   635 1418
>     6498 0  0 94  6  0
>      0  1    304  73440  11968 17712036    0    0  1232   149 1253
>     6232 1  0 94  6  0
>      0  2    304  78472  11968 17706016    0    0  1760    89 1325
>     6361 1  0 94  5  0
>      0  1    304  75616  11968 17708480    0    0  2164    72 1371
>     6855 1  0 94  5  0
>      0  1    304  73292  11968 17710320    0    0  1760   112 1335
>     6673 1  0 94  5  0
>      0  2    304  77956  11964 17703528    0    0  1204  5614 1867
>     6712 0  0 94  6  0
>
>
> It that typical for when the problem is not occurring, or typical for
> when it is occurring.  Without having timestamps to correlate the
> vmstat back to log file, it is very hard to make use of this info.
>  Some versions of vmstat have a -t flag.
>
It's fairly typical - and although the same underlying query will
sometimes complete faster or slower, the overall performance /
throughput is consistently (as opposed to sporadically) poor.
>
>
>
>     I've tried to optimise postgresql.conf for performance:
>
>     max_connections = 1000                  # (change requires restart)
>
>
> 1000 is extremely high.  How many connections do you actually use at
> any one time?
>
>     shared_buffers = 2GB                    # min 128kB or
>     max_connections*16kB
>     work_mem = 100MB                                # min 64kB
>
>
> 100MB is also very high, at least on conjunction with the high
> max_connections.
Blush.  Thanks - I've reduced these to more reasonable values (200 /
10MB), but it didn't have any effect on performance.

>     maintenance_work_mem = 100MB            # min 1MB
>     synchronous_commit = off                # immediate fsync at commit
>     wal_buffers = 16MB                      # min 32kB
>     checkpoint_segments = 64                # in logfile segments, min
>     1, 16MB each
>     checkpoint_timeout = 10min              # range 30s-1h
>     effective_cache_size = 16GB
>     logging_collector = on                  # Enable capturing of
>     stderr and csvlog
>     log_directory = 'pg_log'                # directory where log
>     files are written,
>     log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name
>     pattern,
>     log_rotation_age = 1d                   # Automatic rotation of
>     logfiles will
>     log_min_duration_statement = 1000       # -1 is disabled, 0 logs
>     all statements
>
>
> I would lower this.  You can see that few statements were just over
> 1000 ms, but can't tell if there are lot that are at 800 ms, or if you
> have bimodal distribution with most being 1ms and a few being 1200ms.
I lowered it to 100ms, and taking the same query in my original post
over the last few hours, the times vary in the spectrum from 100ms to
just over a 1s.  It seems like an exponential distribution with the norm
close to 100ms. I am becoming increasingly sure that I'm just up against
the limitations of the SATA disks due to the load profile on this
particular server. Maybe it's time to reassess the load, or install an
SSD or lots of RAM...
> Cheers,
>
> Jeff



pgsql-performance by date:

Previous
From: Karl Denninger
Date:
Subject: Re: Revisiting disk layout on ZFS systems
Next
From: RichmondDyes@monroehosp.org
Date:
Subject: Re: Adding new field to big table