Thread: low perfomances migrating from 9.3 to 9.5
hello all,
We switched from PostgreSQL server 9.3 to 9.5. From the very beginning there was a noticeable drop in performances (for example : when injecting our SQL dumps into 9.5, COPY and CREATE INDEX were very slow). Our configuration file was the same for 9.3 and 9.5, except for the "chekpoint segment" that has been deprecated.
I haven't got (yet) some objective performance measurements to show, because the problem was so obvious (just by looking at the log feed) that we switched back quickly to 9.3 (production constraints).
I thought that, maybe, our system kernel is getting obsolete for a recent postgresql engine ?
OS is Centos 5.5 kernel 2.6.18-194.el5 #1 SMP x86_64
thanks for suggestions,
Tom
On Wed, Jul 27, 2016 at 5:54 PM, thomas veymont <thomas.veymont@gmail.com> wrote: > We switched from PostgreSQL server 9.3 to 9.5. From the very beginning there > was a noticeable drop in performances (for example : when injecting our SQL > dumps into 9.5, COPY and CREATE INDEX were very slow). > > Our configuration file was the same for 9.3 and 9.5, except for the > "chekpoint segment" that has been deprecated. Which value are you using for max_wal_size and min_wal_size? And which value of checkpoint_segments did you use previously? This could influence the checkpoint frequency. -- Michael
2016-07-27 11:06 GMT+02:00 Michael Paquier <michael.paquier@gmail.com>:
On Wed, Jul 27, 2016 at 5:54 PM, thomas veymont
<thomas.veymont@gmail.com> wrote:
> We switched from PostgreSQL server 9.3 to 9.5. From the very beginning there
> was a noticeable drop in performances (for example : when injecting our SQL
> dumps into 9.5, COPY and CREATE INDEX were very slow).
>
> Our configuration file was the same for 9.3 and 9.5, except for the
> "chekpoint segment" that has been deprecated.
Which value are you using for max_wal_size and min_wal_size? And which
value of checkpoint_segments did you use previously? This could
influence the checkpoint frequency.
--
Michael
in 9.3 we had :
checkpoint_segments = 128
#checkpoint_timeout = 5min # range 30s-1h
checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
as far as I remember these settings were for faster bulk loading.
in 9.5 we set:
max_wal_size = 6GB # https://www.postgresql.org/docs/9.5/static/release-9-5.html, max_wal_size = (3 * checkpoint_segments) * 16MB
#min_wal_size = 80MB
max_wal_size = 6GB # https://www.postgresql.org/docs/9.5/static/release-9-5.html, max_wal_size = (3 * checkpoint_segments) * 16MB
#min_wal_size = 80MB
though I tried max_wal_size = 3GB (default is 1GB) and did not notice any improvment.
cheers
tom
On Wed, Jul 27, 2016 at 6:22 PM, thomas veymont <thomas.veymont@gmail.com> wrote: > as far as I remember these settings were for faster bulk loading. > > in 9.5 we set: > max_wal_size = 6GB # > https://www.postgresql.org/docs/9.5/static/release-9-5.html, max_wal_size = > (3 * checkpoint_segments) * 16MB > #min_wal_size = 80MB > > though I tried max_wal_size = 3GB (default is 1GB) and did not notice any > improvment. And do you see changes if you increase min_wal_size? This will increase the number of WAL segments recycled instead of removed at each checkpoint. -- Michael
2016-07-27 14:11 GMT+02:00 Michael Paquier <michael.paquier@gmail.com>:
And do you see changes if you increase min_wal_size? This will
increase the number of WAL segments recycled instead of removed at
each checkpoint.
--
Michael
I have seen no improvment with the following parameters in 9.5:
max_wal_size = 3GB
min_wal_size = 512MB
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
max_wal_size = 3GB
min_wal_size = 512MB
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
while my 9.3 configuration is:
checkpoint_segments = 128 # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min # range 30s-1h
checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
checkpoint_segments = 128 # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min # range 30s-1h
checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
I have just run a quick pgbench test to get some objective numbers.
Both tests were run on the same machine (ie. production machine), same disk, same logical volume :
On 9.5 :
$ pgbench -c 4 -j 2 -T 600 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 70
query mode: simple
number of clients: 4
number of threads: 2
duration: 600 s
number of transactions actually processed: 77318
latency average: 31.041 ms
tps = 128.859708 (including connections establishing)
tps = 128.860447 (excluding connections establishing)
Both tests were run on the same machine (ie. production machine), same disk, same logical volume :
On 9.5 :
$ pgbench -c 4 -j 2 -T 600 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 70
query mode: simple
number of clients: 4
number of threads: 2
duration: 600 s
number of transactions actually processed: 77318
latency average: 31.041 ms
tps = 128.859708 (including connections establishing)
tps = 128.860447 (excluding connections establishing)
On 9.3 :
$ pgbench -c 4 -j 2 -T 600 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 70
query mode: simple
number of clients: 4
number of threads: 2
duration: 600 s
number of transactions actually processed: 1834436
latency average: 1.308 ms
tps = 3057.387254 (including connections establishing)
tps = 3057.398493 (excluding connections establishing)
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 70
query mode: simple
number of clients: 4
number of threads: 2
duration: 600 s
number of transactions actually processed: 1834436
latency average: 1.308 ms
tps = 3057.387254 (including connections establishing)
tps = 3057.398493 (excluding connections establishing)
Note that the 9.3 is handling others production requests in the same time.
Is a checkpoint_segment/WAL problem still to be suspected ?
cheers
Tom
cheers
Tom
On 07/27/2016 07:52 AM, thomas veymont wrote: > > 2016-07-27 14:11 GMT+02:00 Michael Paquier <michael.paquier@gmail.com > <mailto:michael.paquier@gmail.com>>: > > > > And do you see changes if you increase min_wal_size? This will > increase the number of WAL segments recycled instead of removed at > each checkpoint. > -- > Michael > > > I have seen no improvment with the following parameters in 9.5: > max_wal_size = 3GB > min_wal_size = 512MB > #checkpoint_completion_target = 0.5 # checkpoint target duration, > 0.0 - 1.0 > #checkpoint_warning = 30s # 0 disables > > while my 9.3 configuration is: > checkpoint_segments = 128 # in logfile segments, min 1, > 16MB each > #checkpoint_timeout = 5min # range 30s-1h > checkpoint_completion_target = 0.9 # checkpoint target duration, > 0.0 - 1.0 > #checkpoint_warning = 30s # 0 disables > > I have just run a quick pgbench test to get some objective numbers. > Both tests were run on the same machine (ie. production machine), same > disk, same logical volume : > > On 9.5 : > > $ pgbench -c 4 -j 2 -T 600 test > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 70 > query mode: simple > number of clients: 4 > number of threads: 2 > duration: 600 s > number of transactions actually processed: 77318 > latency average: 31.041 ms > tps = 128.859708 (including connections establishing) > tps = 128.860447 (excluding connections establishing) > > On 9.3 : > > $ pgbench -c 4 -j 2 -T 600 test > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 70 > query mode: simple > number of clients: 4 > number of threads: 2 > duration: 600 s > number of transactions actually processed: 1834436 > latency average: 1.308 ms > tps = 3057.387254 (including connections establishing) > tps = 3057.398493 (excluding connections establishing) > > Note that the 9.3 is handling others production requests in the same time. > > Is a checkpoint_segment/WAL problem still to be suspected ? Where did you get the respective versions of Postgres? Where they installed the same way? You mentioned the log feed showing obvious performance issues, can we see the relevant portions? > > cheers > Tom > -- Adrian Klaver adrian.klaver@aklaver.com
On Wed, Jul 27, 2016 at 11:01 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 07/27/2016 07:52 AM, thomas veymont wrote:
2016-07-27 14:11 GMT+02:00 Michael Paquier <michael.paquier@gmail.com
<mailto:michael.paquier@gmail.com>>:
And do you see changes if you increase min_wal_size? This will
increase the number of WAL segments recycled instead of removed at
each checkpoint.
--
Michael
I have seen no improvment with the following parameters in 9.5:
max_wal_size = 3GB
min_wal_size = 512MB
#checkpoint_completion_target = 0.5 # checkpoint target duration,
0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
while my 9.3 configuration is:
checkpoint_segments = 128 # in logfile segments, min 1,
16MB each
#checkpoint_timeout = 5min # range 30s-1h
checkpoint_completion_target = 0.9 # checkpoint target duration,
0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
I have just run a quick pgbench test to get some objective numbers.
Both tests were run on the same machine (ie. production machine), same
disk, same logical volume :
On 9.5 :
$ pgbench -c 4 -j 2 -T 600 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 70
query mode: simple
number of clients: 4
number of threads: 2
duration: 600 s
number of transactions actually processed: 77318
latency average: 31.041 ms
tps = 128.859708 (including connections establishing)
tps = 128.860447 (excluding connections establishing)
On 9.3 :
$ pgbench -c 4 -j 2 -T 600 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 70
query mode: simple
number of clients: 4
number of threads: 2
duration: 600 s
number of transactions actually processed: 1834436
latency average: 1.308 ms
tps = 3057.387254 (including connections establishing)
tps = 3057.398493 (excluding connections establishing)
Note that the 9.3 is handling others production requests in the same time.
Is a checkpoint_segment/WAL problem still to be suspected ?
Where did you get the respective versions of Postgres?
Where they installed the same way?
You mentioned the log feed showing obvious performance issues, can we see the relevant portions?
cheers
Tom
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
I have to ask,
was a vacuumdb -Z OR psql -U postgres -c ANALYZE ;
done after the migration?
Without accurate stats, performance goes down the drain.
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
hi Melvin, Adrian,
>
> Where did you get the respective versions of Postgres?
>
both were compiled from source, from the official website tar files.>
> Where did you get the respective versions of Postgres?
>
>
> Where they installed the same way?
>
yes, exactly the same. My configure command line is:
./configure --prefix=/usr/local/pgsqlX.XX --with-perl --with-python --with-tcl --with-openssl --with-pam --with-ldap --with-libxml --with-libxslt --with-system-tzdata=/usr/share/zoneinfo/>
./configure --prefix=/usr/local/pgsqlX.XX --with-perl --with-python --with-tcl --with-openssl --with-pam --with-ldap --with-libxml --with-libxslt --with-system-tzdata=/usr/share/zoneinfo/>
>
> You mentioned the log feed showing obvious performance issues, can we see the relevant portions?
>
I was meaning the log feed is obviously "slow" because you can almost "read" the log lines going through. You usually can't because it's too fast.
>
> I have to ask,
> was a vacuumdb -Z OR psql -U postgres -c ANALYZE ;
>
> I have to ask,
> was a vacuumdb -Z OR psql -U postgres -c ANALYZE ;
> done after the migration?
>
> Without accurate stats, performance goes down the drain
>>
> Without accurate stats, performance goes down the drain
You're right. I did not run ANALYZE in the first time (assuming autovacuum would do it when needed). But it should be noted that :
- even re-injecting the 9.3 dumps into the fresh 9.5 engine was much longer than expected (it is agreed that I cannot run ANALYZE before re-injecting the dumps ;)
- the pgbench run on both 9.3/9.5 systems were run without ANALYZE. And yet, the 9.3 test provided better results than 9.5.
- the pgbench run on both 9.3/9.5 systems were run without ANALYZE. And yet, the 9.3 test provided better results than 9.5.
To be clear in my mind about it, I think I need to re-run these tests and check whether it's machine/OS dependant or even I am doing my test the wrong way.
I will be back to you with more objective values by next week.
I will be back to you with more objective values by next week.
thanks for helping,
Tom
2016-07-27 17:14 GMT+02:00 Melvin Davidson <melvin6925@gmail.com>:
On Wed, Jul 27, 2016 at 11:01 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 07/27/2016 07:52 AM, thomas veymont wrote:
2016-07-27 14:11 GMT+02:00 Michael Paquier <michael.paquier@gmail.com
<mailto:michael.paquier@gmail.com>>:
And do you see changes if you increase min_wal_size? This will
increase the number of WAL segments recycled instead of removed at
each checkpoint.
--
Michael
I have seen no improvment with the following parameters in 9.5:
max_wal_size = 3GB
min_wal_size = 512MB
#checkpoint_completion_target = 0.5 # checkpoint target duration,
0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
while my 9.3 configuration is:
checkpoint_segments = 128 # in logfile segments, min 1,
16MB each
#checkpoint_timeout = 5min # range 30s-1h
checkpoint_completion_target = 0.9 # checkpoint target duration,
0.0 - 1.0
#checkpoint_warning = 30s # 0 disables
I have just run a quick pgbench test to get some objective numbers.
Both tests were run on the same machine (ie. production machine), same
disk, same logical volume :
On 9.5 :
$ pgbench -c 4 -j 2 -T 600 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 70
query mode: simple
number of clients: 4
number of threads: 2
duration: 600 s
number of transactions actually processed: 77318
latency average: 31.041 ms
tps = 128.859708 (including connections establishing)
tps = 128.860447 (excluding connections establishing)
On 9.3 :
$ pgbench -c 4 -j 2 -T 600 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 70
query mode: simple
number of clients: 4
number of threads: 2
duration: 600 s
number of transactions actually processed: 1834436
latency average: 1.308 ms
tps = 3057.387254 (including connections establishing)
tps = 3057.398493 (excluding connections establishing)
Note that the 9.3 is handling others production requests in the same time.
Is a checkpoint_segment/WAL problem still to be suspected ?
Where did you get the respective versions of Postgres?
Where they installed the same way?
You mentioned the log feed showing obvious performance issues, can we see the relevant portions?
cheers
Tom
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
I have to ask,
was a vacuumdb -Z OR psql -U postgres -c ANALYZE ;done after the migration?Without accurate stats, performance goes down the drain.--Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.