Thread: Re: performance degredation after upgrade from 9.6 to 12

Re: performance degredation after upgrade from 9.6 to 12

From
Andrew Zakharov
Date:
Hi there -
I have same feelings. Try set max_parallel_workers_per_gather to zero. I don't think that comparison non-parallel and parallel versions is correct (don't say anything about parallel in 9.6 pls)
What explain says? I suppose you will have different exec plans. Optimizer stranges of 11 and 12 ver have been discussed. Look thru the archive, but I didn't remember the problem status - resolved or not.
Andrew.

24 ноября 2019 г. 15:53:19 GMT+03:00, Mariel Cherkassky <mariel.cherkassky@gmail.com> пишет:
Hey all,
I'm testing performance of two identical machines one in 9.6 and the second one is in 12. The second machine is a clone of the first one + db upgrade to 12 beta 3 (Yes I'm aware 12.1 was released).

machine stats : 
32gb ram
8 cpu
regular hd (not ssd)

my postgresql.confg settings: 

max_wal_size = 2GB
min_wal_size = 1GB
wal_buffers = 16MB
checkpoint_completion_target = 0.9
checkpoint_timeout = 30min
log_checkpoints = on
log_lock_waits = on
log_temp_files = 1024
log_min_duration_statement = 1000
log_autovacuum_min_duration = 5000
autovacuum_max_workers = 4
autovacuum_vacuum_cost_limit = 800
autovacuum_vacuum_cost_delay = 10ms
standard_conforming_strings = off
max_locks_per_transaction = 5000
max_connections = 500
log_line_prefix = '%t %d %p  '
random_page_cost = 2.0
deadlock_timeout = 5s
shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 32764
maintenance_work_mem = 250MB
work_mem = 32MB
shared_buffers = 8058MB
effective_cache_size = 16116MB

in 12v I also added the following settings : 
log_directory = 'pg_log'
enable_partitionwise_join = on
enable_partitionwise_aggregate = on
max_worker_processes = 8               # (change requires restart)
max_parallel_workers_per_gather = 4    # taken from max_parallel_workers
max_parallel_workers = 8               # maximum number of max_worker_pr

I tested a few applications flows and I saw that the 9.6 version is faster. I also did a few simple tests (enabled \timing) : 

12v : 
postgres=# create table test1 as select generate_series(1,10000);
SELECT 10000
Time: 35.099 ms

postgres=# select count(*) from test1;
 count
-------
 10000
(1 row)

Time: 4.819 ms

9.6v : 
postgres=# create table test1 as select generate_series(1,10000);
SELECT 10000
Time: 19.962 ms

postgres=# select count(*) from test1;
 count
-------
 10000
(1 row)

Time: 1.541 ms

Any idea what can cause it ? What can I check? 
This degredation is visible in many queries that we use ..

After the upgrade to 12v version I run analyze on all tables..

Thanks.

С уважением,
Андрей Захаров

Re: performance degredation after upgrade from 9.6 to 12

From
Mariel Cherkassky
Date:
Hey Andrew,
It seems that changing this parameter worked for me.
Setting it to zero means that there wont be any parallel workers for one query right ?
Is it something familiar this problem with the gatherers ? 

Re: performance degredation after upgrade from 9.6 to 12

From
Jeff Janes
Date:
On Sun, Nov 24, 2019 at 8:52 AM Mariel Cherkassky <mariel.cherkassky@gmail.com> wrote:
Hey Andrew,
It seems that changing this parameter worked for me.
Setting it to zero means that there wont be any parallel workers for one query right ?
Is it something familiar this problem with the gatherers ? 

Your example would not be using parallel workers anyway, regardless of the setting of max_parallel_workers_per_gather, so I don't see how changing this could have worked for you.  Unless you mean it worked in your full test, rather than in your test case. I doubt your test case benchmarking was very reliable to start with, you only show a single execution and didn't indicate you had more unshown ones.

If I do more credible benchmarking, I do get a performance regression but it closer is to 16% than to 3 fold.  And it doesn't depend on the setting of max_parallel_workers_per_gather.  I doubt a regression of this size is even worth investigating.

pgbench -T300 -P5 -f <(echo "select count(*) from test1") -p 9912 -n -M prepared

Cheers,

Jeff

Re: performance degredation after upgrade from 9.6 to 12

From
Mariel Cherkassky
Date:
Hey Jeff,
This example was only used to show that pg96  had better perfomance than pg12 in a very simple case.
 In all the tests that I run most of the queries took less time on 9.6`s version.  I dont know why, but as you can see after disabling the parameter the simple test that I did showed different results.  I intend to test this theory tomorrow. I'm going to disable the parameter and run the same application flows that I have on both machines (9.6 vs 12 with zero value for the param).

I didnt send this mail after doing just one simple test, I have more than 100 queries that work better on 9.6 . If u have any explanation I will be happy to hear.
I'll update tomorrow once I'll have the results..

Re: performance degredation after upgrade from 9.6 to 12

From
John Felix
Date:
Op 24-11-2019 om 19:05 schreef Mariel Cherkassky:
Hey Jeff,
This example was only used to show that pg96  had better perfomance than pg12 in a very simple case.
 In all the tests that I run most of the queries took less time on 9.6`s version.  I dont know why, but as you can see after disabling the parameter the simple test that I did showed different results.  I intend to test this theory tomorrow. I'm going to disable the parameter and run the same application flows that I have on both machines (9.6 vs 12 with zero value for the param).

I didnt send this mail after doing just one simple test, I have more than 100 queries that work better on 9.6 . If u have any explanation I will be happy to hear.
I'll update tomorrow once I'll have the results..

I've had the same experience with parallel query. By default parallel query is disabled in 9.6. When we upgraded from 9.6 to 10 is was significant slower till I disabled parallel query.

We have a very small database (40 tables and all together max 4GB) and we have no long running queries (the largest queries run max 2 to 3 seconds). That's when parallel query gives a performance degrade in my opinion.

Best regards,

John Felix


Re: performance degredation after upgrade from 9.6 to 12

From
Jeff Janes
Date:
On Sun, Nov 24, 2019 at 1:05 PM Mariel Cherkassky <mariel.cherkassky@gmail.com> wrote:
Hey Jeff,
This example was only used to show that pg96  had better perfomance than pg12 in a very simple case.

OK, but do you agree that a 15% slow down is more realistic than 3 fold one?  Or are you still getting 3 fold slow down with more careful testing and over a wide variety of queries?

I find that the main regression (about 15%) in your example occurs in major version 10, at the following commit:

commit b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755
Author: Andres Freund <andres@anarazel.de>
Date:   Tue Mar 14 15:45:36 2017 -0700

    Faster expression evaluation and targetlist projection.  

It is disappointing that this made this case slower rather than faster, and that the "future work" alluded to either hasn't happened, or wasn't effective for this example.  I also tested the same example, only 100 times more rows, and still see the regression at about 16%.  This is a major infrastructure change patch which has been extensively built on since then, the chances of reverting it are very small.  It is making an omelette, and your example is one of the eggs that got broken.  

Performance changes in a large body of queries are usually not all due to the same thing.  Are you a position to custom compile your own PostgreSQL?  It would be nice to test this commit against the one before it, and see how much of the change in your real queries is explained by this one thing (or whether any of it is)
 
 In all the tests that I run most of the queries took less time on 9.6`s version.  I dont know why, but as you can see after disabling the parameter the simple test that I did showed different results.

I can't see--You didn't post results for that.  And running your test on my own system doesn't show that at all.  In your test case,  max_parallel_workers_per_gather makes no difference.  With 100 times more rows, setting it to 0 actually slows things down, as at that size parallelization is useful and disabling it hurts.

Of course parallel query might be hurting some of the other queries, but for the one example you show you will have to show something more convincing for me to believe that that is what caused it.

It is easy to benchmark with something like: 

PGOPTIONS="-c max_parallel_workers_per_gather=0" pgbench -T30 -f <(echo "select count(*) from test1") -p 9912 -n -M prepared

If it is other queries where mpwpg is making a difference, than one issue could be that your settings of parallel_setup_cost and/or parllel_tuple_cost are too low (although I usually find the default settings too high, not too low); or you are running your test concurrently already and so don't need parallel query to fully load the CPUs and trying to use parallel query just increases the overhead; or your machine doesn't have the number of truly effective CPUs you think it does.

Cheers,

Jeff

Re: performance degredation after upgrade from 9.6 to 12

From
Mariel Cherkassky
Date:
Hey Jeff,
First of all thank you again for the quick response. I really appreciate your comments.
Unfortunately I installed pg from rpms so I cant compile my current env but I can install it from source code and migrate the data via pg_dump. Can you explain how can I compile the sources without this commit ? 
I understand that my test such a good test but I thought that using something simple can show my issue. I was sure that I added the results of the same tests after disabling the param but it seems that I didnt do it. I'll send it tomorrow because right now im not inforont of my pc. The bottom line, after disabling this param the same query took less time in pg12(significly).
From this email chain and other peoples comments it seems that I'm not the only one who was facing this issue. One thing that might worth mentitoning, in all my tests I didnt use a big env, my db was at max 30GB and the max duration of a query was about 10s. It seems that this happens when we use queries with short duration (in your test you mentioned u used the same test but with 100 times more rows, maybe that is the reason it took more time ?) - maybe on "short" queries it creates degredation ? It is just an assumption...

I'll try to do most tests  tomorrow and update with results.

Re: performance degredation after upgrade from 9.6 to 12

From
Andres Freund
Date:
Hi,

On 2019-11-24 15:50:20 -0500, Jeff Janes wrote:
> OK, but do you agree that a 15% slow down is more realistic than 3 fold
> one?  Or are you still getting 3 fold slow down with more careful testing
> and over a wide variety of queries?
> 
> I find that the main regression (about 15%) in your example occurs in major
> version 10, at the following commit:

Huh, that's somewhat surprising. <5% I can see - there were some
tradeoffs to be made, and some performance issues to be worked around,
but 15% seems large. Is this with assertions enabled? Optimized?


> I also tested the same example, only 100 times
> more rows, and still see the regression at about 16%.  This is a major
> infrastructure change patch which has been extensively built on since then,
> the chances of reverting it are very small.  It is making an omelette, and
> your example is one of the eggs that got broken.

Yea, there's zero chance of a revert.


> Performance changes in a large body of queries are usually not all due to
> the same thing.  Are you a position to custom compile your own PostgreSQL?
> It would be nice to test this commit against the one before it, and see how
> much of the change in your real queries is explained by this one thing (or
> whether any of it is)

In particular, artificial queries will often show bottlenecks that are
not releveant in practice...



> commit b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755
> Author: Andres Freund <andres@anarazel.de>
> Date:   Tue Mar 14 15:45:36 2017 -0700
> 
>     Faster expression evaluation and targetlist projection.
> 
> It is disappointing that this made this case slower rather than faster, and
> that the "future work" alluded to either hasn't happened, or wasn't
> effective for this example.

I wonder if the improvements in
https://www.postgresql.org/message-id/20191023163849.sosqbfs5yenocez3%40alap3.anarazel.de
would at least partially address this.

Greetings,

Andres Freund



Re: performance degredation after upgrade from 9.6 to 12

From
Mariel Cherkassky
Date:
Hey Jeff,Andrew,
I continued testing the 12version vs the 96 version and it seems that there is almost non diff and in some cases pg96 is faster than 12. I compared the content of pg_stat_statements after each test that I have done and it seems that the db time is almost the same and sometimes 96 is faster by 5%.

Any idea why there isnt any improvement even when I enabled the parallel params in 12 ?
I can add a few examples if needed..

RE: performance degredation after upgrade from 9.6 to 12

From
"Andrew Zakharov"
Date:

Hi there –

I have no idea why this happening. But I suspect the parallel requires more internal machine resources like CPU etc because you can faster retrieve the disk data from the one hand but you ought to spend more resources for maintaining several threads and theirs coordination (px coordinator process in Oracle terms) from another one. Thus there could be more serious hardware requirements even just to keep performance the same. I believe that the real benefit of the parallel will be shown when you have pair of large and wide tables (30M or more each) with hash join (typical task for mart construction) but such class of databases is supposed to be big and required enough resources initially.

 

 

From: Mariel Cherkassky <mariel.cherkassky@gmail.com>
Sent: Monday, December 16, 2019 2:48 PM
To: Jeff Janes <jeff.janes@gmail.com>
Cc: Andrew Zakharov <Andrew898@mail.ru>; pgsql-performance@lists.postgresql.org
Subject: Re: performance degredation after upgrade from 9.6 to 12

 

Hey Jeff,Andrew,

I continued testing the 12version vs the 96 version and it seems that there is almost non diff and in some cases pg96 is faster than 12. I compared the content of pg_stat_statements after each test that I have done and it seems that the db time is almost the same and sometimes 96 is faster by 5%.

 

Any idea why there isnt any improvement even when I enabled the parallel params in 12 ?

I can add a few examples if needed..

Re: performance degredation after upgrade from 9.6 to 12

From
Mariel Cherkassky
Date:
I see, thank u !
Maybe I didnt see big difference because most of my tables arent so big. My db`s size is 17GB and the largest table contains about 20M+ records.

Thanks again !

Re: performance degredation after upgrade from 9.6 to 12

From
Pavel Stehule
Date:


po 16. 12. 2019 v 14:02 odesílatel Mariel Cherkassky <mariel.cherkassky@gmail.com> napsal:
I see, thank u !
Maybe I didnt see big difference because most of my tables arent so big. My db`s size is 17GB and the largest table contains about 20M+ records.

Postgres 12 has enabled JIT by default.

Pavel


Thanks again !