Thread: BUG #17307: Performance deviation between the multiple iterations (NOPM & TPM values).

BUG #17307: Performance deviation between the multiple iterations (NOPM & TPM values).

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17307
Logged by:          HPC Researcher
Email address:      researcherhpc@gmail.com
PostgreSQL version: 14.0
Operating system:   RHEL 8.4
Description:

NOPM values captured with HammerDB-v4.3 scripts (schema_tpcc.tcl and
test_tpcc.tcl ) for multiple trails.
The expected performance deviation between multiple trials should be less
than 2% 

Hardware configuration 
Architecture        x86_64 
CPU op-mode(s)      32-bit, 64-bit 
Byte Order:          Little Endian 
CPU(s):              256 
On-line CPU(s) list: 0-255 
Thread(s) per core:  2 
Core(s) per socket:  64 
Socket(s):           2 
NUMA node(s):        8 
L1d cache:           32K 
L1i cache:           32K  
L2 cache:            512K 
L3 cache:            16384K 
OS: RHEL8.4 
RAM SIZE:512 
SSD:1TB 



Postgresql.conf 

autovacuum_max_workers = 16 
autovacuum_vacuum_cost_limit = 3000 
checkpoint_completion_target = 0.9 
checkpoint_timeout = '15min' 
cpu_tuple_cost = 0.03 
effective_cache_size = '350GB' 
listen_addresses = '*' 
maintenance_work_mem = '2GB' 
max_connections = 1000 
max_wal_size = '128GB' 
random_page_cost = 1.1 
shared_buffers = '128GB' 
wal_buffers = '1GB' 
work_mem = '128MB' 
random_page_cost = 1.1 
effective_io_concurrency = 200  

HammerDB Scripts 
    >>cat schema.tcl 
    #!/bin/tclsh 
    dbset db pg 
    diset connection pg_host localhost 
    diset connection pg_port 5432 
    diset tpcc pg_count_ware 400 
    diset tpcc pg_num_vu 50 
    print dict 
    buildschema 
    waittocomplete 

RUN TEST on i.e. start with 1VU then 2, 4 etc 
      Virtual Users   Trail-1(NOPM) Trail-2(NOPM)    %diff     
        12        99390               92913                  6.516752  
        140        561429               525408                  6.415949  
        192        636016               499574                  21.4526  
        230        621644               701882                  12.9074


PG Bug reporting form <noreply@postgresql.org> writes:
> NOPM values captured with HammerDB-v4.3 scripts (schema_tpcc.tcl and
> test_tpcc.tcl ) for multiple trails.
> The expected performance deviation between multiple trials should be less
> than 2% 

According to who?  Even if you'd provided an easily reproducible
example, I doubt we'd accept this as a bug.  Adding more sessions
does not have zero cost.

            regards, tom lane



As per HammerDB documentation, the same test running for multiple iterations in the same Hardware gives less deviation (1%-2%) 

  

We noticed the TPC-C performance(NOPM/TPM) deviation is >2% to 21% with  virtual users(1 to 250 for 2 socket system) on running multiple iterations(5-6 runs). 

  

Checked on  different configurations/ system settings as below : 

1.Reduced Max connection i.e., lower connections(example max_connections 1700 to 200 in postgres.conf ) 

2.Reduced warehouses in schema build i.e. pg_count_ware 800 to pg_count_ware 400/200 

3.For each run/iteration rebuild schema(delete schema after results captured in  each iteration and delete/drop tpcc, restart postgres and rebuild schema for next iteration) 

4.For each Iteration unmount and mount /data forlder from SSD. 

5.Numa settings like taskset/core pinning and SMT-OFF/SMT-ON. 

6 Test run on different  NUMA  nodes like numactl --interleave=all or  numa auto balancing. 

7.With default PostgreSQL.conf and less virtual users(like 1,2,4,8,12,16,20) and small warehouse like 20 and pg_num_vu 4 

8.Run HammerDB in client Machine and PostgreSQL in Master Machine. 

  

Here are the questions: 

1. What is the right way to test PostgreSQL with HammerDB for multiple iterations? 

2. Is the performance deviation on multiple runs is expected because of raw Postgres performance? 

3. Can "CPU usage, I/O volume, I/O Latency & HDD/SSD latency" be the reason for deviation?


Thanks


On Fri, 3 Dec 2021 at 00:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> NOPM values captured with HammerDB-v4.3 scripts (schema_tpcc.tcl and
> test_tpcc.tcl ) for multiple trails.
> The expected performance deviation between multiple trials should be less
> than 2%

According to who?  Even if you'd provided an easily reproducible
example, I doubt we'd accept this as a bug.  Adding more sessions
does not have zero cost.

                        regards, tom lane