[PERFORM] Poor perfomance of update (and select) on partitioned tables - Mailing list pgsql-performance

From Konstantin Kivi
Subject [PERFORM] Poor perfomance of update (and select) on partitioned tables
Date
Msg-id CAPaixjd4i=LHLFfGh7_rQ6GMom55=p_fDLnPoFqCSsFuKqLF0Q@mail.gmail.com
Whole thread Raw
List pgsql-performance
Hello

Trying to setup table partitioning I noticed strange behavior that prevents me from using it.

I set up master and child tables via inheritance, with range CHECK by date and with
trigger on 'insert', as described in the documentation.

I was happy with insertion speed, it was about 30 megabytes per second that was more than I expected,
and server idle time was near 95 %. I used 100 parallel clients.

However, when it came to updates things turned very bad.
I set up a test with 30 running client making 10000 updates each in a random fashion.
updates via master table took 6 times longer and server idle time dropped to 15%, user CPU 75% with load average 15.

Test details below

300000 updates ( 30 processes 10000 selects each)

via master table 134 seconds
via child table  20 seconds

300000 updates via master table without "date1 >= '2017-09-06' and date1 < '2017-09-07'" clause
180 seconds
That means that constraint_exlusion works, however, the process of exclusion takes A LOT OF time.

I tried to repeat the test with selects

300000 selects ( 30 processes 10000 selects each)

via master table 50 seconds
via child table  8 seconds

This is very bad too.

The documentation says that it is not good to have 1000 partition, probably 100 is OK, but I have only 40 partitions
and have noticeable delays with only 5 partitions.

What I also cannot understand, why time increase for 'select'
is much higher (2.5 times) than time increase for 'update', considering that 'where' clause is identical
and assuming time is spent selecting relevant child tables.

Best regards, Konstantin

Environment description.


Postgres 9.5 on linux

db=> select version();
                                                 version                                                 
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit
(1 row)
db=>


16 CPU

vendor_id    : GenuineIntel
cpu family    : 6
model        : 45
model name    : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz

128GB ram

32GB shared_buffers


Table statistics

db=> select count(*) from my_log_daily;
 count 
--------
 408568
(1 row)

db=> select count(*) from my_log_daily_170906;
 count 
--------
 408568
(1 row)

db=>
              
explain (ANALYZE,BUFFERS) select stage+1 from  my_log_daily_170906  where  date1 >= '2017-09-06' and date1 < '2017-09-07' and msgid1=3414253 and msgid2=20756 and msgid3=1504712117 and instance='WS6';
                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using my_log_daily_idx_170906 on my_log_daily_170906  (cost=0.42..8.46 rows=1 width=4) (actual time=0.013..0.014 rows=1 loops=1)
   Index Cond: ((msgid1 = 3414253) AND (msgid2 = 20756) AND (msgid3 = 1504712117) AND ((instance)::text = 'WS6'::text))
   Filter: ((date1 >= '2017-09-06 00:00:00'::timestamp without time zone) AND (date1 < '2017-09-07 00:00:00'::timestamp without time zone))
   Buffers: shared hit=4
 Planning time: 0.135 ms
 Execution time: 0.029 ms
(6 rows)

db=>                

explain (ANALYZE,BUFFERS) select stage+1 from  my_log_daily  where  date1 >= '2017-09-06' and date1 < '2017-09-07' and msgid1=3414253 and msgid2=20756 and msgid3=1504712117 and instance='WS6';
                                                                                                                            QUERY PLAN                                                                                                                            
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.00..8.46 rows=2 width=4) (actual time=0.016..0.017 rows=1 loops=1)
   Buffers: shared hit=4
   ->  Append  (cost=0.00..8.45 rows=2 width=4) (actual time=0.013..0.014 rows=1 loops=1)
         Buffers: shared hit=4
         ->  Seq Scan on my_log_daily  (cost=0.00..0.00 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
               Filter: ((date1 >= '2017-09-06 00:00:00'::timestamp without time zone) AND (date1 < '2017-09-07 00:00:00'::timestamp without time zone) AND (msgid1 = 3414253) AND (msgid2 = 20756) AND (msgid3 = 1504712117) AND ((instance)::text = 'WS6'::text))
         ->  Index Scan using my_log_daily_idx_170906 on my_log_daily_170906  (cost=0.42..8.45 rows=1 width=4) (actual time=0.012..0.013 rows=1 loops=1)
               Index Cond: ((msgid1 = 3414253) AND (msgid2 = 20756) AND (msgid3 = 1504712117) AND ((instance)::text = 'WS6'::text))
               Filter: ((date1 >= '2017-09-06 00:00:00'::timestamp without time zone) AND (date1 < '2017-09-07 00:00:00'::timestamp without time zone))
               Buffers: shared hit=4
 Planning time: 2.501 ms
 Execution time: 0.042 ms
(12 rows)

db=>

explain (ANALYZE,BUFFERS) update  my_log_daily_170906 set stage=stage+1 where  date1 >= '2017-09-06' and date1 < '2017-09-07' and msgid1=3414253 and msgid2=20756 and msgid3=1504712117 and instance='WS6';
                                                                       QUERY PLAN                                                                       
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on my_log_daily_170906  (cost=0.42..8.46 rows=1 width=186) (actual time=0.133..0.133 rows=0 loops=1)
   Buffers: shared hit=5 dirtied=1
   ->  Index Scan using my_log_daily_idx_170906 on my_log_daily_170906  (cost=0.42..8.46 rows=1 width=186) (actual time=0.014..0.015 rows=1 loops=1)
         Index Cond: ((msgid1 = 3414253) AND (msgid2 = 20756) AND (msgid3 = 1504712117) AND ((instance)::text = 'WS6'::text))
         Filter: ((date1 >= '2017-09-06 00:00:00'::timestamp without time zone) AND (date1 < '2017-09-07 00:00:00'::timestamp without time zone))
         Buffers: shared hit=4
 Planning time: 0.488 ms
 Execution time: 0.177 ms
(8 rows)

db=>
explain (ANALYZE,BUFFERS) update  my_log_daily set stage=stage+1 where  date1 >= '2017-09-06' and date1 < '2017-09-07' and msgid1=3414253 and msgid2=20756 and msgid3=1504712117 and instance='WS6';
                                                                                                                         QUERY PLAN                                                                                                                         
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on my_log_daily  (cost=0.00..8.46 rows=2 width=587) (actual time=0.052..0.052 rows=0 loops=1)
   Update on my_log_daily
   Update on my_log_daily_170906
   Buffers: shared hit=5
   ->  Seq Scan on my_log_daily  (cost=0.00..0.00 rows=1 width=988) (actual time=0.001..0.001 rows=0 loops=1)
         Filter: ((date1 >= '2017-09-06 00:00:00'::timestamp without time zone) AND (date1 < '2017-09-07 00:00:00'::timestamp without time zone) AND (msgid1 = 3414253) AND (msgid2 = 20756) AND (msgid3 = 1504712117) AND ((instance)::text = 'WS6'::text))
   ->  Index Scan using my_log_daily_idx_170906 on my_log_daily_170906  (cost=0.42..8.46 rows=1 width=186) (actual time=0.019..0.020 rows=1 loops=1)
         Index Cond: ((msgid1 = 3414253) AND (msgid2 = 20756) AND (msgid3 = 1504712117) AND ((instance)::text = 'WS6'::text))
         Filter: ((date1 >= '2017-09-06 00:00:00'::timestamp without time zone) AND (date1 < '2017-09-07 00:00:00'::timestamp without time zone))
         Buffers: shared hit=4
 Planning time: 4.639 ms
 Execution time: 0.147 ms
(12 rows)


db=> \d my_log_daily
                                 Table "public.my_log_daily"
   Column   |            Type             |                     Modifiers                     
------------+-----------------------------+----------------------------------------------------
 client_id  | integer                     | not null
 pult       | character varying(6)        | not null
 opr        | character varying(30)       | not null
 handler    | character varying(60)       |
 msgid      | integer                     |
 sclient_id | integer                     |
 stage      | integer                     | default 0
 msgid1     | integer                     | default 0
 msgid2     | integer                     | default 0
 msgid3     | integer                     | default 0
 ended      | smallint                    | default 0
 date1      | timestamp without time zone | default ('now'::text)::timestamp without time zone
 date2      | timestamp without time zone |
 reserved1  | character varying(100)      |
 reserved2  | character varying(100)      |
 reserved3  | character varying(100)      |
 atpco      | smallint                    | not null default 0
 rsrvdnum1  | integer                     |
 rsrvdnum2  | integer                     |
 rsrvdnum3  | integer                     |
 instance   | character varying(3)        |
 duration   | integer                     | default 0
 ip         | integer                     |
Triggers:
    insert_my_log_daily_trigger BEFORE INSERT ON my_log_daily FOR EACH ROW EXECUTE PROCEDURE my_log_daily_insert_trigger()
Number of child tables: 40 (Use \d+ to list them.)

db=>

Indexes:
    "my_log_daily_idx_170906" UNIQUE, btree (msgid1, msgid2, msgid3, instance)
    "my_log_daily_date_170906" btree (date1)
    "my_log_daily_handler_170906" btree (handler)
    "my_log_daily_pult_170906" btree (pult)
    "my_log_daily_reserved1_170906" btree (reserved1)
    "my_log_daily_src_170906" btree (client_id, date1)
Check constraints:
    "my_log_daily_170906_date1_check" CHECK (date1 >= '2017-09-06 00:00:00'::timestamp without time zone AND date1 < '2017-09-07 00:00:00'::timestamp without time zone)
Inherits: my_log_daily

db=>



a complete  list of child tables below.
table descriptions including CHECK and indexes ( as well as trigger function ) are autogenerated, so there is no human error.


-----------------
db=> \d+ my_log_daily
                                                     Table "public.my_log_daily"
   Column   |            Type             |                     Modifiers                      | Storage  | Stats target | Description
------------+-----------------------------+----------------------------------------------------+----------+--------------+-------------
 client_id  | integer                     | not null                                           | plain    |              |
 pult       | character varying(6)        | not null                                           | extended |              |
 opr        | character varying(30)       | not null                                           | extended |              |
 handler    | character varying(60)       |                                                    | extended |              |
 msgid      | integer                     |                                                    | plain    |              |
 sclient_id | integer                     |                                                    | plain    |              |
 stage      | integer                     | default 0                                          | plain    |              |
 msgid1     | integer                     | default 0                                          | plain    |              |
 msgid2     | integer                     | default 0                                          | plain    |              |
 msgid3     | integer                     | default 0                                          | plain    |              |
 ended      | smallint                    | default 0                                          | plain    |              |
 date1      | timestamp without time zone | default ('now'::text)::timestamp without time zone | plain    |              |
 date2      | timestamp without time zone |                                                    | plain    |              |
 reserved1  | character varying(100)      |                                                    | extended |              |
 reserved2  | character varying(100)      |                                                    | extended |              |
 reserved3  | character varying(100)      |                                                    | extended |              |
 atpco      | smallint                    | not null default 0                                 | plain    |              |
 rsrvdnum1  | integer                     |                                                    | plain    |              |
 rsrvdnum2  | integer                     |                                                    | plain    |              |
 rsrvdnum3  | integer                     |                                                    | plain    |              |
 instance   | character varying(3)        |                                                    | extended |              |
 duration   | integer                     | default 0                                          | plain    |              |
 ip         | integer                     |                                                    | plain    |              |
Triggers:
    insert_my_log_daily_trigger BEFORE INSERT ON my_log_daily FOR EACH ROW EXECUTE PROCEDURE my_log_daily_insert_trigger()
Child tables: my_log_daily_170901,
              my_log_daily_170902,
              my_log_daily_170903,
              my_log_daily_170904,
              my_log_daily_170905,
              my_log_daily_170906,
              my_log_daily_170907,
              my_log_daily_170908,
              my_log_daily_170909,
              my_log_daily_170910,
              my_log_daily_170911,
              my_log_daily_170912,
              my_log_daily_170913,
              my_log_daily_170914,
              my_log_daily_170915,
              my_log_daily_170916,
              my_log_daily_170917,
              my_log_daily_170918,
              my_log_daily_170919,
              my_log_daily_170920,
              my_log_daily_170921,
              my_log_daily_170922,
              my_log_daily_170923,
              my_log_daily_170924,
              my_log_daily_170925,
              my_log_daily_170926,
              my_log_daily_170927,
              my_log_daily_170928,
              my_log_daily_170929,
              my_log_daily_170930,
              my_log_daily_171001,
              my_log_daily_171002,
              my_log_daily_171003,
              my_log_daily_171004,
              my_log_daily_171005,
              my_log_daily_171006,
              my_log_daily_171007,
              my_log_daily_171008,
              my_log_daily_171009,
              my_log_daily_171010

db=>







pgsql-performance by date:

Previous
From: George Neuner
Date:
Subject: Re: [PERFORM] printing results of query to file in different times
Next
From: George Neuner
Date:
Subject: Re: [PERFORM] printing results of query to file in different times