Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables - Mailing list pgsql-hackers

From Keisuke Kuroda
Subject Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables
Date
Msg-id CANDwgg+DmVvk_dRmcpG4J+WS2wQmhJ-okikbH2oGF_Byg1_iSw@mail.gmail.com
Whole thread Raw
In response to Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables
List pgsql-hackers
> > I test the patch on the master HEAD(9796f455) and it works fine.
> > * make installcheck-world: passed
> > * walsender process continues to use 100% of the CPU 1core when
> > TRUNCATE 1000 partition: 1s or less
> > ** before patch : 230s
>
> Does this result indicate that it is still CPU bound but it does the
> actual decoding and completes in 1s instead of spending 230s mainly to
> execute unnecessary invalidations?

Okay, to check the decodeing and invalidation times,
I got the time to return the results to pg_recvlogical and perf.

Before the patch, most of the processing was done
by hash_seq_search in the ReferenodeMapInvalidateCallback.
After the patch, this problem has been resolved.

# test

(1) start pg_recvlogical
(2) INSERT to 2000 partition for creating RelfilenodemapHash
(3) TRUNCATE 1000 partition

(1)
pg_recvlogical --create-slot --start -f - --if-not-exists \
--plugin=test_decoding --slot=test1 --dbname=postgres --username=postgres \
--option=include-timestamp | gawk '{ print strftime("%Y-%m-%d
%H:%M:%S"), $0; fflush(); }'

(2)
DO $$
DECLARE
    i INT;
    j INT;
    schema TEXT;
    tablename TEXT;
BEGIN
    FOR i IN 1 .. 2 LOOP
        schema := 'nsp_' || to_char(i, 'FM000');
        tablename := 'tbl_' || to_char(i, 'FM000');
        EXECUTE 'INSERT INTO ' || schema || '.' || tablename || '
         SELECT i,i,i,0,0,0,0,0,0,0,0,0,''a''::bytea,''a''::bytea FROM
generate_series(0,9999,1) AS i';
    END LOOP;
END;
$$ LANGUAGE plpgsql;

(3)
TRUNCATE TABLE nsp_001.tbl_001;

# before

(3) TRUNCATE
decode + invalidation = 222s

2020-10-01 16:58:29 BEGIN 529
2020-10-01 17:02:11 COMMIT 529 (at 2020-10-01 16:58:29.281747+09)

(Before the patch was applied, I was concerned that
the "table nsp_001.tbl_001 ... TRUNCATE" was not outputting.)

Samples: 228K of event 'cpu-clock:uhH', Event count (approx.): 57080750000
  Overhead  Command   Shared Object       Symbol
-   96.00%  postgres  postgres            [.] hash_seq_search
     hash_seq_search
     - RelfilenodeMapInvalidateCallback
      - LocalExecuteInvalidationMessage
        ReorderBufferExecuteInvalidations
        ReorderBufferProcessTXN
        ReorderBufferCommit
        DecodeCommit
        DecodeXactOp
        LogicalDecodingProcessRecord
        XLogSendLogical
        WalSndLoop
        StartLogicalReplication
        exec_replication_command
        PostgresMain
        BackendRun
        BackendStartup
        ServerLoop
        PostmasterMain
        main
        __libc_start_main
        _start
+    0.77%  postgres  postgres            [.] LocalExecuteInvalidationMessage
     0.47%  postgres  postgres            [.] hash_bytes_uint32
     0.42%  postgres  postgres            [.] hash_search_with_hash_value
     0.37%  postgres  postgres            [.] RelfilenodeMapInvalidateCallback
     0.32%  postgres  postgres            [.] CatCacheInvalidate
     0.23%  postgres  postgres            [.] PlanCacheRelCallback
     0.23%  postgres  postgres            [.] ReorderBufferExecuteInvalidations

# after

(3) TRUNCATE
decode + invalidation = 1s or less

2020-10-01 17:09:43 BEGIN 537
2020-10-01 17:09:43 table nsp_001.tbl_001, nsp_001.part_0001,
nsp_001.part_0002 ... nsp_001.part_0999, nsp_001.part_1000: TRUNCATE:
(no-flags)
2020-10-01 17:09:43 COMMIT 537 (at 2020-10-01 17:09:43.192989+09)

Samples: 337  of event 'cpu-clock:uhH', Event count (approx.): 84250000
  Overhead  Command   Shared Object       Symbol
-   47.48%  postgres  postgres            [.] hash_seq_search
     hash_seq_search
     RelfilenodeMapInvalidateCallback
     LocalExecuteInvalidationMessage
     ReorderBufferExecuteInvalidations
     ReorderBufferProcessTXN
     ReorderBufferCommit
     DecodeCommit
     DecodeXactOp
     LogicalDecodingProcessRecord
     XLogSendLogical
     WalSndLoop
     StartLogicalReplication
     exec_replication_command
     PostgresMain
     BackendRun
     BackendStartup
     ServerLoop
     PostmasterMain
     main
     __libc_start_main
     _start
+    3.26%  postgres  postgres            [.] heap_hot_search_buffer
+    2.67%  postgres  postgres            [.] pg_comp_crc32c_sse42
+    2.08%  postgres  libc-2.17.so        [.] __memcpy_ssse3_back
+    2.08%  postgres  postgres            [.] DecodeXLogRecord
+    2.08%  postgres  postgres            [.] hash_search_with_hash_value
+    1.48%  postgres  libpthread-2.17.so  [.] __libc_recv
+    1.19%  postgres  libc-2.17.so        [.] __memset_sse2
+    1.19%  postgres  libpthread-2.17.so  [.] __errno_location
+    1.19%  postgres  postgres            [.] LocalExecuteInvalidationMessage
+    1.19%  postgres  postgres            [.] ReadPageInternal
+    1.19%  postgres  postgres            [.] XLogReadRecord
+    1.19%  postgres  postgres            [.] socket_is_send_pending
     hash_seq_search

Best Regards,
-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com



pgsql-hackers by date:

Previous
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: Disable WAL logging to speed up data loading
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: Add information to rm_redo_error_callback()