Home > mailing lists

Re: Logical replication - DDL sub transactions for script executed insingle transaction? - Mailing list pgsql-general

From	Pavel Stehule
Subject	Re: Logical replication - DDL sub transactions for script executed insingle transaction?
Date	March 22, 2019 13:43:54
Msg-id	CAFj8pRCYgXdkxv1Ha7=-cwEPgnE9UfAaoLXWcUXrP7USw911rQ@mail.gmail.com Whole thread Raw
In response to	Logical replication - DDL sub transactions for script executed insingle transaction? (Aleš Zelený <zeleny.ales@gmail.com>)
List	pgsql-general

Tree view

pá 22. 3. 2019 v 10:49 odesílatel Aleš Zelený <zeleny.ales@gmail.com> napsal:

Hello,

I've learned that logical replication might have performance problem if there are lot of sub transactions within transaction (at least because it enforces spill files in pg_replslot and if there are many - like 80mio, EXT4 did not perform well - in my case it was caused bu misuse of exception handling in trigger function) and once we get rid of this usage pattern logical replication works like a charm.
Until we have applied schema migration patch creating 700+ tables by flyway, therefore whole file was applied in single transaction (like psql -1 -f xxxx) and it effectively blocked our replication for 22 hours. No failures, wal sender processed did not consumed huge amount of memory (what happened in past wiht 80mi sub-trans caused by trigger), all wal senders did IO very rarely, most of the time they dwell on CPU.

The spill files were not so big:
# du -sh xid-924361123-lsn-FBC-C7000000.snap
224K    xid-924361123-lsn-FBC-C7000000.snap

It takes about 1 minute (all the time 100% CPU used by was sender) to move to following file.

Version, OS:
PostgreSQL 10.6 (Ubuntu 10.6-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609, 64-bit

Perf:
Overhead Command   Shared Object      Symbol
57.79% postgres postgres           [.] hash_seq_search
14.70% postgres postgres           [.] hash_search_with_hash_value
   5.93% postgres postgres           [.] LocalExecuteInvalidationMessage
   3.60% postgres postgres           [.] RelfilenodeMapInvalidateCallback
   2.99% postgres postgres           [.] hash_uint32
   2.50% postgres postgres           [.] hash_search
   1.91% postgres postgres           [.] CatCacheInvalidate
   1.83% postgres postgres           [.] CallSyscacheCallbacks
   1.37% postgres postgres           [.] hash_seq_init
   1.24% postgres postgres           [.] hash_seq_term
   1.02% postgres postgres           [.] RelationCacheInvalidateEntry
   0.97% postgres postgres           [.] ReorderBufferCommit
   0.97% postgres pgoutput.so        [.] rel_sync_cache_relation_cb
   0.78% postgres postgres           [.] uint32_hash
   0.73% postgres postgres           [.] PlanCacheRelCallback
   0.60% postgres postgres           [.] InvalidateCatalogSnapshot
   0.58% postgres postgres           [.] SysCacheInvalidate
   0.35% postgres postgres           [.] GetCurrentTransactionNestLevel
   0.13% postgres pgoutput.so        [.] _init
   0.00% postgres [kernel.kallsyms] [k] __do_softirq
   0.00% postgres [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
   0.00% postgres postgres           [.] ResourceOwnerEnlargeRelationRefs

My suspect is that psql (of flyway) while processing a file with many create table and grant statements in single transaction mode might create sub transactions, but I haven't found it in docs.

Maybe you have disabled autocommit with ON_ERROR_ROLLBACK on

I don't know about another case, where psql can use subtransactions.

Pavel

Does you have similar experience? What is root cause for this issue?

Workaround is clear - do not create that large amount of DDL in single transaction, but it would be nice to know root cause to eliminate scripts/usage patterns causing this issue generally.

On the other hand, following transaction (100GB of spill files) , almost 150 mio rows deleted (and archived by statement trigger) behaves expected way - wal sender process was reading from disk (something like 130MB/s) and after ~20 minutes replication was in sync, all spill files were removed, so transaction changes volume itself does not seems to be any issue.

Thanks for any advice!

Kind regards Ales Zeleny
P.S.: PostgreSQL log facility is superb, having DDL statements logged and XID in log line prefix I was able to identify the issue source in this case.

pgsql-general by date:

From: Thomas Güttler
Date: 22 March 2019, 13:23:54
Subject: Script which shows performance of ByteA: ascii vs binary

From: Christian Henz
Date: 22 March 2019, 13:56:34
Subject: Forks of pgadmin3?

Re: Logical replication - DDL sub transactions for script executed insingle transaction? - Mailing list pgsql-general

Previous

Next