Thread: Optimization of vacuum for logical replication
Hi, hackers. Right now if replication level is rgeater or equal than "replica", vacuum of relation copies all its data to WAL: /* * We need to log the copied data in WAL iff WAL archiving/streaming is * enabled AND it's a WAL-logged rel. */ use_wal = XLogIsNeeded() && RelationNeedsWAL(NewHeap); Obviously we have to do it for physical replication and WAL archiving. But why do we need to do so expensive operation (actually copy all table data three times) if we use logical replication? Logically vacuum doesn't change relation so there is no need to write any data to the log and process it by WAL sender. I wonder if we can check that 1. wal_revel is "logical" 2. There are no physical replication slots 3. WAL archiving is disables and in this cases do not write cloned relation to the WAL? Small patch implementing such behavior is attached to this mail. It allows to significantly reduce WAL size when performing vacuum at multimaster, which uses logical replication between cluster nodes. What can be wrong with such optimization? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Am Mittwoch, den 21.08.2019, 12:20 +0300 schrieb Konstantin Knizhnik: > I wonder if we can check that > > 1. wal_revel is "logical" > 2. There are no physical replication slots > 3. WAL archiving is disables Not sure i get that correctly, i can still have a physical standby without replication slots connected to such an instance. How would your idea handle this situation? Bernd
On 21.08.2019 12:34, Bernd Helmle wrote: > Am Mittwoch, den 21.08.2019, 12:20 +0300 schrieb Konstantin Knizhnik: >> I wonder if we can check that >> >> 1. wal_revel is "logical" >> 2. There are no physical replication slots >> 3. WAL archiving is disables > Not sure i get that correctly, i can still have a physical standby > without replication slots connected to such an instance. How would your > idea handle this situation? Yes, it is possible to have physical replica withotu replication slot. But it is not safe, because there is always a risk that lag between master and replica becomes larger than size of WAL kept at master. Also I can't believe that DBA which explicitly sets wal_level is set to logical will use streaming replication without associated replication slot. And certainly it is possible to add GUC which controls such optimization. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Hello > Also I can't believe that DBA which explicitly sets wal_level is set to > logical will use streaming replication without associated replication slot. I am. > Yes, it is possible to have physical replica withotu replication slot. > But it is not safe, because there is always a risk that lag between > master and replica becomes larger than size of WAL kept at master. Just an example: replica for manual queries, QA purposes or for something else that is not an important part of the system. If I use replication slots - my risk is out-of-space on primary and therefore shutdown of primary. With downtime for application. If I use wal_keep_segments instead - I have some limited (and usually stable) amount of WAL but risk to have outdated replica. I prefer to have an outdated replica but primary is more safe. Its OK for me to just take fresh pg_basebackup from anotherreplica. And application want to use logical replication so wal_level = logical. If we not want support such usecase - we need explicitly forbid replication without replication slots. regards, Sergei
Am Mittwoch, den 21.08.2019, 13:26 +0300 schrieb Konstantin Knizhnik: > Yes, it is possible to have physical replica withotu replication > slot. > But it is not safe, because there is always a risk that lag between > master and replica becomes larger than size of WAL kept at master. Sure, but that doesn't mean use cases for this aren't real. > Also I can't believe that DBA which explicitly sets wal_level is set > to > logical will use streaming replication without associated replication > slot. Well, i know people doing exactly this, for various reasons (short living replicas, logical replicated table sets for reports, ...). The fact that they can have loosely coupled replicas with either physical or logical replication is a feature they'd really miss.... Bernd
On 21.08.2019 14:45, Bernd Helmle wrote: > Am Mittwoch, den 21.08.2019, 13:26 +0300 schrieb Konstantin Knizhnik: >> Yes, it is possible to have physical replica withotu replication >> slot. >> But it is not safe, because there is always a risk that lag between >> master and replica becomes larger than size of WAL kept at master. > Sure, but that doesn't mean use cases for this aren't real. > >> Also I can't believe that DBA which explicitly sets wal_level is set >> to >> logical will use streaming replication without associated replication >> slot. > Well, i know people doing exactly this, for various reasons (short > living replicas, logical replicated table sets for reports, ...). The > fact that they can have loosely coupled replicas with either physical > or logical replication is a feature they'd really miss.... > > Bernd > Ok, you convinced me that there are cases when people want to combine logical replication with streaming replication without slot. But is it acceptable to have GUC variable (disabled by default) which allows to use this optimizations? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Hello. At Wed, 21 Aug 2019 18:06:52 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in <968fc591-51d3-fd74-8a55-40aa770baa3a@postgrespro.ru> > Ok, you convinced me that there are cases when people want to combine > logical replication with streaming replication without slot. > But is it acceptable to have GUC variable (disabled by default) which > allows to use this optimizations? The odds are quite high. Couldn't we introduce a new wal_level value instead? wal_level = logical_only I think this thread shows that logical replication no longer is a superset(?) of physical replication. I thougt that we might be able to change wal_level from scalar to bitmap but it breaks backward compatibility.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 22.08.2019 6:13, Kyotaro Horiguchi wrote: > Hello. > > At Wed, 21 Aug 2019 18:06:52 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in <968fc591-51d3-fd74-8a55-40aa770baa3a@postgrespro.ru> >> Ok, you convinced me that there are cases when people want to combine >> logical replication with streaming replication without slot. >> But is it acceptable to have GUC variable (disabled by default) which >> allows to use this optimizations? > The odds are quite high. Couldn't we introduce a new wal_level > value instead? > > wal_level = logical_only > > > I think this thread shows that logical replication no longer is a > superset(?) of physical replication. I thougt that we might be > able to change wal_level from scalar to bitmap but it breaks > backward compatibility.. > > regards. > I think that introducing new wal_level is good idea. There are a lot of other places (except vacuum) where we insert in the log information which is not needed for logical decoding. Instead of changing all places in code where this information is inserted, we can filter it at xlog level (xlog.c). My only concern is how much incompatibilities will be caused by introducing new wal level. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 22.08.2019 6:13, Kyotaro Horiguchi wrote: > Hello. > > At Wed, 21 Aug 2019 18:06:52 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in <968fc591-51d3-fd74-8a55-40aa770baa3a@postgrespro.ru> >> Ok, you convinced me that there are cases when people want to combine >> logical replication with streaming replication without slot. >> But is it acceptable to have GUC variable (disabled by default) which >> allows to use this optimizations? > The odds are quite high. Couldn't we introduce a new wal_level > value instead? > > wal_level = logical_only > > > I think this thread shows that logical replication no longer is a > superset(?) of physical replication. I thougt that we might be > able to change wal_level from scalar to bitmap but it breaks > backward compatibility.. > > regards. > I can propose the following patch introducing new level logical_only. I will be please to receive comments concerning adding new wal_level and possible problems caused by it. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company