Re: Disable WAL logging to speed up data loading - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Disable WAL logging to speed up data loading |
Date | |
Msg-id | 93135c87-eba4-f614-e124-e9ec254029fa@oss.nttdata.com Whole thread Raw |
In response to | Re: Disable WAL logging to speed up data loading (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Disable WAL logging to speed up data loading
RE: Disable WAL logging to speed up data loading |
List | pgsql-hackers |
On 2020/10/02 10:06, Kyotaro Horiguchi wrote: > At Thu, 1 Oct 2020 08:14:42 +0000, "osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com> wrote in >> Hi, Horiguchi-San and Fujii-San. >> >> >> Thank you so much both of you. >>>> the table needs to be rewriitten. One idea for that is to improve that >>>> command so that it skips the table rewrite if wal_level=minimal. >>>> Of course, also you can change wal_level after marking the table as >>>> unlogged. >>> >>> tablecmd.c: >> The idea is really interesting. >> I didn't come up with getting rid of the whole copy of >> the ALTER TABLE UNLOGGED/LOGGED commands >> only when wal_level='minimal'. >> >>>> * There are two reasons for requiring a rewrite when changing >>>> * persistence: on one hand, we need to ensure that the buffers >>>> * belonging to each of the two relations are marked with or without >>>> * BM_PERMANENT properly. On the other hand, since rewriting creates >>>> * and assigns a new relfilenode, we automatically create or drop an >>>> * init fork for the relation as appropriate. >> Thanks for sharing concrete comments in the source code. >> >>> According to this comment, perhaps we can do that at least for >>> wal_level=minimal. >> When I compare the 2 ideas, >> one of the benefits of this ALTER TABLE 's improvement >> is that we can't avoid the downtime >> while that of wal_level='none' provides an easy and faster >> major version up via output file of pg_dumpall. > > The speedup has already been achieved with higher durability by > wal_level=minimal in that case. I was thinking the same, i.e., wal_level=minimal + wal_skip_threshold would speed up that initial data loading. > Or maybe you should consider using > pg_upgrade instead. Even inducing the time to take a backup copy of > the whole cluster, running pg_upgrade would be far faster than > pg_dumpall then loading. > >> Both ideas have good points. >> However, actually to modify ALTER TABLE's copy >> looks far more difficult than wal_level='none' and >> beyond my current ability. >> So, I'd like to go forward with the direction of wal_level='none'. >> Did you have strong objections for this direction ? No, I have no strong objection against your trial. But I was thinking that it's not so easy to design and implement wal_level=none. For example, there are some functions and commands depending on the existence of WAL, like pg_switch_wal(), PREPARE TRANSACTION and COMMIT PREPARED. Probably you need to define how they should work in wal_level=none, e.g., emit an error. > For fuel(?) of the discussion, I tried a very-quick PoC for in-place > ALTER TABLE SET LOGGED/UNLOGGED and resulted as attached. After some > trials of several ways, I drifted to the following way after poking > several ways. Nice! > 1. Flip BM_PERMANENT of active buffers > 2. adding/removing init fork > 3. sync files, > 4. Flip pg_class.relpersistence. > > It always skips table copy in the SET UNLOGGED case, Even in wal_level != minimal? What happens in the standby side when SET UNLOGGED is executed without the table rewrite in the primary? The table data should be truncated in the standby? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
pgsql-hackers by date: