Re: Disable WAL logging to speed up data loading - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Disable WAL logging to speed up data loading
Date
Msg-id 93135c87-eba4-f614-e124-e9ec254029fa@oss.nttdata.com
Whole thread Raw
In response to Re: Disable WAL logging to speed up data loading  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Disable WAL logging to speed up data loading
RE: Disable WAL logging to speed up data loading
List pgsql-hackers

On 2020/10/02 10:06, Kyotaro Horiguchi wrote:
> At Thu, 1 Oct 2020 08:14:42 +0000, "osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com> wrote in
>> Hi, Horiguchi-San and Fujii-San.
>>
>>
>> Thank you so much both of you.
>>>> the table needs to be rewriitten. One idea for that is to improve that
>>>> command so that it skips the table rewrite if wal_level=minimal.
>>>> Of course, also you can change wal_level after marking the table as
>>>> unlogged.
>>>
>>> tablecmd.c:
>> The idea is really interesting.
>> I didn't come up with getting rid of the whole copy of
>> the ALTER TABLE UNLOGGED/LOGGED commands
>> only when wal_level='minimal'.
>>
>>>> * There are two reasons for requiring a rewrite when changing
>>>> * persistence: on one hand, we need to ensure that the buffers
>>>> * belonging to each of the two relations are marked with or without
>>>> * BM_PERMANENT properly.  On the other hand, since rewriting creates
>>>> * and assigns a new relfilenode, we automatically create or drop an
>>>> * init fork for the relation as appropriate.
>> Thanks for sharing concrete comments in the source code.
>>
>>> According to this comment, perhaps we can do that at least for
>>> wal_level=minimal.
>> When I compare the 2 ideas,
>> one of the benefits of this ALTER TABLE 's improvement
>> is that we can't avoid the downtime
>> while that of wal_level='none' provides an easy and faster
>> major version up via output file of pg_dumpall.
> 
> The speedup has already been achieved with higher durability by
> wal_level=minimal in that case.

I was thinking the same, i.e., wal_level=minimal + wal_skip_threshold would
speed up that initial data loading.


>  Or maybe you should consider using
> pg_upgrade instead.  Even inducing the time to take a backup copy of
> the whole cluster, running pg_upgrade would be far faster than
> pg_dumpall then loading.
> 
>> Both ideas have good points.
>> However, actually to modify ALTER TABLE's copy
>> looks far more difficult than wal_level='none' and
>> beyond my current ability.
>> So, I'd like to go forward with the direction of wal_level='none'.
>> Did you have strong objections for this direction ?

No, I have no strong objection against your trial. But I was thinking
that it's not so easy to design and implement wal_level=none.
For example, there are some functions and commands depending on
the existence of WAL, like pg_switch_wal(), PREPARE TRANSACTION
and COMMIT PREPARED. Probably you need to define how they should
work in wal_level=none, e.g., emit an error.


> For fuel(?) of the discussion, I tried a very-quick PoC for in-place
> ALTER TABLE SET LOGGED/UNLOGGED and resulted as attached. After some
> trials of several ways, I drifted to the following way after poking
> several ways.

Nice!


> 1. Flip BM_PERMANENT of active buffers
> 2. adding/removing init fork
> 3. sync files,
> 4. Flip pg_class.relpersistence.
> 
> It always skips table copy in the SET UNLOGGED case,

Even in wal_level != minimal?
What happens in the standby side when SET UNLOGGED is executed without
the table rewrite in the primary? The table data should be truncated
in the standby?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: New statistics for tuning WAL buffer size
Next
From: Ian Barwick
Date:
Subject: Re: Improving connection scalability: GetSnapshotData()