Thread: Implement UNLOGGED clause for COPY FROM

Implement UNLOGGED clause for COPY FROM

From
"osumi.takamichi@fujitsu.com"
Date:
Hello, hackers.


I've been thinking to suggest
a peformance-oriented feature for COPY FROM.
It's UNLOGGED clause, which means data loading skipping WAL generation.

How to make it work is the following.
1. Aquire ACCESS EXCLUSIVE mode to lock the target table and its indexes.
2. Mark those relations 'unrecoverable' in pg_class/pg_index.
3. Issue one WAL to indicate when COPY UNLOGGED is executed.
4. Execute the data loading, bypassing WAL generation for data.
5. Sync the data to disk by performing checkpoint.

During the recovery,
I'd like to make postgres recognize both marked 'unrecoverable' flags of the second step
and when the command was issued by the third step
in order to recover data that the target table had before the execution of COPY UNLOGGED.

Oracle's SQL*Loader has
such a feature called UNRECOVERABLE to boost the loading speed
for severe time limit of workload.


Best,
    Takamichi Osumi



Re: Implement UNLOGGED clause for COPY FROM

From
Fujii Masao
Date:

On 2020/07/09 11:36, osumi.takamichi@fujitsu.com wrote:
> Hello, hackers.
> 
> 
> I've been thinking to suggest
> a peformance-oriented feature for COPY FROM.
> It's UNLOGGED clause, which means data loading skipping WAL generation.

This feature can work safely with wal_level=replica or logical?
Or it can work only with wal_level=minimal? If yes, what is the main
difference between this method and wal_skip_threshold?

> 
> How to make it work is the following.
> 1. Aquire ACCESS EXCLUSIVE mode to lock the target table and its indexes.
> 2. Mark those relations 'unrecoverable' in pg_class/pg_index.
> 3. Issue one WAL to indicate when COPY UNLOGGED is executed.
> 4. Execute the data loading, bypassing WAL generation for data.
> 5. Sync the data to disk by performing checkpoint.

What happens if the server crashes before #5? Since no WAL for
data-loading can be replayed, the target table should be truncated?

Regards,


-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Implement UNLOGGED clause for COPY FROM

From
"David G. Johnston"
Date:
On Wednesday, July 8, 2020, osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:
5. Sync the data to disk by performing checkpoint.

This step seems to invalidate the idea outright.  The checkpoint command is superuser only and isn’t table specific.  This seems to require both those things to be changed.

Aside from that, though, how does this improve upon the existing capability to copy into an unlogged temporary table?

David J.

RE: Implement UNLOGGED clause for COPY FROM

From
"osumi.takamichi@fujitsu.com"
Date:
Fujii-san

Thank you for your interest in this idea.

> This feature can work safely with wal_level=replica or logical?
> Or it can work only with wal_level=minimal?
>If yes, what is the main difference
> between this method and wal_skip_threshold?
I'm thinking this feature can be used
when you set any parameters of wal_level.
Besides that, data loading into a table *with some data*
should be allowed. This means I don't want to limit
the usage of this feature only for initial load
for empty table or under condition of 'minimal' wal_level in other words.

Let me explain more detail of the background.

I got a report that one of my customers says that
multiple COPY from multiple sessions have a bottleneck to write WAL.
Her use case was DWH system using postgres mainly to load dozens of GB (or more) log data
from multiple data sources to execute night batch processing everyday.

Her scenario included both initial load to empty table
and load to table that already has records.

In passing, she also used our company's product of parallel loader,
to load data with dozens of, nearly 100, BGWs at the same time.
Through investigation of iostat,
they found the same problem that CPU worked for WAL write intensively.

This could happen after the implementation
of Parallel copy that is really hotly discussed and reviewed in the mailing lists now.
So I thought it's good to discuss this in advance.

> > 4. Execute the data loading, bypassing WAL generation for data.
> > 5. Sync the data to disk by performing checkpoint.
>
> What happens if the server crashes before #5? Since no WAL for data-loading can
> be replayed, the target table should be truncated?
My answer for this is just to load that COPY data again.
It's because the application itself knows what kind of data was loaded
from the command.

Lastly, let me add some functional specifications of this clause.
The syntax is "COPY tbl FROM ‘/path/to/input/file’ UNLOGGED".

In terms of streaming replication,
I'd like to ask for advice of other members in this community.
Now, I think this feature requires to re-create standby
immediately after the COPY UNLOGGED like Oracle's clause
but I wanna make postgres more attractive than Oracle to users.
Does someone have any ideas ?


Regards,
    Takamichi Osumi



RE: Implement UNLOGGED clause for COPY FROM

From
"tsunakawa.takay@fujitsu.com"
Date:

From: David G. Johnston <david.g.johnston@gmail.com>

> This step seems to invalidate the idea outright.  The checkpoint command is superuser only and isn’t table specific.  This seems to require both those things to be changed.

 

Perhaps FlushRelationBuffers() followed by smgrsync() can be used instead.  Or, depending on the assumed use case (e.g. the DBA adds data regularly for analytics), we may allow COPY UNLOGGED to be used only by superusers and some special pg_ roles, and COPY UNLOGGED performs checkpoints.  Anyway, I kind of feel that COPY UNLOGGED needs some special privileges, because it renders the table unrecoverable and not being replicated to the standby.

 

 

Regards

Takayuki Tsunakawa

 

Re: Implement UNLOGGED clause for COPY FROM

From
Amit Kapila
Date:
On Thu, Jul 9, 2020 at 11:47 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
>
> In terms of streaming replication,
> I'd like to ask for advice of other members in this community.
> Now, I think this feature requires to re-create standby
> immediately after the COPY UNLOGGED like Oracle's clause
>

This seems quite limiting to me and I think the same will be true for
subscribers that get data via logical replication, right?  I suspect
that the user will perform such an operation from time-to-time and
each time creating replica again could be really time-consuming and
maybe more than it will save by making this operation unlogged.

I wonder do they really need to replicate such a table and its data
because each time creating a replica from scratch after an operation
on one table doesn't sound advisable to me?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Implement UNLOGGED clause for COPY FROM

From
Fujii Masao
Date:

On 2020/07/09 15:17, osumi.takamichi@fujitsu.com wrote:
> Fujii-san
> 
> Thank you for your interest in this idea.
> 
>> This feature can work safely with wal_level=replica or logical?
>> Or it can work only with wal_level=minimal?
>> If yes, what is the main difference
>> between this method and wal_skip_threshold?
> I'm thinking this feature can be used
> when you set any parameters of wal_level.
> Besides that, data loading into a table *with some data*
> should be allowed. This means I don't want to limit
> the usage of this feature only for initial load
> for empty table or under condition of 'minimal' wal_level in other words.
> 
> Let me explain more detail of the background.
> 
> I got a report that one of my customers says that
> multiple COPY from multiple sessions have a bottleneck to write WAL.
> Her use case was DWH system using postgres mainly to load dozens of GB (or more) log data
> from multiple data sources to execute night batch processing everyday.
> 
> Her scenario included both initial load to empty table
> and load to table that already has records.

Yes, I understand this use case.


> 
> In passing, she also used our company's product of parallel loader,
> to load data with dozens of, nearly 100, BGWs at the same time.
> Through investigation of iostat,
> they found the same problem that CPU worked for WAL write intensively.
> 
> This could happen after the implementation
> of Parallel copy that is really hotly discussed and reviewed in the mailing lists now.
> So I thought it's good to discuss this in advance.
> 
>>> 4. Execute the data loading, bypassing WAL generation for data.
>>> 5. Sync the data to disk by performing checkpoint.
>>
>> What happens if the server crashes before #5? Since no WAL for data-loading can
>> be replayed, the target table should be truncated?
> My answer for this is just to load that COPY data again.
> It's because the application itself knows what kind of data was loaded
> from the command.

When the server crashes before  #5, some table and index pages that #4 loaded
the data into might have been partially synced to the disk because of bgwriter
or shared buffer replacement. So ISTM that the target table needs to be
truncated to empty during recovery and users need to load whole the data into
the table again. Is my understanding right? If yes, isn't this the same feature
as that UNLOGGED table provides?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



RE: Implement UNLOGGED clause for COPY FROM

From
"osumi.takamichi@fujitsu.com"
Date:

Hi David Johnston

 

Thank you for your comment.

Aside from that, though, how does this improve upon the existing capability to copy into an unlogged temporary table?

 

[>] unlogged temporary table can’t be inherited over sessions first of all.

And unlogged table needs to be recreated due to startup truncation of the table’s content

when the server crashes.

If you hold massive data in an unlogged table,

you’d forced to spend much time to recover it. This isn’t good.

 

So I’m thinking that COPY UNLOGGED’d provide a more flexible way for keeping data

for COPY FROM users.

 

I’m considering that the feature gives them a choice that during ordinary operation

you can keep WAL logging for a target table and when you need high-speed loading

you can bypass WAL generation for it.

 

To achieve this, we have to

consider a new idea like loaded data’d be added

at the end of the all other pages and detach those

if the server crashes during the UNLOGGED loading processing for example.

 

By the way, “ALTER TABLE tbl SET UNLOGGED” is supported by postgres.

You may think it’s OK to change LOGGED table to UNLOGGED table by this command.

But, it copies the whole relation once actually. (This isn’t written in the manual.)

So this command becomes slow if the table the command is applied to contains a lot of data.

Thus changing the table’s status of UNLOGGED/LOGGED also requires cost at the moment and I think this copy is an obstacle for switching that table’s status.

 

The discussion of the reason is written in the url below.

https://www.postgresql.org/message-id/flat/CAFcNs%2Bpeg3VPG2%3Dv6Lu3vfCDP8mt7cs6-RMMXxjxWNLREgSRVQ%40mail.gmail.com

 

Best

            Takamichi Osumi

 

Re: Implement UNLOGGED clause for COPY FROM

From
Tom Lane
Date:
"osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com> writes:
>> Aside from that, though, how does this improve upon the existing capability to copy into an unlogged temporary
table?

> [>] unlogged temporary table can’t be inherited over sessions first of all.

Unlogged tables don't have to be temporary.

> And unlogged table needs to be recreated due to startup truncation of the table’s content
> when the server crashes.

Indeed, and your proposed feature would extend that un-safety to tables
that are NOT marked unlogged, which is not good.

AFAICS, we can already accomplish basically the same thing as what you
want to do like this:

alter table foo set unlogged;
copy foo from ...;
alter table foo set logged;

The mechanics of that are already well worked out.  It's expensive,
no doubt about that, but I think you're just fooling yourself to
imagine that any shortcuts are possible.  A mix of unlogged and logged
data is never going to work safely.

> To achieve this, we have to
> consider a new idea like loaded data’d be added
> at the end of the all other pages and detach those
> if the server crashes during the UNLOGGED loading processing for example.

You keep on ignoring the indexes... not to mention replication.

            regards, tom lane



RE: Implement UNLOGGED clause for COPY FROM

From
"osumi.takamichi@fujitsu.com"
Date:
Hi,

> AFAICS, we can already accomplish basically the same thing as what you want to
> do like this:
> 
> alter table foo set unlogged;
> copy foo from ...;
> alter table foo set logged;
This didn't satisfy what I wanted.
In case that 'foo' has huge amount of rows at the beginning,
this example would spend much time to copy
the contents of 'foo' twice to swap relfilenodes atomically.
When that loaded data by COPY is big too, its execution time becomes much longer.

> You keep on ignoring the indexes... not to mention replication.
Sorry for having made you think like this.

When the server crash occurs during data loading of COPY UNLOGGED,
it's a must to keep index consistent of course.
I'm thinking that to rebuild the indexes on the target table would work.

In my opinion, UNLOGGED clause must be designed to guarantee that 
where the data loaded by this clause is written starts from the end of all other data blocks.
Plus, those blocks needs to be protected by any write of other transactions during the copy.
Apart from that, the server must be aware of which block is the first block,
or the range about where it started or ended in preparation for the crash.

During the crash recovery, those points are helpful to recognize and detach such blocks
in order to solve a situation that the loaded data is partially synced to the disk and the rest isn't.
In terms of index, we can recreate index based on the relation's data
protected by this mechanism above.

Another idea of index crash recovery was to
copy the indexes on the target table as a backup just before loading and
write new added indexes from loaded data into this temporary index files
in order to localize the new indexes. But, my purpose is to accelerate speed
of data loading under the condition that target table has huge amount of data initially.
Taking this purpose into an evaluation criterion, the initial copy of indexes
would make the execution slow down. Thus, I choose rebuilding index.

Another point I need to add for recovery would be
how the startup postgres knows the condition of COPY UNLOGGED clause.
My current idea is to utilize any other system file or
create a new system file in the cluster for this clause.
At least, it would become necessary for postgres to identify
which blocks should be detached at the beginning when the command is executed.
Therefore, we need add information for it.

Lastly, I have to admit that 
the status of target table where data is loaded by COPY UNLOGGED would be marked
as invalid and notified to standbys under replication environment
from the point in time when the operation takes place.
But, I'm going to allow users with special privileges (like DBA) to use this clause
and this kind of tables would be judged by them not to replicate.
Of course, I'm thinking better idea but now what I can say is like this for replication.

Best,
    Takamichi Osumi

Re: Implement UNLOGGED clause for COPY FROM

From
Amit Kapila
Date:
On Fri, Jul 17, 2020 at 9:53 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> Lastly, I have to admit that
> the status of target table where data is loaded by COPY UNLOGGED would be marked
> as invalid and notified to standbys under replication environment
> from the point in time when the operation takes place.
> But, I'm going to allow users with special privileges (like DBA) to use this clause
> and this kind of tables would be judged by them not to replicate.
> Of course, I'm thinking better idea but now what I can say is like this for replication.
>

If you are going to suggest users not to replicate such tables then
why can't you suggest them to create such tables as UNLOGGED in the
first place?  Another idea could be that you create an 'unlogged'
table, copy the data to it.  Then perform Alter Table .. SET Logged
and attach it to the main table.  I think for this you need the main
table to be partitioned but I think if that is possible then it might
be better than all the hacking you are proposing to do in the server
for this special operation.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



RE: Implement UNLOGGED clause for COPY FROM

From
"osumi.takamichi@fujitsu.com"
Date:
Hi. Amit-san


> If you are going to suggest users not to replicate such tables then why can't you
> suggest them to create such tables as UNLOGGED in the first place?  Another
> idea could be that you create an 'unlogged'
> table, copy the data to it.  Then perform Alter Table .. SET Logged and attach it to
> the main table.  I think for this you need the main table to be partitioned but I
> think if that is possible then it might be better than all the hacking you are
> proposing to do in the server for this special operation.
Thank you for your comment.

At the beginning, I should have mentioned this function was
for data warehouse, where you need to load large amounts of data
in the shortest amount of time. 
Sorry for my bad explanation. 

Based on the fact that data warehouse cannot be separated from
usage of applications like B.I. tool in general,
we cannot define unlogged table at the beginning easily.
Basically, such tools don't support to define unlogged table as far as I know.

And if you want to do so, you need *modification or fix of existing application*
which is implemented by a third party and commercially available for data analytics.
In other words, to make CREATE UNLOGGED TABLE available in that application,
you must revise the product's source code of the application directly,
which is an act to invalidate the warranty from the software company of B.I. tool.
In my opinion, it would be like unrealistic for everyone to do so.

Best,
    Takamichi Osumi

Re: Implement UNLOGGED clause for COPY FROM

From
Amit Kapila
Date:
On Wed, Jul 22, 2020 at 11:11 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> > If you are going to suggest users not to replicate such tables then why can't you
> > suggest them to create such tables as UNLOGGED in the first place?  Another
> > idea could be that you create an 'unlogged'
> > table, copy the data to it.  Then perform Alter Table .. SET Logged and attach it to
> > the main table.  I think for this you need the main table to be partitioned but I
> > think if that is possible then it might be better than all the hacking you are
> > proposing to do in the server for this special operation.
> Thank you for your comment.
>
> At the beginning, I should have mentioned this function was
> for data warehouse, where you need to load large amounts of data
> in the shortest amount of time.
> Sorry for my bad explanation.
>
> Based on the fact that data warehouse cannot be separated from
> usage of applications like B.I. tool in general,
> we cannot define unlogged table at the beginning easily.
> Basically, such tools don't support to define unlogged table as far as I know.
>
> And if you want to do so, you need *modification or fix of existing application*
> which is implemented by a third party and commercially available for data analytics.
> In other words, to make CREATE UNLOGGED TABLE available in that application,
> you must revise the product's source code of the application directly,
> which is an act to invalidate the warranty from the software company of B.I. tool.
> In my opinion, it would be like unrealistic for everyone to do so.
>

So, does this mean that the need for data warehouse application can be
satisfied if the table would have been an 'UNLOGGED'?  However, you
still need 'COPY UNLOGGED ..' syntax because they don't have control
over table definition.  I think if this is the case, then the
application user should find a way for this.  BTW, if the application
is anyway going to execute a PG native syntax like 'COPY UNLOGGED ..'
then why can't they simply set tables as UNLOGGED by using Alter
Table?

IIUC, I think here the case is that the applications are not allowed
to change the standard definitions of tables owned/created by B.I tool
and instead they use some Copy Unlogged sort of syntax provided by
other databases to load the data at a faster speed and now they expect
PG to also have a similar alternative.  If that is true, then it is
possible that those database doesn't have 'Create Unlogged Table ...'
types of syntax which PG has and that is why they have provided such
an alternative for Copy kind of commands.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Implement UNLOGGED clause for COPY FROM

From
Masahiko Sawada
Date:
On Fri, 17 Jul 2020 at 13:23, osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> Hi,
>
> > AFAICS, we can already accomplish basically the same thing as what you want to
> > do like this:
> >
> > alter table foo set unlogged;
> > copy foo from ...;
> > alter table foo set logged;
> This didn't satisfy what I wanted.
> In case that 'foo' has huge amount of rows at the beginning,
> this example would spend much time to copy
> the contents of 'foo' twice to swap relfilenodes atomically.
> When that loaded data by COPY is big too, its execution time becomes much longer.
>
> > You keep on ignoring the indexes... not to mention replication.
> Sorry for having made you think like this.
>
> When the server crash occurs during data loading of COPY UNLOGGED,
> it's a must to keep index consistent of course.
> I'm thinking that to rebuild the indexes on the target table would work.
>
> In my opinion, UNLOGGED clause must be designed to guarantee that
> where the data loaded by this clause is written starts from the end of all other data blocks.
> Plus, those blocks needs to be protected by any write of other transactions during the copy.
> Apart from that, the server must be aware of which block is the first block,
> or the range about where it started or ended in preparation for the crash.
>
> During the crash recovery, those points are helpful to recognize and detach such blocks
> in order to solve a situation that the loaded data is partially synced to the disk and the rest isn't.

How do online backup and archive recovery work?

Suppose that the user executes pg_basebackup during COPY UNLOGGED
running, the physical backup might have the portion of tuples loaded
by COPY UNLOGGED but these data are not recovered. It might not be a
problem because the operation is performed without WAL records. But
what if an insertion happens after COPY UNLOGGED but before
pg_stop_backup()? I think that a new tuple could be inserted at the
end of the table, following the data loaded by COPY UNLOGGED. With
your approach described above, the newly inserted tuple will be
recovered during archive recovery, but it either will be removed if we
replay the insertion WAL then truncate the table or won’t be inserted
due to missing block if we truncate the table then replay the
insertion WAL, resulting in losing the tuple although the user got
successful of insertion.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



RE: Implement UNLOGGED clause for COPY FROM

From
"osumi.takamichi@fujitsu.com"
Date:
Hello.

Apologies for the delay.
> > When the server crash occurs during data loading of COPY UNLOGGED,
> > it's a must to keep index consistent of course.
> > I'm thinking that to rebuild the indexes on the target table would work.
> >
> > In my opinion, UNLOGGED clause must be designed to guarantee that
> > where the data loaded by this clause is written starts from the end of all other
> data blocks.
> > Plus, those blocks needs to be protected by any write of other transactions
> during the copy.
> > Apart from that, the server must be aware of which block is the first
> > block, or the range about where it started or ended in preparation for the crash.
> >
> > During the crash recovery, those points are helpful to recognize and
> > detach such blocks in order to solve a situation that the loaded data is partially
> synced to the disk and the rest isn't.
> 
> How do online backup and archive recovery work ?
> 
> Suppose that the user executes pg_basebackup during COPY UNLOGGED running,
> the physical backup might have the portion of tuples loaded by COPY UNLOGGED
> but these data are not recovered. It might not be a problem because the operation
> is performed without WAL records. But what if an insertion happens after COPY
> UNLOGGED but before pg_stop_backup()? I think that a new tuple could be
> inserted at the end of the table, following the data loaded by COPY UNLOGGED.
> With your approach described above, the newly inserted tuple will be recovered
> during archive recovery, but it either will be removed if we replay the insertion
> WAL then truncate the table or won’t be inserted due to missing block if we
> truncate the table then replay the insertion WAL, resulting in losing the tuple
> although the user got successful of insertion.
I consider that from the point in time when COPY UNLOGGED is executed,
any subsequent operations to the data which comes from UNLOGGED operation
also cannot be recovered even if those issued WAL.

This is basically inevitable because subsequent operations 
after COPY UNLOGGED depend on blocks of loaded data without WAL,
which means we cannot replay exact operations.

Therefore, all I can do is to guarantee that 
when one recovery process ends, the target table returns to the state
immediately before the COPY UNLOGGED is executed.
This could be achieved by issuing and notifying the server of an invalidation WAL,
an indicator to stop WAL application toward one specific table after this new type of WAL.
I think I need to implement this mechanism as well for this feature.
Thus, I'll take a measure against your concern of confusing data loss.

For recovery of the loaded data itself, the user of this clause,
like DBA or administrator of data warehouse for instance, 
would need to make a backup just after the data loading.
For some developers, this behavior would seem incomplete because of the heavy user's burden.

On the other hand, I'm aware of a fact that Oracle Database has a feature of UNRECOVERABLE clause,
which is equivalent to what I'm suggesting now in this thread.

This data loading without REDO log by the clause is more convenient than what I said above,
because it's supported by a tool named Recovery Manager which enables users to make an incremental backup.
This works to back up only the changed blocks since the previous backup and
remove the manual burden from the user like above.
Here, I have to admit that I cannot design and implement 
this kind of synergistic pair of all features at once for data warehousing.
So I'd like to make COPY UNLOGGED as the first step.

This is the URL of how Oracle database for data warehouse achieves the backup of no log operation while acquiring high
speedof data loading.
 
https://docs.oracle.com/database/121/VLDBG/GUID-42825ED1-C4C5-449B-870F-D2C8627CBF86.htm#VLDBG1578

Best,
    Takamichi Osumi

Re: Implement UNLOGGED clause for COPY FROM

From
Kyotaro Horiguchi
Date:
At Thu, 20 Aug 2020 00:18:52 +0000, "osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com> wrote in 
> Hello.
> 
> Apologies for the delay.
> > > When the server crash occurs during data loading of COPY UNLOGGED,
> > > it's a must to keep index consistent of course.
> > > I'm thinking that to rebuild the indexes on the target table would work.
> > >
> > > In my opinion, UNLOGGED clause must be designed to guarantee that
> > > where the data loaded by this clause is written starts from the end of all other
> > data blocks.
> > > Plus, those blocks needs to be protected by any write of other transactions
> > during the copy.
> > > Apart from that, the server must be aware of which block is the first
> > > block, or the range about where it started or ended in preparation for the crash.
> > >
> > > During the crash recovery, those points are helpful to recognize and
> > > detach such blocks in order to solve a situation that the loaded data is partially
> > synced to the disk and the rest isn't.
> > 
> > How do online backup and archive recovery work ?
> > 
> > Suppose that the user executes pg_basebackup during COPY UNLOGGED running,
> > the physical backup might have the portion of tuples loaded by COPY UNLOGGED
> > but these data are not recovered. It might not be a problem because the operation
> > is performed without WAL records. But what if an insertion happens after COPY
> > UNLOGGED but before pg_stop_backup()? I think that a new tuple could be
> > inserted at the end of the table, following the data loaded by COPY UNLOGGED.
> > With your approach described above, the newly inserted tuple will be recovered
> > during archive recovery, but it either will be removed if we replay the insertion
> > WAL then truncate the table or won’t be inserted due to missing block if we
> > truncate the table then replay the insertion WAL, resulting in losing the tuple
> > although the user got successful of insertion.
> I consider that from the point in time when COPY UNLOGGED is executed,
> any subsequent operations to the data which comes from UNLOGGED operation
> also cannot be recovered even if those issued WAL.
> 
> This is basically inevitable because subsequent operations 
> after COPY UNLOGGED depend on blocks of loaded data without WAL,
> which means we cannot replay exact operations.
> 
> Therefore, all I can do is to guarantee that 
> when one recovery process ends, the target table returns to the state
> immediately before the COPY UNLOGGED is executed.
> This could be achieved by issuing and notifying the server of an invalidation WAL,
> an indicator to stop WAL application toward one specific table after this new type of WAL.
> I think I need to implement this mechanism as well for this feature.
> Thus, I'll take a measure against your concern of confusing data loss.
> 
> For recovery of the loaded data itself, the user of this clause,
> like DBA or administrator of data warehouse for instance, 
> would need to make a backup just after the data loading.
> For some developers, this behavior would seem incomplete because of the heavy user's burden.
> 
> On the other hand, I'm aware of a fact that Oracle Database has a feature of UNRECOVERABLE clause,
> which is equivalent to what I'm suggesting now in this thread.
> 
> This data loading without REDO log by the clause is more convenient than what I said above,
> because it's supported by a tool named Recovery Manager which enables users to make an incremental backup.
> This works to back up only the changed blocks since the previous backup and
> remove the manual burden from the user like above.
> Here, I have to admit that I cannot design and implement 
> this kind of synergistic pair of all features at once for data warehousing.
> So I'd like to make COPY UNLOGGED as the first step.
> 
> This is the URL of how Oracle database for data warehouse achieves the backup of no log operation while acquiring
highspeed of data loading.
 
> https://docs.oracle.com/database/121/VLDBG/GUID-42825ED1-C4C5-449B-870F-D2C8627CBF86.htm#VLDBG1578

Anyway, if the target table is turned back to LOGGED, the succeedeing
WAL stream can be polluted by the logs on the table. So any operations
relying on WAL records is assumed not to be continuable. Not only
useless, it is harmful. I think we don't accept emitting an
insconsistent WAL stream intentionally while wal_level > minimal.

You assert that we could prevent WAL from redoed by the "invalidation"
but it is equivalent to turning the table into UNLOGGED. Why do you
insist that a table should be labeled as "LOGGED" whereas it is
virtually UNLOGGED? That costs nothing (if the table is almost empty).

If you want to get the table back to LOGGED without emitting WAL,
wal_level=minimal works. That requires restart twice, and table
copying, though. It seems like that we can skip copying in this case
but I'm not sure.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Implement UNLOGGED clause for COPY FROM

From
Amit Kapila
Date:
On Thu, Aug 20, 2020 at 5:49 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> Hello.
> > > During the crash recovery, those points are helpful to recognize and
> > > detach such blocks in order to solve a situation that the loaded data is partially
> > synced to the disk and the rest isn't.
> >
> > How do online backup and archive recovery work ?
> >
> > Suppose that the user executes pg_basebackup during COPY UNLOGGED running,
> > the physical backup might have the portion of tuples loaded by COPY UNLOGGED
> > but these data are not recovered. It might not be a problem because the operation
> > is performed without WAL records. But what if an insertion happens after COPY
> > UNLOGGED but before pg_stop_backup()? I think that a new tuple could be
> > inserted at the end of the table, following the data loaded by COPY UNLOGGED.
> > With your approach described above, the newly inserted tuple will be recovered
> > during archive recovery, but it either will be removed if we replay the insertion
> > WAL then truncate the table or won’t be inserted due to missing block if we
> > truncate the table then replay the insertion WAL, resulting in losing the tuple
> > although the user got successful of insertion.
> I consider that from the point in time when COPY UNLOGGED is executed,
> any subsequent operations to the data which comes from UNLOGGED operation
> also cannot be recovered even if those issued WAL.
>
> This is basically inevitable because subsequent operations
> after COPY UNLOGGED depend on blocks of loaded data without WAL,
> which means we cannot replay exact operations.
>
> Therefore, all I can do is to guarantee that
> when one recovery process ends, the target table returns to the state
> immediately before the COPY UNLOGGED is executed.
> This could be achieved by issuing and notifying the server of an invalidation WAL,
> an indicator to stop WAL application toward one specific table after this new type of WAL.
>

I don't think we can achieve what you want by one special invalidation
WAL. Consider a case where an update has happened on the page which
exists before 'Copy Unlogged' operation and while writing that page to
disk, the system crashed and the page is half-written. Without the
special WAL mechanism you are proposing to introduce, during recovery,
we can replay the full-page-image from WAL of such a page and then
perform the required update, so after recovery, the page won't be torn
anymore.

Basically, the idea is that to protect from such torn-writes
(half-written pages), we have a concept called full-page writes which
protects the data from such writes after recovery. Before writing to
any page after a checkpoint, we write its full-page-image in WAL which
helps us in recovering from such situations but with your proposed
mechanism it won't work.

Another concern I have with this idea is that you want to keep writing
WAL for such a relation but don't want to replay in recovery which
sounds like a good idea.

The idea to keep part of the table as logged and other as unlogged
sounds scary to me. Now, IIUC, you are trying to come up with these
ideas because to use Alter Table .. Set Unlogged, one has to rewrite
the entire table and if such a table is large, it will be a very
time-consuming operation. You might want to explore whether we can
avoid rewriting the table for such an operation but I don't think that
is easy either. The two problems I could see immediately are (a) we
have to change BM_PERMANENT marking of exiting buffers of such a
relation which again can be a time-consuming operation especially for
a large value of shread_buffers, (b) create an _init fork of such a
relation in-sync with the commit. You might want to read the archives
to see why at the first place we have decided to re-write the table
for SET UNLOGGED operation, see email [1].

[1] - https://www.postgresql.org/message-id/CAFcNs%2Bpeg3VPG2%3Dv6Lu3vfCDP8mt7cs6-RMMXxjxWNLREgSRVQ%40mail.gmail.com

--
With Regards,
Amit Kapila.



RE: Implement UNLOGGED clause for COPY FROM

From
"tsunakawa.takay@fujitsu.com"
Date:
Hello,


I think it's worth thinking about a sophisticated feature like Oracle's UNRECOVERABLE data loading (because SQL
Server'sBCP load utility also has such a feature, but for an empty table), how about an easier approach like MySQL?  I
expectthis won't complicate Postgres code much.
 

The customer is using Oracle RAC for high availability of a data warehouse.  Then, I think they can use the traditional
shareddisk-based HA clustering, not the streaming replication when they migrate to Postgres.
 

They load data into the data warehouse with the nightly ETL or ELT.  The loading window is limited, so they run
multipleconcurrent loading sessions, with the transaction logging off.  They probably use all resources for the data
loadingduring that period.
 

Then, you might think "How about turning fsync and full_page_writes to off?"  But the customer doesn't like to be
worriedabout the massive amount of WAL generated during the loading.
 


OTOH, the latest MySQL 8.0.21 introduced the following feature.  This is for the initial data loading into a new
databaseinstance, though.
 


https://dev.mysql.com/doc/refman/8.0/en/innodb-redo-log.html#innodb-disable-redo-logging
--------------------------------------------------
Disabling Redo Logging
As of MySQL 8.0.21, you can disable redo logging using the ALTER INSTANCE DISABLE INNODB REDO_LOG statement. This
functionalityis intended for loading data into a new MySQL instance. Disabling redo logging speeds up data loading by
avoidingredo log writes and doublewrite buffering.
 

Warning
This feature is intended only for loading data into a new MySQL instance. Do not disable redo logging on a production
system.It is permitted to shutdown and restart the server while redo logging is disabled, but an unexpected server
stoppagewhile redo logging is disabled can cause data loss and instance corruption.
 

Attempting to restart the server after an unexpected server stoppage while redo logging is disabled is refused with the
followingerror:
 

[ERROR] [MY-013578] [InnoDB] Server was killed when Innodb Redo 
logging was disabled. Data files could be corrupt. You can try 
to restart the database with innodb_force_recovery=6
In this case, initialize a new MySQL instance and start the data loading procedure again.
--------------------------------------------------


Following this idea, what do you think about adding a new value "none" to wal_level, where no WAL is generated?  The
settingof wal_level is recorded in pg_control.  The startup process can see the value and reject recovery after
abnormalshutdown, emitting a message similar to MySQL's.
 

Just a quick idea.  I hope no devil will appear in the details.


Regards
Takayuki Tsunakawa


Re: Implement UNLOGGED clause for COPY FROM

From
Amit Kapila
Date:
On Wed, Aug 26, 2020 at 12:54 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
>
>
> Following this idea, what do you think about adding a new value "none" to wal_level, where no WAL is generated?  The
settingof wal_level is recorded in pg_control.  The startup process can see the value and reject recovery after
abnormalshutdown, emitting a message similar to MySQL's. 
>

So you want your users to shutdown and restart the server before Copy
because that would be required if you want to change the wal_level.
However, even if we do that, users who are running the server
previously with wal_level as 'replica' won't be happy after doing this
change. Because if they change the wal_level to 'none' for certain
operations like bulk load and then again change back the mode to
'replica' they need to back up the database again to setup 'replica'
as they can't continue replication from the previous point (consider
update on a page for which previously WAL was not written).

--
With Regards,
Amit Kapila.



RE: Implement UNLOGGED clause for COPY FROM

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Amit Kapila <amit.kapila16@gmail.com>
> So you want your users to shutdown and restart the server before Copy
> because that would be required if you want to change the wal_level.

Yes.  They seem to be fine with it, as far as I heard from a person who is involved in the system design.


> However, even if we do that, users who are running the server
> previously with wal_level as 'replica' won't be happy after doing this
> change. Because if they change the wal_level to 'none' for certain
> operations like bulk load and then again change back the mode to
> 'replica' they need to back up the database again to setup 'replica'
> as they can't continue replication from the previous point (consider
> update on a page for which previously WAL was not written).

Yes, it requires the database backup.  The database backup should be a daily task anyway, so I expect it wouldn't
imposeextra maintenance burdon on the user.  Plus, not all users use the streaming replication for HA.  I think it's
helpfulfor the maturing Postgres to provide some kind of solution for some context so that user can scratch their
itches.


Regards
Takayuki Tsunakawa


Re: Implement UNLOGGED clause for COPY FROM

From
Amit Kapila
Date:
On Thu, Aug 27, 2020 at 7:04 AM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
>
> From: Amit Kapila <amit.kapila16@gmail.com>
> > So you want your users to shutdown and restart the server before Copy
> > because that would be required if you want to change the wal_level.
>
> Yes.  They seem to be fine with it, as far as I heard from a person who is involved in the system design.
>
>
> > However, even if we do that, users who are running the server
> > previously with wal_level as 'replica' won't be happy after doing this
> > change. Because if they change the wal_level to 'none' for certain
> > operations like bulk load and then again change back the mode to
> > 'replica' they need to back up the database again to setup 'replica'
> > as they can't continue replication from the previous point (consider
> > update on a page for which previously WAL was not written).
>
> Yes, it requires the database backup.  The database backup should be a daily task anyway, so I expect it wouldn't
imposeextra maintenance burdon on the user.
 
>

Sure, but on a daily basis, one requires only incremental WAL to
complete the backup but in this case, it would require the entire
database back up unless we have some form of block-level incremental
backup method. OTOH, I don't deny that there is some use case with
wal_level = 'none' for initial data loading as MySQL provides but I
think that is a separate feature than what is proposed here (Copy
Unlogged). It might be better if you can start a separate thread for
that with some more details on the implementation side as well.

-- 
With Regards,
Amit Kapila.



RE: Implement UNLOGGED clause for COPY FROM

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Amit Kapila <amit.kapila16@gmail.com>
> Sure, but on a daily basis, one requires only incremental WAL to
> complete the backup but in this case, it would require the entire
> database back up unless we have some form of block-level incremental
> backup method. 

Regarding the backup time, I think users can shorten it by using the storage device's snapshoting (or split
mirroring?),filesystem's snapshot feature.
 


> OTOH, I don't deny that there is some use case with
> wal_level = 'none' for initial data loading as MySQL provides but I
> think that is a separate feature than what is proposed here (Copy
> Unlogged). It might be better if you can start a separate thread for
> that with some more details on the implementation side as well.

Yeah, the feature doesn't match the title of this thread and could confuse other readers that join later.
Nevertheless,I think MySQL's feature could be used for additional data loading as well if the user understands what
he/sheis doing.  So, we can discuss it in another thread just in case the ongoing discussion gets stuck.
 


Regards
Takayuki Tsunakawa




Re: Implement UNLOGGED clause for COPY FROM

From
Peter Smith
Date:
Hi.

I expect I have some basic misunderstanding because IMO now this
thread seems to have come full circle.

Earlier, Osumi-san was rejecting the idea of using ALTER TABLE tbl SET
UNLOGGED on basis that it is too time consuming for large data to
switch the table modes [1].

Now the latest idea is to introduce a wal_level=none. But now
apparently full daily backups are OK, and daily restarting the server
before the copies is also OK [2].

~

Doesn't wal_level=none essentially just behave as if every table was
UNLOGGED; not just the ones we are loading?

Doesn't wal_level=none come with all the same limitations/requirements
(full daily backups/restarts etc) that the UNLOGGED TABLE would also
have?

So I don't recognise the difference?

If wal_level=none is judged OK as a fast loading solution, then why
wasn't an initially UNLOGGED table also judged OK by the same
criteria? And if there is no real difference, then why is it necessary
to introduce wal_level=none (instead of using the existing UNLOGGED
feature) in the first place?

Or, if all this problem is simply due to a quirk that the BI tool
referred to does not support the CREATE UNLOGGED TABLE syntax [3],
then surely there is some other workaround could be written to handle
that.

What part am I missing?

--
[1] -
https://www.postgresql.org/message-id/OSBPR01MB48884832932F93DAA953CEB9ED650%40OSBPR01MB4888.jpnprd01.prod.outlook.com
[2] -
https://www.postgresql.org/message-id/TYAPR01MB299005FC543C43348A4993FDFE550%40TYAPR01MB2990.jpnprd01.prod.outlook.com
[3] -
https://www.postgresql.org/message-id/OSBPR01MB4888CBD08DDF73721C18D2C0ED790%40OSBPR01MB4888.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Implement UNLOGGED clause for COPY FROM

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Peter Smith <smithpb2250@gmail.com>
> Earlier, Osumi-san was rejecting the idea of using ALTER TABLE tbl SET
> UNLOGGED on basis that it is too time consuming for large data to
> switch the table modes [1].

> Doesn't wal_level=none essentially just behave as if every table was
> UNLOGGED; not just the ones we are loading?
> 
> Doesn't wal_level=none come with all the same limitations/requirements
> (full daily backups/restarts etc) that the UNLOGGED TABLE would also
> have?

ALTER TABLE takes long time proportional to the amount of existing data, while wal_level = none doesn't.


Regards
Takayuki Tsunakawa


Re: Implement UNLOGGED clause for COPY FROM

From
Peter Smith
Date:
On Thu, Sep 10, 2020 at 7:16 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:

> ALTER TABLE takes long time proportional to the amount of existing data, while wal_level = none doesn't.

Right, but if wal_level=none is considered OK for that table with
existing data, then why not just create the table UNLOGGED in the
first place? (or ALTER it to set UNLOGGED just one time and then leave
it as UNLOGGED).

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Implement UNLOGGED clause for COPY FROM

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Peter Smith <smithpb2250@gmail.com>
On Thu, Sep 10, 2020 at 7:16 PM tsunakawa.takay@fujitsu.com
> <tsunakawa.takay@fujitsu.com> wrote:
> > ALTER TABLE takes long time proportional to the amount of existing data,
> while wal_level = none doesn't.
> 
> Right, but if wal_level=none is considered OK for that table with
> existing data, then why not just create the table UNLOGGED in the
> first place? (or ALTER it to set UNLOGGED just one time and then leave
> it as UNLOGGED).

The target tables sometimes receive updates (for data maintenance and/or correction).  They don't want those updates to
belost due to the database server crash.  Unlogged tables lose their entire contents during crash recovery.
 

Please think like this: logging is is the norm, and unlogged operations are exceptions/hacks for some requirement of
whichthe user wants to minimize the use.
 


Regards
Takayuki Tsunakawa





Re: Implement UNLOGGED clause for COPY FROM

From
Kyotaro Horiguchi
Date:
At Fri, 11 Sep 2020 05:15:32 +0000, "tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> wrote in 
> From: Peter Smith <smithpb2250@gmail.com>
> On Thu, Sep 10, 2020 at 7:16 PM tsunakawa.takay@fujitsu.com
> > <tsunakawa.takay@fujitsu.com> wrote:
> > > ALTER TABLE takes long time proportional to the amount of existing data,
> > while wal_level = none doesn't.
> > 
> > Right, but if wal_level=none is considered OK for that table with
> > existing data, then why not just create the table UNLOGGED in the
> > first place? (or ALTER it to set UNLOGGED just one time and then leave
> > it as UNLOGGED).
> 
> The target tables sometimes receive updates (for data maintenance and/or correction).  They don't want those updates
tobe lost due to the database server crash.  Unlogged tables lose their entire contents during crash recovery.
 
> 
> Please think like this: logging is is the norm, and unlogged operations are exceptions/hacks for some requirement of
whichthe user wants to minimize the use.
 

I suspect that wal_level=none is a bit too toxic.

"ALTER TABLE SET UNLOGGED" doesn't dump large amount of WAL so I don't
think it can be a problem. "ALTER TABLE SET LOGGED" also doesn't issue
WAL while wal_level=minimal but runs a table copy. I think the only
problem of the UNLOGGED table method is that table copy.

If we can skip the table-copy when ALTER TABLE SET LOGGED on
wal_level=minimal, is your objective achived?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Implement UNLOGGED clause for COPY FROM

From
Kyotaro Horiguchi
Date:
At Fri, 11 Sep 2020 17:36:19 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> At Fri, 11 Sep 2020 05:15:32 +0000, "tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> wrote in 
> > From: Peter Smith <smithpb2250@gmail.com>
> > On Thu, Sep 10, 2020 at 7:16 PM tsunakawa.takay@fujitsu.com
> > > <tsunakawa.takay@fujitsu.com> wrote:
> > > > ALTER TABLE takes long time proportional to the amount of existing data,
> > > while wal_level = none doesn't.
> > > 
> > > Right, but if wal_level=none is considered OK for that table with
> > > existing data, then why not just create the table UNLOGGED in the
> > > first place? (or ALTER it to set UNLOGGED just one time and then leave
> > > it as UNLOGGED).
> > 
> > The target tables sometimes receive updates (for data maintenance and/or correction).  They don't want those
updatesto be lost due to the database server crash.  Unlogged tables lose their entire contents during crash recovery.
 
> > 
> > Please think like this: logging is is the norm, and unlogged operations are exceptions/hacks for some requirement
ofwhich the user wants to minimize the use.
 
> 
> I suspect that wal_level=none is a bit too toxic.
> 
> "ALTER TABLE SET UNLOGGED" doesn't dump large amount of WAL so I don't
> think it can be a problem. "ALTER TABLE SET LOGGED" also doesn't issue

(Oops! this runs a table copy)

> WAL while wal_level=minimal but runs a table copy. I think the only
> problem of the UNLOGGED table method is that table copy.
> 
> If we can skip the table-copy when ALTER TABLE SET LOGGED on
> wal_level=minimal, is your objective achived?

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



RE: Implement UNLOGGED clause for COPY FROM

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> If we can skip the table-copy when ALTER TABLE SET LOGGED on
> wal_level=minimal, is your objective achived?

I expect so, if we can skip the table copy during ALTER TABLE SET LOGGED/UNLOGGED.  On the other hand, both approaches
havedifferent pros and cons.  It's nice that ALTER TABLE doesn't require database restart, but the user has to specify
tables. wal_level = none is vice versa.  Anyway, wal_level = none would be useful for initial data loading after
creatinga new database cluster, as MySQL suggests. 


Regards
Takayuki Tsunakawa