Re: In-placre persistance change of a relation - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: In-placre persistance change of a relation
Date
Msg-id 20201113.172312.1767546251154140847.horikyota.ntt@gmail.com
Whole thread Raw
In response to RE: In-placre persistance change of a relation  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses RE: In-placre persistance change of a relation  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
List pgsql-hackers
At Fri, 13 Nov 2020 07:15:41 +0000, "osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com> wrote in 
> Hello, Tsunakawa-San
> 

Thanks for sharing it!

> > Do you know the reason why data copy was done before?  And, it may be
> > odd for me to ask this, but I think I saw someone referred to the past
> > discussion that eliminating data copy is difficult due to some processing at
> > commit.  I can't find it.
> I can share 2 sources why to eliminate the data copy is difficult in hackers thread.
> 
> Tom's remark and the context to copy relation's data.
> https://www.postgresql.org/message-id/flat/31724.1394163360%40sss.pgh.pa.us#31724.1394163360@sss.pgh.pa.us

https://www.postgresql.org/message-id/CA+Tgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0+aLg=x2Q@mail.gmail.com

> No, not really.  The issue is more around what happens if we crash
> part way through.  At crash recovery time, the system catalogs are not
> available, because the database isn't consistent yet and, anyway, the
> startup process can't be bound to a database, let alone every database
> that might contain unlogged tables.  So the sentinel that's used to
> decide whether to flush the contents of a table or index is the
> presence or absence of an _init fork, which the startup process
> obviously can see just fine.  The _init fork also tells us what to
> stick in the relation when we reset it; for a table, we can just reset
> to an empty file, but that's not legal for indexes, so the _init fork
> contains a pre-initialized empty index that we can just copy over.
> 
> Now, to make an unlogged table logged, you've got to at some stage
> remove those _init forks.  But this is not a transactional operation.
> If you remove the _init forks and then the transaction rolls back,
> you've left the system an inconsistent state.  If you postpone the
> removal until commit time, then you have a problem if it fails,

It's true. That are the cause of headache.

> particularly if it works for the first file but fails for the second.
> And if you crash at any point before you've fsync'd the containing
> directory, you have no idea which files will still be on disk after a
> hard reboot.

This is not an issue in this patch *except* the case where init fork
is failed to removed but the following removal of inittmp fork
succeeds.  Another idea is adding a "not-yet-committed" property to a
fork.  I added a new fork type for easiness of the patch but I could
go that way if that is an issue.

> Amit-San quoted this thread and mentioned that point in another thread.
> https://www.postgresql.org/message-id/CAA4eK1%2BHDqS%2B1fhs5Jf9o4ZujQT%3DXBZ6sU0kOuEh2hqQAC%2Bt%3Dw%40mail.gmail.com

This sounds like a bit differrent discussion. Making part-of-a-table
UNLOGGED looks far difficult to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: In-placre persistance change of a relation
Next
From: Pavel Borisov
Date:
Subject: Re: Bogus documentation for bogus geometric operators