Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date
Msg-id CAFiTN-v=U58by_BeiZruNhykxk1q9XUxF+qLzD2LZAsEn2EBkg@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (John Naylor <john.naylor@enterprisedb.com>)
List pgsql-hackers
On Tue, Nov 23, 2021 at 10:29 PM John Naylor
<john.naylor@enterprisedb.com> wrote:
>
> Hi,
>
> I've looked over this patch set and email thread a couple times, and I don't see anything amiss, but I'm also not
terriblyfamiliar with the subsystems this part of the code relies on. I haven't yet tried to stress test with a large
database,but it seems like a good idea to do so. 

Thanks, John for looking into the patches.  Yeah, that makes sense,
next week I will try to test with a large database and maybe with
multiple tablespaces as well to see how this behaves.

> I have a couple comments and questions:
>
> 0006:
>
> + * XXX We can optimize RelationMapOidToFileenodeForDatabase API
> + * so that instead of reading the relmap file every time, it can
> + * save it in a temporary variable and use it for subsequent
> + * calls.  Then later reset it once we're done or at the
> + * transaction end.
>
> Do we really need to consider optimizing here? Only a handful of relations will be found in the relmap, right?

You are right, it is actually not required I will remove this comment.

>
> + * Once we start copying files from the source database, we need to be able
> + * to clean 'em up if we fail.  Use an ENSURE block to make sure this
> + * happens.  (This is not a 100% solution, because of the possibility of
> + * failure during transaction commit after we leave this routine, but it
> + * should handle most scenarios.)
>
> This comment in master started with
>
> - * Once we start copying subdirectories, we need to be able to clean 'em
>
> Is the distinction important enough to change this comment? Also, is "most scenarios" still true with the patch? I
haven'tread into how ENSURE works. 

Actually, it is like PG_TRY(), CATCH() block with extra assurance to
cleanup on shm_exit as well.  And in the cleanup function, we go
through all the tablespaces and remove the new DB-related directory
which we are trying to create.  And you are right, we actually don't
need to change the comments.

> Same with this comment change, seems fine the way it was:

Correct.

> - * Use an ENSURE block to make sure we remove the debris if the copy fails
> - * (eg, due to out-of-disk-space).  This is not a 100% solution, because
> - * of the possibility of failure during transaction commit, but it should
> - * handle most scenarios.
> + * Use an ENSURE block to make sure we remove the debris if the copy fails.
> + * This is not a 100% solution, because of the possibility of failure
> + * during transaction commit, but it should handle most scenarios.
>
> And do we need additional tests? Maybe we don't, but it seems good to make sure.
>
> I haven't looked at 0007, and I have no opinion on which approach is better.

Okay, I like approach 6 because of mainly two reasons, 1) it is not
directly scanning the raw file to identify which files to copy so
seems cleaner to me 2) with 0007 if we directly scan directory we
don't know the relation oid, so before acquiring the buffer lock there
is no way to acquire the relation lock.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: VS2022: Support Visual Studio 2022 on Windows
Next
From: Michael Paquier
Date:
Subject: Re: Some RELKIND macro refactoring