Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date
Msg-id CAFiTN-veqQXM3uUh2AXxzFJ74ToTaEBqMM9ht8CuJ6=wAyqjUQ@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Thu, Sep 2, 2021 at 8:52 PM Robert Haas <robertmhaas@gmail.com> wrote:

PFA, updated version of the patch, where I have fixed the issues
reported by you and also done some more refactoring and patch split,
next I am planning to post the patch with another approach where we
scan the directory instead of scanning the pg_class for identifying
the relfilenodes.  For specific comments please find my response
inline,


> Andres pointed out that this approach ends up accessing relations
> without taking a lock on them. It doesn't look like you did anything
> about that.

Now I have acquired a lock before scanning the pg_class as well as
other relfilenode.

>
> + /* Built-in oids are mapped directly */
> + if (classForm->oid < FirstGenbkiObjectId)
> + relfilenode = classForm->oid;
> + else if (OidIsValid(classForm->relfilenode))
> + relfilenode = classForm->relfilenode;
> + else
> + continue;
>
> Am I missing something, or is this totally busted?

Handled the mapped relation using relmapper.

>   /*
> + * Now drop all buffers holding data of the target database; they should
> + * no longer be dirty so DropDatabaseBuffers is safe.
>
> The way things worked before, this was true, but now AFAICS it's
> false. I'm not sure whether that means that DropDatabaseBuffers() here
> is actually unsafe or whether it just means that you haven't updated
> the comment to explain the reason.

Now we can only drop the buffer specific to old tablespace not the new
tablespace so can not directly use the dboid, so extended the
DropDatabaseBuffers interface to take tablespace oid as and input and
updated the comments accordingly.

> + * Since we copy the file directly without looking at the shared buffers,
> + * we'd better first flush out any pages of the source relation that are
> + * in shared buffers.  We assume no new changes will be made while we are
> + * holding exclusive lock on the rel.
>
> Ditto.

I think these comments is related to index_copy_data() and this is
still valid, it is showing in the patch due to some refactoring so I
have separated out this refactoring patch as 0003 to avoid confusion.

>
> + /* As always, WAL must hit the disk before the data update does. */
>
> Actually, the way it's coded now, part of the on-disk changes are done
> before WAL is issued, and part are done after. I doubt that's the
> right idea. There's nothing special about writing the actual payload
> bytes vs. the other on-disk changes (creating directories and files).
> In any case the ordering deserves a better-considered comment than
> this one.

Changed, now WAL first and then disk change.


Open question:
- Scan pg_class vs scan directories
- Whether to retain the old created database mechanism as option or not.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Added schema level support for publication.
Next
From: tushar
Date:
Subject: two_phase commit parameter used in subscription for a publication which is on < 15.