Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints |
Date | |
Msg-id | CAFiTN-uiY_pBV38ciZCmta_qDgV9b5rGOmLR07XQz2-wrg+7Mg@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
|
List | pgsql-hackers |
On Thu, Sep 2, 2021 at 8:52 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Sep 2, 2021 at 2:06 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> 0003- The main patch for WAL logging the created database operation.
Andres pointed out that this approach ends up accessing relations
without taking a lock on them. It doesn't look like you did anything
about that.
I missed that, I have shared my opinion about this in my last email [1]
+ /* Built-in oids are mapped directly */
+ if (classForm->oid < FirstGenbkiObjectId)
+ relfilenode = classForm->oid;
+ else if (OidIsValid(classForm->relfilenode))
+ relfilenode = classForm->relfilenode;
+ else
+ continue;
Am I missing something, or is this totally busted?
Oops, I think the condition should be like below, but I will think carefully before posting the next version if there is something else I am missing.
if (OidIsValid(classForm->relfilenode))
relfilenode = classForm->relfilenode;
else if if (classForm->oid < FirstGenbkiObjectId)
relfilenode = classForm->oid;
else
continue
/*
+ * Now drop all buffers holding data of the target database; they should
+ * no longer be dirty so DropDatabaseBuffers is safe.
The way things worked before, this was true, but now AFAICS it's
false. I'm not sure whether that means that DropDatabaseBuffers() here
is actually unsafe or whether it just means that you haven't updated
the comment to explain the reason.
I think DropDatabaseBuffers(), itself is unsafe, we just copied pages using bufmgr and dirtied the buffers so dropping buffers is definitely unsafe here.
+ * Since we copy the file directly without looking at the shared buffers,
+ * we'd better first flush out any pages of the source relation that are
+ * in shared buffers. We assume no new changes will be made while we are
+ * holding exclusive lock on the rel.
Ditto.
Yeah this comment no longer makes sense now.
+ /* As always, WAL must hit the disk before the data update does. */
Actually, the way it's coded now, part of the on-disk changes are done
before WAL is issued, and part are done after. I doubt that's the
right idea.
There's nothing special about writing the actual payload
bytes vs. the other on-disk changes (creating directories and files).
In any case the ordering deserves a better-considered comment than
this one.
Agreed to all, but In general, I think WAL hitting the disk before data is more applicable for the shared buffers, where we want to flush the WAL first before writing the shared buffer so that in case of torn page we have an option to recover the page from previous FPI. But in such cases where we are creating a directory or file there is no such requirement. Anyways, I agreed with the comments that it should be more uniform and the comment should be correct.
+ XLogRegisterData((char *) PG_MAJORVERSION, nbytes);
Surely this is utterly pointless.
Yeah it is. During the WAL replay also we know the PG_MAJORVERSION :)
+ CopyDatabase(src_dboid, dboid, src_deftablespace, dst_deftablespace);
PG_END_ENSURE_ERROR_CLEANUP(createdb_failure_callback,
PointerGetDatum(&fparms));
I'd leave braces around the code for which we're ensuring error
cleanup even if it's just one line.
Okay
+ if (info == XLOG_DBASEDIR_CREATE)
{
xl_dbase_create_rec *xlrec = (xl_dbase_create_rec *) XLogRecGetData(record);
Seems odd to rename the record but not change the name of the struct.
I think I would be inclined to keep the existing record name, but if
we're going to change it we should be more thorough.
Right, I think we can leave the record name as it is.
pgsql-hackers by date: