Re: [HACKERS] some review comments on logical rep code - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: [HACKERS] some review comments on logical rep code |
Date | |
Msg-id | CAHGQGwFQ1mARau+0h1_3EeO+3hmfiN8Z28WqEdN2M0S8Jtu5oQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] some review comments on logical rep code (Petr Jelinek <petr.jelinek@2ndquadrant.com>) |
Responses |
Re: [HACKERS] some review comments on logical rep code
|
List | pgsql-hackers |
On Wed, Apr 19, 2017 at 1:35 AM, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > Thank you for working on this! > > On 18/04/17 10:16, Masahiko Sawada wrote: >> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >>> >>>>>>> 3. >>>>>>>> >>>>>>>> ApplyLauncherWakeup() should be static function. >>>>>>> >>>>>>> Attached 003 patch fixes it (and also fixes #5 review comment). >>>>>>> >>>>>>> 4. >>>>>>>> >>>>>>>> Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's >>>>>>>> subcommands (e.g., ENABLED one) should wake the launcher up on commit. >>>>>>> >>>>>>> This is also reported by Horiguchi-san on another thread[1], and patch exists. >>>>>>> >>>>>>> 5. >>>>>>>> >>>>>>>> Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? >>>>>>> >>>>>>> I also guess it's necessary. This change is included in #3 patch. >>>>>> >>>>>> IsBackendPid() is not free, I suppose that just ignoring failure >>>>>> with ESRCH is enough. >>>>> >>>>> After thought, since LogicalRepCtx->launcher_pid could be 0 should, we >>>>> check if launcher_pid != 0? >>> >>> Yes. For example, code to send a signal to autovacuum launcher >>> looks as the follows. I don't think there's a reason to do >>> different thing here. >>> >>> | avlauncher_needs_signal = false; >>> | if (AutoVacPID != 0) >>> | kill(AutoVacPID, SIGUSR2); >>> >> >> Fixed. > > Yes the IsBackendPid was needed only because we didn't reset > launcher_pid and it was causing issues otherwise obviously (and I didn't > realize we are not resetting it...) > >> >>> >>>>>>> 6. >>>>>>>> >>>>>>>> Normal statements that the walsender for logical rep runs are logged >>>>>>>> by log_replication_commands. So if log_statement is also set, >>>>>>>> those commands are logged twice. >>>>>>> >>>>>>> Attached 006 patch fixes it by adding "-c log_statement = none" to >>>>>>> connection parameter of libpqrcv if logical = true. >>>>>> >>>>>> The control by log_replication_commands is performed on >>>>>> walsender, so this also shuld be done on the same place. Anyway >>>>>> log_statement is irrelevant to walsender. >>>>> >>>>> Patch 006 emits all logs executed by the apply worker as a log of >>>>> log_replication_command but perhaps you're right. It would be better >>>>> to leave the log of log_statement if the command executed by the apply >>>>> worker is a SQL command. Will fix. >>> >>> Here, we can choose whether a SQL command is a part of >>> replication commands or not. The previous fix thought it is and >>> this doesn't. (They are emitted in different forms) Is this ok? >> >> Yes, v2 patch logs other than T_SQLCmd as a replication command, and >> T_SQLCmd is logged later in exec_simple_query. The >> log_replication_command logs only replication commands[1], it doesn't >> mean to log commands executed on replication connection. I think such >> behavior is right. >> >>> I'm not sure why you have moved the location of the code, but if >>> it should show all received commands, it might be better to be >>> placed before CHECK_FOR_INTERRUPTS.. >> >> Hmm, the reason why I've moved it is that we cannot know whether given >> cmd_string is a SQL command or a replication command before parsing. >> > > Well the issue is that then the query is not logged if there was parsing > issue even when logging was specifically requested. I don't know what's > good solution here, maybe teaching exec_simple_query to not log the > query if it came from walsender. > > BTW reading that function in walsender, there is : >> /* >> * CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot until the next >> * command arrives. Clean up the old stuff if there's anything. >> */ >> SnapBuildClearExportedSnapshot(); > > and then > >> /* >> * CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot. If it was >> * called outside of transaction the snapshot should be cleared here. >> */ >> if (!IsTransactionBlock()) >> SnapBuildClearExportedSnapshot(); > > The first one should not be there, it looks like a result of some merge > conflict being solved wrongly. Maybe your patch could fix that too? > >>>>> >>>>>>> 10. >>>>>>>> >>>>>>>> SpinLockAcquire(&MyLogicalRepWorker->relmutex); >>>>>>>> MyLogicalRepWorker->relstate = >>>>>>>> GetSubscriptionRelState(MyLogicalRepWorker->subid, >>>>>>>> MyLogicalRepWorker->relid, >>>>>>>> &MyLogicalRepWorker->relstate_lsn, >>>>>>>> false); >>>>>>>> SpinLockRelease(&MyLogicalRepWorker->relmutex); >>>>>>>> >>>>>>>> Non-"short-term" function like GetSubscriptionRelState() should not >>>>>>>> be called while spinlock is being held. >>>>>>>> >>>>>>> >>>>>>> One option is adding new LWLock for the relation state change but this >>>>>>> lock is used at many locations where the operation actually doesn't >>>>>>> take time. I think that the discussion would be needed to get >>>>>>> consensus, so patch for it is not attached. >>>>>> >>>>>> From the point of view of time, I agree that it doesn't seem to >>>>>> harm. Bt I thing it as more significant problem that >>>>>> potentially-throwable function is called within the mutex >>>>>> region. It potentially causes a kind of dead lock. (That said, >>>>>> theoretically dosn't occur and I'm not sure what happens by the >>>>>> dead lock..) >>>>>> > > Hmm, I think doing what I attached should be fine here. Thanks for the patch! > We don't need to > hold spinlock for table read, just for changing the > MyLogicalRepWorker->relstate. Yes, but the update of MyLogicalRepWorker->relstate_lsn also should be protected with the spinlock. Regards, -- Fujii Masao
pgsql-hackers by date: