Re: global / super barriers (for checksums) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: global / super barriers (for checksums) |
Date | |
Msg-id | CA+TgmobRhjD+NPxjcJSjwpGOxt_w6cObsYS4EtZntUc7S1Oe3w@mail.gmail.com Whole thread Raw |
In response to | global / super barriers (for checksums) (Andres Freund <andres@anarazel.de>) |
Responses |
Re: global / super barriers (for checksums)
Re: global / super barriers (for checksums) |
List | pgsql-hackers |
On Tue, Oct 30, 2018 at 1:17 AM Andres Freund <andres@anarazel.de> wrote: > The patch definitely is in a prototype stage. At the very least it needs > a high-level comment somewhere, and some of the lower-level code needs > to be cleaned up. > > One thing I wasn't happy about is how checksum internals have to absorb > barrier requests - that seems unavoidable, but I'd hope for something > more global than just BufferSync(). Hi, TL;DR: I'm not sure that we need 0001; I propose to commit 0002; and I have some concerns about 0003 and am interested in working further on it. 0001 changes the StartBackgroundWorker so that the SIGINT handler is contingent on BGWORKER_SHMEM_ACCESS rather than BGWORKER_BACKEND_DATABASE_CONNECTION. It seems to me that the goal here should be to use procsignal_sigusr1_handler(), or something that calls it, in any process where ProcSignalInit() is called, but a backend that only requests BGWORKER_SHMEM_ACCESS probably won't, because the normal way for a background worker to call ProcSignalInit() would be to indirectly call InitPostgres() by means of BackgroundWorkerInitializeConnection() or BackgroundWorkerInitializeConnectionByOid(), and a worker that didn't ask for a database connection presumably shouldn't be doing that. So I'm not sure that I understand why we need this. (Is it legal for a worker to call one of these functions as long as it passes InvalidOid for the database OID, or something? Do we have examples of workers that do that?) On the other hand, 0002 seems like it's pretty clearly a good idea. It makes a whole bunch of auxiliary processes use procsignal_sigusr1_handler() and those things all get called from AuxiliaryProcessMain(), which does ProcSignalInit(), and it seems like clearly the right idea that processes which register to participate in the procsignal mechanism should also register to get notified if they receive a procsignal. I think that the reason we haven't bothered with this up until now is because I think that it's presently impossible for any of the kind of procsignals that we have to get sent to any of those processes. But, global barriers would require us to do so, so it seems like it's time to tighten that up, and it doesn't really cost anything. So I propose to commit this part soon, unless somebody objects. Regarding 0003: - The proc->pid == 0 check in EmitGlobalBarrier() doesn't really do what it appears to do, because regular backends don't clear proc->pid when they exit; only auxiliary processes do. (We could, and perhaps should, change that.) - It seems to me that it would be generally better to insert CHECK_FOR_INTERRUPTS() in places where the patch just inserts an ad-hoc if (GlobalBarrierInterruptPending) ProcessGlobalBarrierIntterupt(). There might be someplace where we have to do it the latter way because we unavoidably hold an LWLock or something, but I think we should avoid that whenever possible. If it's a good place to check for one kind of interrupt, it's probably a good place to check for all of them. - I don't think BufferSync() is doing this in the right place. Unless I am confused, it's doing it inside a loop that just allocates stuff and copies data around, which probably does not need this kind of decoration. It wouldn't hurt to check here, and it might be a good idea for safety, but I think the place that really needs this treatment is the following loop where we are actually doing I/O and sleeping. Possibly we should even be doctoring CheckpointWriteDelay to use a latch-wait loop that can CHECK_FOR_INTERRUPTS() before re-sleeping, although for 100ms (curiously described as a Big Sleep) it might be overkill. - I think it would be nice to make this system more extensible. IIUC, the idea is that ProcessGlobalBarrierIntterupt() will call bespoke code for each GLOBBAR_* flag that set, and that code will then update process-local state from global state before returning. But, instead of hard-coding constants, how about just having a list of callbacks, and we call them all here, and each one is responsible for figuring out whether anything's been changed that it cares about, and if so updating the appropriate local state? Then GlobalBarrierKind goes away altogether. Granted, this could be a bit expensive if we were using global barriers for lots of different things and at least some of those things occurred frequently, but I don't think we're very near the point where we need that kind of optimization. As long as the callbacks have tests that exit quickly if there's nothing to do, I think it should be fine. (As an intermediate position, we could consider keeping barrierFlags but assign the bits dynamically; but unless and until we get more users of the facility, I think it might be simplest to just forget about barrierFlags altogether. Then even extensions can use this, if they want.) - The patch needs some general tidying up, like comments and naming consistency and stuff like that. Andres, Magnus, if neither of you are planning to work on this soon, I might like to jump in and run with this. Please let me know your thoughts. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: