Thread: [PATCH] Slight improvement of worker_spi.c example
Hi, Currently the example uses the following order of calls: StartTransactionCommand(); SPI_connect(); PushActiveSnapshot(...); ... SPI_finish(); PopActiveSnapshot(); CommitTransactionCommand(); This could be somewhat misleading. Typically one expects something to be freed in a reverse order compared to initialization. This creates a false impression that PushActiveSnapshot(...) _should_ be called after SPI_connect(). The patch changes the order to: StartTransactionCommand(); PushActiveSnapshot(...); SPI_connect(); ... SPI_finish(); PopActiveSnapshot(); CommitTransactionCommand(); ... and also clarifies that the order of PushActiveSnapshot(...) and SPI_connect() is not important. -- Best regards, Aleksander Alekseev
Attachment
Hi, > The patch changes the order to: > > StartTransactionCommand(); > PushActiveSnapshot(...); > SPI_connect(); > > ... > > SPI_finish(); > PopActiveSnapshot(); > CommitTransactionCommand(); > > ... and also clarifies that the order of PushActiveSnapshot(...) and > SPI_connect() is not important. Additionally I noticed that the check: ``` if (!process_shared_preload_libraries_in_progress) return; ``` ... was misplaced in _PG_init(). Here is the patch v2 which fixes this too. -- Best regards, Aleksander Alekseev
Attachment
On Sat, Jun 03, 2023 at 02:09:26PM +0300, Aleksander Alekseev wrote: > > Additionally I noticed that the check: > > ``` > if (!process_shared_preload_libraries_in_progress) > return; > ``` > > ... was misplaced in _PG_init(). Here is the patch v2 which fixes this too. I'm pretty sure that this is intentional. The worker can be launched dynamically and in that case it still needs a GUC for the naptime.
Hi Julien, > I'm pretty sure that this is intentional. The worker can be launched > dynamically and in that case it still needs a GUC for the naptime. The dynamic worker also is going to need worker_spi_database, however the corresponding GUC declaration is placed below the check. Perhaps we should just say that the extension shouldn't be used without shared_preload_libraies. We are not testing whether it works in such a case anyway. -- Best regards, Aleksander Alekseev
On Sat, Jun 03, 2023 at 02:38:26PM +0300, Aleksander Alekseev wrote: > Hi Julien, > > > I'm pretty sure that this is intentional. The worker can be launched > > dynamically and in that case it still needs a GUC for the naptime. > > The dynamic worker also is going to need worker_spi_database, however > the corresponding GUC declaration is placed below the check. Yes, and that's because that GUC is declared as PGC_POSTMASTER so you can't declare such a GUC dynamically. > > Perhaps we should just say that the extension shouldn't be used > without shared_preload_libraies. We are not testing whether it works > in such a case anyway. The patch that added the database name clearly was committed without bothering testing that scenario, but it would be better to fix it and add some coverage rather than remove some example of how to use dynamic bgworkers. Maybe with a assign hook to make sure it's never changed once assigned or something along those lines. That being said this module is really naive and has so many problems that I don't think it's actually helpful as coding guidelines for anyone who wants to create a non toy extension using bgworkers.
Hi, > That being said this module is really naive and has so many problems that I > don't think it's actually helpful as coding guidelines for anyone who wants to > create a non toy extension using bgworkers. Agree. It is a simple example and I don't think it's going to be useful to make a complicated one out of it. The order of the calls it currently uses however may be extremely confusing for newcomers. It creates an impression that this particular order is extremely important while in fact it's not and it takes time to figure this out. -- Best regards, Aleksander Alekseev
On Sat, Jun 03, 2023 at 03:34:30PM +0300, Aleksander Alekseev wrote: > Agree. It is a simple example and I don't think it's going to be > useful to make a complicated one out of it. It does not have to be complicated, but I definitely agree that we'd better spend some efforts in improving it as a whole especially knowing that this is mentioned on the docs as an example that one could rely on. > The order of the calls it currently uses however may be extremely > confusing for newcomers. It creates an impression that this particular > order is extremely important while in fact it's not and it takes time > to figure this out. + * The order of PushActiveSnapshot() and SPI_connect() is not really + * important. FWIW, looking at the patch, I don't think that this is particularly useful. -- Michael
Attachment
Hi Michael, Thanks for your feedback. > + * The order of PushActiveSnapshot() and SPI_connect() is not really > + * important. > > FWIW, looking at the patch, I don't think that this is particularly > useful. Fair enough, here is the corrected patch. -- Best regards, Aleksander Alekseev
Attachment
On Sat, Jun 03, 2023 at 06:35:00PM -0400, Michael Paquier wrote: > It does not have to be complicated, but I definitely agree that we'd > better spend some efforts in improving it as a whole especially > knowing that this is mentioned on the docs as an example that one > could rely on. +1. I know I've used worker_spi as a reference for writing background workers before. IMHO it'd be better if the patch documented the places where the ordering really does matter instead of hoping extension authors will understand the reasoning behind the proposed reordering. I agree that the current code could lead folks to think that PushActiveSnapshot must go after SPI_connect, but wouldn't the reverse ordering just give folks the opposite impression? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Hi, > On Sat, Jun 03, 2023 at 06:35:00PM -0400, Michael Paquier wrote: > > It does not have to be complicated, but I definitely agree that we'd > > better spend some efforts in improving it as a whole especially > > knowing that this is mentioned on the docs as an example that one > > could rely on. > > +1. I know I've used worker_spi as a reference for writing background > workers before. Thanks for the feedback. > I agree that the current code > could lead folks to think that PushActiveSnapshot must go after > SPI_connect, but wouldn't the reverse ordering just give folks the opposite > impression? This is the exact reason why the original patch had an explicit comment that the ordering is not important in this case. It was argued however that the comment is redundant and thus it was removed. -- Best regards, Aleksander Alekseev
On Tue, Jun 13, 2023 at 12:34:09PM +0300, Aleksander Alekseev wrote: > > > I agree that the current code > > could lead folks to think that PushActiveSnapshot must go after > > SPI_connect, but wouldn't the reverse ordering just give folks the opposite > > impression? > > This is the exact reason why the original patch had an explicit > comment that the ordering is not important in this case. It was argued > however that the comment is redundant and thus it was removed. I also don't think that a comment is really worthwhile. If there were any hard dependency, it should be mentioned in the various functions comments as that's the first place one should look at when using a function they're not familiar with. That being said, I still don't understand why you focus on this tiny and not really important detail while the module itself is actually broken (for dynamic bgworker without s_p_l) and also has some broken behaviors with regards to the naptime that are way more likely to hurt third party code that was written using this module as an example.
On Tue, Jun 13, 2023 at 07:58:02PM +0800, Julien Rouhaud wrote: > That being said, I still don't understand why you focus on this tiny and not > really important detail while the module itself is actually broken (for dynamic > bgworker without s_p_l) and also has some broken behaviors with regards to the > naptime that are way more likely to hurt third party code that was written > using this module as an example. Are you or Aleksander interested in helping improve this module? I'm happy to help review and/or write patches. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Hi Nathan, > > That being said, I still don't understand why you focus on this tiny and not > > really important detail while the module itself is actually broken (for dynamic > > bgworker without s_p_l) and also has some broken behaviors with regards to the > > naptime that are way more likely to hurt third party code that was written > > using this module as an example. > > Are you or Aleksander interested in helping improve this module? I'm happy > to help review and/or write patches. Unfortunately I'm not familiar with the problem in respect of naptime Julien is referring to. If you know what this problem is and how to fix it, go for it. I'll review and test the code then. I can write the part of the patch that fixes the part regarding dynamic workers if necessary. -- Best regards, Aleksander Alekseev
On Wed, Jun 14, 2023 at 02:08:03PM +0300, Aleksander Alekseev wrote: > > Unfortunately I'm not familiar with the problem in respect of naptime > Julien is referring to. If you know what this problem is and how to > fix it, go for it. I'll review and test the code then. I can write the > part of the patch that fixes the part regarding dynamic workers if > necessary. Oh, sorry I thought it was somewhat evident. The naptime GUC description says: > Duration between each check (in seconds). and the associated code does a single WaitLatch(..., WL_LATCH_SET | WL_TIMEOUT, ...) So unless I'm missing something nothing prevents the check being run way more often than expected if the latch keeps being set. Similarly, my understanding of "duration between checks" is that a naptime of 1 min means that the check should be run a minute apart, assuming it's possible. As is, the checks are run naptime + query execution time apart, which doesn't seem right. Obviously there's isn't much you can do if the query execution lasts for more than naptime, apart from detecting it and raising a warning to let users know that their configuration isn't adequate (or that there's some other problem like some lock contention or something), similarly to e.g. checkpoint_warning. Note I haven't looked closely at this module otherwise, so I can't say if there are some other problems around.
> On 14 Jun 2023, at 13:08, Aleksander Alekseev <aleksander@timescale.com> wrote: >> Are you or Aleksander interested in helping improve this module? I'm happy >> to help review and/or write patches. > > Unfortunately I'm not familiar with the problem in respect of naptime > Julien is referring to. If you know what this problem is and how to > fix it, go for it. I'll review and test the code then. I can write the > part of the patch that fixes the part regarding dynamic workers if > necessary. Have you had a chance to look at these suggestions, and Juliens reply downthread, in order to produce a new version of the patch? -- Daniel Gustafsson
Hi, > Have you had a chance to look at these suggestions, and Juliens reply > downthread, in order to produce a new version of the patch? Thanks for the reminder. No I haven't. Please feel free marking this CF entry as RwF if this will be helpful. We may reopen it if and when necessary. -- Best regards, Aleksander Alekseev
> On 10 Jul 2023, at 14:40, Aleksander Alekseev <aleksander@timescale.com> wrote: >> Have you had a chance to look at these suggestions, and Juliens reply >> downthread, in order to produce a new version of the patch? > > Thanks for the reminder. No I haven't. Please feel free marking this > CF entry as RwF if this will be helpful. We may reopen it if and when > necessary. Ok, will do. Please feel free to resubmit to a future CF when there is a new version for consideration. -- Daniel Gustafsson