Thread: [PATCH] Extend the length of BackgroundWorker.bgw_library_name
Hi,
I want to suggest a patch against master (it may also be worth backporting it) that makes it possible to use longer filenames (such as those with absolute paths) in `BackgroundWorker.bgw_library_name`.
`BackgroundWorker.bgw_library_name` currently allows names up to BGW_MAXLEN-1, which is generally sufficient if `$libdir` expansion is used.
However, there are use cases where [potentially] longer names are expected/desired; for example, test benches (where library files may not [or can not] be copied to Postgres installation) or alternative library installation methods that do not put them into $libdir.
--
The patch is backwards-compatible and ensures that bgw_library_name stays *at least* as long as BGW_MAXLEN. Existing external code that uses BGW_MAXLEN is a length boundary (for example, in `strncpy`) will continue to work as expected.
The trade-off of this patch is that the `BackgroundWorker` structure becomes larger. From my perspective, this is a reasonable cost (less than a kilobyte of extra space per worker).
The patch builds and `make check` succeeds.
Any feedback is welcome!
Attachment
On Mon, Mar 13, 2023 at 07:57:47AM -0700, Yurii Rashkovskii wrote: > However, there are use cases where [potentially] longer names are > expected/desired; for example, test benches (where library files may not > [or can not] be copied to Postgres installation) or alternative library > installation methods that do not put them into $libdir. > > The patch is backwards-compatible and ensures that bgw_library_name stays > *at least* as long as BGW_MAXLEN. Existing external code that uses > BGW_MAXLEN is a length boundary (for example, in `strncpy`) will continue > to work as expected. I see that BGW_MAXLEN was originally set to 64 in 2013 (7f7485a) [0], but was increased to 96 in 2018 (3a4b891) [1]. It seems generally reasonable to me to increase the length of bgw_library_name further for the use-case you describe, but I wonder if it'd be better to simply increase BGW_MAXLEN again. However, IIUC bgw_library_name is the only field that is likely to be used for absolute paths, so only increasing that one to MAXPGPATH makes sense. [0] https://postgr.es/m/CA%2BTgmoYtQQ-JqAJPxZg3Mjg7EqugzqQ%2BZBrpnXo95chWMCZsXw%40mail.gmail.com [1] https://postgr.es/m/304a21ab-a9d6-264a-f688-912869c0d7c6%402ndquadrant.com -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Nathan,
Thank you for your review.
Indeed, my motivation for doing the change the way I did it was that only bgw_library_name is expected to be longer, whereas it is much less of a concern for other fields. If we have increased BGW_MAXLEN, it would have increased the size of BackgroundWorker for little to no benefit.
On Mon, Mar 13, 2023 at 10:35 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 13, 2023 at 07:57:47AM -0700, Yurii Rashkovskii wrote:
> However, there are use cases where [potentially] longer names are
> expected/desired; for example, test benches (where library files may not
> [or can not] be copied to Postgres installation) or alternative library
> installation methods that do not put them into $libdir.
>
> The patch is backwards-compatible and ensures that bgw_library_name stays
> *at least* as long as BGW_MAXLEN. Existing external code that uses
> BGW_MAXLEN is a length boundary (for example, in `strncpy`) will continue
> to work as expected.
I see that BGW_MAXLEN was originally set to 64 in 2013 (7f7485a) [0], but
was increased to 96 in 2018 (3a4b891) [1]. It seems generally reasonable
to me to increase the length of bgw_library_name further for the use-case
you describe, but I wonder if it'd be better to simply increase BGW_MAXLEN
again. However, IIUC bgw_library_name is the only field that is likely to
be used for absolute paths, so only increasing that one to MAXPGPATH makes
sense.
[0] https://postgr.es/m/CA%2BTgmoYtQQ-JqAJPxZg3Mjg7EqugzqQ%2BZBrpnXo95chWMCZsXw%40mail.gmail.com
[1] https://postgr.es/m/304a21ab-a9d6-264a-f688-912869c0d7c6%402ndquadrant.com
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
> On 13 Mar 2023, at 18:35, Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Mon, Mar 13, 2023 at 07:57:47AM -0700, Yurii Rashkovskii wrote: >> However, there are use cases where [potentially] longer names are >> expected/desired; for example, test benches (where library files may not >> [or can not] be copied to Postgres installation) or alternative library >> installation methods that do not put them into $libdir. >> >> The patch is backwards-compatible and ensures that bgw_library_name stays >> *at least* as long as BGW_MAXLEN. Existing external code that uses >> BGW_MAXLEN is a length boundary (for example, in `strncpy`) will continue >> to work as expected. > > I see that BGW_MAXLEN was originally set to 64 in 2013 (7f7485a) [0], but > was increased to 96 in 2018 (3a4b891) [1]. It seems generally reasonable > to me to increase the length of bgw_library_name further for the use-case > you describe, but I wonder if it'd be better to simply increase BGW_MAXLEN > again. However, IIUC bgw_library_name is the only field that is likely to > be used for absolute paths, so only increasing that one to MAXPGPATH makes > sense. Yeah, raising just bgw_library_name to MAXPGPATH seems reasonable here. While the memory usage does grow it's still quite modest, and has an upper limit in max_worker_processes. While here, I wonder if we should document what BGW_MAXLEN is defined as in bgworker.sgml? -- Daniel Gustafsson
On Wed, Mar 15, 2023 at 10:38:34AM +0100, Daniel Gustafsson wrote: > While here, I wonder if we should document what BGW_MAXLEN is defined as in > bgworker.sgml? I am -0.5 for this. If you are writing a new background worker, it's probably reasonable to expect that you can locate the definition of BGW_MAXLEN. Also, I think there's a good chance that we'd forget to update such documentation the next time we adjust it. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
> On 21 Apr 2023, at 01:32, Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Wed, Mar 15, 2023 at 10:38:34AM +0100, Daniel Gustafsson wrote: >> While here, I wonder if we should document what BGW_MAXLEN is defined as in >> bgworker.sgml? > > I am -0.5 for this. If you are writing a new background worker, it's > probably reasonable to expect that you can locate the definition of > BGW_MAXLEN. Of course. The question is if it's a helpful addition for someone who is reading the documentation section on implementing background workers where we explicitly mention BGW_MAXLEN without saying what it is. > Also, I think there's a good chance that we'd forget to update > such documentation the next time we adjust it. There is that, but once set to MAXPGPATH it seems unlikely to change particularly often so it seems the wrong thing to optimize for. -- Daniel Gustafsson
On Fri, Apr 21, 2023 at 10:49:48AM +0200, Daniel Gustafsson wrote: > On 21 Apr 2023, at 01:32, Nathan Bossart <nathandbossart@gmail.com> wrote: >> I am -0.5 for this. If you are writing a new background worker, it's >> probably reasonable to expect that you can locate the definition of >> BGW_MAXLEN. > > Of course. The question is if it's a helpful addition for someone who is > reading the documentation section on implementing background workers where we > explicitly mention BGW_MAXLEN without saying what it is. IMHO it's better to have folks use the macro so that their calls to snprintf(), etc. are updated when BGW_MAXLEN is changed. But I can't say I'm strongly opposed to adding the value to the docs if you think it is helpful. >> Also, I think there's a good chance that we'd forget to update >> such documentation the next time we adjust it. > > There is that, but once set to MAXPGPATH it seems unlikely to change > particularly often so it seems the wrong thing to optimize for. True. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Hi, > The trade-off of this patch is that the `BackgroundWorker` structure becomes larger. From my perspective, this is a reasonablecost (less than a kilobyte of extra space per worker). Agree. > The patch is backwards-compatible and ensures that bgw_library_name stays *at least* as long as BGW_MAXLEN. Existing externalcode that uses BGW_MAXLEN is a length boundary (for example, in `strncpy`) will continue to work as expected. There is a mistake in the comment though: ``` +/* + * Ensure bgw_function_name's size is backwards-compatible and sensible + */ +StaticAssertDecl(MAXPGPATH >= BGW_MAXLEN, "MAXPGPATH must be at least equal to BGW_MAXLEN"); ``` library_name, not function_name. Also I think the comment should be more detailed, something like "prior to PG17 we used ... but since PG17 ... which may cause problems if ...". -- Best regards, Aleksander Alekseev
Aleksander,
On Mon, Apr 24, 2023 at 1:01 PM Aleksander Alekseev <aleksander@timescale.com> wrote:
> The patch is backwards-compatible and ensures that bgw_library_name stays *at least* as long as BGW_MAXLEN. Existing external code that uses BGW_MAXLEN is a length boundary (for example, in `strncpy`) will continue to work as expected.
There is a mistake in the comment though:
```
+/*
+ * Ensure bgw_function_name's size is backwards-compatible and sensible
+ */
+StaticAssertDecl(MAXPGPATH >= BGW_MAXLEN, "MAXPGPATH must be at least
equal to BGW_MAXLEN");
```
library_name, not function_name. Also I think the comment should be
more detailed, something like "prior to PG17 we used ... but since
PG17 ... which may cause problems if ...".
This is a very good catch and a suggestion. I've slightly reworked the patch, and I also made this static assertion to have less indirection and, therefore, make it easier to understand the premise.
Version 2 is attached.
Y.
Attachment
Hi, > This is a very good catch and a suggestion. I've slightly reworked the patch, and I also made this static assertion tohave less indirection and, therefore, make it easier to understand the premise. > Version 2 is attached. ``` sizeof((BackgroundWorker *)NULL)->bgw_library_name ``` I'm pretty confident something went wrong with the parentheses in v2. -- Best regards, Aleksander Alekseev
You're absolutely right. Here's v3.
On Mon, Apr 24, 2023 at 6:30 PM Aleksander Alekseev <aleksander@timescale.com> wrote:
Hi,
> This is a very good catch and a suggestion. I've slightly reworked the patch, and I also made this static assertion to have less indirection and, therefore, make it easier to understand the premise.
> Version 2 is attached.
```
sizeof((BackgroundWorker *)NULL)->bgw_library_name
```
I'm pretty confident something went wrong with the parentheses in v2.
--
Best regards,
Aleksander Alekseev
Y.
Attachment
Hi, > You're absolutely right. Here's v3. Please avoid using top posting [1]. The commit message may require a bit of tweaking by the committer but other than that the patch seems to be fine. I'm going to mark it as RfC in a bit unless anyone objects. [1]: https://wiki.postgresql.org/wiki/Mailing_Lists#Email_etiquette_mechanics -- Best regards, Aleksander Alekseev
On Wed, Apr 26, 2023 at 03:07:18PM +0300, Aleksander Alekseev wrote: > The commit message may require a bit of tweaking by the committer but > other than that the patch seems to be fine. I'm going to mark it as > RfC in a bit unless anyone objects. In v4, I've introduced a new BGW_LIBLEN macro and set it to the default value of MAXPGPATH (1024). This way, the value can live in bgworker.h like the other BGW_* macros do. Plus, this should make the assertion that checks for backward compatibility unnecessary. Since bgw_library_name is essentially a path, I can see the argument that we should just set BGW_LIBLEN to MAXPGPATH directly. I'm curious what folks think about this. I also changed the added sizeofs to use the macro for consistency with the surrounding code. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
Hi Nathan,
On Fri, Jun 30, 2023 at 2:39 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
In v4, I've introduced a new BGW_LIBLEN macro and set it to the default
value of MAXPGPATH (1024). This way, the value can live in bgworker.h like
the other BGW_* macros do. Plus, this should make the assertion that
checks for backward compatibility unnecessary. Since bgw_library_name is
essentially a path, I can see the argument that we should just set
BGW_LIBLEN to MAXPGPATH directly. I'm curious what folks think about this.
Thank you for revising the patch. While this is relatively minor, I think it should be set to MAXPGPATH directly to clarify their relationship.
Y.
On Sun, Jul 02, 2023 at 04:37:52PM -0700, Yurii Rashkovskii wrote: > Thank you for revising the patch. While this is relatively minor, I think > it should be set to MAXPGPATH directly to clarify their relationship. Committed. I set the size to MAXPGPATH directly instead of inventing a new macro with the same value. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Nathan,
On Mon, Jul 3, 2023 at 3:08 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Sun, Jul 02, 2023 at 04:37:52PM -0700, Yurii Rashkovskii wrote:
> Thank you for revising the patch. While this is relatively minor, I think
> it should be set to MAXPGPATH directly to clarify their relationship.
Committed. I set the size to MAXPGPATH directly instead of inventing a new
macro with the same value.
Great, thank you! The reason I was leaving the other constant in place to make upgrading extensions trivial (so that they don't need to adjust for this), but if you think this is a better way, I am fine with it.
Y.
On Mon, Jul 03, 2023 at 06:00:12PM -0700, Yurii Rashkovskii wrote: > Great, thank you! The reason I was leaving the other constant in place to > make upgrading extensions trivial (so that they don't need to adjust for > this), but if you think this is a better way, I am fine with it. Sorry, I'm not following. Which constant are you referring to? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Nathan,
On Mon, Jul 3, 2023 at 8:12 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Jul 03, 2023 at 06:00:12PM -0700, Yurii Rashkovskii wrote:
> Great, thank you! The reason I was leaving the other constant in place to
> make upgrading extensions trivial (so that they don't need to adjust for
> this), but if you think this is a better way, I am fine with it.
Sorry, I'm not following. Which constant are you referring to?
Apologies, I misread the final patch. All good!
Y.