Thread: dfmgr additional ABI version fields
When producing a forked version of PostgreSQL, there is no straightforward way to enforce that users don't accidentally load modules built for the non-forked (standard, community) version. You can only distinguish by PostgreSQL major version and a few compile-time settings. (see internal_load_library(), Pg_magic_struct) Depending on the details, mixing and matching might even work, until it doesn't, so this is a bad experience. I'm thinking about adding two more int fields to Pg_magic_struct: a product or vendor magic number, and an ABI version that can be used freely within a product/vendor. Would anyone else have use for this? Any thoughts?
čt 7. 10. 2021 v 11:28 odesílatel Peter Eisentraut <peter.eisentraut@enterprisedb.com> napsal:
When producing a forked version of PostgreSQL, there is no
straightforward way to enforce that users don't accidentally load
modules built for the non-forked (standard, community) version. You can
only distinguish by PostgreSQL major version and a few compile-time
settings. (see internal_load_library(), Pg_magic_struct) Depending on
the details, mixing and matching might even work, until it doesn't, so
this is a bad experience.
I'm thinking about adding two more int fields to Pg_magic_struct: a
product or vendor magic number, and an ABI version that can be used
freely within a product/vendor.
Would anyone else have use for this? Any thoughts?
+1
Pavel
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > I'm thinking about adding two more int fields to Pg_magic_struct: a > product or vendor magic number, and an ABI version that can be used > freely within a product/vendor. Who would hand out these magic numbers? If the answer is "choose a random one, it probably won't collide" then I'm not sure why we need two fields. You can choose a new random number for each ABI version, if you're changing it faster than once per PG major version. I'm also kind of unclear on why we need to do anything about this in the community version. If someone has forked PG and changed APIs to the extent that extensions are unlikely to work, there's not much stopping them from also making the two-line change to fmgr.h that would be needed to guarantee that different magic struct contents are needed. regards, tom lane
Hi, On October 7, 2021 8:49:57 AM PDT, Tom Lane >I'm also kind of unclear on why we need to do anything about this >in the community version. If someone has forked PG and changed >APIs to the extent that extensions are unlikely to work, there's >not much stopping them from also making the two-line change >to fmgr.h that would be needed to guarantee that different magic >struct contents are needed. I can see two reasons. First, it'd probably allow stock pg to generate a better error message when confronted with such amodule. Second, there's some value in signaling forks that they should change (or think about changing), that field. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Andres Freund <andres@anarazel.de> writes: > On October 7, 2021 8:49:57 AM PDT, Tom Lane >> I'm also kind of unclear on why we need to do anything about this >> in the community version. If someone has forked PG and changed >> APIs to the extent that extensions are unlikely to work, there's >> not much stopping them from also making the two-line change >> to fmgr.h that would be needed to guarantee that different magic >> struct contents are needed. > I can see two reasons. First, it'd probably allow stock pg to generate a better error message when confronted with sucha module. Second, there's some value in signaling forks that they should change (or think about changing), that field. Hmm, ok, I can buy the first of those arguments. Less sure about the second, but the first is reason enough. Can we make the addition be a string not a number, so that we could include something more useful than "1234" in the error message? Something like "Module is built for EDB v1234.56" seems like it'd be a lot more on-point to the average user, and it gets us out of having to design the ABI versioning scheme that a fork should use. regards, tom lane
On 10/07/21 12:42, Tom Lane wrote: > Can we make the addition be a string not a number, so that we > could include something more useful than "1234" in the error > message? I was wondering the same thing, just to sidestep the "who hands out IDs" question. Just using a string like "EDB v" + something would probably rule out collisions in practice. To be more formal about it, something like the tag URI scheme [0] could be recommended. Nothing at runtime would have to know or care about tag URI syntax; it would just match a string with a fixed opaque prefix and some suffix. The scheme gives the developer an easy way to construct a meaningful and reliably non-colliding string. Surely loading libraries isn't a hot enough operation to begrudge a strcmp. Regards, -Chap [0] https://datatracker.ietf.org/doc/html/rfc4151
Chapman Flack <chap@anastigmatix.net> writes: > On 10/07/21 12:42, Tom Lane wrote: >> Can we make the addition be a string not a number, so that we >> could include something more useful than "1234" in the error >> message? > Just using a string like "EDB v" + something would probably rule out > collisions in practice. To be more formal about it, something like > the tag URI scheme [0] could be recommended. Hmm. Personally I'm more interested in the string being comprehensible to end users than in whether there's any formal rule guaranteeing uniqueness. I really doubt that we will have any practical problem with collisions, so I'd rather go with something like "EnterpriseDB v1.2.3" than with something like "tag:enterprisedb.com,2021:1.2.3". Conceivably we could have two strings, or a printable string and a chosen-at-random unique number (the latter not meant to be shown to users). Not sure it's worth the trouble though. regards, tom lane
On 07.10.21 21:15, Tom Lane wrote: > Chapman Flack <chap@anastigmatix.net> writes: >> On 10/07/21 12:42, Tom Lane wrote: >>> Can we make the addition be a string not a number, so that we >>> could include something more useful than "1234" in the error >>> message? > >> Just using a string like "EDB v" + something would probably rule out >> collisions in practice. To be more formal about it, something like >> the tag URI scheme [0] could be recommended. > > Hmm. Personally I'm more interested in the string being comprehensible to > end users than in whether there's any formal rule guaranteeing uniqueness. > I really doubt that we will have any practical problem with collisions, > so I'd rather go with something like "EnterpriseDB v1.2.3" than with > something like "tag:enterprisedb.com,2021:1.2.3". Yeah, just a string should be fine.
So here is a patch. This does what I had in mind as a use case. Obviously, the naming and wording can be tuned. Input from other vendors is welcome.
Attachment
On Tue, Oct 12, 2021 at 8:13 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > So here is a patch. This does what I had in mind as a use case. > Obviously, the naming and wording can be tuned. Input from other > vendors is welcome. I'm not a different vendor, but I do work on different code than you do, and I like this. Advanced Server accidentally dodges this problem at present by shipping with a different FUNC_MAX_ARGS value, but this is much cleaner. Would it be reasonable to consider something similar for the control file, for the benefit of distributions that are not the same on disk? -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Oct 13, 2021 at 12:50:38PM -0400, Robert Haas wrote: > I'm not a different vendor, but I do work on different code than you > do, and I like this. Advanced Server accidentally dodges this problem > at present by shipping with a different FUNC_MAX_ARGS value, but this > is much cleaner. I am pretty sure that Greenplum could benefit from something like that. As a whole, using a string looks like a good idea for that. > Would it be reasonable to consider something similar for the control > file, for the benefit of distributions that are not the same on disk? Hmm. Wouldn't that cause more harm than actual benefits? -- Michael
Attachment
On 19.11.21 08:58, Michael Paquier wrote: > On Wed, Oct 13, 2021 at 12:50:38PM -0400, Robert Haas wrote: >> I'm not a different vendor, but I do work on different code than you >> do, and I like this. Advanced Server accidentally dodges this problem >> at present by shipping with a different FUNC_MAX_ARGS value, but this >> is much cleaner. > > I am pretty sure that Greenplum could benefit from something like > that. As a whole, using a string looks like a good idea for that. > >> Would it be reasonable to consider something similar for the control >> file, for the benefit of distributions that are not the same on disk? > > Hmm. Wouldn't that cause more harm than actual benefits? The catalog version already serves this purpose.
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: >> On Wed, Oct 13, 2021 at 12:50:38PM -0400, Robert Haas wrote: >>> Would it be reasonable to consider something similar for the control >>> file, for the benefit of distributions that are not the same on disk? > The catalog version already serves this purpose. We already have fields in pg_control for that, and fields to check endianness, maxalign, etc, ie the things that matter for data storage. Perhaps there is a need for more such fields, but I don't see that extension ABI questions are directly related. regards, tom lane
I have committed this patch as posted.