Thread: Mark all GUC variable as PGDLLIMPORT
Hi, I've been pinged many time over the years to either fix Windows compatibility or provide DLL for multiple extensions I'm maintaining. I've finally taken some time to setup a Windows build environment so I could take care of most of the problem, but not all (at least not in a satisfactory way). I've also been looking a bit around other extensions and I see that the #1 problem with compiling extensions on Windows is the lack of PGDLLIMPORT annotations, which is 99% of the time for a GUC. This topic has been raised multiple time over the years, and I don't see any objection to add such an annotation at least for all GUC variables (either the direct variables or the indirect variables set during the hook execution), so PFA a patch that takes care of all the GUC. I don't now if that's still an option at that point, but backporting to at least pg14 if that patch is accepted would be quite helpful.
Attachment
On Sun, Aug 22, 2021 at 04:10:33PM +0800, Julien Rouhaud wrote: > This topic has been raised multiple time over the years, and I don't see any > objection to add such an annotation at least for all GUC variables (either the > direct variables or the indirect variables set during the hook execution), so > PFA a patch that takes care of all the GUC. > > I don't now if that's still an option at that point, but backporting to at > least pg14 if that patch is accepted would be quite helpful. These are usually just applied on HEAD, and on a parameter-basis based on requests from extension authors. If you wish to make your extensions able to work on Windows, that's a good idea, but I would recommend to limit this exercise to what's really necessary for your purpose. -- Michael
Attachment
On Sun, Aug 22, 2021 at 08:51:26PM +0900, Michael Paquier wrote: > On Sun, Aug 22, 2021 at 04:10:33PM +0800, Julien Rouhaud wrote: > > This topic has been raised multiple time over the years, and I don't see any > > objection to add such an annotation at least for all GUC variables (either the > > direct variables or the indirect variables set during the hook execution), so > > PFA a patch that takes care of all the GUC. > > > > I don't now if that's still an option at that point, but backporting to at > > least pg14 if that patch is accepted would be quite helpful. > > These are usually just applied on HEAD Yeah but 14 isn't released yet, and this is a really low risk change. > , and on a parameter-basis based > on requests from extension authors. If you wish to make your > extensions able to work on Windows, that's a good idea, but I would > recommend to limit this exercise to what's really necessary for your > purpose. I disagree. For random global variables I agree that we shouldn't mark them all blindly, but for GUCs it's pretty clear that they're intended to be accessible from any caller, including extensions. Why treating Windows as a second-class citizen, especially when any change can only be used a year after someone complained?
ne 22. 8. 2021 v 14:08 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Sun, Aug 22, 2021 at 08:51:26PM +0900, Michael Paquier wrote:
> On Sun, Aug 22, 2021 at 04:10:33PM +0800, Julien Rouhaud wrote:
> > This topic has been raised multiple time over the years, and I don't see any
> > objection to add such an annotation at least for all GUC variables (either the
> > direct variables or the indirect variables set during the hook execution), so
> > PFA a patch that takes care of all the GUC.
> >
> > I don't now if that's still an option at that point, but backporting to at
> > least pg14 if that patch is accepted would be quite helpful.
>
> These are usually just applied on HEAD
Yeah but 14 isn't released yet, and this is a really low risk change.
> , and on a parameter-basis based
> on requests from extension authors. If you wish to make your
> extensions able to work on Windows, that's a good idea, but I would
> recommend to limit this exercise to what's really necessary for your
> purpose.
I disagree. For random global variables I agree that we shouldn't mark them
all blindly, but for GUCs it's pretty clear that they're intended to be
accessible from any caller, including extensions. Why treating Windows as a
second-class citizen, especially when any change can only be used a year after
someone complained?
I had few problems with access with these variables too when I worked on orafce.
Is true, so it increases differences between Windows and Unix, and fixing these issues is not fun work. On the other hand, maybe direct access to these variables from extensions is an antipattern, and we should use a direct function call API and functions current_setting and set_config. The overhead is usually not too big.
On Sun, Aug 22, 2021 at 02:17:16PM +0200, Pavel Stehule wrote: > > Is true, so it increases differences between Windows and Unix, and fixing > these issues is not fun work. On the other hand, maybe direct access to > these variables from extensions is an antipattern, and we should use a > direct function call API and functions current_setting and set_config. The > overhead is usually not too big. Yes, and that's what I did for one of my extensions. But that's still a bit of overhead, and extra burden only seen when trying to have Windows compatiblity, and I hope I can get rid of that at some point. If direct variable access shouldn't be possible, then we should explicitly tag those with __attribute__ ((visibility ("hidden"))) or something like that to have a more consistent behavior.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Sun, Aug 22, 2021 at 08:51:26PM +0900, Michael Paquier wrote: >> ... and on a parameter-basis based >> on requests from extension authors. If you wish to make your >> extensions able to work on Windows, that's a good idea, but I would >> recommend to limit this exercise to what's really necessary for your >> purpose. > I disagree. For random global variables I agree that we shouldn't mark them > all blindly, but for GUCs it's pretty clear that they're intended to be > accessible from any caller, including extensions. Uh, no, it's exactly *not* clear. There are a lot of GUCs that are only of interest to particular subsystems. I do not see why being a GUC makes something automatically more interesting than any other global variable. Usually, the fact that one is global is only so the GUC machinery itself can get at it, otherwise it'd be static in the owning module. As for "extensions should be able to get at the values", the GUC machinery already provides uniform mechanisms for doing that safely. Direct access to the variable's internal value would be unsafe in many cases. regards, tom lane
Bruce Momjian <bruce@momjian.us> writes: > On Sun, Aug 22, 2021 at 09:29:01PM +0800, Julien Rouhaud wrote: >> Then shouldn't we try to prevent direct access on all platforms rather than >> only one? > Agreed. If Julian says 99% of the non-export problems are GUCs, and we > can just export them all, why not do it? We already export every global > variable on Unix-like systems, and we have seen no downsides. By that argument, *every* globally-visible variable should be marked PGDLLIMPORT. But the mere fact that two backend .c files need to access some variable doesn't mean that we want any random bit of code doing so. And yes, I absolutely would prohibit extensions from accessing many of them, if there were a reasonable way to do it. It would be a good start towards establishing a defined API for extensions. regards, tom lane
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Aug 23, 2021 at 10:15:04AM -0400, Tom Lane wrote: >> By that argument, *every* globally-visible variable should be marked >> PGDLLIMPORT. But the mere fact that two backend .c files need to access > No, Julien says 99% need only the GUCs, so that is not the argument I am > making. That's a claim unbacked by any evidence that I've seen. More to the point, we already have a mechanism that extensions can/should use to read and write GUC settings, and it's not direct access. regards, tom lane
On Mon, Aug 23, 2021 at 10:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > On Mon, Aug 23, 2021 at 10:15:04AM -0400, Tom Lane wrote: > >> By that argument, *every* globally-visible variable should be marked > >> PGDLLIMPORT. But the mere fact that two backend .c files need to access > > > No, Julien says 99% need only the GUCs, so that is not the argument I am > > making. > > That's a claim unbacked by any evidence that I've seen. More to > the point, we already have a mechanism that extensions can/should > use to read and write GUC settings, and it's not direct access. I clearly didn't try all single extension available out there. It's excessively annoying to compile extensions on Windows, and also I don't have a lot of dependencies installed so there are some that I wasn't able to test since I'm lacking some other lib and given my absolute lack of knowledge of that platform I didn't spent time trying to install those. I think I tested a dozen of "small" extensions (I'm assuming that the big one like postgis would require too much effort to build, and they probably already take care of Windows compatibility), and I only faced this problem. That's maybe not a representative set, but I also doubt that I was unlucky enough to find the few only exceptions.
On Mon, Aug 23, 2021 at 10:36 PM Bruce Momjian <bruce@momjian.us> wrote: > > So the problem is that extensions only _need_ to use that API on > Windows, so many initially don't, or that the API is too limited? The inconvenience with that API is that it's only returning c strings, so you gave to convert it back to the original datatype. That's probably why most of the extensions simply read from the original exposed variable rather than using the API, because they're usually written on Linux or similar, not because they want to mess up the stored value.
On Mon, Aug 23, 2021 at 10:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > And yes, I absolutely would prohibit extensions from accessing many > of them, if there were a reasonable way to do it. It would be a good > start towards establishing a defined API for extensions. The v2 patch I sent does that, at least when compiling with GCC. I didn't find something similar for clang, but I only checked quickly. I'm assuming that the unreasonable part is having to add some extra attribute to the variable? Would it be acceptable if wrapped into some other macro, as I proposed?
On Mon, Aug 23, 2021 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > And yes, I absolutely would prohibit extensions from accessing many > of them, if there were a reasonable way to do it. It would be a good > start towards establishing a defined API for extensions. Mostly, it would make extension development more difficult for no discernable benefit to the project. You've made this argument many times over the years ... but we're no closer to having an extension API than we ever were, and we continue to get complaints about breaking stuff on Windows on a pretty regular basis. Honestly, it seems unimaginable that an API is ever really going to be possible. It would be a ton of work to maintain, and we'd just end up breaking it every time we discover that there's a new feature we want to implement which doesn't fit into the defined API now. That's what we do *now* with functions that third-party extensions actually call, and with variables that they access, and it's not something that, in my experience, is any great problem in maintaining an extension. You're running code that is running inside somebody else's executable and sometimes you have to adjust it for upstream changes. That's life, and I don't think that complaints about that topic are nearly as frequent as complaints about extensions breaking on Windows because of missing PGDLLIMPORT markings. And more than that, I'm pretty sure that you've previously taken the view that we shouldn't document all the hook functions that only exist in the backend for the purpose of extension use. I think you would argue against a patch to go and document all the variables that are marked PGDLLIMPORT now. So it seems to me that you're for an API when it means that we don't have to change anything, and against an API when it means that we don't have to change anything, which doesn't really seem like a consistent position. I think we should be responding to the real, expressed needs of extension developers, and the lack of PGDLLIMPORT markings on various global variables is surely top of the list. It's also a bit unfair to say, well we have APIs for accessing GUC values. It's true that we do. But if the GUC variable is, say, a Boolean, you do not want your extension to call some function that does a bunch of shenanigans and returns a string so that you can then turn around and parse the string to recover the Boolean value. Even moreso if the value is an integer or a comma-separated list. You want to access the value as the system represents it internally, not duplicate the parsing logic in a way that is inefficient and bug-prone. In short, +1 from me for the original proposal of marking all GUCs as PGDLLIMPORT. And, heck, +1 for marking all the other global variables that way, too. We're not solving any problem here. We're just annoying people, mostly people who are loyal community members and steady contributors to the project. -- Robert Haas EDB: http://www.enterprisedb.com
On 08/23/21 10:36, Bruce Momjian wrote: > So the problem is that extensions only _need_ to use that API on > Windows, so many initially don't, or that the API is too limited? I think there can be cases where it's too limited, such as when significant computation or validation is needed between the form of the setting known to the GUC machinery and the form that would otherwise be available in the global. I'm thinking, for instance, of the old example before session_timezone was PGDLLIMPORTed, and you'd have to GETCONFIGOPTION("timezone") and repeat the work done by pg_tzset to validate and map the timezone name through the timezone database, to reconstruct the value that was otherwise already available in session_timezone. Maybe those cases aren't very numerous ... and maybe they're distinctive enough to recognize when creating one ("hmm, I am creating a check or assign hook that does significant work here, will it be worth exposing a getter API for the product of the work?"). Regards, -Chap
On 2021-Aug-23, Robert Haas wrote: > It's also a bit unfair to say, well we have APIs for accessing GUC > values. It's true that we do. But if the GUC variable is, say, a > Boolean, you do not want your extension to call some function that > does a bunch of shenanigans and returns a string so that you can then > turn around and parse the string to recover the Boolean value. Even > moreso if the value is an integer or a comma-separated list. You want > to access the value as the system represents it internally, not > duplicate the parsing logic in a way that is inefficient and > bug-prone. In that case, why not improve the API with functions that return the values in some native datatype? For scalars with native C types (int, floats, Boolean etc) this is easy enough; I bet it'll solve 99% of the problems or more. -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
On 08/23/21 10:57, Chapman Flack wrote: > Maybe those cases aren't very numerous ... and maybe they're distinctive > enough to recognize when creating one ("hmm, I am creating a check or > assign hook that does significant work here, will it be worth exposing > a getter API for the product of the work?"). How about a generic GetTypedConfigValue(char const *name, void *into) ? By default the type of *into would correspond to the type of the GUC (int for an enum) and would be returned directly from *conf->variable by a getter hook installed by default when the GUC is defined. But the definition could also supply a getter hook that would store a value of a different type, or computed a different way, for a GUC where that's appropriate. The function return could be boolean, true if such a variable exists and isn't a placeholder. Pro: provides read access to the typed value of any GUC, without exposing it to overwriting. Con: has to bsearch the GUCs every time the value is wanted. What I'd really like: ObserveTypedConfigValue(char const *name, void *into, int *flags, int flag) This would register an observer of the named GUC: whenever its typed value changes, it is written at *into and flag is ORed into *flags. This is more restrictive than allowing arbitrary code to be hooked into GUC changes (as changes can happen at delicate times such as error recovery), but allows an extension to always have the current typed value available and to cheaply know when it has changed. Observers would be updated in the normal course of processing a GUC change, eliminating the bsearch-per-lookup cost of the first approach. Thinkable? Regards, -Chap
On Mon, Aug 23, 2021 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > In that case, why not improve the API with functions that return the > values in some native datatype? For scalars with native C types (int, > floats, Boolean etc) this is easy enough; I bet it'll solve 99% of the > problems or more. Sure, but ... why bother? The entire argument rests on the presumption that there is some harm being done by people accessing the values directly, but I don't think that's true. And, if it were true, it seems likely that this proposed API would have the exact same problem, because it would let people do exactly the same thing. And, going through this proposed API would still be significantly more expensive than just accessing the bare variables, because you'd at least have to do some kind of lookup based on the GUC name to find the corresponding variable. It's just a solution in search of a problem. Nothing bad happens when people write extensions that access GUC variables directly. It works totally, completely fine. Except on Windows. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Aug 23, 2021 at 7:57 AM Robert Haas <robertmhaas@gmail.com> wrote: > In short, +1 from me for the original proposal of marking all GUCs as > PGDLLIMPORT. +1 > And, heck, +1 for marking all the other global variables > that way, too. We're not solving any problem here. We're just annoying > people, mostly people who are loyal community members and steady > contributors to the project. I'm +0.5 on this aspect -- the result might be a lot of verbosity for no possible benefit. I'm not sure how many global variables there are. Hopefully not that many. Maybe making adding new global variables annoying would be a useful disincentive -- sometimes we use global variables when it isn't particularly natural (it's natural with GUCs, but not other things). That might tip the scales, at least for me. Unnecessary use of global variables are why Postgres 13 went through several point releases before I accidentally found out that parallel VACUUM doesn't respect cost limits -- a very simple bug concerning how we propagate (or fail to propagate) state to parallel workers. I bet it would have taken far longer for the bug to be discovered if it wasn't for my Postgres 14 VACUUM refactoring work. -- Peter Geoghegan
On Mon, Aug 23, 2021 at 10:56:52AM -0400, Robert Haas wrote: > On Mon, Aug 23, 2021 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > And yes, I absolutely would prohibit extensions from accessing many > > of them, if there were a reasonable way to do it. It would be a good > > start towards establishing a defined API for extensions. > > Mostly, it would make extension development more difficult for no > discernable benefit to the project. > > You've made this argument many times over the years ... but we're no > closer to having an extension API than we ever were, and we continue > to get complaints about breaking stuff on Windows on a pretty regular > basis. > > Honestly, it seems unimaginable that an API is ever really going to be > possible. It would be a ton of work to maintain, and we'd just end up > breaking it every time we discover that there's a new feature we want > to implement which doesn't fit into the defined API now. That's what > we do *now* with functions that third-party extensions actually call, > and with variables that they access, and it's not something that, in > my experience, is any great problem in maintaining an extension. > You're running code that is running inside somebody else's executable > and sometimes you have to adjust it for upstream changes. That's life, > and I don't think that complaints about that topic are nearly as > frequent as complaints about extensions breaking on Windows because of > missing PGDLLIMPORT markings. > > And more than that, I'm pretty sure that you've previously taken the > view that we shouldn't document all the hook functions that only exist > in the backend for the purpose of extension use. As the person on the receiving end of that one, I was nonplussed, so I took a step back to think it over. I recognized at that time that I didn't have a great answer for a legitimate concern, namely that any change, however trivial, that goes into the core and doesn't go directly to core database functionality, represents a long-term maintenance burden. The thing I'm coming to is that the key architectural feature PostgreSQL has that other RDBMSs don't is its extensibility. Because that's been a stable feature over time, I'm pretty sure we actually need to see documenting that as a thing that does actually go to core database functionality. Yes, there are resources involved with doing a thing like this, but I don't think that they require constant or even frequent attention from committers or even from seasoned DB hackers. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 08/23/21 14:30, Robert Haas wrote: > it seems likely that this proposed > API would have the exact same problem, because it would let people do > exactly the same thing. And, going through this proposed API would > still be significantly more expensive than just accessing the bare > variables, because you'd at least have to do some kind of lookup based > on the GUC name I think the API ideas in [0] would not let people do exactly the same thing. They would avoid exposing the bare variables to overwrite. Not that there has been any plague of extensions going and overwriting GUCs, but I think in some messages on this thread I detected a sense that in principle it's better if an API precludes it, and that makes sense to me. The second idea also avoids the expense of name-based lookup (except once at extension initialization), and in fact minimizes the cost of obtaining the current value when needed, by slightly increasing the infrequent cost of updating values. Regards, -Chap [0] https://www.postgresql.org/message-id/6123C425.3080409%40anastigmatix.net
po 23. 8. 2021 v 23:08 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 08/23/21 14:30, Robert Haas wrote:
> it seems likely that this proposed
> API would have the exact same problem, because it would let people do
> exactly the same thing. And, going through this proposed API would
> still be significantly more expensive than just accessing the bare
> variables, because you'd at least have to do some kind of lookup based
> on the GUC name
I think the API ideas in [0] would not let people do exactly the same thing.
They would avoid exposing the bare variables to overwrite. Not that
there has been any plague of extensions going and overwriting GUCs,
but I think in some messages on this thread I detected a sense that
in principle it's better if an API precludes it, and that makes sense
to me.
The second idea also avoids the expense of name-based lookup (except once
at extension initialization), and in fact minimizes the cost of obtaining
the current value when needed, by slightly increasing the infrequent cost
of updating values.
There are few GUC variables, where we need extra fast access - work_mem, encoding settings, and maybe an application name. For others I hadn't needed to access it for over 20 years. But I understand that more complex extensions like timescaledb will use more internal GUC.
Maybe a new light API based on enum identifiers instead of string identifiers that ensure relatively fast access (and safe and secure) can be a good solution. And for special variables, there can be envelope functions for very fast access. But although the special interface, the responsibility of the extension's author will not be less, and the C extensions will not be more trusted, so it is hard to say something about possible benefits. I am inclined to Robert's opinion, so the current state is not too bad, and maybe the best thing that is possible now is the decreasing difference between supported platforms.
Regards
Pavel
On Tue, Aug 24, 2021 at 12:36 PM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > There are few GUC variables, where we need extra fast access - work_mem, encoding settings, and maybe an application name.For others I hadn't needed to access it for over 20 years. But I understand that more complex extensions like timescaledbwill use more internal GUC. Note that even trivial extensions require some other GUC access. For instance any single extension that wants to aggregate data based on the query_id has to store an array of query_id in shmem to know what a parallel worker is working on. On top of wasting memory and CPU, it also means that computing the maximum number of backends in required. So you need to recompute MaxBackends, which means access to multiple GUCs. Unfortunately the patch to expose the query_id didn't fix that problem as it only passes the top level query_id to the pararllel workers, not the current one.
On Mon, 23 Aug 2021 at 22:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
> On Sun, Aug 22, 2021 at 09:29:01PM +0800, Julien Rouhaud wrote:
>> Then shouldn't we try to prevent direct access on all platforms rather than
>> only one?
> Agreed. If Julian says 99% of the non-export problems are GUCs, and we
> can just export them all, why not do it? We already export every global
> variable on Unix-like systems, and we have seen no downsides.
By that argument, *every* globally-visible variable should be marked
PGDLLIMPORT. But the mere fact that two backend .c files need to access
some variable doesn't mean that we want any random bit of code doing so.
And yes, I absolutely would prohibit extensions from accessing many
of them, if there were a reasonable way to do it. It would be a good
start towards establishing a defined API for extensions.
There is: -fvisibility=hidden and __attribute__((visibility("default"))) . Or if you prefer to explicitly mark private symbols, use __attribute__((visiblity("hidden"))) .
In addition to API surface control this also gives you a smaller export symbol table for faster dynamic linking, and it improves link-time optimization.
I could've sworn I proposed its use in the past but I can't find a relevant list thread except quite a recent one. All I can find is [1] . But that is where we should start: switch from a linker script for libpq to using PGDLLIMPORT (actually it'd be LIBPQDLLIMPORT) in libpq. When -DBUILDING_LIBPQ this would expand to __declspec("dllexport") on Windows and __attribute__((visibility("default"))) on gcc/clang. Otherwise it expands to __declspec("dllimport") on Windows and empty on other targets. This would also be a good time to rename the confusingly named BUILDING_DLL macro to BUILDING_POSTGRES .
The next step would be to have PGDLLIMPORT expand to __attribute__((visibility("default"))) on gcc/clang when building the server itself. This won't do anything by itself since all symbols are already default visibility. But once the "public API" of both function and data symbol is so-annotated, we could switch to building Pg with -fvisibility=hidden by default, and on Windows, we'd disable the linker script that exports all functions using a generated .DEF file.
On Tue, 24 Aug 2021 at 02:31, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Aug 23, 2021 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> In that case, why not improve the API with functions that return the
> values in some native datatype? For scalars with native C types (int,
> floats, Boolean etc) this is easy enough; I bet it'll solve 99% of the
> problems or more.
Sure, but ... why bother?
The entire argument rests on the presumption that there is some harm
being done by people accessing the values directly, but I don't think
that's true. And, if it were true, it seems likely that this proposed
API would have the exact same problem, because it would let people do
exactly the same thing. And, going through this proposed API would
still be significantly more expensive than just accessing the bare
variables, because you'd at least have to do some kind of lookup based
on the GUC name to find the corresponding variable. It's just a
solution in search of a problem.
Nothing bad happens when people write extensions that access GUC
variables directly. It works totally, completely fine. Except on
Windows.
Not only that, but postgres already exports every non-static function symbol on both *nix and Windows, and every data symbol on *nix. A lot of those function symbols are very internal and give anything that can call them the ability to wreck absolute havoc on the server's state.
There is not even the thinnest pretense of Pg having a dedicated extension API or any sort of internal vs public API separation. This ongoing pain with PGDLLIMPORT on Windows is hard to see as anything but an excuse to make working with and supporting Windows harder because some of us don't like it. I happen to rather dislike working with Windows myself, but I get to do it anyway, and I'd be very happy to remove this particular source of pain.
On Tue, 24 Aug 2021 at 13:21, Craig Ringer <craig.ringer@enterprisedb.com> wrote:
There is not even the thinnest pretense of Pg having a dedicated extension API or any sort of internal vs public API separation.
Oh, and if we do want such a separation, we'll need to introduce a MUCH lower-pain-and-overhead way to get related patches in. Otherwise it'll take decades to add any necessary function wrappers for currently exported data symbols, add necessary hooks, wrap or hide internal symbols and state, etc.
But ... what is the actual goal and expected outcome of such a hypothetical public/private API separation?
It won't help meaningfully with server maintenance: We already break extensions freely in major releases, and sometimes even minor releases. We don't make any stable API promise at all. So any argument that it would make maintenance of the core server easier is weak at best.
It won't help protect server runtime stability: The server is written in C, and makes heavy use of non-opaque / non-hidden types. Many of which would not be practical to hide without enormous refactoring if at all. Writing any sort of "safe" C API is very difficult even when the exposed functionality is very narrow and well defined. Even then such an API can only help prevent inadvertent mistakes, since C programs are free to grovel around in memory. Look at the mess with EXPORT_SYMBOL_GPL in the kernel for just how ugly that can get. So I don't think there's any realistic way to claim that narrowing the exposed API surface would make it safer to load and run extensions that the user has not separately checked and reviewed or obtained from a trusted source with robust testing practices. Certainly it offers no benefit at all against a bad actor.
It won't make it safer to use untrusted extensions.
What will it do? Not much, in the short term, except cripple existing extensions or add a pile of questionably useful code annotations. The only real benefits I see are some improvement in link-time optimization and export symbol table size. Both of which are nice, but IMO not worth the pain by themselves for a pure C program. In C++, with its enormous symbol tables it's absolutely worth it. But not really for Pg.
To be clear, I actually love the idea of starting to define a solid public API, with API, ABI and semantic promises and associated tests. But to say it's a nontrivial amount of work is an enormous understatement. And unless done by an existing committer who is trusted to largely define a provisional API without bike-shedding and arguing the merits of every single part of it, it's nearly impossible to do with the way Pg is currently developed.
It's completely beyond me why it's OK to export all function symbols on Windows, but not all data symbols. Or why it's OK to export all data symbols on *nix, but not on Windows.
On Tue, 24 Aug 2021 at 05:08, Chapman Flack <chap@anastigmatix.net> wrote:
On 08/23/21 14:30, Robert Haas wrote:
> it seems likely that this proposed
> API would have the exact same problem, because it would let people do
> exactly the same thing. And, going through this proposed API would
> still be significantly more expensive than just accessing the bare
> variables, because you'd at least have to do some kind of lookup based
> on the GUC name
I think the API ideas in [0] would not let people do exactly the same thing.
They would avoid exposing the bare variables to overwrite. Not that
there has been any plague of extensions going and overwriting GUCs,
but I think in some messages on this thread I detected a sense that
in principle it's better if an API precludes it, and that makes sense
to me.
The second idea also avoids the expense of name-based lookup (except once
at extension initialization), and in fact minimizes the cost of obtaining
the current value when needed, by slightly increasing the infrequent cost
of updating values.
I'd be generally in favour of something that reduced our reliance on the current chaotic and inconsistent jumble of globals which are a semi-random combination of compilation-unit-scoped, globally-scoped-except-on-Windows and globally scoped.
Tackling GUCs would be a good start. Especially given the number of GUCs where the actual GUC value isn't the state that anyone should be interacting with directly since a hook maintains the "real" state derived from the GUC storage.
And preventing direct writes to GUCs seems like a clearly correct thing to do.
Some consideration of performance would be important for some of the hotter GUCs, of course, but most extensions won't be hammering most GUC access a lot.
On Mon, 23 Aug 2021 at 22:45, Julien Rouhaud <rjuju123@gmail.com> wrote:
On Mon, Aug 23, 2021 at 10:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > On Mon, Aug 23, 2021 at 10:15:04AM -0400, Tom Lane wrote:
> >> By that argument, *every* globally-visible variable should be marked
> >> PGDLLIMPORT. But the mere fact that two backend .c files need to access
>
> > No, Julien says 99% need only the GUCs, so that is not the argument I am
> > making.
>
> That's a claim unbacked by any evidence that I've seen. More to
> the point, we already have a mechanism that extensions can/should
> use to read and write GUC settings, and it's not direct access.
I clearly didn't try all single extension available out there. It's
excessively annoying to compile extensions on Windows, and also I
don't have a lot of dependencies installed so there are some that I
wasn't able to test since I'm lacking some other lib and given my
absolute lack of knowledge of that platform I didn't spent time trying
to install those.
Plenty of them are closed too.
While that's not the Pg community's problem, as such, it'd be nice not to arbitrarily break them all for no actual benefit.
On 23.08.21 16:47, Julien Rouhaud wrote: > On Mon, Aug 23, 2021 at 10:36 PM Bruce Momjian <bruce@momjian.us> wrote: >> >> So the problem is that extensions only _need_ to use that API on >> Windows, so many initially don't, or that the API is too limited? > > The inconvenience with that API is that it's only returning c strings, > so you gave to convert it back to the original datatype. That's > probably why most of the extensions simply read from the original > exposed variable rather than using the API, because they're usually > written on Linux or similar, not because they want to mess up the > stored value. If there were an API, then in-core code should use it as well. If, for example, an extension wanted to define a "float16" type, then it should be able to access extra_float_digits in the *same way* as float4out() and float8out() can access it. This is clearly not possible today.
On Tue, Aug 24, 2021 at 4:28 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > If there were an API, then in-core code should use it as well. ...which is presumably never going to happen, because the performance cost would, I think, be quite terrible. If you really had to force everything through an API, I think what you'd want to do is define an API where code can look up a handle object for a GUC using the name of the GUC, and then hold onto a pointer to the handle and use that for future accesses, so that you don't have to keep incurring the expense of a hash table hit on every access. But even if you did that, preventing "unauthorized" writes to GUC variables would require a function call for every access. That function would be pretty trivial, but it would have to exist, and... > If, for example, an extension wanted to define a "float16" type, then it > should be able to access extra_float_digits in the *same way* as > float4out() and float8out() can access it. This is clearly not possible > today. ...if you used it for something like this, it would probably show up in the profile, and we would get demands to remove that API and allow direct access to the variables, which is all anybody is asking for here anyway, and which is what pretty much everyone, whether they develop for core or some extension, does and wants to do. This also brings me to another point, which is that I do not think there is anyone who actually wants an API like this. I believe that extension developers find it rather convenient that they can make use of all of the backend functions and variables that PostgreSQL exposes, and that a lot of them would be deeply unhappy if we removed that access. As an occasional extension maintainer myself, I know I would be. And, as Craig quite rightly points out upthread, we would not get anything in return for making those people unhappy. I don't know why we would even consider doing something that would benefit nobody, greatly inconvenience some people, and generally stifle innovation in the PostgreSQL ecosystem. Adding PGDLLIMPORT markings, on the other hand, would hurt nobody, be very convenient for some people, and encourage innovation in the PostgreSQL ecosystem. -- Robert Haas EDB: http://www.enterprisedb.com
On 08/24/21 14:28, Robert Haas wrote: > cost would, I think, be quite terrible. If you really had to force > everything through an API, I think what you'd want to do is define an > API where code can look up a handle object for a GUC using the name of > the GUC, and then hold onto a pointer to the handle and use that for > future accesses, so that you don't have to keep incurring the expense > of a hash table hit on every access. But even if you did that, > preventing "unauthorized" writes to GUC variables would require a > function call for every access. I don't think that's true of the second proposal in [0]. I don't foresee a noticeable runtime cost unless there is a plausible workload that involves very frequent updates to GUC settings that are also of interest to a bunch of extensions. Maybe I'll take a stab at a POC. Regards, -Chap [0] https://www.postgresql.org/message-id/6123C425.3080409%40anastigmatix.net
On Tue, Aug 24, 2021 at 2:52 PM Chapman Flack <chap@anastigmatix.net> wrote: > I don't think that's true of the second proposal in [0]. I don't foresee > a noticeable runtime cost unless there is a plausible workload that > involves very frequent updates to GUC settings that are also of interest > to a bunch of extensions. Maybe I'll take a stab at a POC. I'm not sure I fully understand that proposal, but I find it hard to believe that we would seriously consider replacing every direct GUC reference in the backend with something that goes through an API. Even if didn't hurt performance, I think it would uglify the code a whole lot. And as Peter says, if we're not going to do that, then it's not clear why extensions should have to. -- Robert Haas EDB: http://www.enterprisedb.com
On 08/24/21 15:12, Robert Haas wrote: > I find it hard to > believe that we would seriously consider replacing every direct GUC > reference in the backend with something that goes through an API. Peter may have advocated for that kind of across-the-board adoption; my leaning is more to add an API that /can/ be adopted, initially with separately-linked extensions as the audience. Nothing would stop it being used in core as well, but no reason to change any site where it did not offer an advantage. I generally tend to be an incrementalist. Regards, -Chap
On Tue, Aug 24, 2021 at 3:36 PM Chapman Flack <chap@anastigmatix.net> wrote: > Peter may have advocated for that kind of across-the-board adoption; > my leaning is more to add an API that /can/ be adopted, initially with > separately-linked extensions as the audience. Nothing would stop it being > used in core as well, but no reason to change any site where it did not > offer an advantage. > > I generally tend to be an incrementalist. Sure, me too, but the point for me is that there doesn't seem to be a shred of a reason to go this way at all. We've turned a discussion about adding PGDLLIMPORT, which ought to be totally uncontroversial, into some kind of a discussion about adding an API layer that no one wants to prevent a hypothetical failure mode not in evidence. -- Robert Haas EDB: http://www.enterprisedb.com
On 08/24/21 16:31, Robert Haas wrote: > about adding PGDLLIMPORT, which ought to be totally uncontroversial, The thing is, I think I have somewhere a list of all the threads on this topic that I've read through since the first time I had to come with my own hat in hand asking for a PGDLLIMPORT on something, years ago now, and I don't think I have ever seen one where it was as uncontroversial as you suggest. In each iteration, I think I've also seen a countervailing view expressed in favor of looking into whether globals visibility could be further /reduced/. Maybe that view is slowly losing adherents? I don't know; it did get advanced again by some in this thread. The positions seem to routinely fall into two boxes: * Let's look into that, it would be a step toward having a more defined API * No, what we have now is so far from a defined API that there's nothing worth stepping toward. The situation seems intermediate to me. The current condition is clearly something that organically grew without a strong emphasis on defining API. On the other hand, there are many corners of it that show evidence of thought about encapsulation and API by the developers of those corners. It seems to me like the kind of setting where incrementalists can find room for small moves. Regards, -Chap
On Tue, Aug 24, 2021 at 04:31:32PM -0400, Robert Haas wrote: > On Tue, Aug 24, 2021 at 3:36 PM Chapman Flack <chap@anastigmatix.net> wrote: > > Peter may have advocated for that kind of across-the-board adoption; > > my leaning is more to add an API that /can/ be adopted, initially with > > separately-linked extensions as the audience. Nothing would stop it being > > used in core as well, but no reason to change any site where it did not > > offer an advantage. > > > > I generally tend to be an incrementalist. > > Sure, me too, but the point for me is that there doesn't seem to be a > shred of a reason to go this way at all. We've turned a discussion > about adding PGDLLIMPORT, which ought to be totally uncontroversial, > into some kind of a discussion about adding an API layer that no one > wants to prevent a hypothetical failure mode not in evidence. +1 -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
On Tue, Aug 24, 2021 at 05:06:54PM -0400, Chapman Flack wrote: > On 08/24/21 16:31, Robert Haas wrote: > > about adding PGDLLIMPORT, which ought to be totally uncontroversial, > > The thing is, I think I have somewhere a list of all the threads on this > topic that I've read through since the first time I had to come with my own > hat in hand asking for a PGDLLIMPORT on something, years ago now, and > I don't think I have ever seen one where it was as uncontroversial > as you suggest. The "ought" above is a load-bearing word. Nonetheless, here's a case, also involving GUCs, where it was uncontroversial: https://postgr.es/m/flat/20171120200230.iwcmptwznbvl6y4c%40alap3.anarazel.de
On Tue, Aug 24, 2021 at 5:06 PM Chapman Flack <chap@anastigmatix.net> wrote: > The thing is, I think I have somewhere a list of all the threads on this > topic that I've read through since the first time I had to come with my own > hat in hand asking for a PGDLLIMPORT on something, years ago now, and > I don't think I have ever seen one where it was as uncontroversial > as you suggest. It does tend to be controversial, but I think that's basically only because Tom Lane has reservations about it. I think if Tom dropped his opposition to this, nobody else would really care. And I think that would be a good thing for the project. > In each iteration, I think I've also seen a countervailing view expressed > in favor of looking into whether globals visibility could be further > /reduced/. But, like I say, that's only a view that gets advanced as a reason not to mark things PGDLLIMPORT. Nobody ever wants that thing for its own sake. I think it's a complete red herring. If and when somebody wants to make a serious proposal to do something like that, it can be considered on its own merits. But between now and then, refusing to make things work on Windows as they do on Linux does not benefit anyone. A ton of work has been made into making PostgreSQL portable over the years, and abandoning it in just this one case is unreasonable. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Aug 24, 2021 at 5:06 PM Chapman Flack <chap@anastigmatix.net> wrote: > > The thing is, I think I have somewhere a list of all the threads on this > > topic that I've read through since the first time I had to come with my own > > hat in hand asking for a PGDLLIMPORT on something, years ago now, and > > I don't think I have ever seen one where it was as uncontroversial > > as you suggest. > > It does tend to be controversial, but I think that's basically only > because Tom Lane has reservations about it. I think if Tom dropped his > opposition to this, nobody else would really care. And I think that > would be a good thing for the project. I have only one consideration about it, and that's a technical one :) Does this in some way have an effect on the size of the binary and/or the time it takes to load it? I ask, because IIRC back in the prehistoric days, adding all the *functions* to the list of exports had a very significant impact on the size of the binary, and some (but not very large) impact on the loading time. Of course, we had to do that so that even our own libraries would probably load. And at the time it was decided that we definitely wanted to export all functions and not just the ones that we would somehow define as an API. Now, I'm pretty sure that the GUCs are few enough that this should have no measurable effect on size/load time, but it's something that should be verified. But in particular, both on that argument, and on the general maintenance argument, I have a very hard time seeing how exporting the GUC variables would be any worse than exporting the many hundreds of functions we already export. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote: >> It does tend to be controversial, but I think that's basically only >> because Tom Lane has reservations about it. I think if Tom dropped his >> opposition to this, nobody else would really care. And I think that >> would be a good thing for the project. > But in particular, both on that argument, and on the general > maintenance argument, I have a very hard time seeing how exporting the > GUC variables would be any worse than exporting the many hundreds of > functions we already export. My beef about it has nothing to do with binary-size concerns, although that is an interesting angle. (I wonder whether marking a variable PGDLLIMPORT has any negative effect on the cost of accessing it from within the core code?) Rather, I'm unhappy with spreading even more Microsoft-droppings all over our source. If there were some way to just do this automatically for all global variables without any source changes, I'd be content with that. That would *really* make the platforms more nearly equivalent. regards, tom lane
On Wed, Aug 25, 2021 at 4:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Magnus Hagander <magnus@hagander.net> writes: > > On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote: > >> It does tend to be controversial, but I think that's basically only > >> because Tom Lane has reservations about it. I think if Tom dropped his > >> opposition to this, nobody else would really care. And I think that > >> would be a good thing for the project. > > > But in particular, both on that argument, and on the general > > maintenance argument, I have a very hard time seeing how exporting the > > GUC variables would be any worse than exporting the many hundreds of > > functions we already export. > > My beef about it has nothing to do with binary-size concerns, although > that is an interesting angle. (I wonder whether marking a variable > PGDLLIMPORT has any negative effect on the cost of accessing it from > within the core code?) It should have no effect on local code. PGDLLIMPORT turns into "__declspec (dllexport)" when building the backend, which should have no effect on imports (it turns into __declspec (dllimport) when building a frontend only, but that's why we need it in the headers, iirc) The only overhead I've seen discussions about int he docs around that is the overhead of exporting by name vs exporting by ordinal. > Rather, I'm unhappy with spreading even more > Microsoft-droppings all over our source. If there were some way to > just do this automatically for all global variables without any source > changes, I'd be content with that. That would *really* make the > platforms more nearly equivalent. Actually, ti's clearly been a while since I dug into this AFAICT we *do* actually export all the data symbols as well? At least in the MSVC build where we use gendef.pl? Specifically, see a5eed4d770. The thing we need the PGDLLIMPORT definition for is to *import* them on the other end? -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On 2021-Aug-25, Magnus Hagander wrote: > The thing we need the PGDLLIMPORT definition for is to *import* them > on the other end? Oh ... so modules that are willing to cheat can include their own declarations of the variables they need, and mark them __declspec (dllimport)? -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "The Gord often wonders why people threaten never to come back after they've been told never to return" (www.actsofgord.com)
On Thu, Aug 26, 2021 at 1:51 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2021-Aug-25, Magnus Hagander wrote: > > > The thing we need the PGDLLIMPORT definition for is to *import* them > > on the other end? > > Oh ... so modules that are willing to cheat can include their own > declarations of the variables they need, and mark them __declspec > (dllimport)? I just tried and msvc doesn't like it. It errors out with a C2370 error "redefinition; different storage class". According to https://docs.microsoft.com/en-us/cpp/error-messages/compiler-errors-1/compiler-error-c2370 changing __declspec() on the other side is not possible.
On Wed, Aug 25, 2021 at 10:41:14AM -0400, Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: > > On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote: > >> It does tend to be controversial, but I think that's basically only > >> because Tom Lane has reservations about it. I think if Tom dropped his > >> opposition to this, nobody else would really care. And I think that > >> would be a good thing for the project. > > > But in particular, both on that argument, and on the general > > maintenance argument, I have a very hard time seeing how exporting the > > GUC variables would be any worse than exporting the many hundreds of > > functions we already export. > > My beef about it has nothing to do with binary-size concerns, although > that is an interesting angle. (I wonder whether marking a variable > PGDLLIMPORT has any negative effect on the cost of accessing it from > within the core code?) Rather, I'm unhappy with spreading even more > Microsoft-droppings all over our source. If there were some way to > just do this automatically for all global variables without any source > changes, I'd be content with that. That would *really* make the > platforms more nearly equivalent. OK, so the big question is how can we minimize the code impact of this feature? Can we add some kind of checking so we don't forget to mark anything? Can we automate this somehow? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
On Thu, Aug 26, 2021 at 3:38 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > > On Thu, Aug 26, 2021 at 1:51 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2021-Aug-25, Magnus Hagander wrote: > > > > > The thing we need the PGDLLIMPORT definition for is to *import* them > > > on the other end? > > > > Oh ... so modules that are willing to cheat can include their own > > declarations of the variables they need, and mark them __declspec > > (dllimport)? > > I just tried and msvc doesn't like it. It errors out with a C2370 > error "redefinition; different storage class". According to > https://docs.microsoft.com/en-us/cpp/error-messages/compiler-errors-1/compiler-error-c2370 > changing __declspec() on the other side is not possible. Yeah, but that does move the problem to the other side doesn't it? So if you (as a pure test of course) were to remove the variable completely from the included header and just declare it manually with PGDLLSPEC in your file, it should work? Ugly as it is, I wonder if there's a chance we could just process all the headers at install times and inject the PGDLLIMPORT. We know which symvols it is on account of what we're getting in the DEF file. Not saying that's not a very ugly solution, but it might work? -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On Thu, Aug 26, 2021 at 3:42 PM Magnus Hagander <magnus@hagander.net> wrote: > Ugly as it is, I wonder if there's a chance we could just process all > the headers at install times and inject the PGDLLIMPORT. We know which > symvols it is on account of what we're getting in the DEF file. > > Not saying that's not a very ugly solution, but it might work? If it's ugly, that might mean it's a bad idea and we shouldn't do it ... but if it can be made not-too-ugly, it would certainly be nice to be able to stop worrying about this. -- Robert Haas EDB: http://www.enterprisedb.com
On 8/26/21 3:57 PM, Robert Haas wrote: > On Thu, Aug 26, 2021 at 3:42 PM Magnus Hagander <magnus@hagander.net> wrote: >> Ugly as it is, I wonder if there's a chance we could just process all >> the headers at install times and inject the PGDLLIMPORT. We know which >> symvols it is on account of what we're getting in the DEF file. >> >> Not saying that's not a very ugly solution, but it might work? > If it's ugly, that might mean it's a bad idea and we shouldn't do it > ... but if it can be made not-too-ugly, it would certainly be nice to > be able to stop worrying about this. > How is this going to affect msys builds? No gendef there IIRC. I guess some similar procedure might be possible ... cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Thu, Aug 26, 2021 at 05:10:39PM -0400, Andrew Dunstan wrote: > On 8/26/21 3:57 PM, Robert Haas wrote: >> On Thu, Aug 26, 2021 at 3:42 PM Magnus Hagander <magnus@hagander.net> wrote: >>> Ugly as it is, I wonder if there's a chance we could just process all >>> the headers at install times and inject the PGDLLIMPORT. We know which >>> symvols it is on account of what we're getting in the DEF file. >>> Not saying that's not a very ugly solution, but it might work? I missed the word "install" first here :) >> If it's ugly, that might mean it's a bad idea and we shouldn't do it >> ... but if it can be made not-too-ugly, it would certainly be nice to >> be able to stop worrying about this. > > How is this going to affect msys builds? No gendef there IIRC. I guess > some similar procedure might be possible ... Wouldn't that be needed for cygwin as well? If we go down to enable that for a maximum number of parameters, I would really agree for doing things so as this never gets forgotten for new parameters and we don't have to discuss the matter anymore. With all that in mind, that would mean a new perl script that does the job, callable by both MSVC and normal make builds. But we have nothing that does a manipulation of the installation contents. And couldn't it be a problem if an installation is overwritten, updated or upgradedd, where there may be contents not coming from the in-core build process but from some extension? -- Michael
Attachment
On Fri, Aug 27, 2021 at 3:42 AM Magnus Hagander <magnus@hagander.net> wrote: > > Yeah, but that does move the problem to the other side doesn't it? So > if you (as a pure test of course) were to remove the variable > completely from the included header and just declare it manually with > PGDLLSPEC in your file, it should work? > > Ugly as it is, I wonder if there's a chance we could just process all > the headers at install times and inject the PGDLLIMPORT. We know which > symvols it is on account of what we're getting in the DEF file. > > Not saying that's not a very ugly solution, but it might work? It's apparently not enough. I tried with autovacuum_max_workers GUC, and it still errors out. If I add a PGDLLIMPORT, there's a link error when trying to access the variable: error LNK2019: unresolved external symbol __imp_autovacuum_max_workers referenced in function... I think that it means that msvc tries to link that to a DLL while it's expected to be in postgres.lib as the __imp_ prefix is added in that case. If I use PGDLLEXPORT I simply get: error LNK2001: unresolved external symbol aytovacuum_max_workers
On Wed, 25 Aug 2021 at 03:13, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Aug 24, 2021 at 2:52 PM Chapman Flack <chap@anastigmatix.net> wrote:
> I don't think that's true of the second proposal in [0]. I don't foresee
> a noticeable runtime cost unless there is a plausible workload that
> involves very frequent updates to GUC settings that are also of interest
> to a bunch of extensions. Maybe I'll take a stab at a POC.
I'm not sure I fully understand that proposal, but I find it hard to
believe that we would seriously consider replacing every direct GUC
reference in the backend with something that goes through an API. Even
if didn't hurt performance, I think it would uglify the code a whole
lot.
It'd probably have to be something that resolves the GUC storage addresses at compile-time or once at runtime, if it's going to be used by core code. While some code doesn't hit a lot of GUCs, some *really* hammers some common GUCs.
There are various issues with cache lines and pointer chasing that are beyond my low-level fu at work here. Adding a level of pointer indirection can be very expensive in the wrong situations.
So you're probably looking at some kind of mess with token pasting, macros and static inlines. Ew.
On Wed, 25 Aug 2021 at 22:36, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 5:06 PM Chapman Flack <chap@anastigmatix.net> wrote:
> > The thing is, I think I have somewhere a list of all the threads on this
> > topic that I've read through since the first time I had to come with my own
> > hat in hand asking for a PGDLLIMPORT on something, years ago now, and
> > I don't think I have ever seen one where it was as uncontroversial
> > as you suggest.
>
> It does tend to be controversial, but I think that's basically only
> because Tom Lane has reservations about it. I think if Tom dropped his
> opposition to this, nobody else would really care. And I think that
> would be a good thing for the project.
I have only one consideration about it, and that's a technical one :)
Does this in some way have an effect on the size of the binary and/or
the time it takes to load it?
On *nix, no.
On Windows, very, very minimally.
But doing so is quite orthogonal to the matter of fixing a linkage issue on Windows. By making select symbols hidden we'd be *reducing* the exposed set of functions and data symbols in a disciplined and progressive way on all platforms. Useful but different.
On Thu, 26 Aug 2021 at 01:51, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Aug-25, Magnus Hagander wrote:
> The thing we need the PGDLLIMPORT definition for is to *import* them
> on the other end?
Oh ... so modules that are willing to cheat can include their own
declarations of the variables they need, and mark them __declspec
(dllimport)?
Damn. I was hoping nobody would notice that.
I do exactly that in some extensions to work around some of this mess, but it is quite awkward and has its limitations.
On Fri, 27 Aug 2021 at 08:59, Julien Rouhaud <rjuju123@gmail.com> wrote:
On Fri, Aug 27, 2021 at 3:42 AM Magnus Hagander <magnus@hagander.net> wrote:
>
> Yeah, but that does move the problem to the other side doesn't it? So
> if you (as a pure test of course) were to remove the variable
> completely from the included header and just declare it manually with
> PGDLLSPEC in your file, it should work?
>
> Ugly as it is, I wonder if there's a chance we could just process all
> the headers at install times and inject the PGDLLIMPORT. We know which
> symvols it is on account of what we're getting in the DEF file.
>
> Not saying that's not a very ugly solution, but it might work?
It's apparently not enough. I tried with autovacuum_max_workers GUC,
and it still errors out.
If I add a PGDLLIMPORT, there's a link error when trying to access the variable:
error LNK2019: unresolved external symbol __imp_autovacuum_max_workers
referenced in function...
If I use PGDLLEXPORT I simply get:
error LNK2001: unresolved external symbol aytovacuum_max_workers
It works, but you can't use PGDLLIMPORT, you have to implement the import yourself without the help of __declspec(dllimport) .
Where you want
autovacuum_max_workers
you must instead write
*((int*)__imp_autovacuum_max_workers)
Here's the comment I wrote on the topic in something I was working on:
/*
* On Windows, a symbol is not accessible outside the executable or shared
* library (PE object) that it is defined in unless explicitly exported in
* the DLL interface.
*
* It must then be explicitly imported by objects that use it; Windows doesn't
* do ELF-style fixups.
*
* The export part is usually accomplished by a __declspec(dllexport)
* annotation on the symbol or a .DEF file supplied as linker input. Postgres
* uses the .DEF approach, auto-exporting all symbols using
* src\tools\msvc\gendef.pl . Internally this hides "symname" from the DLL
* interface and instead generates an export symbol "__imp_symname" that is a
* pointer to the value of "symname".
*
* The import part is usually done with the __declspec(dllimport) annotation on
* the symbol. The PGDLLIMPORT macro expands to __declspec(dllimport) when
* postgres headers are included during extension compilation. But not all the
* symbols that pglogical needs are annotated with PGDLLIMPORT. Attempting to
* access a symbol that is not so-annotated will fail at link time with an
* error like
*
* error LNK2001: unresolved external symbol criticalSharedRelcachesBuilt
*
* Because of gendefs.pl, postgres still exports the symbol even if it isn't
* annotated PGDLLIMPORT. So we can just do the shorthand that
* __declspec(dllimport) does for us in the preprocessor instead: replace each
* symbol with its __imp_symbol indirection and dereference it.
*
* There's one wrinkle in this though. MSVC doesn't generate a definition for a
* global data symbol that is neither initialized nor explicitly marked
* __declspec(dllexport). gendefs.pl will think such symbols are references to
* a symbol defined in another object file and will skip them without emitting
* a DATA entry for them in the DEF file, so no __imp_ stub is generated in the
* DLL interface. We can't use (*__imp_symbolname) if there's no import stub.
*
* If they're GUCs, we can round-trip them through their text values
* to read them. Nothing should ever be assigning to GUC storage and there's no
* reason to take the address of GUC storage, so this should work fine, albeit
* slower. If we find any that aren't GUCs we're in trouble but so far there
* haven't been any.
* On Windows, a symbol is not accessible outside the executable or shared
* library (PE object) that it is defined in unless explicitly exported in
* the DLL interface.
*
* It must then be explicitly imported by objects that use it; Windows doesn't
* do ELF-style fixups.
*
* The export part is usually accomplished by a __declspec(dllexport)
* annotation on the symbol or a .DEF file supplied as linker input. Postgres
* uses the .DEF approach, auto-exporting all symbols using
* src\tools\msvc\gendef.pl . Internally this hides "symname" from the DLL
* interface and instead generates an export symbol "__imp_symname" that is a
* pointer to the value of "symname".
*
* The import part is usually done with the __declspec(dllimport) annotation on
* the symbol. The PGDLLIMPORT macro expands to __declspec(dllimport) when
* postgres headers are included during extension compilation. But not all the
* symbols that pglogical needs are annotated with PGDLLIMPORT. Attempting to
* access a symbol that is not so-annotated will fail at link time with an
* error like
*
* error LNK2001: unresolved external symbol criticalSharedRelcachesBuilt
*
* Because of gendefs.pl, postgres still exports the symbol even if it isn't
* annotated PGDLLIMPORT. So we can just do the shorthand that
* __declspec(dllimport) does for us in the preprocessor instead: replace each
* symbol with its __imp_symbol indirection and dereference it.
*
* There's one wrinkle in this though. MSVC doesn't generate a definition for a
* global data symbol that is neither initialized nor explicitly marked
* __declspec(dllexport). gendefs.pl will think such symbols are references to
* a symbol defined in another object file and will skip them without emitting
* a DATA entry for them in the DEF file, so no __imp_ stub is generated in the
* DLL interface. We can't use (*__imp_symbolname) if there's no import stub.
*
* If they're GUCs, we can round-trip them through their text values
* to read them. Nothing should ever be assigning to GUC storage and there's no
* reason to take the address of GUC storage, so this should work fine, albeit
* slower. If we find any that aren't GUCs we're in trouble but so far there
* haven't been any.
*
* See also:
*
* - https://docs.microsoft.com/en-us/cpp/build/importing-data-using-declspec-dllimport
* - https://docs.microsoft.com/en-us/cpp/build/importing-using-def-files
* - https://docs.microsoft.com/en-us/cpp/build/exporting-from-a-dll-using-def-files
* - https://docs.microsoft.com/en-us/cpp/build/determining-which-exporting-method-to-use
*/
*
* - https://docs.microsoft.com/en-us/cpp/build/importing-data-using-declspec-dllimport
* - https://docs.microsoft.com/en-us/cpp/build/importing-using-def-files
* - https://docs.microsoft.com/en-us/cpp/build/exporting-from-a-dll-using-def-files
* - https://docs.microsoft.com/en-us/cpp/build/determining-which-exporting-method-to-use
*/
This is gruesome and I hadn't planned to mention it, but now someone noticed the .DEF file exports the symbols I guess it does no harm.
So can we just fix PGDLLIMPORT now?
Hi, On 02/13/22 02:29, Julien Rouhaud wrote: > Maybe we could have an actually usable GUC API to retrieve values in their > native format rather than C string for instance, that we could make sure also > works for cases like max_backend? I proposed a sketch of such an API for discussion back in [0] (the second idea in that email, the "what I'd really like" one). In that scheme, some extension code that was interested in (say, for some reason) log_statement_sample_rate could say: static double samprate; static int gucs_changed = 0; #define SAMPRATE_CHANGED 1 ... ObserveTypedConfigValue("log_statement_sample_rate", &samprate, &gucs_changed, SAMPRATE_CHANGED); ... and will be subscribed to have the native-format value stored into samprate, and SAMPRATE_CHANGED ORed into gucs_changed, whenever the value changes. The considerations leading me to that design were: - avoid subscribing as a 'callback' sort of listener. GUCs can get set (or, especially, reset) in delicate situations like error recovery where calling out to arbitrary extra code might best be avoided. - instead, just dump the value in a subscribed location. A copy, of course, so no modification there affects the real value. - but at the same time, OR a flag into a bit set, so subscribing code can very cheaply poll for when a value of interest (or any of a bunch of values of interest) has changed since last checked. - do the variable lookup by name once only, and pay no further search cost when the subscribing code wants the value. I never pictured that proposal as the last word on the question, and different proposals could result from putting different weights on those objectives, or adding other objectives, but I thought it might serve as a discussion-starter. Regards, -Chap [0] https://www.postgresql.org/message-id/6123C425.3080409%40anastigmatix.net
Chapman Flack <chap@anastigmatix.net> writes: > On 02/13/22 02:29, Julien Rouhaud wrote: >> Maybe we could have an actually usable GUC API to retrieve values in their >> native format rather than C string for instance, that we could make sure also >> works for cases like max_backend? > I proposed a sketch of such an API for discussion back in [0] (the second > idea in that email, the "what I'd really like" one). > In that scheme, some extension code that was interested in (say, > for some reason) log_statement_sample_rate could say: > static double samprate; > static int gucs_changed = 0; > #define SAMPRATE_CHANGED 1 > ... > ObserveTypedConfigValue("log_statement_sample_rate", > &samprate, &gucs_changed, SAMPRATE_CHANGED); > ... > and will be subscribed to have the native-format value stored into samprate, > and SAMPRATE_CHANGED ORed into gucs_changed, whenever the value changes. That seems like about 10X more complexity than is warranted, not only in terms of the infrastructure required, but also in the intellectual complexity around "just when could that value change?" Why not just provide equivalents to GetConfigOption() that can deliver int, float8, etc values instead of strings? (In any case we'd need to rethink the GUC "show_hook" APIs, which currently need only deal in string output.) regards, tom lane
On 02/13/22 15:16, Tom Lane wrote: > That seems like about 10X more complexity than is warranted, > not only in terms of the infrastructure required, but also in > the intellectual complexity around "just when could that value > change?" > > Why not just provide equivalents to GetConfigOption() that can > deliver int, float8, etc values instead of strings? And repeat the bsearch to find the option every time the interested extension wants to check the value, even when what it learns is that the value has not changed? And make it the job of every piece of interested extension code to save the last-retrieved value and compare if it wants to detect that it's changed? When the GUC machinery already has code that executes exactly when a value is being supplied for an option? Clearly I'm not thinking here of the GUCs that are read-only views of values that are determined some other way. How many of those are there that are mutable, and could have their values changed without going through the GUC mechanisms? Also, I think there are some options that are only represented by an int, float8, etc., when shown, but whose native internal form is something else, like a struct. I was definitely contemplating that you could 'subscribe' to one of those too, by passing the address of an appropriate struct. But of course a GetConfigOption() flavor could work that way too. Regards, -Chap
Chapman Flack <chap@anastigmatix.net> writes: > On 02/13/22 15:16, Tom Lane wrote: >> Why not just provide equivalents to GetConfigOption() that can >> deliver int, float8, etc values instead of strings? > And repeat the bsearch to find the option every time the interested > extension wants to check the value, even when what it learns is that > the value has not changed? And make it the job of every piece of > interested extension code to save the last-retrieved value and compare > if it wants to detect that it's changed? When the GUC machinery already > has code that executes exactly when a value is being supplied for > an option? (shrug) You could argue the performance aspect either way. If the core GUC code updates a value 1000 times while the extension consults the result once, you've probably lost money on the deal. As for the bsearch, we could improve on that when and if it's shown to be a bottleneck --- converting to a hash table ought to pretty much fix any worries there. Or we could provide APIs that let an extension look up a "struct config_generic *" once and then fetch directly using that pointer. (But I'd prefer not to, since it'd constrain the internals more than I think is wise.) regards, tom lane
Andres Freund <andres@anarazel.de> writes: > On 2022-02-13 15:36:16 -0500, Chapman Flack wrote: >> Also, I think there are some options that are only represented by >> an int, float8, etc., when shown, but whose native internal form >> is something else, like a struct. I was definitely contemplating >> that you could 'subscribe' to one of those too, by passing the >> address of an appropriate struct. But of course a GetConfigOption() >> flavor could work that way too. > I have a very hard time seeing a use-case for this. Nor how it'd even work > with a struct - you can't just copy the struct contents, because of pointers > to objects etc. I don't think there really are options like this anyway. There are a couple of legacy cases like "datestyle" where something that the user sees as one GUC translates to multiple variables under the hood, so you'd have to invent a struct if you wanted to pass them through a mechanism like this. I don't have a big problem with leaving those out of any such solution, though. (I see that datestyle's underlying variables DateStyle and DateOrder are already marked PGDLLIMPORT, and that's fine with me.) A possibly more interesting case is something like search_path --- but again, there are already special-purpose APIs for accessing the interpreted value of that. regards, tom lane
On 02/13/22 15:48, Tom Lane wrote: > much fix any worries there. Or we could provide APIs that let an > extension look up a "struct config_generic *" once and then fetch > directly using that pointer. (But I'd prefer not to, since it'd > constrain the internals more than I think is wise.) There is GetConfigOptionByNum already ... but I'm not sure there's an easy way to ask "what's the num for option foo?". GetNumConfigOptions exists, so there is a brute-force way. Regards, -Chap
On Sun, Feb 13, 2022 at 3:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > ... > > ObserveTypedConfigValue("log_statement_sample_rate", > > &samprate, &gucs_changed, SAMPRATE_CHANGED); > > ... > > > and will be subscribed to have the native-format value stored into samprate, > > and SAMPRATE_CHANGED ORed into gucs_changed, whenever the value changes. > > > That seems like about 10X more complexity than is warranted, > not only in terms of the infrastructure required, but also in > the intellectual complexity around "just when could that value > change?" I agree in the sense that I believe we should $SUBJECT rather than fooling around with this. It is completely understandable that extensions want to know the value of GUCs, and not just as strings, and doing $SUBJECT would be by far the easiest way of accomplishing that. I'm sure Andres is right when he says that there are cases where not exposing symbols could improve the generated machine code, but I'm also pretty sure that we just have to live with the cost when it's core GUCs that we're talking about. It's just unrealistic to suppose that extensions are not going to care. But if we're not going to do that, then I don't see why Chapman's proposal is overly complex. It seems completely understandable that if (some_guc_var != last_observed_some_guc_var) { ...adapt accordingly... } feels like an OK thing to do but if you have to make a function call every time then it seems too expensive. Imagine a background worker that has to do a bunch of extra setup every time some GUC changes, and every iteration of the main loop just wants to check whether it's changed, and the main loop could iterate very quickly in some cases. I wouldn't want to worry about the cost of a function call on every trip through the loop. Maybe it would be trivial in practice, but who knows? I don't particularly like Chapman's solution, but given that you've repeatedly blocked every effort to just apply PGDLLIMPORT markings across the board, I'm not sure what the realistic alternative is. It doesn't seem fair to argue, on the one hand, that we can't just do what I believe literally every other hacker on the project wants, and that on the other hand, it's also unacceptable to add complexity to work around the problem you've created by blocking that proposal every time it's been raised year after year. It's really pretty frustrating to me that we haven't just done the obvious thing here years ago. The time we've spent arguing about it could have been better spent on just about anything else, with absolutely zero harm to the project. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > I don't particularly like Chapman's solution, but given that you've > repeatedly blocked every effort to just apply PGDLLIMPORT markings > across the board, I'm not sure what the realistic alternative is. You do realize that I just have one vote in these matters? If I'm outvoted then so be it. The impression I have though is that a number of other people don't like the extra notational cruft either. regards, tom lane
On Mon, Feb 14, 2022 at 10:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > I don't particularly like Chapman's solution, but given that you've > > repeatedly blocked every effort to just apply PGDLLIMPORT markings > > across the board, I'm not sure what the realistic alternative is. > > You do realize that I just have one vote in these matters? If I'm > outvoted then so be it. The impression I have though is that a > number of other people don't like the extra notational cruft either. Hmm, I guess I'd need to know who those people are in order to be able to review their comments. I don't *like* the extra notational cruft, but applying it inconsistently isn't better than being consistent. As I see it, we have four choices: (1) apply PGDLLIMPORT markings relatively broadly so that people can get extensions to work on Windows, (2) continue to apply them inconsistently, thus slightly reducing notational clutter at the cost of breaking lots of extensions on Windows, (3) put some complex system in place like what Chapman proposes and get all extension authors to adopt it, and (4) remove the Windows port. To the best of my current knowledge, everyone other than you prefers (1), you prefer (2) or (4), and (3) is an attempt at compromise that is nobody's first choice. If that is correct, then I think we should do (1). If it's incorrect then I think we should do our best to find a choice other than (2) that does attract a consensus. The current situation, which is (2), must be the worst of all possible options because it manages to bother the people who dislike the clutter AND ALSO bother the people who want to have their extensions work on Windows. Any other choice has a chance of reaching a state where only one of those groups of people is annoyed. -- Robert Haas EDB: http://www.enterprisedb.com
On 02/14/22 11:43, Robert Haas wrote: > On Mon, Feb 14, 2022 at 10:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> I don't particularly like Chapman's solution, > ... and (3) is an attempt at compromise that is nobody's first choice. Ok, I guess that's )sniffle( a pretty fair way of putting it. > ... (3) put some complex system in place like what Chapman > proposes and get all extension authors to adopt it, By the same token, I don't think it would entail anything like a big flag day for extension authors. Provided no one proposes immediately to /remove/ PGDLLIMPORT from things that now have it, the effect would be more that the next time an extension author comes hat-in-hand asking for a new PGDLLIMPORT because a thing won't build on Windows, the answer can be "here, we have this API you can use for that now." My reading of past discussions on the list suggest that stronger encapsulation and API delineation have advocates in some quarters, so I tried to accommodate that in what I proposed. It does, for example, avoid exposing the 'real' value of a GUC to writing by a buggy or compromised extension. I'll be first to agree what I proposed isn't beautiful, but it might be that a round or two of improvement by somebody smarter than me could lead to something possibly preferable to PGDLLIMPORT-everywhere /when measured against certain objectives/, like encapsulation. So maybe this is in part a discussion about what the weights should be on those various objectives. Regards, -Chap
Robert Haas <robertmhaas@gmail.com> writes: > Hmm, I guess I'd need to know who those people are in order to be able > to review their comments. I don't *like* the extra notational cruft, > but applying it inconsistently isn't better than being consistent. As > I see it, we have four choices: (1) apply PGDLLIMPORT markings > relatively broadly so that people can get extensions to work on > Windows, (2) continue to apply them inconsistently, thus slightly > reducing notational clutter at the cost of breaking lots of extensions > on Windows, (3) put some complex system in place like what Chapman > proposes and get all extension authors to adopt it, and (4) remove the > Windows port. To the best of my current knowledge, everyone other than > you prefers (1), you prefer (2) or (4), and (3) is an attempt at > compromise that is nobody's first choice. I think you are attributing straw-man positions to me. What I'd actually *like* is some solution that has the effect of (1) without having to mark up our code with a bunch of Microsoft-isms. However I don't know how to do that, and I do agree that (2), (3), and (4) are not better than (1). There are some technical issues though: * AFAICS, the patch of record doesn't actually do (1), it does something else that adds yet more new notation. The cfbot says it fails, too. * If we institute a policy that all GUCs should be PGDLLIMPORT'd, how will we enforce that new patches follow that? I don't want to be endlessly going back and adding forgotten PGDLLIMPORT markers. * There's a moderately sizable subset of GUCs where the underlying variable is not visible at all because it's static in guc.c. Typically this is because that variable is only used for display and there's an assign hook that stores the real data somewhere else. I suppose what we want in such cases is for the "somewhere else" to be PGDLLIMPORT'd, but in a lot of cases those variables are also static in some other module. Does this proposal include exporting variables that currently aren't visible to extensions at all? I'm a little resistant to that. I can buy making sure that Windows has a level playing field, but that's as far as I want to go. regards, tom lane
On Mon, Feb 14, 2022 at 12:25 PM Chapman Flack <chap@anastigmatix.net> wrote: > On 02/14/22 11:43, Robert Haas wrote: > > On Mon, Feb 14, 2022 at 10:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Robert Haas <robertmhaas@gmail.com> writes: > >>> I don't particularly like Chapman's solution, > > ... and (3) is an attempt at compromise that is nobody's first choice. > > Ok, I guess that's )sniffle( a pretty fair way of putting it. I mean I like it better than Tom does ... I just think it's a complicated work around for a problem we don't really need to have. > My reading of past discussions on the list suggest that stronger > encapsulation and API delineation have advocates in some quarters, > so I tried to accommodate that in what I proposed. It does, for example, > avoid exposing the 'real' value of a GUC to writing by a buggy > or compromised extension. > > I'll be first to agree what I proposed isn't beautiful, but it might be > that a round or two of improvement by somebody smarter than me could lead > to something possibly preferable to PGDLLIMPORT-everywhere /when measured > against certain objectives/, like encapsulation. > > So maybe this is in part a discussion about what the weights should be > on those various objectives. Keep in mind that there's nothing being set in stone here. Applying PGDLLIMPORT to all GUCs across the board does not foreclose putting some other system that restricts symbol visibility in the future. But in the present, there is clearly no approach to the symbol visibility problem that is fully baked or enjoys consensus. Yet, there is clearly consensus that not being able to access GUCs in Windows extensions is a problem. There must be like six different threads with people complaining about that, and in every case the suggested solution is to just add PGDLLIMPORT for crying out loud. If in the future there is a consensus to restrict symbol visibility in order to achieve some agreed-upon amount of encapsulation, then we can do that then. Extension authors may not like it much, but every serious extension author understands that the PostgreSQL code base needs to keep moving forward, and that this will sometimes require them to adjust for new major versions. This would be a bigger adjustment than many, but I have confidence that if the benefits are worthwhile, people will cope with it and adjust their extensions to adhere to whatever new rules we put in place. But between now and then, refusing to apply PGDLLIMPORT to a basically random subset of GUCs is just annoying people without any compensating benefit. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Feb 14, 2022 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think you are attributing straw-man positions to me. What I'd actually > *like* is some solution that has the effect of (1) without having to mark > up our code with a bunch of Microsoft-isms. However I don't know how to > do that, and I do agree that (2), (3), and (4) are not better than (1). OK, that sounds like we might be making some progress toward a useful agreement. > There are some technical issues though: > > * AFAICS, the patch of record doesn't actually do (1), it does something > else that adds yet more new notation. The cfbot says it fails, too. That sounds like a problem. Should be fixable. > * If we institute a policy that all GUCs should be PGDLLIMPORT'd, > how will we enforce that new patches follow that? I don't want to be > endlessly going back and adding forgotten PGDLLIMPORT markers. It seems to me that it would be no different from bumping catversion or updating nodefuncs.c: yes, it will get forgotten sometimes, but hopefully people will mostly get used to it just like they do with dozens of other PG-specific coding rules. I don't see that it's likely to generate more than a handful of commits per year, so it doesn't really seem worth worrying about to me, but YMMV. Maybe it's possible to write a perl script to verify it, although it seems like it might be complicated to code that up. > * There's a moderately sizable subset of GUCs where the underlying > variable is not visible at all because it's static in guc.c. > Typically this is because that variable is only used for display > and there's an assign hook that stores the real data somewhere else. > I suppose what we want in such cases is for the "somewhere else" > to be PGDLLIMPORT'd, but in a lot of cases those variables are also > static in some other module. Does this proposal include exporting > variables that currently aren't visible to extensions at all? > I'm a little resistant to that. I can buy making sure that Windows > has a level playing field, but that's as far as I want to go. I can live with that. If someone complains about those variables being static-to-file instead of globally visible, we can address that complaint on its merits when it is presented. An alternative rule which would dodge that particular issue would be to just slap PGDLLIMPORT on every global variable in every header file. That would arguably be a simpler rule, though it means even more PGDLLIMPORT declarations floating around. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > An alternative rule which would dodge that particular issue would be > to just slap PGDLLIMPORT on every global variable in every header > file. That would arguably be a simpler rule, though it means even more > PGDLLIMPORT declarations floating around. Yeah, if the objective is "level playing field for Windows", then it's hard to avoid the conclusion that we should just do that. Again, I've never had an objection to that as the end result --- I just wish that we could get the toolchain to do it for us. But if we can't, we can't. regards, tom lane
On Mon, Feb 14, 2022 at 12:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > An alternative rule which would dodge that particular issue would be > > to just slap PGDLLIMPORT on every global variable in every header > > file. That would arguably be a simpler rule, though it means even more > > PGDLLIMPORT declarations floating around. > > Yeah, if the objective is "level playing field for Windows", > then it's hard to avoid the conclusion that we should just do that. > Again, I've never had an objection to that as the end result --- > I just wish that we could get the toolchain to do it for us. > But if we can't, we can't. 100% agreed. -- Robert Haas EDB: http://www.enterprisedb.com
Hi, On 2022-02-14 12:45:08 -0500, Robert Haas wrote: > On Mon, Feb 14, 2022 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > * If we institute a policy that all GUCs should be PGDLLIMPORT'd, > > how will we enforce that new patches follow that? I don't want to be > > endlessly going back and adding forgotten PGDLLIMPORT markers. > > It seems to me that it would be no different from bumping catversion > or updating nodefuncs.c: yes, it will get forgotten sometimes, but > hopefully people will mostly get used to it just like they do with > dozens of other PG-specific coding rules. I don't see that it's likely > to generate more than a handful of commits per year, so it doesn't > really seem worth worrying about to me, but YMMV. Maybe it's possible > to write a perl script to verify it, although it seems like it might > be complicated to code that up. I think it'd be better if we had scripts ensuring this stays sane. Several of your examples do have that. Whereas I don't see what would cause us to missing PGDLLIMPORTs for GUCs, given it's a windows only issue and that you need an extension using the GUC with omitted PGDLLIMPORT. > An alternative rule which would dodge that particular issue would be > to just slap PGDLLIMPORT on every global variable in every header > file. That would arguably be a simpler rule, though it means even more > PGDLLIMPORT declarations floating around. That would have the advantage of being comparatively easy to check in an automated way. Might even be cheap enough to just make it part of the build. But it seems like it'd be a fair amount of work and cause a lot of patch rebasing pain? If we end up going that way, we should schedule this to happen just after the feature freeze, I think. If we consider doing this for all extern variables, we should think about doing this for headers *and* functions. That'd allow us to get rid of the fairly complicated logic to generate the .def file for the postgres binary on windows (src/tools/gendef.pl). And maybe also the related thing on AIX (src/backend/port/aix/mkldexport.sh) I'm kind of partial to the "add explicit visibility information to everything" approach because the windows export file generation is a decent chunk of the meson patchset. And what's missing to make it work on AIX... Greetings, Andres Freund
On Mon, Feb 14, 2022 at 8:53 PM Andres Freund <andres@anarazel.de> wrote: > > An alternative rule which would dodge that particular issue would be > > to just slap PGDLLIMPORT on every global variable in every header > > file. That would arguably be a simpler rule, though it means even more > > PGDLLIMPORT declarations floating around. > > That would have the advantage of being comparatively easy to check in an > automated way. Might even be cheap enough to just make it part of the build. I wasn't able to quickly write something that was smart enough to use as part of the build, but I wrote something dumb that I think works well enough to use with a little bit of human intelligence alongside. See attached. > But it seems like it'd be a fair amount of work and cause a lot of patch > rebasing pain? If we end up going that way, we should schedule this to happen > just after the feature freeze, I think. We could do that. I'd sort of rather get it done. We still have two weeks before the last CommitFest officially starts, and it's not as if there won't be tons of uncommitted patches floating around after that. > If we consider doing this for all extern variables, we should think about > doing this for headers *and* functions. That'd allow us to get rid of the > fairly complicated logic to generate the .def file for the postgres binary on > windows (src/tools/gendef.pl). And maybe also the related thing on AIX > (src/backend/port/aix/mkldexport.sh) I don't know what you mean by this. -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
Hi, On 2022-02-15 08:58:05 -0500, Robert Haas wrote: > On Mon, Feb 14, 2022 at 8:53 PM Andres Freund <andres@anarazel.de> wrote: > > > An alternative rule which would dodge that particular issue would be > > > to just slap PGDLLIMPORT on every global variable in every header > > > file. That would arguably be a simpler rule, though it means even more > > > PGDLLIMPORT declarations floating around. > > > > That would have the advantage of being comparatively easy to check in an > > automated way. Might even be cheap enough to just make it part of the build. > > I wasn't able to quickly write something that was smart enough to use > as part of the build, but I wrote something dumb that I think works > well enough to use with a little bit of human intelligence alongside. > See attached. Cool. > > But it seems like it'd be a fair amount of work and cause a lot of patch > > rebasing pain? If we end up going that way, we should schedule this to happen > > just after the feature freeze, I think. > > We could do that. I'd sort of rather get it done. We still have two > weeks before the last CommitFest officially starts, and it's not as if > there won't be tons of uncommitted patches floating around after that. Breaking close to every patch 6-7 weeks before freeze doesn't seem great. Particularly because this is a mostly automatically generated patch, so I don't really see a forcing function to do this now, rather than at a more opportune moment. The more I think about it the more I'm convinced that if we want to do this, we should do it for variables and functions. > > If we consider doing this for all extern variables, we should think about > > doing this for headers *and* functions. That'd allow us to get rid of the > > fairly complicated logic to generate the .def file for the postgres binary on > > windows (src/tools/gendef.pl). And maybe also the related thing on AIX > > (src/backend/port/aix/mkldexport.sh) > > I don't know what you mean by this. We only need gendef.pl because we don't mark functions with visibility information. If we attach PGDLLIMPORT to functions and variables we can a fair bit of complexity. And I think not just for windows, but also AIX (there's a whole section in src/backend/Makefile about AIX oddity), but I'm not sure about it. > diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h > index a4b5dc853b..13849a3790 100644 > --- a/src/include/common/relpath.h > +++ b/src/include/common/relpath.h > @@ -56,7 +56,7 @@ typedef enum ForkNumber > > #define FORKNAMECHARS 4 /* max chars for a fork name */ > > -extern const char *const forkNames[]; > +extern PGDLLIMPORT const char *const forkNames[]; > > extern ForkNumber forkname_to_number(const char *forkName); > extern int forkname_chars(const char *str, ForkNumber *fork); I think we might need a new form of PGDLLIMPORT to mark these correctly - I don't think they should be marked PGDLLIMPORT in frontend environment. > diff --git a/src/include/fe_utils/print.h b/src/include/fe_utils/print.h > index 836b4e29a8..5caf9e2d08 100644 > --- a/src/include/fe_utils/print.h > +++ b/src/include/fe_utils/print.h > @@ -177,10 +177,10 @@ typedef struct printQueryOpt > } printQueryOpt; > > > -extern volatile sig_atomic_t cancel_pressed; > +extern PGDLLIMPORT volatile sig_atomic_t cancel_pressed; > > -extern const printTextFormat pg_asciiformat; > -extern const printTextFormat pg_asciiformat_old; > +extern PGDLLIMPORT const printTextFormat pg_asciiformat; > +extern PGDLLIMPORT const printTextFormat pg_asciiformat_old; > extern printTextFormat pg_utf8format; /* ideally would be const, but... */ Are these right? I don't think we should make frontend stuff exported without a very clear reason. > extern void jit_reset_after_error(void); > diff --git a/src/include/jit/llvmjit.h b/src/include/jit/llvmjit.h > index 66143afccc..4541f9a2c4 100644 > --- a/src/include/jit/llvmjit.h > +++ b/src/include/jit/llvmjit.h > @@ -56,30 +56,30 @@ typedef struct LLVMJitContext > } LLVMJitContext; > > /* llvm module containing information about types */ > -extern LLVMModuleRef llvm_types_module; > +extern PGDLLIMPORT LLVMModuleRef llvm_types_module; These are wrong I think, this is a shared library. Greetings, Andres Freund
Hi, On Mon, Feb 14, 2022 at 12:45:08PM -0500, Robert Haas wrote: > On Mon, Feb 14, 2022 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > * There's a moderately sizable subset of GUCs where the underlying > > variable is not visible at all because it's static in guc.c. > > Typically this is because that variable is only used for display > > and there's an assign hook that stores the real data somewhere else. > > I suppose what we want in such cases is for the "somewhere else" > > to be PGDLLIMPORT'd, but in a lot of cases those variables are also > > static in some other module. Does this proposal include exporting > > variables that currently aren't visible to extensions at all? > > I'm a little resistant to that. I can buy making sure that Windows > > has a level playing field, but that's as far as I want to go. > > I can live with that. If someone complains about those variables being > static-to-file instead of globally visible, we can address that > complaint on its merits when it is presented. Same here, if any third-party project had any use of such variable, they would have sent some patch for that already so I don't see any reason to change it now.
On Wed, Feb 16, 2022 at 01:10:50AM +0800, Julien Rouhaud wrote: > Hi, > > On Mon, Feb 14, 2022 at 12:45:08PM -0500, Robert Haas wrote: > > On Mon, Feb 14, 2022 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > * There's a moderately sizable subset of GUCs where the underlying > > > variable is not visible at all because it's static in guc.c. > > > Typically this is because that variable is only used for display > > > and there's an assign hook that stores the real data somewhere else. > > > I suppose what we want in such cases is for the "somewhere else" > > > to be PGDLLIMPORT'd, but in a lot of cases those variables are also > > > static in some other module. Does this proposal include exporting > > > variables that currently aren't visible to extensions at all? > > > I'm a little resistant to that. I can buy making sure that Windows > > > has a level playing field, but that's as far as I want to go. > > > > I can live with that. If someone complains about those variables being > > static-to-file instead of globally visible, we can address that > > complaint on its merits when it is presented. > > Same here, if any third-party project had any use of such variable, they would > have sent some patch for that already so I don't see any reason to change it > now. Agreed. I think the big goal here is to make Windows have the same GUC variable visibility as Unix --- when we change things in a way that breaks extensions, we hear about Unix breakage quickly and adjust for it. It is Windows being different and only getting the problem reports later that is the real problem. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Hi, Sorry for taking so much time to answer here. I definitely wanted to work on that but I was under the impression that although we now had an agreement to apply PGDLLIMPORT globally a patch itself wasn't really rushed, so I spent the last two days trying to setup a new Windows environment as my previous one isn't working anymore. That's taking much longer than I planned. On Tue, Feb 15, 2022 at 08:06:58AM -0800, Andres Freund wrote: > Hi, > > On 2022-02-15 08:58:05 -0500, Robert Haas wrote: > > On Mon, Feb 14, 2022 at 8:53 PM Andres Freund <andres@anarazel.de> wrote: > > > > An alternative rule which would dodge that particular issue would be > > > > to just slap PGDLLIMPORT on every global variable in every header > > > > file. That would arguably be a simpler rule, though it means even more > > > > PGDLLIMPORT declarations floating around. > > > > > > That would have the advantage of being comparatively easy to check in an > > > automated way. Might even be cheap enough to just make it part of the build. > > > > I wasn't able to quickly write something that was smart enough to use > > as part of the build, but I wrote something dumb that I think works > > well enough to use with a little bit of human intelligence alongside. > > See attached. > > Cool. +1 > > > But it seems like it'd be a fair amount of work and cause a lot of patch > > > rebasing pain? If we end up going that way, we should schedule this to happen > > > just after the feature freeze, I think. > > > > We could do that. I'd sort of rather get it done. We still have two > > weeks before the last CommitFest officially starts, and it's not as if > > there won't be tons of uncommitted patches floating around after that. > > Breaking close to every patch 6-7 weeks before freeze doesn't seem > great. Particularly because this is a mostly automatically generated patch, so > I don't really see a forcing function to do this now, rather than at a more > opportune moment. Agreed, especially if we can do some cleanup in gendef.pl. > The more I think about it the more I'm convinced that if we want to do this, > we should do it for variables and functions. > > > > > If we consider doing this for all extern variables, we should think about > > > doing this for headers *and* functions. That'd allow us to get rid of the > > > fairly complicated logic to generate the .def file for the postgres binary on > > > windows (src/tools/gendef.pl). And maybe also the related thing on AIX > > > (src/backend/port/aix/mkldexport.sh) > > > > I don't know what you mean by this. > > We only need gendef.pl because we don't mark functions with visibility > information. If we attach PGDLLIMPORT to functions and variables we can a fair > bit of complexity. And I think not just for windows, but also AIX (there's a > whole section in src/backend/Makefile about AIX oddity), but I'm not sure > about it. I will try to have a look at it once my build environment is ready.
On Tue, Feb 15, 2022 at 11:07 PM Andres Freund <andres@anarazel.de> wrote: > > diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h > > index a4b5dc853b..13849a3790 100644 > > --- a/src/include/common/relpath.h > > +++ b/src/include/common/relpath.h > > @@ -56,7 +56,7 @@ typedef enum ForkNumber > > > > #define FORKNAMECHARS 4 /* max chars for a fork name */ > > > > -extern const char *const forkNames[]; > > +extern PGDLLIMPORT const char *const forkNames[]; > > > > extern ForkNumber forkname_to_number(const char *forkName); > > extern int forkname_chars(const char *str, ForkNumber *fork); > > I think we might need a new form of PGDLLIMPORT to mark these correctly - I > don't think they should be marked PGDLLIMPORT in frontend environment. That has been taken care of already, IIUC: commit e04a8059a74c125a8af94acdcb7b15b92188470a Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon Nov 29 11:00:00 2021 -0500 Simplify declaring variables exported from libpgcommon and libpgport. This reverts commits c2d1eea9e and 11b500072, as well as similar hacks elsewhere, in favor of setting up the PGDLLIMPORT macro so that it can just be used unconditionally. That can work because in frontend code, we need no marking in either the defining or consuming files for a variable exported from these libraries; and frontend code has no need to access variables exported from the core backend, either. -- John Naylor EDB: http://www.enterprisedb.com
Hi, On 2022-02-15 08:06:58 -0800, Andres Freund wrote: > The more I think about it the more I'm convinced that if we want to do this, > we should do it for variables and functions. Btw, if we were to do this, we should just use -fvisibility=hidden everywhere and would see the same set of failures on unixoid systems as on windows. Of course only in in-core extensions, but it'd still be better than nothing. I proposed using -fvisibility=hidden in extensions: https://postgr.es/m/20211101020311.av6hphdl6xbjbuif@alap3.anarazel.de Mostly because using explicit symbol exports makes it a lot easier to build extensions libraries on windows (with meson, but also everything built outside of postgres with msvc). But also because it makes function calls *inside* an extension have noticeably lower overhead. And it makes the set of symbols exported smaller. One problem I encountered was that it's actually kind of hard to set build flags only for extension libraries: https://postgr.es/m/20220111025328.iq5g6uck53j5qtin%40alap3.anarazel.de But if we added explicit exports to everything we export, we'd not need to restrict the use of -fvisibility=hidden to extension libraries anymore. Would get decent error messages on all platforms. Subsequently we could yield a bit of performance from critical paths by marking selected symbols as *not* exported. That'd make them a tad cheaper to call and allow the optimizer more leeway. Greetings, Andres Freund
On Fri, Feb 18, 2022 at 7:02 PM Andres Freund <andres@anarazel.de> wrote: > On 2022-02-15 08:06:58 -0800, Andres Freund wrote: > > The more I think about it the more I'm convinced that if we want to do this, > > we should do it for variables and functions. > > Btw, if we were to do this, we should just use -fvisibility=hidden everywhere > and would see the same set of failures on unixoid systems as on windows. Of > course only in in-core extensions, but it'd still be better than nothing. Let's be less ambitious for this release, and just get the variables marked with PGDLLIMPORT. We seem to have consensus to create parity between Windows and non-Windows builds, which means precisely applying PGDLLIMPORT to variables marked in header files, and nothing more. The merits of -fvisibility=hidden or PGDLLIMPORT on functions are a separate question that can be debated on its own merits, but I don't want that larger discussion to bog down this effort. Here are updated patches for that. @RMT: Andres proposed upthread that we should plan to do this just after feature freeze. Accordingly I propose to commit at least 0002 and perhaps 0001 if people want it just after feature freeze. I therefore ask that the RMT either (a) regard this change as not being a feature (and thus not subject to the freeze) or (b) give it a 1-day extension. The reason for committing it just after freeze is to minimize the number of conflicts that it creates for other patches. The reason why that's probably an OK thing to do is that applying PGDLLIMPORT markings is low-risk. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
On 3/30/22 14:37, Robert Haas wrote: > On Fri, Feb 18, 2022 at 7:02 PM Andres Freund <andres@anarazel.de> wrote: >> On 2022-02-15 08:06:58 -0800, Andres Freund wrote: >>> The more I think about it the more I'm convinced that if we want to do this, >>> we should do it for variables and functions. >> Btw, if we were to do this, we should just use -fvisibility=hidden everywhere >> and would see the same set of failures on unixoid systems as on windows. Of >> course only in in-core extensions, but it'd still be better than nothing. > Let's be less ambitious for this release, and just get the variables > marked with PGDLLIMPORT. We seem to have consensus to create parity > between Windows and non-Windows builds, which means precisely applying > PGDLLIMPORT to variables marked in header files, and nothing more. The > merits of -fvisibility=hidden or PGDLLIMPORT on functions are a > separate question that can be debated on its own merits, but I don't > want that larger discussion to bog down this effort. Here are updated > patches for that. > > @RMT: Andres proposed upthread that we should plan to do this just > after feature freeze. Accordingly I propose to commit at least 0002 > and perhaps 0001 if people want it just after feature freeze. I > therefore ask that the RMT either (a) regard this change as not being > a feature (and thus not subject to the freeze) or (b) give it a 1-day > extension. The reason for committing it just after freeze is to > minimize the number of conflicts that it creates for other patches. > The reason why that's probably an OK thing to do is that applying > PGDLLIMPORT markings is low-risk. > > Thanks, > WFM. I think Tom also has an item he wants to do right at the end of feature freeze. The script looks fine, needs a copyright notice and a comment at the top describing what it does. It seems like something we might need to do from time to time, as it will be easy to forget to mark variables and we should periodically run this as a check. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 3/30/22 14:37, Robert Haas wrote: >> @RMT: Andres proposed upthread that we should plan to do this just >> after feature freeze. Accordingly I propose to commit at least 0002 >> and perhaps 0001 if people want it just after feature freeze. I >> therefore ask that the RMT either (a) regard this change as not being >> a feature (and thus not subject to the freeze) or (b) give it a 1-day >> extension. The reason for committing it just after freeze is to >> minimize the number of conflicts that it creates for other patches. >> The reason why that's probably an OK thing to do is that applying >> PGDLLIMPORT markings is low-risk. > WFM. I think Tom also has an item he wants to do right at the end of > feature freeze. Yeah, the frontend error message rework in [1]. That has exactly the same constraint that it's likely to break other open patches, so it'd be better to do it after the CF cutoff. I think that doing that concurrently with Robert's thing shouldn't be too risky, because it only affects frontend code while his patch should touch only backend. regards, tom lane [1] https://commitfest.postgresql.org/37/3574/
On Tue, Apr 5, 2022 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, the frontend error message rework in [1]. That has exactly > the same constraint that it's likely to break other open patches, > so it'd be better to do it after the CF cutoff. I think that doing > that concurrently with Robert's thing shouldn't be too risky, because > it only affects frontend code while his patch should touch only backend. So when *exactly* do we want to land these patches? None of the calendar programs I use support "anywhere on earth" as a time zone, but I think that feature freeze is 8am on Friday on the East coast of the United States. If I commit the PGDLLIMPORT change within a few hours after that time, is that good? Should I try to do it earlier, before we technically hit 8am? Should I do it the night before, last thing before I go to bed on Thursday? Do you care whether your commit or mine goes in first? -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Apr 5, 2022 at 10:06 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Apr 5, 2022 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Yeah, the frontend error message rework in [1]. That has exactly > > the same constraint that it's likely to break other open patches, > > so it'd be better to do it after the CF cutoff. I think that doing > > that concurrently with Robert's thing shouldn't be too risky, because > > it only affects frontend code while his patch should touch only backend. > > So when *exactly* do we want to land these patches? None of the > calendar programs I use support "anywhere on earth" as a time zone, > but I think that feature freeze is 8am on Friday on the East coast of > the United States. I understand it to be noon UTC on Friday. > If I commit the PGDLLIMPORT change within a few > hours after that time, is that good? Should I try to do it earlier, > before we technically hit 8am? Should I do it the night before, last > thing before I go to bed on Thursday? Do you care whether your commit > or mine goes in first? For these two patches, I'd say a day or two after feature freeze is a reasonable goal. -- John Naylor EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > Do you care whether your commit > or mine goes in first? I do not. If they're not independent, at least one of us has messed up. I have family commitments on Saturday, so if I don't get mine in on Friday it'll likely not happen before Sunday. regards, tom lane
On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote: > For these two patches, I'd say a day or two after feature freeze is a > reasonable goal. Yeah. For patches as invasive as the PGDLLIMPORT business and the frontend error refactoring, I am also fine to have two exceptions with the freeze deadline. -- Michael
Attachment
On Wed, Apr 6, 2022 at 7:56 PM Michael Paquier <michael@paquier.xyz> wrote: > On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote: > > For these two patches, I'd say a day or two after feature freeze is a > > reasonable goal. > > Yeah. For patches as invasive as the PGDLLIMPORT business and the > frontend error refactoring, I am also fine to have two exceptions with > the freeze deadline. Done now. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Apr 8, 2022 at 2:42 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Apr 6, 2022 at 7:56 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote:
> > For these two patches, I'd say a day or two after feature freeze is a
> > reasonable goal.
>
> Yeah. For patches as invasive as the PGDLLIMPORT business and the
> frontend error refactoring, I am also fine to have two exceptions with
> the freeze deadline.
Done now.
\o/
On Fri, Apr 08, 2022 at 03:04:18PM +0200, Magnus Hagander wrote: > On Fri, Apr 8, 2022 at 2:42 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > On Wed, Apr 6, 2022 at 7:56 PM Michael Paquier <michael@paquier.xyz> > > wrote: > > > On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote: > > > > For these two patches, I'd say a day or two after feature freeze is a > > > > reasonable goal. > > > > > > Yeah. For patches as invasive as the PGDLLIMPORT business and the > > > frontend error refactoring, I am also fine to have two exceptions with > > > the freeze deadline. > > > > Done now. > > > > \o/ Woohoo! Thanks a lot!
Hi, On 2022-04-08 08:42:38 -0400, Robert Haas wrote: > On Wed, Apr 6, 2022 at 7:56 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote: > > > For these two patches, I'd say a day or two after feature freeze is a > > > reasonable goal. > > > > Yeah. For patches as invasive as the PGDLLIMPORT business and the > > frontend error refactoring, I am also fine to have two exceptions with > > the freeze deadline. > > Done now. Just noticed that extern sigset_t UnBlockSig, BlockSig, StartupBlockSig; are unadorned. There's also a number of variables that are only declared in .c files that !windows can still access. Some likely aren't worth caring about, but some others are underlying GUCs, so we probably do? E.g. CommitDelay CommitSiblings default_tablespace ignore_checksum_failure ignore_invalid_pages Log_disconnections ssl_renegotiation_limit temp_tablespaces Unix_socket_group Unix_socket_permissions wal_level_options Likely don't care: dynamic_shared_memory_options gistBufferingOptValues recovery_target_action_options ssl_protocol_versions_info Also noticed that the bbsink_ops variables are const, instead of static const, was that intentional? Greetings, Andres Freund
On Fri, May 06, 2022 at 04:49:24PM -0700, Andres Freund wrote: > Just noticed that > extern sigset_t UnBlockSig, > BlockSig, > StartupBlockSig; > are unadorned. mark_pgdllimport.pl is not able to capture this part of the change because of this logic, where we assume that the header line has to finish with a semicolon: # Variable declarations should end in a semicolon. If we see an # opening parenthesis, it's probably a function declaration. $needs_pgdllimport = 0 if $stripped_line !~ /;$/ || $stripped_line =~ /\(/; It is quite common to define one variable per line, so I would not put the blame on the script and just update pqsignal.h. And it is common to finish lines with a comma for function declarations.. > There's also a number of variables that are only declared in .c files that > !windows can still access. Some likely aren't worth caring about, but some > others are underlying GUCs, so we probably do? E.g. > CommitDelay > CommitSiblings > default_tablespace > ignore_checksum_failure > ignore_invalid_pages > Log_disconnections > ssl_renegotiation_limit > temp_tablespaces > wal_level_options These are indeed declared in .c files. So you mean that we'd better declare them in headers and mark them as PGDLLIMPORT? I am not sure if that's worth the addition, nobody has asked for these to be available yet, AFAIK. > Unix_socket_group > Unix_socket_permissions Already marked with PGDLLIMPORT. > Also noticed that the bbsink_ops variables are const, instead of static const, > was that intentional? Yep, that looks like an error. While on it, I think that it would be a good idea to document in the script that we need pass down a list of header files as arguments to rewrite them. -- Michael
Attachment
On Mon, May 9, 2022 at 3:15 AM Michael Paquier <michael@paquier.xyz> wrote: > On Fri, May 06, 2022 at 04:49:24PM -0700, Andres Freund wrote: > > Just noticed that > > extern sigset_t UnBlockSig, > > BlockSig, > > StartupBlockSig; > > are unadorned. > > mark_pgdllimport.pl is not able to capture this part of the change > because of this logic, where we assume that the header line has to > finish with a semicolon: > # Variable declarations should end in a semicolon. If we see an > # opening parenthesis, it's probably a function declaration. > $needs_pgdllimport = 0 if $stripped_line !~ /;$/ || > $stripped_line =~ /\(/; > > It is quite common to define one variable per line, so I would not put > the blame on the script and just update pqsignal.h. And it is common > to finish lines with a comma for function declarations.. Right. I didn't try to equip the script with a real C parser. Just enough to catch our typical style - which those declarations aren't. > > There's also a number of variables that are only declared in .c files that > > !windows can still access. Some likely aren't worth caring about, but some > > others are underlying GUCs, so we probably do? E.g. > > CommitDelay > > CommitSiblings > > default_tablespace > > ignore_checksum_failure > > ignore_invalid_pages > > Log_disconnections > > ssl_renegotiation_limit > > temp_tablespaces > > wal_level_options > > These are indeed declared in .c files. So you mean that we'd better > declare them in headers and mark them as PGDLLIMPORT? I am not sure > if that's worth the addition, nobody has asked for these to be > available yet, AFAIK. Completely agree. The agreed-upon charter was to adjust the header files, not move any declarations into header files. I'm not against doing that; I think it's good for extensions to have access to GUC values. But it wasn't the mission. > > Also noticed that the bbsink_ops variables are const, instead of static const, > > was that intentional? > > Yep, that looks like an error. > > While on it, I think that it would be a good idea to document in the > script that we need pass down a list of header files as arguments to > rewrite them. Either of you please feel free to change these things, at least as far as I'm concerned. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, May 09, 2022 at 09:23:47AM -0400, Robert Haas wrote: > Either of you please feel free to change these things, at least as far > as I'm concerned. Well, what about the attached then? While looking at all the headers in the tree, I have noticed that __pg_log_level is not marked, but we'd better paint a PGDLLIMPORT also for it? This is getting used by pgbench for some unlikely() business, as one example. -- Michael
Attachment
On Tue, May 10, 2022 at 04:09:47PM +0900, Michael Paquier wrote: > Well, what about the attached then? While looking at all the headers > in the tree, I have noticed that __pg_log_level is not marked, but > we'd better paint a PGDLLIMPORT also for it? This is getting used by > pgbench for some unlikely() business, as one example. After an extra look, PGDLLIMPORT missing from __pg_log_level looks like an imbroglio between 8ec5694, that has added the marking, and 9a374b77 that has removed it the same day. All that has been fixed in 5edeb57. -- Michael
Attachment
Hi, On 2022-05-12 15:15:10 +0900, Michael Paquier wrote: > On Tue, May 10, 2022 at 04:09:47PM +0900, Michael Paquier wrote: > > Well, what about the attached then? While looking at all the headers > > in the tree, I have noticed that __pg_log_level is not marked, but > > we'd better paint a PGDLLIMPORT also for it? This is getting used by > > pgbench for some unlikely() business, as one example. > > After an extra look, PGDLLIMPORT missing from __pg_log_level looks > like an imbroglio between 8ec5694, that has added the marking, and > 9a374b77 that has removed it the same day. All that has been fixed in > 5edeb57. It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers / variables. What is that supposed to mean? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-05-12 15:15:10 +0900, Michael Paquier wrote: >> After an extra look, PGDLLIMPORT missing from __pg_log_level looks >> like an imbroglio between 8ec5694, that has added the marking, and >> 9a374b77 that has removed it the same day. All that has been fixed in >> 5edeb57. > It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers / > variables. What is that supposed to mean? Yeah, that's why I removed it in 9a374b77. regards, tom lane
On Thu, May 12, 2022 at 11:59:49AM -0400, Tom Lane wrote: >> It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers / >> variables. What is that supposed to mean? > > Yeah, that's why I removed it in 9a374b77. Perhaps we should try to remove it from the header itself in the long run, even if that's used in a couple of macros? pgbench relies on it to avoid building a debug string for a meta-command, and logging.h has it in those compat macros.. I won't fight on that, though. Anyway, I'll go remove the marking. My apologies for the noise. -- Michael