Thread: Mark all GUC variable as PGDLLIMPORT

Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
Hi,

I've been pinged many time over the years to either fix Windows compatibility
or provide DLL for multiple extensions I'm maintaining.  I've finally taken
some time to setup a Windows build environment so I could take care of most of
the problem, but not all (at least not in a satisfactory way).

I've also been looking a bit around other extensions and I see that the #1
problem with compiling extensions on Windows is the lack of PGDLLIMPORT
annotations, which is 99% of the time for a GUC.

This topic has been raised multiple time over the years, and I don't see any
objection to add such an annotation at least for all GUC variables (either the
direct variables or the indirect variables set during the hook execution), so
PFA a patch that takes care of all the GUC.

I don't now if that's still an option at that point, but backporting to at
least pg14 if that patch is accepted would be quite helpful.

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Michael Paquier
Date:
On Sun, Aug 22, 2021 at 04:10:33PM +0800, Julien Rouhaud wrote:
> This topic has been raised multiple time over the years, and I don't see any
> objection to add such an annotation at least for all GUC variables (either the
> direct variables or the indirect variables set during the hook execution), so
> PFA a patch that takes care of all the GUC.
>
> I don't now if that's still an option at that point, but backporting to at
> least pg14 if that patch is accepted would be quite helpful.

These are usually just applied on HEAD, and on a parameter-basis based
on requests from extension authors.  If you wish to make your
extensions able to work on Windows, that's a good idea, but I would
recommend to limit this exercise to what's really necessary for your
purpose.
--
Michael

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Sun, Aug 22, 2021 at 08:51:26PM +0900, Michael Paquier wrote:
> On Sun, Aug 22, 2021 at 04:10:33PM +0800, Julien Rouhaud wrote:
> > This topic has been raised multiple time over the years, and I don't see any
> > objection to add such an annotation at least for all GUC variables (either the
> > direct variables or the indirect variables set during the hook execution), so
> > PFA a patch that takes care of all the GUC.
> > 
> > I don't now if that's still an option at that point, but backporting to at
> > least pg14 if that patch is accepted would be quite helpful.
> 
> These are usually just applied on HEAD

Yeah but 14 isn't released yet, and this is a really low risk change.

> , and on a parameter-basis based
> on requests from extension authors.  If you wish to make your
> extensions able to work on Windows, that's a good idea, but I would
> recommend to limit this exercise to what's really necessary for your
> purpose.

I disagree.  For random global variables I agree that we shouldn't mark them
all blindly, but for GUCs it's pretty clear that they're intended to be
accessible from any caller, including extensions.  Why treating Windows as a
second-class citizen, especially when any change can only be used a year after
someone complained?



Re: Mark all GUC variable as PGDLLIMPORT

From
Pavel Stehule
Date:


ne 22. 8. 2021 v 14:08 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
On Sun, Aug 22, 2021 at 08:51:26PM +0900, Michael Paquier wrote:
> On Sun, Aug 22, 2021 at 04:10:33PM +0800, Julien Rouhaud wrote:
> > This topic has been raised multiple time over the years, and I don't see any
> > objection to add such an annotation at least for all GUC variables (either the
> > direct variables or the indirect variables set during the hook execution), so
> > PFA a patch that takes care of all the GUC.
> >
> > I don't now if that's still an option at that point, but backporting to at
> > least pg14 if that patch is accepted would be quite helpful.
>
> These are usually just applied on HEAD

Yeah but 14 isn't released yet, and this is a really low risk change.

> , and on a parameter-basis based
> on requests from extension authors.  If you wish to make your
> extensions able to work on Windows, that's a good idea, but I would
> recommend to limit this exercise to what's really necessary for your
> purpose.

I disagree.  For random global variables I agree that we shouldn't mark them
all blindly, but for GUCs it's pretty clear that they're intended to be
accessible from any caller, including extensions.  Why treating Windows as a
second-class citizen, especially when any change can only be used a year after
someone complained?

I had few problems with access with these variables too when I worked on orafce.

Is true, so it increases differences between Windows and Unix, and fixing these issues is not fun work. On the other hand, maybe direct access to these variables from extensions is an antipattern, and we should use a direct function call API and functions current_setting and set_config. The overhead is usually not too big.



Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Sun, Aug 22, 2021 at 02:17:16PM +0200, Pavel Stehule wrote:
> 
> Is true, so it increases differences between Windows and Unix, and fixing
> these issues is not fun work. On the other hand, maybe direct access to
> these variables from extensions is an antipattern, and we should use a
> direct function call API and functions current_setting and set_config. The
> overhead is usually not too big.

Yes, and that's what I did for one of my extensions.  But that's still a bit of
overhead, and extra burden only seen when trying to have Windows compatiblity,
and I hope I can get rid of that at some point.

If direct variable access shouldn't be possible, then we should explicitly tag
those with __attribute__ ((visibility ("hidden"))) or something like that to
have a more consistent behavior.



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Sun, Aug 22, 2021 at 08:51:26PM +0900, Michael Paquier wrote:
>> ... and on a parameter-basis based
>> on requests from extension authors.  If you wish to make your
>> extensions able to work on Windows, that's a good idea, but I would
>> recommend to limit this exercise to what's really necessary for your
>> purpose.

> I disagree.  For random global variables I agree that we shouldn't mark them
> all blindly, but for GUCs it's pretty clear that they're intended to be
> accessible from any caller, including extensions.

Uh, no, it's exactly *not* clear.  There are a lot of GUCs that are only
of interest to particular subsystems.  I do not see why being a GUC makes
something automatically more interesting than any other global variable.
Usually, the fact that one is global is only so the GUC machinery itself
can get at it, otherwise it'd be static in the owning module.

As for "extensions should be able to get at the values", the GUC machinery
already provides uniform mechanisms for doing that safely.  Direct access
to the variable's internal value would be unsafe in many cases.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Sun, Aug 22, 2021 at 09:29:01PM +0800, Julien Rouhaud wrote:
>> Then shouldn't we try to prevent direct access on all platforms rather than
>> only one?

> Agreed.  If Julian says 99% of the non-export problems are GUCs, and we
> can just export them all, why not do it?  We already export every global
> variable on Unix-like systems, and we have seen no downsides.

By that argument, *every* globally-visible variable should be marked
PGDLLIMPORT.  But the mere fact that two backend .c files need to access
some variable doesn't mean that we want any random bit of code doing so.

And yes, I absolutely would prohibit extensions from accessing many
of them, if there were a reasonable way to do it.  It would be a good
start towards establishing a defined API for extensions.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Mon, Aug 23, 2021 at 10:15:04AM -0400, Tom Lane wrote:
>> By that argument, *every* globally-visible variable should be marked
>> PGDLLIMPORT.  But the mere fact that two backend .c files need to access

> No, Julien says 99% need only the GUCs, so that is not the argument I am
> making.

That's a claim unbacked by any evidence that I've seen.  More to
the point, we already have a mechanism that extensions can/should
use to read and write GUC settings, and it's not direct access.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Mon, Aug 23, 2021 at 10:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > On Mon, Aug 23, 2021 at 10:15:04AM -0400, Tom Lane wrote:
> >> By that argument, *every* globally-visible variable should be marked
> >> PGDLLIMPORT.  But the mere fact that two backend .c files need to access
>
> > No, Julien says 99% need only the GUCs, so that is not the argument I am
> > making.
>
> That's a claim unbacked by any evidence that I've seen.  More to
> the point, we already have a mechanism that extensions can/should
> use to read and write GUC settings, and it's not direct access.

I clearly didn't try all single extension available out there.  It's
excessively annoying to compile extensions on Windows, and also I
don't have a lot of dependencies installed so there are some that I
wasn't able to test since I'm lacking some other lib and given my
absolute lack of knowledge of that platform I didn't spent time trying
to install those.

I think I tested a dozen of "small" extensions (I'm assuming that the
big one like postgis would require too much effort to build, and they
probably already take care of Windows compatibility), and I only faced
this problem.  That's maybe not a representative set, but I also doubt
that I was unlucky enough to find the few only exceptions.



Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Mon, Aug 23, 2021 at 10:36 PM Bruce Momjian <bruce@momjian.us> wrote:
>
> So the problem is that extensions only _need_ to use that API on
> Windows, so many initially don't, or that the API is too limited?

The inconvenience with that API is that it's only returning c strings,
so you gave to convert it back to the original datatype.  That's
probably why most of the extensions simply read from the original
exposed variable rather than using the API, because they're usually
written on Linux or similar, not because they want to mess up the
stored value.



Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Mon, Aug 23, 2021 at 10:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> And yes, I absolutely would prohibit extensions from accessing many
> of them, if there were a reasonable way to do it.  It would be a good
> start towards establishing a defined API for extensions.

The v2 patch I sent does that, at least when compiling with GCC.  I
didn't find something similar for clang, but I only checked quickly.

I'm assuming that the unreasonable part is having to add some extra
attribute to the variable?  Would it be acceptable if wrapped into
some other macro, as I proposed?



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Mon, Aug 23, 2021 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> And yes, I absolutely would prohibit extensions from accessing many
> of them, if there were a reasonable way to do it.  It would be a good
> start towards establishing a defined API for extensions.

Mostly, it would make extension development more difficult for no
discernable benefit to the project.

You've made this argument many times over the years ... but we're no
closer to having an extension API than we ever were, and we continue
to get complaints about breaking stuff on Windows on a pretty regular
basis.

Honestly, it seems unimaginable that an API is ever really going to be
possible. It would be a ton of work to maintain, and we'd just end up
breaking it every time we discover that there's a new feature we want
to implement which doesn't fit into the defined API now. That's what
we do *now* with functions that third-party extensions actually call,
and with variables that they access, and it's not something that, in
my experience, is any great problem in maintaining an extension.
You're running code that is running inside somebody else's executable
and sometimes you have to adjust it for upstream changes. That's life,
and I don't think that complaints about that topic are nearly as
frequent as complaints about extensions breaking on Windows because of
missing PGDLLIMPORT markings.

And more than that, I'm pretty sure that you've previously taken the
view that we shouldn't document all the hook functions that only exist
in the backend for the purpose of extension use. I think you would
argue against a patch to go and document all the variables that are
marked PGDLLIMPORT now. So it seems to me that you're for an API when
it means that we don't have to change anything, and against an API
when it means that we don't have to change anything, which doesn't
really seem like a consistent position. I think we should be
responding to the real, expressed needs of extension developers, and
the lack of PGDLLIMPORT markings on various global variables is surely
top of the list.

It's also a bit unfair to say, well we have APIs for accessing GUC
values. It's true that we do. But if the GUC variable is, say, a
Boolean, you do not want your extension to call some function that
does a bunch of shenanigans and returns a string so that you can then
turn around and parse the string to recover the Boolean value. Even
moreso if the value is an integer or a comma-separated list. You want
to access the value as the system represents it internally, not
duplicate the parsing logic in a way that is inefficient and
bug-prone.

In short, +1 from me for the original proposal of marking all GUCs as
PGDLLIMPORT. And, heck, +1 for marking all the other global variables
that way, too. We're not solving any problem here. We're just annoying
people, mostly people who are loyal community members and steady
contributors to the project.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 08/23/21 10:36, Bruce Momjian wrote:

> So the problem is that extensions only _need_ to use that API on
> Windows, so many initially don't, or that the API is too limited?

I think there can be cases where it's too limited, such as when significant
computation or validation is needed between the form of the setting known
to the GUC machinery and the form that would otherwise be available in
the global.

I'm thinking, for instance, of the old example before session_timezone
was PGDLLIMPORTed, and you'd have to GETCONFIGOPTION("timezone") and
repeat the work done by pg_tzset to validate and map the timezone name
through the timezone database, to reconstruct the value that was
otherwise already available in session_timezone.

Maybe those cases aren't very numerous ... and maybe they're distinctive
enough to recognize when creating one ("hmm, I am creating a check or
assign hook that does significant work here, will it be worth exposing
a getter API for the product of the work?").

Regards,
-Chap



Re: Mark all GUC variable as PGDLLIMPORT

From
Alvaro Herrera
Date:
On 2021-Aug-23, Robert Haas wrote:

> It's also a bit unfair to say, well we have APIs for accessing GUC
> values. It's true that we do. But if the GUC variable is, say, a
> Boolean, you do not want your extension to call some function that
> does a bunch of shenanigans and returns a string so that you can then
> turn around and parse the string to recover the Boolean value. Even
> moreso if the value is an integer or a comma-separated list. You want
> to access the value as the system represents it internally, not
> duplicate the parsing logic in a way that is inefficient and
> bug-prone.

In that case, why not improve the API with functions that return the
values in some native datatype?  For scalars with native C types (int,
floats, Boolean etc) this is easy enough; I bet it'll solve 99% of the
problems or more.

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 08/23/21 10:57, Chapman Flack wrote:
> Maybe those cases aren't very numerous ... and maybe they're distinctive
> enough to recognize when creating one ("hmm, I am creating a check or
> assign hook that does significant work here, will it be worth exposing
> a getter API for the product of the work?").

How about a generic GetTypedConfigValue(char const *name, void *into) ?

By default the type of *into would correspond to the type of the GUC
(int for an enum) and would be returned directly from *conf->variable
by a getter hook installed by default when the GUC is defined. But the
definition could also supply a getter hook that would store a value of
a different type, or computed a different way, for a GUC where that's
appropriate.

The function return could be boolean, true if such a variable exists
and isn't a placeholder.

Pro: provides read access to the typed value of any GUC, without exposing
it to overwriting. Con: has to bsearch the GUCs every time the value
is wanted.

What I'd really like:

ObserveTypedConfigValue(char const *name, void *into, int *flags, int flag)

This would register an observer of the named GUC: whenever its typed value
changes, it is written at *into and flag is ORed into *flags.

This is more restrictive than allowing arbitrary code to be hooked into
GUC changes (as changes can happen at delicate times such as error
recovery), but allows an extension to always have the current typed
value available and to cheaply know when it has changed. Observers would
be updated in the normal course of processing a GUC change, eliminating
the bsearch-per-lookup cost of the first approach.

Thinkable?

Regards,
-Chap



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Mon, Aug 23, 2021 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> In that case, why not improve the API with functions that return the
> values in some native datatype?  For scalars with native C types (int,
> floats, Boolean etc) this is easy enough; I bet it'll solve 99% of the
> problems or more.

Sure, but ... why bother?

The entire argument rests on the presumption that there is some harm
being done by people accessing the values directly, but I don't think
that's true. And, if it were true, it seems likely that this proposed
API would have the exact same problem, because it would let people do
exactly the same thing. And, going through this proposed API would
still be significantly more expensive than just accessing the bare
variables, because you'd at least have to do some kind of lookup based
on the GUC name to find the corresponding variable. It's just a
solution in search of a problem.

Nothing bad happens when people write extensions that access GUC
variables directly. It works totally, completely fine. Except on
Windows.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Peter Geoghegan
Date:
On Mon, Aug 23, 2021 at 7:57 AM Robert Haas <robertmhaas@gmail.com> wrote:
> In short, +1 from me for the original proposal of marking all GUCs as
> PGDLLIMPORT.

+1

> And, heck, +1 for marking all the other global variables
> that way, too. We're not solving any problem here. We're just annoying
> people, mostly people who are loyal community members and steady
> contributors to the project.

I'm +0.5 on this aspect -- the result might be a lot of verbosity for
no possible benefit.

I'm not sure how many global variables there are. Hopefully not that
many. Maybe making adding new global variables annoying would be a
useful disincentive -- sometimes we use global variables when it isn't
particularly natural (it's natural with GUCs, but not other things).
That might tip the scales, at least for me.

Unnecessary use of global variables are why Postgres 13 went through
several point releases before I accidentally found out that parallel
VACUUM doesn't respect cost limits -- a very simple bug concerning how
we propagate (or fail to propagate) state to parallel workers. I bet
it would have taken far longer for the bug to be discovered if it
wasn't for my Postgres 14 VACUUM refactoring work.

-- 
Peter Geoghegan



Re: Mark all GUC variable as PGDLLIMPORT

From
David Fetter
Date:
On Mon, Aug 23, 2021 at 10:56:52AM -0400, Robert Haas wrote:
> On Mon, Aug 23, 2021 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > And yes, I absolutely would prohibit extensions from accessing many
> > of them, if there were a reasonable way to do it.  It would be a good
> > start towards establishing a defined API for extensions.
> 
> Mostly, it would make extension development more difficult for no
> discernable benefit to the project.
> 
> You've made this argument many times over the years ... but we're no
> closer to having an extension API than we ever were, and we continue
> to get complaints about breaking stuff on Windows on a pretty regular
> basis.
> 
> Honestly, it seems unimaginable that an API is ever really going to be
> possible. It would be a ton of work to maintain, and we'd just end up
> breaking it every time we discover that there's a new feature we want
> to implement which doesn't fit into the defined API now. That's what
> we do *now* with functions that third-party extensions actually call,
> and with variables that they access, and it's not something that, in
> my experience, is any great problem in maintaining an extension.
> You're running code that is running inside somebody else's executable
> and sometimes you have to adjust it for upstream changes. That's life,
> and I don't think that complaints about that topic are nearly as
> frequent as complaints about extensions breaking on Windows because of
> missing PGDLLIMPORT markings.
> 
> And more than that, I'm pretty sure that you've previously taken the
> view that we shouldn't document all the hook functions that only exist
> in the backend for the purpose of extension use.

As the person on the receiving end of that one, I was nonplussed, so I
took a step back to think it over.  I recognized at that time that I
didn't have a great answer for a legitimate concern, namely that any
change, however trivial, that goes into the core and doesn't go
directly to core database functionality, represents a long-term
maintenance burden.

The thing I'm coming to is that the key architectural feature
PostgreSQL has that other RDBMSs don't is its extensibility. Because
that's been a stable feature over time, I'm pretty sure we actually
need to see documenting that as a thing that does actually go to core
database functionality. Yes, there are resources involved with doing a
thing like this, but I don't think that they require constant or even
frequent attention from committers or even from seasoned DB hackers.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 08/23/21 14:30, Robert Haas wrote:
> it seems likely that this proposed
> API would have the exact same problem, because it would let people do
> exactly the same thing. And, going through this proposed API would
> still be significantly more expensive than just accessing the bare
> variables, because you'd at least have to do some kind of lookup based
> on the GUC name

I think the API ideas in [0] would not let people do exactly the same thing.

They would avoid exposing the bare variables to overwrite. Not that
there has been any plague of extensions going and overwriting GUCs,
but I think in some messages on this thread I detected a sense that
in principle it's better if an API precludes it, and that makes sense
to me.

The second idea also avoids the expense of name-based lookup (except once
at extension initialization), and in fact minimizes the cost of obtaining
the current value when needed, by slightly increasing the infrequent cost
of updating values.

Regards,
-Chap


[0] https://www.postgresql.org/message-id/6123C425.3080409%40anastigmatix.net



Re: Mark all GUC variable as PGDLLIMPORT

From
Pavel Stehule
Date:


po 23. 8. 2021 v 23:08 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 08/23/21 14:30, Robert Haas wrote:
> it seems likely that this proposed
> API would have the exact same problem, because it would let people do
> exactly the same thing. And, going through this proposed API would
> still be significantly more expensive than just accessing the bare
> variables, because you'd at least have to do some kind of lookup based
> on the GUC name

I think the API ideas in [0] would not let people do exactly the same thing.

They would avoid exposing the bare variables to overwrite. Not that
there has been any plague of extensions going and overwriting GUCs,
but I think in some messages on this thread I detected a sense that
in principle it's better if an API precludes it, and that makes sense
to me.

The second idea also avoids the expense of name-based lookup (except once
at extension initialization), and in fact minimizes the cost of obtaining
the current value when needed, by slightly increasing the infrequent cost
of updating values.


There are few GUC variables, where we need extra fast access - work_mem, encoding settings, and maybe an application name. For others I hadn't needed to access it for over 20 years. But I understand that more complex extensions like timescaledb will use more internal GUC.

Maybe a new light API based on enum identifiers instead of string identifiers that ensure relatively fast access (and safe and secure) can be a good solution. And for special variables, there can be envelope functions for very fast access. But although the special interface, the responsibility of the extension's author will not be less, and the C extensions will not be more trusted, so it is hard to say something about possible benefits. I am inclined to Robert's opinion, so the current state is not too bad, and maybe the best thing that is possible now is the decreasing difference between supported platforms.

Regards

Pavel


Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Tue, Aug 24, 2021 at 12:36 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> There are few GUC variables, where we need extra fast access - work_mem, encoding settings, and maybe an application
name.For others I hadn't needed to access it for over 20 years. But I understand that more complex extensions like
timescaledbwill use more internal GUC. 

Note that even trivial extensions require some other GUC access.  For
instance any single extension that wants to aggregate data based on
the query_id has to store an array of query_id in shmem to know what a
parallel worker is working on.  On top of wasting memory and CPU, it
also means that computing the maximum number of backends in required.
So you need to recompute MaxBackends, which means access to multiple
GUCs.

Unfortunately the patch to expose the query_id didn't fix that problem
as it only passes the top level query_id to the pararllel workers, not
the current one.



Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Mon, 23 Aug 2021 at 22:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
> On Sun, Aug 22, 2021 at 09:29:01PM +0800, Julien Rouhaud wrote:
>> Then shouldn't we try to prevent direct access on all platforms rather than
>> only one?

> Agreed.  If Julian says 99% of the non-export problems are GUCs, and we
> can just export them all, why not do it?  We already export every global
> variable on Unix-like systems, and we have seen no downsides.

By that argument, *every* globally-visible variable should be marked
PGDLLIMPORT.  But the mere fact that two backend .c files need to access
some variable doesn't mean that we want any random bit of code doing so.

And yes, I absolutely would prohibit extensions from accessing many
of them, if there were a reasonable way to do it.  It would be a good
start towards establishing a defined API for extensions.

There is: -fvisibility=hidden and __attribute__((visibility("default"))) . Or if you prefer to explicitly mark private symbols, use __attribute__((visiblity("hidden"))) .

In addition to API surface control this also gives you a smaller export symbol table for faster dynamic linking, and it improves link-time optimization.

I could've sworn I proposed its use in the past but I can't find a relevant list thread except quite a recent one. All I can find is [1] . But that is where we should start: switch from a linker script for libpq to using PGDLLIMPORT (actually  it'd be LIBPQDLLIMPORT) in libpq. When -DBUILDING_LIBPQ this would expand to __declspec("dllexport") on Windows and __attribute__((visibility("default"))) on gcc/clang. Otherwise it expands to __declspec("dllimport") on Windows and empty on other targets. This would also be a good time to rename the confusingly named BUILDING_DLL macro to BUILDING_POSTGRES .

The next step would be to have PGDLLIMPORT expand to __attribute__((visibility("default"))) on gcc/clang when building the server itself. This won't do anything by itself since all symbols are already default visibility. But once the "public API" of both function and data symbol is so-annotated, we could switch to building Pg with -fvisibility=hidden by default, and on Windows, we'd disable the linker script that exports all functions using a generated .DEF file.

Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Tue, 24 Aug 2021 at 02:31, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Aug 23, 2021 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> In that case, why not improve the API with functions that return the
> values in some native datatype?  For scalars with native C types (int,
> floats, Boolean etc) this is easy enough; I bet it'll solve 99% of the
> problems or more.

Sure, but ... why bother?

The entire argument rests on the presumption that there is some harm
being done by people accessing the values directly, but I don't think
that's true. And, if it were true, it seems likely that this proposed
API would have the exact same problem, because it would let people do
exactly the same thing. And, going through this proposed API would
still be significantly more expensive than just accessing the bare
variables, because you'd at least have to do some kind of lookup based
on the GUC name to find the corresponding variable. It's just a
solution in search of a problem.

Nothing bad happens when people write extensions that access GUC
variables directly. It works totally, completely fine. Except on
Windows.

Not only that, but postgres already exports every non-static function symbol on both *nix and Windows, and every data symbol on *nix. A lot of those function symbols are very internal and give anything that can call them the ability to wreck absolute havoc on the server's state.

There is not even the thinnest pretense of Pg having a dedicated extension API or any sort of internal vs public API separation. This ongoing pain with PGDLLIMPORT on Windows is hard to see as anything but an excuse to make working with and supporting Windows harder because some of us don't like it. I happen to rather dislike working with Windows myself, but I get to do it anyway, and I'd be very happy to remove this particular source of pain.

Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Tue, 24 Aug 2021 at 13:21, Craig Ringer <craig.ringer@enterprisedb.com> wrote:

There is not even the thinnest pretense of Pg having a dedicated extension API or any sort of internal vs public API separation.

Oh, and if we do want such a separation, we'll need to introduce a MUCH lower-pain-and-overhead way to get related patches in. Otherwise it'll take decades to add any necessary function wrappers for currently exported data symbols, add necessary hooks, wrap or hide internal symbols and state, etc.

But ... what is the actual goal and expected outcome of such a hypothetical public/private API separation?

It won't help meaningfully with server maintenance: We already break extensions freely in major releases, and sometimes even minor releases. We don't make any stable API promise at all. So any argument that it would make maintenance of the core server easier is weak at best.

It won't help protect server runtime stability: The server is written in C, and makes heavy use of non-opaque / non-hidden types. Many of which would not be practical to hide without enormous refactoring if at all. Writing any sort of "safe" C API is very difficult even when the exposed functionality is very narrow and well defined. Even then such an API can only help prevent inadvertent mistakes, since C programs are free to grovel around in memory. Look at the mess with EXPORT_SYMBOL_GPL in the kernel for just how ugly that can get. So I don't think there's any realistic way to claim that narrowing the exposed API surface would make it safer to load and run extensions that the user has not separately checked and reviewed or obtained from a trusted source with robust testing practices. Certainly it offers no benefit at all against a bad actor.

It won't make it safer to use untrusted extensions.

What will it do? Not much, in the short term, except cripple existing extensions or add a pile of questionably useful code annotations. The only real benefits I see are some improvement in link-time optimization and export symbol table size. Both of which are nice, but IMO not worth the pain by themselves for a pure C program. In C++, with its enormous symbol tables it's absolutely worth it. But not really for Pg.

To be clear, I actually love the idea of starting to define a solid public API, with API, ABI and semantic promises and associated tests. But to say it's a nontrivial amount of work is an enormous understatement. And unless done by an existing committer who is trusted to largely define a provisional API without bike-shedding and arguing the merits of every single part of it, it's nearly impossible to do with the way Pg is currently developed.

It's completely beyond me why it's OK to export all function symbols on Windows, but not all data symbols. Or why it's OK to export all data symbols on *nix, but not on Windows.

Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Tue, 24 Aug 2021 at 05:08, Chapman Flack <chap@anastigmatix.net> wrote:
On 08/23/21 14:30, Robert Haas wrote:
> it seems likely that this proposed
> API would have the exact same problem, because it would let people do
> exactly the same thing. And, going through this proposed API would
> still be significantly more expensive than just accessing the bare
> variables, because you'd at least have to do some kind of lookup based
> on the GUC name

I think the API ideas in [0] would not let people do exactly the same thing.

They would avoid exposing the bare variables to overwrite. Not that
there has been any plague of extensions going and overwriting GUCs,
but I think in some messages on this thread I detected a sense that
in principle it's better if an API precludes it, and that makes sense
to me.

The second idea also avoids the expense of name-based lookup (except once
at extension initialization), and in fact minimizes the cost of obtaining
the current value when needed, by slightly increasing the infrequent cost
of updating values.

I'd be generally in favour of something that reduced our reliance on the current chaotic and inconsistent jumble of globals which are a semi-random combination of compilation-unit-scoped, globally-scoped-except-on-Windows and globally scoped.

Tackling GUCs would be a good start. Especially given the number of GUCs where the actual GUC value isn't the state that anyone should be interacting with directly since a hook maintains the "real" state derived from the GUC storage.

And preventing direct writes to GUCs seems like a clearly correct thing to do.

Some consideration of performance would be important for some of the hotter GUCs, of course, but most extensions won't be hammering most GUC access a lot.

Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Mon, 23 Aug 2021 at 22:45, Julien Rouhaud <rjuju123@gmail.com> wrote:
On Mon, Aug 23, 2021 at 10:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > On Mon, Aug 23, 2021 at 10:15:04AM -0400, Tom Lane wrote:
> >> By that argument, *every* globally-visible variable should be marked
> >> PGDLLIMPORT.  But the mere fact that two backend .c files need to access
>
> > No, Julien says 99% need only the GUCs, so that is not the argument I am
> > making.
>
> That's a claim unbacked by any evidence that I've seen.  More to
> the point, we already have a mechanism that extensions can/should
> use to read and write GUC settings, and it's not direct access.

I clearly didn't try all single extension available out there.  It's
excessively annoying to compile extensions on Windows, and also I
don't have a lot of dependencies installed so there are some that I
wasn't able to test since I'm lacking some other lib and given my
absolute lack of knowledge of that platform I didn't spent time trying
to install those.


Plenty of them are closed too.

While that's not the Pg community's problem, as such, it'd be nice not to arbitrarily break them all for no actual benefit.

Re: Mark all GUC variable as PGDLLIMPORT

From
Peter Eisentraut
Date:
On 23.08.21 16:47, Julien Rouhaud wrote:
> On Mon, Aug 23, 2021 at 10:36 PM Bruce Momjian <bruce@momjian.us> wrote:
>>
>> So the problem is that extensions only _need_ to use that API on
>> Windows, so many initially don't, or that the API is too limited?
> 
> The inconvenience with that API is that it's only returning c strings,
> so you gave to convert it back to the original datatype.  That's
> probably why most of the extensions simply read from the original
> exposed variable rather than using the API, because they're usually
> written on Linux or similar, not because they want to mess up the
> stored value.

If there were an API, then in-core code should use it as well.

If, for example, an extension wanted to define a "float16" type, then it 
should be able to access extra_float_digits in the *same way* as 
float4out() and float8out() can access it.  This is clearly not possible 
today.



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Tue, Aug 24, 2021 at 4:28 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> If there were an API, then in-core code should use it as well.

...which is presumably never going to happen, because the performance
cost would, I think, be quite terrible. If you really had to force
everything through an API, I think what you'd want to do is define an
API where code can look up a handle object for a GUC using the name of
the GUC, and then hold onto a pointer to the handle and use that for
future accesses, so that you don't have to keep incurring the expense
of a hash table hit on every access. But even if you did that,
preventing "unauthorized" writes to GUC variables would require a
function call for every access. That function would be pretty trivial,
but it would have to exist, and...

> If, for example, an extension wanted to define a "float16" type, then it
> should be able to access extra_float_digits in the *same way* as
> float4out() and float8out() can access it.  This is clearly not possible
> today.

...if you used it for something like this, it would probably show up
in the profile, and we would get demands to remove that API and allow
direct access to the variables, which is all anybody is asking for
here anyway, and which is what pretty much everyone, whether they
develop for core or some extension, does and wants to do.

This also brings me to another point, which is that I do not think
there is anyone who actually wants an API like this. I believe that
extension developers find it rather convenient that they can make use
of all of the backend functions and variables that PostgreSQL exposes,
and that a lot of them would be deeply unhappy if we removed that
access. As an occasional extension maintainer myself, I know I would
be. And, as Craig quite rightly points out upthread, we would not get
anything in return for making those people unhappy. I don't know why
we would even consider doing something that would benefit nobody,
greatly inconvenience some people, and generally stifle innovation in
the PostgreSQL ecosystem.

Adding PGDLLIMPORT markings, on the other hand, would hurt nobody, be
very convenient for some people, and encourage innovation in the
PostgreSQL ecosystem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 08/24/21 14:28, Robert Haas wrote:

> cost would, I think, be quite terrible. If you really had to force
> everything through an API, I think what you'd want to do is define an
> API where code can look up a handle object for a GUC using the name of
> the GUC, and then hold onto a pointer to the handle and use that for
> future accesses, so that you don't have to keep incurring the expense
> of a hash table hit on every access. But even if you did that,
> preventing "unauthorized" writes to GUC variables would require a
> function call for every access.

I don't think that's true of the second proposal in [0]. I don't foresee
a noticeable runtime cost unless there is a plausible workload that
involves very frequent updates to GUC settings that are also of interest
to a bunch of extensions. Maybe I'll take a stab at a POC.

Regards,
-Chap

[0] https://www.postgresql.org/message-id/6123C425.3080409%40anastigmatix.net



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Tue, Aug 24, 2021 at 2:52 PM Chapman Flack <chap@anastigmatix.net> wrote:
> I don't think that's true of the second proposal in [0]. I don't foresee
> a noticeable runtime cost unless there is a plausible workload that
> involves very frequent updates to GUC settings that are also of interest
> to a bunch of extensions. Maybe I'll take a stab at a POC.

I'm not sure I fully understand that proposal, but I find it hard to
believe that we would seriously consider replacing every direct GUC
reference in the backend with something that goes through an API. Even
if didn't hurt performance, I think it would uglify the code a whole
lot.

And as Peter says, if we're not going to do that, then it's not clear
why extensions should have to.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 08/24/21 15:12, Robert Haas wrote:
> I find it hard to
> believe that we would seriously consider replacing every direct GUC
> reference in the backend with something that goes through an API.

Peter may have advocated for that kind of across-the-board adoption;
my leaning is more to add an API that /can/ be adopted, initially with
separately-linked extensions as the audience. Nothing would stop it being
used in core as well, but no reason to change any site where it did not
offer an advantage.

I generally tend to be an incrementalist.

Regards,
-Chap



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Tue, Aug 24, 2021 at 3:36 PM Chapman Flack <chap@anastigmatix.net> wrote:
> Peter may have advocated for that kind of across-the-board adoption;
> my leaning is more to add an API that /can/ be adopted, initially with
> separately-linked extensions as the audience. Nothing would stop it being
> used in core as well, but no reason to change any site where it did not
> offer an advantage.
>
> I generally tend to be an incrementalist.

Sure, me too, but the point for me is that there doesn't seem to be a
shred of a reason to go this way at all. We've turned a discussion
about adding PGDLLIMPORT, which ought to be totally uncontroversial,
into some kind of a discussion about adding an API layer that no one
wants to prevent a hypothetical failure mode not in evidence.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 08/24/21 16:31, Robert Haas wrote:

> about adding PGDLLIMPORT, which ought to be totally uncontroversial,

The thing is, I think I have somewhere a list of all the threads on this
topic that I've read through since the first time I had to come with my own
hat in hand asking for a PGDLLIMPORT on something, years ago now, and
I don't think I have ever seen one where it was as uncontroversial
as you suggest.

In each iteration, I think I've also seen a countervailing view expressed
in favor of looking into whether globals visibility could be further
/reduced/. Maybe that view is slowly losing adherents? I don't know; it
did get advanced again by some in this thread.

The positions seem to routinely fall into two boxes:

* Let's look into that, it would be a step toward having a more defined API

* No, what we have now is so far from a defined API that there's nothing
  worth stepping toward.

The situation seems intermediate to me. The current condition is clearly
something that organically grew without a strong emphasis on defining API.

On the other hand, there are many corners of it that show evidence of
thought about encapsulation and API by the developers of those corners.

It seems to me like the kind of setting where incrementalists can find
room for small moves.

Regards,
-Chap



Re: Mark all GUC variable as PGDLLIMPORT

From
Bruce Momjian
Date:
On Tue, Aug 24, 2021 at 04:31:32PM -0400, Robert Haas wrote:
> On Tue, Aug 24, 2021 at 3:36 PM Chapman Flack <chap@anastigmatix.net> wrote:
> > Peter may have advocated for that kind of across-the-board adoption;
> > my leaning is more to add an API that /can/ be adopted, initially with
> > separately-linked extensions as the audience. Nothing would stop it being
> > used in core as well, but no reason to change any site where it did not
> > offer an advantage.
> >
> > I generally tend to be an incrementalist.
> 
> Sure, me too, but the point for me is that there doesn't seem to be a
> shred of a reason to go this way at all. We've turned a discussion
> about adding PGDLLIMPORT, which ought to be totally uncontroversial,
> into some kind of a discussion about adding an API layer that no one
> wants to prevent a hypothetical failure mode not in evidence.

+1

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Mark all GUC variable as PGDLLIMPORT

From
Noah Misch
Date:
On Tue, Aug 24, 2021 at 05:06:54PM -0400, Chapman Flack wrote:
> On 08/24/21 16:31, Robert Haas wrote:
> > about adding PGDLLIMPORT, which ought to be totally uncontroversial,
> 
> The thing is, I think I have somewhere a list of all the threads on this
> topic that I've read through since the first time I had to come with my own
> hat in hand asking for a PGDLLIMPORT on something, years ago now, and
> I don't think I have ever seen one where it was as uncontroversial
> as you suggest.

The "ought" above is a load-bearing word.  Nonetheless, here's a case, also
involving GUCs, where it was uncontroversial:
https://postgr.es/m/flat/20171120200230.iwcmptwznbvl6y4c%40alap3.anarazel.de



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Tue, Aug 24, 2021 at 5:06 PM Chapman Flack <chap@anastigmatix.net> wrote:
> The thing is, I think I have somewhere a list of all the threads on this
> topic that I've read through since the first time I had to come with my own
> hat in hand asking for a PGDLLIMPORT on something, years ago now, and
> I don't think I have ever seen one where it was as uncontroversial
> as you suggest.

It does tend to be controversial, but I think that's basically only
because Tom Lane has reservations about it. I think if Tom dropped his
opposition to this, nobody else would really care. And I think that
would be a good thing for the project.

> In each iteration, I think I've also seen a countervailing view expressed
> in favor of looking into whether globals visibility could be further
> /reduced/.

But, like I say, that's only a view that gets advanced as a reason not
to mark things PGDLLIMPORT. Nobody ever wants that thing for its own
sake. I think it's a complete red herring.

If and when somebody wants to make a serious proposal to do something
like that, it can be considered on its own merits. But between now and
then, refusing to make things work on Windows as they do on Linux does
not benefit anyone. A ton of work has been made into making PostgreSQL
portable over the years, and abandoning it in just this one case is
unreasonable.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Magnus Hagander
Date:
On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 5:06 PM Chapman Flack <chap@anastigmatix.net> wrote:
> > The thing is, I think I have somewhere a list of all the threads on this
> > topic that I've read through since the first time I had to come with my own
> > hat in hand asking for a PGDLLIMPORT on something, years ago now, and
> > I don't think I have ever seen one where it was as uncontroversial
> > as you suggest.
>
> It does tend to be controversial, but I think that's basically only
> because Tom Lane has reservations about it. I think if Tom dropped his
> opposition to this, nobody else would really care. And I think that
> would be a good thing for the project.


I have only one consideration about it, and that's a technical one :)

Does this in some way have an effect on the size of the binary and/or
the time it takes to load it?

I ask, because IIRC back in the prehistoric days, adding all the
*functions* to the list of exports had a very significant impact on
the size of the binary, and some (but not very large) impact on the
loading time. Of course, we had to do that so that even our own
libraries would probably load. And at the time it was decided that we
definitely wanted to export all functions and not just the ones that
we would somehow define as an API.

Now, I'm pretty sure that the GUCs are few enough that this should
have no measurable effect on size/load time, but it's something that
should be verified.

But in particular, both on that argument, and on the general
maintenance argument, I have a very hard time seeing how exporting the
GUC variables would be any worse than exporting the many hundreds of
functions we already export.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
>> It does tend to be controversial, but I think that's basically only
>> because Tom Lane has reservations about it. I think if Tom dropped his
>> opposition to this, nobody else would really care. And I think that
>> would be a good thing for the project.

> But in particular, both on that argument, and on the general
> maintenance argument, I have a very hard time seeing how exporting the
> GUC variables would be any worse than exporting the many hundreds of
> functions we already export.

My beef about it has nothing to do with binary-size concerns, although
that is an interesting angle.  (I wonder whether marking a variable
PGDLLIMPORT has any negative effect on the cost of accessing it from
within the core code?)  Rather, I'm unhappy with spreading even more
Microsoft-droppings all over our source.  If there were some way to
just do this automatically for all global variables without any source
changes, I'd be content with that.  That would *really* make the
platforms more nearly equivalent.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Magnus Hagander
Date:
On Wed, Aug 25, 2021 at 4:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Magnus Hagander <magnus@hagander.net> writes:
> > On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >> It does tend to be controversial, but I think that's basically only
> >> because Tom Lane has reservations about it. I think if Tom dropped his
> >> opposition to this, nobody else would really care. And I think that
> >> would be a good thing for the project.
>
> > But in particular, both on that argument, and on the general
> > maintenance argument, I have a very hard time seeing how exporting the
> > GUC variables would be any worse than exporting the many hundreds of
> > functions we already export.
>
> My beef about it has nothing to do with binary-size concerns, although
> that is an interesting angle.  (I wonder whether marking a variable
> PGDLLIMPORT has any negative effect on the cost of accessing it from
> within the core code?)

It should have no effect on local code.

PGDLLIMPORT turns into "__declspec (dllexport)" when building the
backend, which should have no effect on imports

(it turns into __declspec (dllimport) when building a frontend only,
but that's why we need it in the headers, iirc)

The only overhead I've seen discussions about int he docs around that
is the overhead of exporting by name vs exporting by ordinal.


>  Rather, I'm unhappy with spreading even more
> Microsoft-droppings all over our source.  If there were some way to
> just do this automatically for all global variables without any source
> changes, I'd be content with that.  That would *really* make the
> platforms more nearly equivalent.

Actually, ti's clearly been a while since I dug into this

AFAICT we *do* actually export all the data symbols as well? At least
in the MSVC build where we use gendef.pl? Specifically, see
a5eed4d770.

The thing we need the PGDLLIMPORT definition for is to *import* them
on the other end?

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Mark all GUC variable as PGDLLIMPORT

From
Alvaro Herrera
Date:
On 2021-Aug-25, Magnus Hagander wrote:

> The thing we need the PGDLLIMPORT definition for is to *import* them
> on the other end?

Oh ... so modules that are willing to cheat can include their own
declarations of the variables they need, and mark them __declspec
(dllimport)?

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The Gord often wonders why people threaten never to come back after they've
been told never to return" (www.actsofgord.com)



Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Thu, Aug 26, 2021 at 1:51 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Aug-25, Magnus Hagander wrote:
>
> > The thing we need the PGDLLIMPORT definition for is to *import* them
> > on the other end?
>
> Oh ... so modules that are willing to cheat can include their own
> declarations of the variables they need, and mark them __declspec
> (dllimport)?

I just tried and msvc doesn't like it.  It errors out with a C2370
error "redefinition; different storage class".  According to
https://docs.microsoft.com/en-us/cpp/error-messages/compiler-errors-1/compiler-error-c2370
changing __declspec() on the other side is not possible.



Re: Mark all GUC variable as PGDLLIMPORT

From
Bruce Momjian
Date:
On Wed, Aug 25, 2021 at 10:41:14AM -0400, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >> It does tend to be controversial, but I think that's basically only
> >> because Tom Lane has reservations about it. I think if Tom dropped his
> >> opposition to this, nobody else would really care. And I think that
> >> would be a good thing for the project.
> 
> > But in particular, both on that argument, and on the general
> > maintenance argument, I have a very hard time seeing how exporting the
> > GUC variables would be any worse than exporting the many hundreds of
> > functions we already export.
> 
> My beef about it has nothing to do with binary-size concerns, although
> that is an interesting angle.  (I wonder whether marking a variable
> PGDLLIMPORT has any negative effect on the cost of accessing it from
> within the core code?)  Rather, I'm unhappy with spreading even more
> Microsoft-droppings all over our source.  If there were some way to
> just do this automatically for all global variables without any source
> changes, I'd be content with that.  That would *really* make the
> platforms more nearly equivalent.

OK, so the big question is how can we minimize the code impact of this
feature?  Can we add some kind of checking so we don't forget to mark
anything?  Can we automate this somehow?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Mark all GUC variable as PGDLLIMPORT

From
Magnus Hagander
Date:
On Thu, Aug 26, 2021 at 3:38 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 1:51 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2021-Aug-25, Magnus Hagander wrote:
> >
> > > The thing we need the PGDLLIMPORT definition for is to *import* them
> > > on the other end?
> >
> > Oh ... so modules that are willing to cheat can include their own
> > declarations of the variables they need, and mark them __declspec
> > (dllimport)?
>
> I just tried and msvc doesn't like it.  It errors out with a C2370
> error "redefinition; different storage class".  According to
> https://docs.microsoft.com/en-us/cpp/error-messages/compiler-errors-1/compiler-error-c2370
> changing __declspec() on the other side is not possible.

Yeah, but that does move the problem to the other side doesn't it? So
if you (as a pure test of course) were to remove the variable
completely from the included header and just declare it manually with
PGDLLSPEC in your file, it should work?

Ugly as it is, I wonder if there's a chance we could just process all
the headers at install times and inject the PGDLLIMPORT. We know which
symvols it is on account of what we're getting in the DEF file.

Not saying that's not a very ugly solution, but it might work?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Thu, Aug 26, 2021 at 3:42 PM Magnus Hagander <magnus@hagander.net> wrote:
> Ugly as it is, I wonder if there's a chance we could just process all
> the headers at install times and inject the PGDLLIMPORT. We know which
> symvols it is on account of what we're getting in the DEF file.
>
> Not saying that's not a very ugly solution, but it might work?

If it's ugly, that might mean it's a bad idea and we shouldn't do it
... but if it can be made not-too-ugly, it would certainly be nice to
be able to stop worrying about this.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Andrew Dunstan
Date:
On 8/26/21 3:57 PM, Robert Haas wrote:
> On Thu, Aug 26, 2021 at 3:42 PM Magnus Hagander <magnus@hagander.net> wrote:
>> Ugly as it is, I wonder if there's a chance we could just process all
>> the headers at install times and inject the PGDLLIMPORT. We know which
>> symvols it is on account of what we're getting in the DEF file.
>>
>> Not saying that's not a very ugly solution, but it might work?
> If it's ugly, that might mean it's a bad idea and we shouldn't do it
> ... but if it can be made not-too-ugly, it would certainly be nice to
> be able to stop worrying about this.
>

How is this going to affect msys builds? No gendef there IIRC. I guess
some similar procedure might be possible ...


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Mark all GUC variable as PGDLLIMPORT

From
Michael Paquier
Date:
On Thu, Aug 26, 2021 at 05:10:39PM -0400, Andrew Dunstan wrote:
> On 8/26/21 3:57 PM, Robert Haas wrote:
>> On Thu, Aug 26, 2021 at 3:42 PM Magnus Hagander <magnus@hagander.net> wrote:
>>> Ugly as it is, I wonder if there's a chance we could just process all
>>> the headers at install times and inject the PGDLLIMPORT. We know which
>>> symvols it is on account of what we're getting in the DEF file.
>>> Not saying that's not a very ugly solution, but it might work?

I missed the word "install" first here :)

>> If it's ugly, that might mean it's a bad idea and we shouldn't do it
>> ... but if it can be made not-too-ugly, it would certainly be nice to
>> be able to stop worrying about this.
>
> How is this going to affect msys builds? No gendef there IIRC. I guess
> some similar procedure might be possible ...

Wouldn't that be needed for cygwin as well?  If we go down to enable
that for a maximum number of parameters, I would really agree for
doing things so as this never gets forgotten for new parameters and we
don't have to discuss the matter anymore.  With all that in mind, that
would mean a new perl script that does the job, callable by both MSVC
and normal make builds.  But we have nothing that does a manipulation
of the installation contents.  And couldn't it be a problem if an
installation is overwritten, updated or upgradedd, where there may be
contents not coming from the in-core build process but from some
extension?
--
Michael

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Fri, Aug 27, 2021 at 3:42 AM Magnus Hagander <magnus@hagander.net> wrote:
>
> Yeah, but that does move the problem to the other side doesn't it? So
> if you (as a pure test of course) were to remove the variable
> completely from the included header and just declare it manually with
> PGDLLSPEC in your file, it should work?
>
> Ugly as it is, I wonder if there's a chance we could just process all
> the headers at install times and inject the PGDLLIMPORT. We know which
> symvols it is on account of what we're getting in the DEF file.
>
> Not saying that's not a very ugly solution, but it might work?

It's apparently not enough.  I tried with autovacuum_max_workers GUC,
and it still errors out.
If I add a PGDLLIMPORT, there's a link error when trying to access the variable:
error LNK2019: unresolved external symbol __imp_autovacuum_max_workers
referenced in function...

I think that it means that msvc tries to link that to a DLL while it's
expected to be in postgres.lib as the __imp_ prefix is added in that
case.

If I use PGDLLEXPORT I simply get:
error LNK2001: unresolved external symbol aytovacuum_max_workers



Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Wed, 25 Aug 2021 at 03:13, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Aug 24, 2021 at 2:52 PM Chapman Flack <chap@anastigmatix.net> wrote:
> I don't think that's true of the second proposal in [0]. I don't foresee
> a noticeable runtime cost unless there is a plausible workload that
> involves very frequent updates to GUC settings that are also of interest
> to a bunch of extensions. Maybe I'll take a stab at a POC.

I'm not sure I fully understand that proposal, but I find it hard to
believe that we would seriously consider replacing every direct GUC
reference in the backend with something that goes through an API. Even
if didn't hurt performance, I think it would uglify the code a whole
lot.

It'd probably have to be something that resolves the GUC storage addresses at compile-time or once at runtime, if it's going to be used by core code. While some code doesn't hit a lot of GUCs, some *really* hammers some common GUCs.

There are various issues with cache lines and pointer chasing that are beyond my low-level fu at work here. Adding a level of pointer indirection can be very expensive in the wrong situations.

So you're probably looking at some kind of mess with token pasting, macros and static inlines. Ew.

Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Wed, 25 Aug 2021 at 22:36, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Aug 25, 2021 at 4:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 5:06 PM Chapman Flack <chap@anastigmatix.net> wrote:
> > The thing is, I think I have somewhere a list of all the threads on this
> > topic that I've read through since the first time I had to come with my own
> > hat in hand asking for a PGDLLIMPORT on something, years ago now, and
> > I don't think I have ever seen one where it was as uncontroversial
> > as you suggest.
>
> It does tend to be controversial, but I think that's basically only
> because Tom Lane has reservations about it. I think if Tom dropped his
> opposition to this, nobody else would really care. And I think that
> would be a good thing for the project.


I have only one consideration about it, and that's a technical one :)

Does this in some way have an effect on the size of the binary and/or
the time it takes to load it?

On *nix, no.

On Windows, very, very minimally.

We *should* be looking into making  private symbols we can't make non-static have hidden visibility at link time, i.e. be DSO-private.  This can have a huge impact on link-time optimisation and inlining.

But doing so is quite orthogonal to the matter of fixing a linkage issue on Windows. By making select symbols hidden we'd be *reducing* the exposed set of functions and data symbols in a disciplined and progressive way on all platforms. Useful but different.

Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Thu, 26 Aug 2021 at 01:51, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Aug-25, Magnus Hagander wrote:

> The thing we need the PGDLLIMPORT definition for is to *import* them
> on the other end?

Oh ... so modules that are willing to cheat can include their own
declarations of the variables they need, and mark them __declspec
(dllimport)?


Damn. I was hoping nobody would notice that.

I do exactly that in some extensions to work around some of this mess, but it is quite awkward and has its limitations.

Re: Mark all GUC variable as PGDLLIMPORT

From
Craig Ringer
Date:
On Fri, 27 Aug 2021 at 08:59, Julien Rouhaud <rjuju123@gmail.com> wrote:
On Fri, Aug 27, 2021 at 3:42 AM Magnus Hagander <magnus@hagander.net> wrote:
>
> Yeah, but that does move the problem to the other side doesn't it? So
> if you (as a pure test of course) were to remove the variable
> completely from the included header and just declare it manually with
> PGDLLSPEC in your file, it should work?
>
> Ugly as it is, I wonder if there's a chance we could just process all
> the headers at install times and inject the PGDLLIMPORT. We know which
> symvols it is on account of what we're getting in the DEF file.
>
> Not saying that's not a very ugly solution, but it might work?

It's apparently not enough.  I tried with autovacuum_max_workers GUC,
and it still errors out.
If I add a PGDLLIMPORT, there's a link error when trying to access the variable:
error LNK2019: unresolved external symbol __imp_autovacuum_max_workers
referenced in function...

If I use PGDLLEXPORT I simply get:
error LNK2001: unresolved external symbol aytovacuum_max_workers

It works, but you can't use PGDLLIMPORT, you have to implement the import yourself without the help of __declspec(dllimport) .

Where you want

    autovacuum_max_workers

you must instead write

    *((int*)__imp_autovacuum_max_workers)

Here's the comment I wrote on the topic in something I was working on:

/*
 * On Windows, a symbol is not accessible outside the executable or shared
 * library (PE object) that it is defined in unless explicitly exported in
 * the DLL interface.
 *
 * It must then be explicitly imported by objects that use it; Windows doesn't
 * do ELF-style fixups.
 *
 * The export part is usually accomplished by a __declspec(dllexport)
 * annotation on the symbol or a .DEF file supplied as linker input. Postgres
 * uses the .DEF approach, auto-exporting all symbols using
 * src\tools\msvc\gendef.pl . Internally this hides "symname" from the DLL
 * interface and instead generates an export symbol "__imp_symname" that is a
 * pointer to the value of "symname".
 *
 * The import part is usually done with the __declspec(dllimport) annotation on
 * the symbol. The PGDLLIMPORT macro expands to __declspec(dllimport) when
 * postgres headers are included during extension compilation. But not all the
 * symbols that pglogical needs are annotated with PGDLLIMPORT. Attempting to
 * access a symbol that is not so-annotated will fail at link time with an
 * error like
 *
 *     error LNK2001: unresolved external symbol criticalSharedRelcachesBuilt
 *
 * Because of gendefs.pl, postgres still exports the symbol even if it isn't
 * annotated PGDLLIMPORT. So we can just do the shorthand that
 * __declspec(dllimport) does for us in the preprocessor instead: replace each
 * symbol with its __imp_symbol indirection and dereference it.
 *
 * There's one wrinkle in this though. MSVC doesn't generate a definition for a
 * global data symbol that is neither initialized nor explicitly marked
 * __declspec(dllexport). gendefs.pl will think such symbols are references to
 * a symbol defined in another object file and will skip them without emitting
 * a DATA entry for them in the DEF file, so no __imp_ stub is generated in the
 * DLL interface. We can't use (*__imp_symbolname) if there's no import stub.
 *
 * If they're GUCs, we can round-trip them through their text values
 * to read them. Nothing should ever be assigning to GUC storage and there's no
 * reason to take the address of GUC storage, so this should work fine, albeit
 * slower. If we find any that aren't GUCs we're in trouble but so far there
 * haven't been any.
 *


This is gruesome and I hadn't planned to mention it, but now someone noticed the .DEF file exports the symbols I guess it does no harm.

So can we just fix PGDLLIMPORT now?

Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
Hi,

On 02/13/22 02:29, Julien Rouhaud wrote:
> Maybe we could have an actually usable GUC API to retrieve values in their
> native format rather than C string for instance, that we could make sure also
> works for cases like max_backend?

I proposed a sketch of such an API for discussion back in [0] (the second
idea in that email, the "what I'd really like" one).

In that scheme, some extension code that was interested in (say,
for some reason) log_statement_sample_rate could say:

static double samprate;
static int gucs_changed = 0;

#define SAMPRATE_CHANGED 1

...
  ObserveTypedConfigValue("log_statement_sample_rate",
    &samprate, &gucs_changed, SAMPRATE_CHANGED);
...

and will be subscribed to have the native-format value stored into samprate,
and SAMPRATE_CHANGED ORed into gucs_changed, whenever the value changes.

The considerations leading me to that design were:

- avoid subscribing as a 'callback' sort of listener. GUCs can get set
  (or, especially, reset) in delicate situations like error recovery
  where calling out to arbitrary extra code might best be avoided.

- instead, just dump the value in a subscribed location. A copy,
  of course, so no modification there affects the real value.

- but at the same time, OR a flag into a bit set, so subscribing code can
  very cheaply poll for when a value of interest (or any of a bunch of
  values of interest) has changed since last checked.

- do the variable lookup by name once only, and pay no further search cost
  when the subscribing code wants the value.


I never pictured that proposal as the last word on the question, and
different proposals could result from putting different weights on those
objectives, or adding other objectives, but I thought it might serve
as a discussion-starter.

Regards,
-Chap


[0] https://www.postgresql.org/message-id/6123C425.3080409%40anastigmatix.net



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> On 02/13/22 02:29, Julien Rouhaud wrote:
>> Maybe we could have an actually usable GUC API to retrieve values in their
>> native format rather than C string for instance, that we could make sure also
>> works for cases like max_backend?

> I proposed a sketch of such an API for discussion back in [0] (the second
> idea in that email, the "what I'd really like" one).

> In that scheme, some extension code that was interested in (say,
> for some reason) log_statement_sample_rate could say:

> static double samprate;
> static int gucs_changed = 0;

> #define SAMPRATE_CHANGED 1

> ...
>   ObserveTypedConfigValue("log_statement_sample_rate",
>     &samprate, &gucs_changed, SAMPRATE_CHANGED);
> ...

> and will be subscribed to have the native-format value stored into samprate,
> and SAMPRATE_CHANGED ORed into gucs_changed, whenever the value changes.


That seems like about 10X more complexity than is warranted,
not only in terms of the infrastructure required, but also in
the intellectual complexity around "just when could that value
change?"

Why not just provide equivalents to GetConfigOption() that can
deliver int, float8, etc values instead of strings?

(In any case we'd need to rethink the GUC "show_hook" APIs, which
currently need only deal in string output.)

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 02/13/22 15:16, Tom Lane wrote:
> That seems like about 10X more complexity than is warranted,
> not only in terms of the infrastructure required, but also in
> the intellectual complexity around "just when could that value
> change?"
> 
> Why not just provide equivalents to GetConfigOption() that can
> deliver int, float8, etc values instead of strings?

And repeat the bsearch to find the option every time the interested
extension wants to check the value, even when what it learns is that
the value has not changed? And make it the job of every piece of
interested extension code to save the last-retrieved value and compare
if it wants to detect that it's changed? When the GUC machinery already
has code that executes exactly when a value is being supplied for
an option?

Clearly I'm not thinking here of the GUCs that are read-only views of
values that are determined some other way. How many of those are there
that are mutable, and could have their values changed without going
through the GUC mechanisms?

Also, I think there are some options that are only represented by
an int, float8, etc., when shown, but whose native internal form
is something else, like a struct. I was definitely contemplating
that you could 'subscribe' to one of those too, by passing the
address of an appropriate struct. But of course a GetConfigOption()
flavor could work that way too.

Regards,
-Chap



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> On 02/13/22 15:16, Tom Lane wrote:
>> Why not just provide equivalents to GetConfigOption() that can
>> deliver int, float8, etc values instead of strings?

> And repeat the bsearch to find the option every time the interested
> extension wants to check the value, even when what it learns is that
> the value has not changed? And make it the job of every piece of
> interested extension code to save the last-retrieved value and compare
> if it wants to detect that it's changed? When the GUC machinery already
> has code that executes exactly when a value is being supplied for
> an option?

(shrug) You could argue the performance aspect either way.  If the
core GUC code updates a value 1000 times while the extension consults
the result once, you've probably lost money on the deal.

As for the bsearch, we could improve on that when and if it's shown
to be a bottleneck --- converting to a hash table ought to pretty
much fix any worries there.  Or we could provide APIs that let an
extension look up a "struct config_generic *" once and then fetch
directly using that pointer.  (But I'd prefer not to, since it'd
constrain the internals more than I think is wise.)

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-02-13 15:36:16 -0500, Chapman Flack wrote:
>> Also, I think there are some options that are only represented by
>> an int, float8, etc., when shown, but whose native internal form
>> is something else, like a struct. I was definitely contemplating
>> that you could 'subscribe' to one of those too, by passing the
>> address of an appropriate struct. But of course a GetConfigOption()
>> flavor could work that way too.

> I have a very hard time seeing a use-case for this. Nor how it'd even work
> with a struct - you can't just copy the struct contents, because of pointers
> to objects etc.  I don't think there really are options like this anyway.

There are a couple of legacy cases like "datestyle" where something
that the user sees as one GUC translates to multiple variables under
the hood, so you'd have to invent a struct if you wanted to pass
them through a mechanism like this.  I don't have a big problem
with leaving those out of any such solution, though.  (I see that
datestyle's underlying variables DateStyle and DateOrder are already
marked PGDLLIMPORT, and that's fine with me.)  A possibly more
interesting case is something like search_path --- but again,
there are already special-purpose APIs for accessing the interpreted
value of that.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 02/13/22 15:48, Tom Lane wrote:
> much fix any worries there.  Or we could provide APIs that let an
> extension look up a "struct config_generic *" once and then fetch
> directly using that pointer.  (But I'd prefer not to, since it'd
> constrain the internals more than I think is wise.)

There is GetConfigOptionByNum already ... but I'm not sure there's
an easy way to ask "what's the num for option foo?".

GetNumConfigOptions exists, so there is a brute-force way.

Regards,
-Chap



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Sun, Feb 13, 2022 at 3:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > ...
> >   ObserveTypedConfigValue("log_statement_sample_rate",
> >     &samprate, &gucs_changed, SAMPRATE_CHANGED);
> > ...
>
> > and will be subscribed to have the native-format value stored into samprate,
> > and SAMPRATE_CHANGED ORed into gucs_changed, whenever the value changes.
>
>
> That seems like about 10X more complexity than is warranted,
> not only in terms of the infrastructure required, but also in
> the intellectual complexity around "just when could that value
> change?"

I agree in the sense that I believe we should $SUBJECT rather than
fooling around with this. It is completely understandable that
extensions want to know the value of GUCs, and not just as strings,
and doing $SUBJECT would be by far the easiest way of accomplishing
that. I'm sure Andres is right when he says that there are cases where
not exposing symbols could improve the generated machine code, but I'm
also pretty sure that we just have to live with the cost when it's
core GUCs that we're talking about. It's just unrealistic to suppose
that extensions are not going to care.

But if we're not going to do that, then I don't see why Chapman's
proposal is overly complex. It seems completely understandable that if
(some_guc_var != last_observed_some_guc_var) { ...adapt accordingly...
} feels like an OK thing to do but if you have to make a function call
every time then it seems too expensive. Imagine a background worker
that has to do a bunch of extra setup every time some GUC changes, and
every iteration of the main loop just wants to check whether it's
changed, and the main loop could iterate very quickly in some cases. I
wouldn't want to worry about the cost of a function call on every trip
through the loop. Maybe it would be trivial in practice, but who
knows?

I don't particularly like Chapman's solution, but given that you've
repeatedly blocked every effort to just apply PGDLLIMPORT markings
across the board, I'm not sure what the realistic alternative is. It
doesn't seem fair to argue, on the one hand, that we can't just do
what I believe literally every other hacker on the project wants, and
that on the other hand, it's also unacceptable to add complexity to
work around the problem you've created by blocking that proposal every
time it's been raised year after year. It's really pretty frustrating
to me that we haven't just done the obvious thing here years ago. The
time we've spent arguing about it could have been better spent on just
about anything else, with absolutely zero harm to the project.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I don't particularly like Chapman's solution, but given that you've
> repeatedly blocked every effort to just apply PGDLLIMPORT markings
> across the board, I'm not sure what the realistic alternative is.

You do realize that I just have one vote in these matters?  If I'm
outvoted then so be it.  The impression I have though is that a
number of other people don't like the extra notational cruft either.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Mon, Feb 14, 2022 at 10:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > I don't particularly like Chapman's solution, but given that you've
> > repeatedly blocked every effort to just apply PGDLLIMPORT markings
> > across the board, I'm not sure what the realistic alternative is.
>
> You do realize that I just have one vote in these matters?  If I'm
> outvoted then so be it.  The impression I have though is that a
> number of other people don't like the extra notational cruft either.

Hmm, I guess I'd need to know who those people are in order to be able
to review their comments. I don't *like* the extra notational cruft,
but applying it inconsistently isn't better than being consistent. As
I see it, we have four choices: (1) apply PGDLLIMPORT markings
relatively broadly so that people can get extensions to work on
Windows, (2) continue to apply them inconsistently, thus slightly
reducing notational clutter at the cost of breaking lots of extensions
on Windows, (3) put some complex system in place like what Chapman
proposes and get all extension authors to adopt it, and (4) remove the
Windows port. To the best of my current knowledge, everyone other than
you prefers (1), you prefer (2) or (4), and (3) is an attempt at
compromise that is nobody's first choice.

If that is correct, then I think we should do (1). If it's incorrect
then I think we should do our best to find a choice other than (2)
that does attract a consensus. The current situation, which is (2),
must be the worst of all possible options because it manages to bother
the people who dislike the clutter AND ALSO bother the people who want
to have their extensions work on Windows. Any other choice has a
chance of reaching a state where only one of those groups of people is
annoyed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Chapman Flack
Date:
On 02/14/22 11:43, Robert Haas wrote:
> On Mon, Feb 14, 2022 at 10:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I don't particularly like Chapman's solution,
> ... and (3) is an attempt at compromise that is nobody's first choice.

Ok, I guess that's )sniffle( a pretty fair way of putting it.

> ... (3) put some complex system in place like what Chapman
> proposes and get all extension authors to adopt it,

By the same token, I don't think it would entail anything like a big
flag day for extension authors. Provided no one proposes immediately
to /remove/ PGDLLIMPORT from things that now have it, the effect would
be more that the next time an extension author comes hat-in-hand
asking for a new PGDLLIMPORT because a thing won't build on Windows,
the answer can be "here, we have this API you can use for that now."

My reading of past discussions on the list suggest that stronger
encapsulation and API delineation have advocates in some quarters,
so I tried to accommodate that in what I proposed. It does, for example,
avoid exposing the 'real' value of a GUC to writing by a buggy
or compromised extension.

I'll be first to agree what I proposed isn't beautiful, but it might be
that a round or two of improvement by somebody smarter than me could lead
to something possibly preferable to PGDLLIMPORT-everywhere /when measured
against certain objectives/, like encapsulation.

So maybe this is in part a discussion about what the weights should be
on those various objectives.

Regards,
-Chap



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Hmm, I guess I'd need to know who those people are in order to be able
> to review their comments. I don't *like* the extra notational cruft,
> but applying it inconsistently isn't better than being consistent. As
> I see it, we have four choices: (1) apply PGDLLIMPORT markings
> relatively broadly so that people can get extensions to work on
> Windows, (2) continue to apply them inconsistently, thus slightly
> reducing notational clutter at the cost of breaking lots of extensions
> on Windows, (3) put some complex system in place like what Chapman
> proposes and get all extension authors to adopt it, and (4) remove the
> Windows port. To the best of my current knowledge, everyone other than
> you prefers (1), you prefer (2) or (4), and (3) is an attempt at
> compromise that is nobody's first choice.

I think you are attributing straw-man positions to me.  What I'd actually
*like* is some solution that has the effect of (1) without having to mark
up our code with a bunch of Microsoft-isms.  However I don't know how to
do that, and I do agree that (2), (3), and (4) are not better than (1).

There are some technical issues though:

* AFAICS, the patch of record doesn't actually do (1), it does something
else that adds yet more new notation.  The cfbot says it fails, too.

* If we institute a policy that all GUCs should be PGDLLIMPORT'd,
how will we enforce that new patches follow that?  I don't want to be
endlessly going back and adding forgotten PGDLLIMPORT markers.

* There's a moderately sizable subset of GUCs where the underlying
variable is not visible at all because it's static in guc.c.
Typically this is because that variable is only used for display
and there's an assign hook that stores the real data somewhere else.
I suppose what we want in such cases is for the "somewhere else"
to be PGDLLIMPORT'd, but in a lot of cases those variables are also
static in some other module.  Does this proposal include exporting
variables that currently aren't visible to extensions at all?
I'm a little resistant to that.  I can buy making sure that Windows
has a level playing field, but that's as far as I want to go.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Mon, Feb 14, 2022 at 12:25 PM Chapman Flack <chap@anastigmatix.net> wrote:
> On 02/14/22 11:43, Robert Haas wrote:
> > On Mon, Feb 14, 2022 at 10:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Robert Haas <robertmhaas@gmail.com> writes:
> >>> I don't particularly like Chapman's solution,
> > ... and (3) is an attempt at compromise that is nobody's first choice.
>
> Ok, I guess that's )sniffle( a pretty fair way of putting it.

I mean I like it better than Tom does ... I just think it's a
complicated work around for a problem we don't really need to have.

> My reading of past discussions on the list suggest that stronger
> encapsulation and API delineation have advocates in some quarters,
> so I tried to accommodate that in what I proposed. It does, for example,
> avoid exposing the 'real' value of a GUC to writing by a buggy
> or compromised extension.
>
> I'll be first to agree what I proposed isn't beautiful, but it might be
> that a round or two of improvement by somebody smarter than me could lead
> to something possibly preferable to PGDLLIMPORT-everywhere /when measured
> against certain objectives/, like encapsulation.
>
> So maybe this is in part a discussion about what the weights should be
> on those various objectives.

Keep in mind that there's nothing being set in stone here. Applying
PGDLLIMPORT to all GUCs across the board does not foreclose putting
some other system that restricts symbol visibility in the future. But
in the present, there is clearly no approach to the symbol visibility
problem that is fully baked or enjoys consensus. Yet, there is clearly
consensus that not being able to access GUCs in Windows extensions is
a problem. There must be like six different threads with people
complaining about that, and in every case the suggested solution is to
just add PGDLLIMPORT for crying out loud.

If in the future there is a consensus to restrict symbol visibility in
order to achieve some agreed-upon amount of encapsulation, then we can
do that then. Extension authors may not like it much, but every
serious extension author understands that the PostgreSQL code base
needs to keep moving forward, and that this will sometimes require
them to adjust for new major versions. This would be a bigger
adjustment than many, but I have confidence that if the benefits are
worthwhile, people will cope with it and adjust their extensions to
adhere to whatever new rules we put in place.

But between now and then, refusing to apply PGDLLIMPORT to a basically
random subset of GUCs is just annoying people without any compensating
benefit.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Mon, Feb 14, 2022 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think you are attributing straw-man positions to me.  What I'd actually
> *like* is some solution that has the effect of (1) without having to mark
> up our code with a bunch of Microsoft-isms.  However I don't know how to
> do that, and I do agree that (2), (3), and (4) are not better than (1).

OK, that sounds like we might be making some progress toward a useful agreement.

> There are some technical issues though:
>
> * AFAICS, the patch of record doesn't actually do (1), it does something
> else that adds yet more new notation.  The cfbot says it fails, too.

That sounds like a problem. Should be fixable.

> * If we institute a policy that all GUCs should be PGDLLIMPORT'd,
> how will we enforce that new patches follow that?  I don't want to be
> endlessly going back and adding forgotten PGDLLIMPORT markers.

It seems to me that it would be no different from bumping catversion
or updating nodefuncs.c: yes, it will get forgotten sometimes, but
hopefully people will mostly get used to it just like they do with
dozens of other PG-specific coding rules. I don't see that it's likely
to generate more than a handful of commits per year, so it doesn't
really seem worth worrying about to me, but YMMV. Maybe it's possible
to write a perl script to verify it, although it seems like it might
be complicated to code that up.

> * There's a moderately sizable subset of GUCs where the underlying
> variable is not visible at all because it's static in guc.c.
> Typically this is because that variable is only used for display
> and there's an assign hook that stores the real data somewhere else.
> I suppose what we want in such cases is for the "somewhere else"
> to be PGDLLIMPORT'd, but in a lot of cases those variables are also
> static in some other module.  Does this proposal include exporting
> variables that currently aren't visible to extensions at all?
> I'm a little resistant to that.  I can buy making sure that Windows
> has a level playing field, but that's as far as I want to go.

I can live with that. If someone complains about those variables being
static-to-file instead of globally visible, we can address that
complaint on its merits when it is presented.

An alternative rule which would dodge that particular issue would be
to just slap PGDLLIMPORT on every global variable in every header
file. That would arguably be a simpler rule, though it means even more
PGDLLIMPORT declarations floating around.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> An alternative rule which would dodge that particular issue would be
> to just slap PGDLLIMPORT on every global variable in every header
> file. That would arguably be a simpler rule, though it means even more
> PGDLLIMPORT declarations floating around.

Yeah, if the objective is "level playing field for Windows",
then it's hard to avoid the conclusion that we should just do that.
Again, I've never had an objection to that as the end result ---
I just wish that we could get the toolchain to do it for us.
But if we can't, we can't.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Mon, Feb 14, 2022 at 12:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > An alternative rule which would dodge that particular issue would be
> > to just slap PGDLLIMPORT on every global variable in every header
> > file. That would arguably be a simpler rule, though it means even more
> > PGDLLIMPORT declarations floating around.
>
> Yeah, if the objective is "level playing field for Windows",
> then it's hard to avoid the conclusion that we should just do that.
> Again, I've never had an objection to that as the end result ---
> I just wish that we could get the toolchain to do it for us.
> But if we can't, we can't.

100% agreed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Andres Freund
Date:
Hi,

On 2022-02-14 12:45:08 -0500, Robert Haas wrote:
> On Mon, Feb 14, 2022 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > * If we institute a policy that all GUCs should be PGDLLIMPORT'd,
> > how will we enforce that new patches follow that?  I don't want to be
> > endlessly going back and adding forgotten PGDLLIMPORT markers.
> 
> It seems to me that it would be no different from bumping catversion
> or updating nodefuncs.c: yes, it will get forgotten sometimes, but
> hopefully people will mostly get used to it just like they do with
> dozens of other PG-specific coding rules. I don't see that it's likely
> to generate more than a handful of commits per year, so it doesn't
> really seem worth worrying about to me, but YMMV. Maybe it's possible
> to write a perl script to verify it, although it seems like it might
> be complicated to code that up.

I think it'd be better if we had scripts ensuring this stays sane. Several of
your examples do have that. Whereas I don't see what would cause us to missing
PGDLLIMPORTs for GUCs, given it's a windows only issue and that you need an
extension using the GUC with omitted PGDLLIMPORT.


> An alternative rule which would dodge that particular issue would be
> to just slap PGDLLIMPORT on every global variable in every header
> file. That would arguably be a simpler rule, though it means even more
> PGDLLIMPORT declarations floating around.

That would have the advantage of being comparatively easy to check in an
automated way. Might even be cheap enough to just make it part of the build.

But it seems like it'd be a fair amount of work and cause a lot of patch
rebasing pain? If we end up going that way, we should schedule this to happen
just after the feature freeze, I think.

If we consider doing this for all extern variables, we should think about
doing this for headers *and* functions. That'd allow us to get rid of the
fairly complicated logic to generate the .def file for the postgres binary on
windows (src/tools/gendef.pl). And maybe also the related thing on AIX
(src/backend/port/aix/mkldexport.sh)

I'm kind of partial to the "add explicit visibility information to everything"
approach because the windows export file generation is a decent chunk of the
meson patchset. And what's missing to make it work on AIX...

Greetings,

Andres Freund



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Mon, Feb 14, 2022 at 8:53 PM Andres Freund <andres@anarazel.de> wrote:
> > An alternative rule which would dodge that particular issue would be
> > to just slap PGDLLIMPORT on every global variable in every header
> > file. That would arguably be a simpler rule, though it means even more
> > PGDLLIMPORT declarations floating around.
>
> That would have the advantage of being comparatively easy to check in an
> automated way. Might even be cheap enough to just make it part of the build.

I wasn't able to quickly write something that was smart enough to use
as part of the build, but I wrote something dumb that I think works
well enough to use with a little bit of human intelligence alongside.
See attached.

> But it seems like it'd be a fair amount of work and cause a lot of patch
> rebasing pain? If we end up going that way, we should schedule this to happen
> just after the feature freeze, I think.

We could do that. I'd sort of rather get it done. We still have two
weeks before the last CommitFest officially starts, and it's not as if
there won't be tons of uncommitted patches floating around after that.

> If we consider doing this for all extern variables, we should think about
> doing this for headers *and* functions. That'd allow us to get rid of the
> fairly complicated logic to generate the .def file for the postgres binary on
> windows (src/tools/gendef.pl). And maybe also the related thing on AIX
> (src/backend/port/aix/mkldexport.sh)

I don't know what you mean by this.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Andres Freund
Date:
Hi,

On 2022-02-15 08:58:05 -0500, Robert Haas wrote:
> On Mon, Feb 14, 2022 at 8:53 PM Andres Freund <andres@anarazel.de> wrote:
> > > An alternative rule which would dodge that particular issue would be
> > > to just slap PGDLLIMPORT on every global variable in every header
> > > file. That would arguably be a simpler rule, though it means even more
> > > PGDLLIMPORT declarations floating around.
> >
> > That would have the advantage of being comparatively easy to check in an
> > automated way. Might even be cheap enough to just make it part of the build.
> 
> I wasn't able to quickly write something that was smart enough to use
> as part of the build, but I wrote something dumb that I think works
> well enough to use with a little bit of human intelligence alongside.
> See attached.

Cool.


> > But it seems like it'd be a fair amount of work and cause a lot of patch
> > rebasing pain? If we end up going that way, we should schedule this to happen
> > just after the feature freeze, I think.
> 
> We could do that. I'd sort of rather get it done. We still have two
> weeks before the last CommitFest officially starts, and it's not as if
> there won't be tons of uncommitted patches floating around after that.

Breaking close to every patch 6-7 weeks before freeze doesn't seem
great. Particularly because this is a mostly automatically generated patch, so
I don't really see a forcing function to do this now, rather than at a more
opportune moment.

The more I think about it the more I'm convinced that if we want to do this,
we should do it for variables and functions.


> > If we consider doing this for all extern variables, we should think about
> > doing this for headers *and* functions. That'd allow us to get rid of the
> > fairly complicated logic to generate the .def file for the postgres binary on
> > windows (src/tools/gendef.pl). And maybe also the related thing on AIX
> > (src/backend/port/aix/mkldexport.sh)
> 
> I don't know what you mean by this.

We only need gendef.pl because we don't mark functions with visibility
information. If we attach PGDLLIMPORT to functions and variables we can a fair
bit of complexity.  And I think not just for windows, but also AIX (there's a
whole section in src/backend/Makefile about AIX oddity), but I'm not sure
about it.



> diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
> index a4b5dc853b..13849a3790 100644
> --- a/src/include/common/relpath.h
> +++ b/src/include/common/relpath.h
> @@ -56,7 +56,7 @@ typedef enum ForkNumber
>  
>  #define FORKNAMECHARS    4        /* max chars for a fork name */
>  
> -extern const char *const forkNames[];
> +extern PGDLLIMPORT const char *const forkNames[];
>  
>  extern ForkNumber forkname_to_number(const char *forkName);
>  extern int    forkname_chars(const char *str, ForkNumber *fork);

I think we might need a new form of PGDLLIMPORT to mark these correctly - I
don't think they should be marked PGDLLIMPORT in frontend environment.


> diff --git a/src/include/fe_utils/print.h b/src/include/fe_utils/print.h
> index 836b4e29a8..5caf9e2d08 100644
> --- a/src/include/fe_utils/print.h
> +++ b/src/include/fe_utils/print.h
> @@ -177,10 +177,10 @@ typedef struct printQueryOpt
>  } printQueryOpt;
>  
>  
> -extern volatile sig_atomic_t cancel_pressed;
> +extern PGDLLIMPORT volatile sig_atomic_t cancel_pressed;
>  
> -extern const printTextFormat pg_asciiformat;
> -extern const printTextFormat pg_asciiformat_old;
> +extern PGDLLIMPORT const printTextFormat pg_asciiformat;
> +extern PGDLLIMPORT const printTextFormat pg_asciiformat_old;
>  extern printTextFormat pg_utf8format;    /* ideally would be const, but... */

Are these right? I don't think we should make frontend stuff exported without
a very clear reason.


>  extern void jit_reset_after_error(void);
> diff --git a/src/include/jit/llvmjit.h b/src/include/jit/llvmjit.h
> index 66143afccc..4541f9a2c4 100644
> --- a/src/include/jit/llvmjit.h
> +++ b/src/include/jit/llvmjit.h
> @@ -56,30 +56,30 @@ typedef struct LLVMJitContext
>  } LLVMJitContext;
>  
>  /* llvm module containing information about types */
> -extern LLVMModuleRef llvm_types_module;
> +extern PGDLLIMPORT LLVMModuleRef llvm_types_module;

These are wrong I think, this is a shared library.

Greetings,

Andres Freund



Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
Hi,

On Mon, Feb 14, 2022 at 12:45:08PM -0500, Robert Haas wrote:
> On Mon, Feb 14, 2022 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > * There's a moderately sizable subset of GUCs where the underlying
> > variable is not visible at all because it's static in guc.c.
> > Typically this is because that variable is only used for display
> > and there's an assign hook that stores the real data somewhere else.
> > I suppose what we want in such cases is for the "somewhere else"
> > to be PGDLLIMPORT'd, but in a lot of cases those variables are also
> > static in some other module.  Does this proposal include exporting
> > variables that currently aren't visible to extensions at all?
> > I'm a little resistant to that.  I can buy making sure that Windows
> > has a level playing field, but that's as far as I want to go.
> 
> I can live with that. If someone complains about those variables being
> static-to-file instead of globally visible, we can address that
> complaint on its merits when it is presented.

Same here, if any third-party project had any use of such variable, they would
have sent some patch for that already so I don't see any reason to change it
now.



Re: Mark all GUC variable as PGDLLIMPORT

From
Bruce Momjian
Date:
On Wed, Feb 16, 2022 at 01:10:50AM +0800, Julien Rouhaud wrote:
> Hi,
> 
> On Mon, Feb 14, 2022 at 12:45:08PM -0500, Robert Haas wrote:
> > On Mon, Feb 14, 2022 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > 
> > > * There's a moderately sizable subset of GUCs where the underlying
> > > variable is not visible at all because it's static in guc.c.
> > > Typically this is because that variable is only used for display
> > > and there's an assign hook that stores the real data somewhere else.
> > > I suppose what we want in such cases is for the "somewhere else"
> > > to be PGDLLIMPORT'd, but in a lot of cases those variables are also
> > > static in some other module.  Does this proposal include exporting
> > > variables that currently aren't visible to extensions at all?
> > > I'm a little resistant to that.  I can buy making sure that Windows
> > > has a level playing field, but that's as far as I want to go.
> > 
> > I can live with that. If someone complains about those variables being
> > static-to-file instead of globally visible, we can address that
> > complaint on its merits when it is presented.
> 
> Same here, if any third-party project had any use of such variable, they would
> have sent some patch for that already so I don't see any reason to change it
> now.

Agreed.  I think the big goal here is to make Windows have the same
GUC variable visibility as Unix --- when we change things in a way that
breaks extensions, we hear about Unix breakage quickly and adjust for
it.  It is Windows being different and only getting the problem reports
later that is the real problem.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
Hi,

Sorry for taking so much time to answer here.  I definitely wanted to work on
that but I was under the impression that although we now had an agreement to
apply PGDLLIMPORT globally a patch itself wasn't really rushed, so I spent the
last two days trying to setup a new Windows environment as my previous one
isn't working anymore.  That's taking much longer than I planned.

On Tue, Feb 15, 2022 at 08:06:58AM -0800, Andres Freund wrote:
> Hi,
> 
> On 2022-02-15 08:58:05 -0500, Robert Haas wrote:
> > On Mon, Feb 14, 2022 at 8:53 PM Andres Freund <andres@anarazel.de> wrote:
> > > > An alternative rule which would dodge that particular issue would be
> > > > to just slap PGDLLIMPORT on every global variable in every header
> > > > file. That would arguably be a simpler rule, though it means even more
> > > > PGDLLIMPORT declarations floating around.
> > >
> > > That would have the advantage of being comparatively easy to check in an
> > > automated way. Might even be cheap enough to just make it part of the build.
> > 
> > I wasn't able to quickly write something that was smart enough to use
> > as part of the build, but I wrote something dumb that I think works
> > well enough to use with a little bit of human intelligence alongside.
> > See attached.
> 
> Cool.

+1

> > > But it seems like it'd be a fair amount of work and cause a lot of patch
> > > rebasing pain? If we end up going that way, we should schedule this to happen
> > > just after the feature freeze, I think.
> > 
> > We could do that. I'd sort of rather get it done. We still have two
> > weeks before the last CommitFest officially starts, and it's not as if
> > there won't be tons of uncommitted patches floating around after that.
> 
> Breaking close to every patch 6-7 weeks before freeze doesn't seem
> great. Particularly because this is a mostly automatically generated patch, so
> I don't really see a forcing function to do this now, rather than at a more
> opportune moment.

Agreed, especially if we can do some cleanup in gendef.pl.

> The more I think about it the more I'm convinced that if we want to do this,
> we should do it for variables and functions.
> 
> 
> > > If we consider doing this for all extern variables, we should think about
> > > doing this for headers *and* functions. That'd allow us to get rid of the
> > > fairly complicated logic to generate the .def file for the postgres binary on
> > > windows (src/tools/gendef.pl). And maybe also the related thing on AIX
> > > (src/backend/port/aix/mkldexport.sh)
> > 
> > I don't know what you mean by this.
> 
> We only need gendef.pl because we don't mark functions with visibility
> information. If we attach PGDLLIMPORT to functions and variables we can a fair
> bit of complexity.  And I think not just for windows, but also AIX (there's a
> whole section in src/backend/Makefile about AIX oddity), but I'm not sure
> about it.

I will try to have a look at it once my build environment is ready.



Re: Mark all GUC variable as PGDLLIMPORT

From
John Naylor
Date:
On Tue, Feb 15, 2022 at 11:07 PM Andres Freund <andres@anarazel.de> wrote:

> > diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
> > index a4b5dc853b..13849a3790 100644
> > --- a/src/include/common/relpath.h
> > +++ b/src/include/common/relpath.h
> > @@ -56,7 +56,7 @@ typedef enum ForkNumber
> >
> >  #define FORKNAMECHARS        4               /* max chars for a fork name */
> >
> > -extern const char *const forkNames[];
> > +extern PGDLLIMPORT const char *const forkNames[];
> >
> >  extern ForkNumber forkname_to_number(const char *forkName);
> >  extern int   forkname_chars(const char *str, ForkNumber *fork);
>
> I think we might need a new form of PGDLLIMPORT to mark these correctly - I
> don't think they should be marked PGDLLIMPORT in frontend environment.

That has been taken care of already, IIUC:

commit e04a8059a74c125a8af94acdcb7b15b92188470a
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Mon Nov 29 11:00:00 2021 -0500

    Simplify declaring variables exported from libpgcommon and libpgport.

    This reverts commits c2d1eea9e and 11b500072, as well as similar hacks
    elsewhere, in favor of setting up the PGDLLIMPORT macro so that it can
    just be used unconditionally.  That can work because in frontend code,
    we need no marking in either the defining or consuming files for a
    variable exported from these libraries; and frontend code has no need
    to access variables exported from the core backend, either.

-- 
John Naylor
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Andres Freund
Date:
Hi,

On 2022-02-15 08:06:58 -0800, Andres Freund wrote:
> The more I think about it the more I'm convinced that if we want to do this,
> we should do it for variables and functions.

Btw, if we were to do this, we should just use -fvisibility=hidden everywhere
and would see the same set of failures on unixoid systems as on windows. Of
course only in in-core extensions, but it'd still be better than nothing.

I proposed using -fvisibility=hidden in extensions:
https://postgr.es/m/20211101020311.av6hphdl6xbjbuif@alap3.anarazel.de

Mostly because using explicit symbol exports makes it a lot easier to build
extensions libraries on windows (with meson, but also everything built outside
of postgres with msvc). But also because it makes function calls *inside* an
extension have noticeably lower overhead. And it makes the set of symbols
exported smaller.


One problem I encountered was that it's actually kind of hard to set build
flags only for extension libraries:
https://postgr.es/m/20220111025328.iq5g6uck53j5qtin%40alap3.anarazel.de

But if we added explicit exports to everything we export, we'd not need to
restrict the use of -fvisibility=hidden to extension libraries anymore. Would
get decent error messages on all platforms.


Subsequently we could yield a bit of performance from critical paths by
marking selected symbols as *not* exported. That'd make them a tad cheaper to
call and allow the optimizer more leeway.

Greetings,

Andres Freund



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Fri, Feb 18, 2022 at 7:02 PM Andres Freund <andres@anarazel.de> wrote:
> On 2022-02-15 08:06:58 -0800, Andres Freund wrote:
> > The more I think about it the more I'm convinced that if we want to do this,
> > we should do it for variables and functions.
>
> Btw, if we were to do this, we should just use -fvisibility=hidden everywhere
> and would see the same set of failures on unixoid systems as on windows. Of
> course only in in-core extensions, but it'd still be better than nothing.

Let's be less ambitious for this release, and just get the variables
marked with PGDLLIMPORT. We seem to have consensus to create parity
between Windows and non-Windows builds, which means precisely applying
PGDLLIMPORT to variables marked in header files, and nothing more. The
merits of -fvisibility=hidden or PGDLLIMPORT on functions are a
separate question that can be debated on its own merits, but I don't
want that larger discussion to bog down this effort. Here are updated
patches for that.

@RMT: Andres proposed upthread that we should plan to do this just
after feature freeze. Accordingly I propose to commit at least 0002
and perhaps 0001 if people want it just after feature freeze. I
therefore ask that the RMT either (a) regard this change as not being
a feature (and thus not subject to the freeze) or (b) give it a 1-day
extension. The reason for committing it just after freeze is to
minimize the number of conflicts that it creates for other patches.
The reason why that's probably an OK thing to do is that applying
PGDLLIMPORT markings is low-risk.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Andrew Dunstan
Date:
On 3/30/22 14:37, Robert Haas wrote:
> On Fri, Feb 18, 2022 at 7:02 PM Andres Freund <andres@anarazel.de> wrote:
>> On 2022-02-15 08:06:58 -0800, Andres Freund wrote:
>>> The more I think about it the more I'm convinced that if we want to do this,
>>> we should do it for variables and functions.
>> Btw, if we were to do this, we should just use -fvisibility=hidden everywhere
>> and would see the same set of failures on unixoid systems as on windows. Of
>> course only in in-core extensions, but it'd still be better than nothing.
> Let's be less ambitious for this release, and just get the variables
> marked with PGDLLIMPORT. We seem to have consensus to create parity
> between Windows and non-Windows builds, which means precisely applying
> PGDLLIMPORT to variables marked in header files, and nothing more. The
> merits of -fvisibility=hidden or PGDLLIMPORT on functions are a
> separate question that can be debated on its own merits, but I don't
> want that larger discussion to bog down this effort. Here are updated
> patches for that.
>
> @RMT: Andres proposed upthread that we should plan to do this just
> after feature freeze. Accordingly I propose to commit at least 0002
> and perhaps 0001 if people want it just after feature freeze. I
> therefore ask that the RMT either (a) regard this change as not being
> a feature (and thus not subject to the freeze) or (b) give it a 1-day
> extension. The reason for committing it just after freeze is to
> minimize the number of conflicts that it creates for other patches.
> The reason why that's probably an OK thing to do is that applying
> PGDLLIMPORT markings is low-risk.
>
> Thanks,
>


WFM. I think Tom also has an item he wants to do right at the end of
feature freeze.


The script looks fine, needs a copyright notice and a comment at the top
describing what it does. It seems like something we might need to do
from time to time, as it will be easy to forget to mark variables and we
should periodically run this as a check.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 3/30/22 14:37, Robert Haas wrote:
>> @RMT: Andres proposed upthread that we should plan to do this just
>> after feature freeze. Accordingly I propose to commit at least 0002
>> and perhaps 0001 if people want it just after feature freeze. I
>> therefore ask that the RMT either (a) regard this change as not being
>> a feature (and thus not subject to the freeze) or (b) give it a 1-day
>> extension. The reason for committing it just after freeze is to
>> minimize the number of conflicts that it creates for other patches.
>> The reason why that's probably an OK thing to do is that applying
>> PGDLLIMPORT markings is low-risk.

> WFM. I think Tom also has an item he wants to do right at the end of
> feature freeze.

Yeah, the frontend error message rework in [1].  That has exactly
the same constraint that it's likely to break other open patches,
so it'd be better to do it after the CF cutoff.  I think that doing
that concurrently with Robert's thing shouldn't be too risky, because
it only affects frontend code while his patch should touch only backend.

            regards, tom lane

[1] https://commitfest.postgresql.org/37/3574/



Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Tue, Apr 5, 2022 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, the frontend error message rework in [1].  That has exactly
> the same constraint that it's likely to break other open patches,
> so it'd be better to do it after the CF cutoff.  I think that doing
> that concurrently with Robert's thing shouldn't be too risky, because
> it only affects frontend code while his patch should touch only backend.

So when *exactly* do we want to land these patches? None of the
calendar programs I use support "anywhere on earth" as a time zone,
but I think that feature freeze is 8am on Friday on the East coast of
the United States. If I commit the PGDLLIMPORT change within a few
hours after that time, is that good? Should I try to do it earlier,
before we technically hit 8am? Should I do it the night before, last
thing before I go to bed on Thursday? Do you care whether your commit
or mine goes in first?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
John Naylor
Date:
On Tue, Apr 5, 2022 at 10:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Apr 5, 2022 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Yeah, the frontend error message rework in [1].  That has exactly
> > the same constraint that it's likely to break other open patches,
> > so it'd be better to do it after the CF cutoff.  I think that doing
> > that concurrently with Robert's thing shouldn't be too risky, because
> > it only affects frontend code while his patch should touch only backend.
>
> So when *exactly* do we want to land these patches? None of the
> calendar programs I use support "anywhere on earth" as a time zone,
> but I think that feature freeze is 8am on Friday on the East coast of
> the United States.

I understand it to be noon UTC on Friday.

> If I commit the PGDLLIMPORT change within a few
> hours after that time, is that good? Should I try to do it earlier,
> before we technically hit 8am? Should I do it the night before, last
> thing before I go to bed on Thursday? Do you care whether your commit
> or mine goes in first?

For these two patches, I'd say a day or two after feature freeze is a
reasonable goal.

--
John Naylor
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Do you care whether your commit
> or mine goes in first?

I do not.  If they're not independent, at least one of us has messed up.

I have family commitments on Saturday, so if I don't get mine in
on Friday it'll likely not happen before Sunday.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Michael Paquier
Date:
On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote:
> For these two patches, I'd say a day or two after feature freeze is a
> reasonable goal.

Yeah.  For patches as invasive as the PGDLLIMPORT business and the
frontend error refactoring, I am also fine to have two exceptions with
the freeze deadline.
--
Michael

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Wed, Apr 6, 2022 at 7:56 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote:
> > For these two patches, I'd say a day or two after feature freeze is a
> > reasonable goal.
>
> Yeah.  For patches as invasive as the PGDLLIMPORT business and the
> frontend error refactoring, I am also fine to have two exceptions with
> the freeze deadline.

Done now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Magnus Hagander
Date:


On Fri, Apr 8, 2022 at 2:42 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Apr 6, 2022 at 7:56 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote:
> > For these two patches, I'd say a day or two after feature freeze is a
> > reasonable goal.
>
> Yeah.  For patches as invasive as the PGDLLIMPORT business and the
> frontend error refactoring, I am also fine to have two exceptions with
> the freeze deadline.

Done now.

\o/
 
--

Re: Mark all GUC variable as PGDLLIMPORT

From
Julien Rouhaud
Date:
On Fri, Apr 08, 2022 at 03:04:18PM +0200, Magnus Hagander wrote:
> On Fri, Apr 8, 2022 at 2:42 PM Robert Haas <robertmhaas@gmail.com> wrote:
> 
> > On Wed, Apr 6, 2022 at 7:56 PM Michael Paquier <michael@paquier.xyz>
> > wrote:
> > > On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote:
> > > > For these two patches, I'd say a day or two after feature freeze is a
> > > > reasonable goal.
> > >
> > > Yeah.  For patches as invasive as the PGDLLIMPORT business and the
> > > frontend error refactoring, I am also fine to have two exceptions with
> > > the freeze deadline.
> >
> > Done now.
> >
> 
> \o/

Woohoo!  Thanks a lot!



Re: Mark all GUC variable as PGDLLIMPORT

From
Andres Freund
Date:
Hi,

On 2022-04-08 08:42:38 -0400, Robert Haas wrote:
> On Wed, Apr 6, 2022 at 7:56 PM Michael Paquier <michael@paquier.xyz> wrote:
> > On Wed, Apr 06, 2022 at 12:57:29AM +0700, John Naylor wrote:
> > > For these two patches, I'd say a day or two after feature freeze is a
> > > reasonable goal.
> >
> > Yeah.  For patches as invasive as the PGDLLIMPORT business and the
> > frontend error refactoring, I am also fine to have two exceptions with
> > the freeze deadline.
> 
> Done now.

Just noticed that
extern sigset_t UnBlockSig,
             BlockSig,
             StartupBlockSig;
are unadorned.


There's also a number of variables that are only declared in .c files that
!windows can still access. Some likely aren't worth caring about, but some
others are underlying GUCs, so we probably do? E.g.
CommitDelay
CommitSiblings
default_tablespace
ignore_checksum_failure
ignore_invalid_pages
Log_disconnections
ssl_renegotiation_limit
temp_tablespaces
Unix_socket_group
Unix_socket_permissions
wal_level_options

Likely don't care:
dynamic_shared_memory_options
gistBufferingOptValues
recovery_target_action_options
ssl_protocol_versions_info

Also noticed that the bbsink_ops variables are const, instead of static const,
was that intentional?

Greetings,

Andres Freund



Re: Mark all GUC variable as PGDLLIMPORT

From
Michael Paquier
Date:
On Fri, May 06, 2022 at 04:49:24PM -0700, Andres Freund wrote:
> Just noticed that
> extern sigset_t UnBlockSig,
>              BlockSig,
>              StartupBlockSig;
> are unadorned.

mark_pgdllimport.pl is not able to capture this part of the change
because of this logic, where we assume that the header line has to
finish with a semicolon:
# Variable declarations should end in a semicolon. If we see an
# opening parenthesis, it's probably a function declaration.
$needs_pgdllimport = 0 if $stripped_line !~ /;$/ ||
    $stripped_line =~ /\(/;

It is quite common to define one variable per line, so I would not put
the blame on the script and just update pqsignal.h.  And it is common
to finish lines with a comma for function declarations..

> There's also a number of variables that are only declared in .c files that
> !windows can still access. Some likely aren't worth caring about, but some
> others are underlying GUCs, so we probably do? E.g.
> CommitDelay
> CommitSiblings
> default_tablespace
> ignore_checksum_failure
> ignore_invalid_pages
> Log_disconnections
> ssl_renegotiation_limit
> temp_tablespaces
> wal_level_options

These are indeed declared in .c files.  So you mean that we'd better
declare them in headers and mark them as PGDLLIMPORT?  I am not sure
if that's worth the addition, nobody has asked for these to be
available yet, AFAIK.

> Unix_socket_group
> Unix_socket_permissions

Already marked with PGDLLIMPORT.

> Also noticed that the bbsink_ops variables are const, instead of static const,
> was that intentional?

Yep, that looks like an error.

While on it, I think that it would be a good idea to document in the
script that we need pass down a list of header files as arguments to
rewrite them.
--
Michael

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Robert Haas
Date:
On Mon, May 9, 2022 at 3:15 AM Michael Paquier <michael@paquier.xyz> wrote:
> On Fri, May 06, 2022 at 04:49:24PM -0700, Andres Freund wrote:
> > Just noticed that
> > extern sigset_t UnBlockSig,
> >                       BlockSig,
> >                       StartupBlockSig;
> > are unadorned.
>
> mark_pgdllimport.pl is not able to capture this part of the change
> because of this logic, where we assume that the header line has to
> finish with a semicolon:
> # Variable declarations should end in a semicolon. If we see an
> # opening parenthesis, it's probably a function declaration.
> $needs_pgdllimport = 0 if $stripped_line !~ /;$/ ||
>     $stripped_line =~ /\(/;
>
> It is quite common to define one variable per line, so I would not put
> the blame on the script and just update pqsignal.h.  And it is common
> to finish lines with a comma for function declarations..

Right. I didn't try to equip the script with a real C parser. Just
enough to catch our typical style - which those declarations aren't.

> > There's also a number of variables that are only declared in .c files that
> > !windows can still access. Some likely aren't worth caring about, but some
> > others are underlying GUCs, so we probably do? E.g.
> > CommitDelay
> > CommitSiblings
> > default_tablespace
> > ignore_checksum_failure
> > ignore_invalid_pages
> > Log_disconnections
> > ssl_renegotiation_limit
> > temp_tablespaces
> > wal_level_options
>
> These are indeed declared in .c files.  So you mean that we'd better
> declare them in headers and mark them as PGDLLIMPORT?  I am not sure
> if that's worth the addition, nobody has asked for these to be
> available yet, AFAIK.

Completely agree. The agreed-upon charter was to adjust the header
files, not move any declarations into header files. I'm not against
doing that; I think it's good for extensions to have access to GUC
values. But it wasn't the mission.

> > Also noticed that the bbsink_ops variables are const, instead of static const,
> > was that intentional?
>
> Yep, that looks like an error.
>
> While on it, I think that it would be a good idea to document in the
> script that we need pass down a list of header files as arguments to
> rewrite them.

Either of you please feel free to change these things, at least as far
as I'm concerned.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Mark all GUC variable as PGDLLIMPORT

From
Michael Paquier
Date:
On Mon, May 09, 2022 at 09:23:47AM -0400, Robert Haas wrote:
> Either of you please feel free to change these things, at least as far
> as I'm concerned.

Well, what about the attached then?  While looking at all the headers
in the tree, I have noticed that __pg_log_level is not marked, but
we'd better paint a PGDLLIMPORT also for it?  This is getting used by
pgbench for some unlikely() business, as one example.
--
Michael

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Michael Paquier
Date:
On Tue, May 10, 2022 at 04:09:47PM +0900, Michael Paquier wrote:
> Well, what about the attached then?  While looking at all the headers
> in the tree, I have noticed that __pg_log_level is not marked, but
> we'd better paint a PGDLLIMPORT also for it?  This is getting used by
> pgbench for some unlikely() business, as one example.

After an extra look, PGDLLIMPORT missing from __pg_log_level looks
like an imbroglio between 8ec5694, that has added the marking, and
9a374b77 that has removed it the same day.  All that has been fixed in
5edeb57.
--
Michael

Attachment

Re: Mark all GUC variable as PGDLLIMPORT

From
Andres Freund
Date:
Hi,

On 2022-05-12 15:15:10 +0900, Michael Paquier wrote:
> On Tue, May 10, 2022 at 04:09:47PM +0900, Michael Paquier wrote:
> > Well, what about the attached then?  While looking at all the headers
> > in the tree, I have noticed that __pg_log_level is not marked, but
> > we'd better paint a PGDLLIMPORT also for it?  This is getting used by
> > pgbench for some unlikely() business, as one example.
> 
> After an extra look, PGDLLIMPORT missing from __pg_log_level looks
> like an imbroglio between 8ec5694, that has added the marking, and
> 9a374b77 that has removed it the same day.  All that has been fixed in
> 5edeb57.

It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers /
variables. What is that supposed to mean?

Greetings,

Andres Freund



Re: Mark all GUC variable as PGDLLIMPORT

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-05-12 15:15:10 +0900, Michael Paquier wrote:
>> After an extra look, PGDLLIMPORT missing from __pg_log_level looks
>> like an imbroglio between 8ec5694, that has added the marking, and
>> 9a374b77 that has removed it the same day.  All that has been fixed in
>> 5edeb57.

> It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers /
> variables. What is that supposed to mean?

Yeah, that's why I removed it in 9a374b77.

            regards, tom lane



Re: Mark all GUC variable as PGDLLIMPORT

From
Michael Paquier
Date:
On Thu, May 12, 2022 at 11:59:49AM -0400, Tom Lane wrote:
>> It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers /
>> variables. What is that supposed to mean?
>
> Yeah, that's why I removed it in 9a374b77.

Perhaps we should try to remove it from the header itself in the long
run, even if that's used in a couple of macros?  pgbench relies on it
to avoid building a debug string for a meta-command, and logging.h has
it in those compat macros..  I won't fight on that, though.

Anyway, I'll go remove the marking.  My apologies for the noise.
--
Michael

Attachment