Re: Higher level questions around shared memory stats - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Higher level questions around shared memory stats
Date
Msg-id CAKFQuwYupmWdVdnCE0f4daMxPz9hMi+NEYOSiyb0QiLOyrEe9g@mail.gmail.com
Whole thread Raw
In response to Re: Higher level questions around shared memory stats  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Higher level questions around shared memory stats  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Mar 31, 2022 at 7:33 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Thu, 31 Mar 2022 14:04:16 -0700, Andres Freund <andres@anarazel.de> wrote in
> Hi,
>
> On 2022-03-31 16:16:31 +0900, Kyotaro Horiguchi wrote:
> > After moving to shared stats, we might want to expose the GUC variable
> > itself. Then hide/remove the macro PG_STAT_TMP_DIR.  This breaks the
> > extensions but it is better than keeping using PG_STAT_TMP_DIR for
> > uncertain reasons. The existence of the macro can be used as the
> > marker of the feature change.  This is the chance to break the (I
> > think) bad practice shared among the extensions.  At least I am okay
> > with that.
>
> I don't really understand why we'd want to do it that way round? Why not just
> leave PG_STAT_TMP_DIR in place, and remove the GUC? Since nothing uses the
> GUC, we're not loosing anything, and we'd not break extensions unnecessarily?

Yeah, I'm two-sided here.

I think so and did so in the past versions of this patch.  But when
asked anew, I came to think I might want to keep (and make use more
of) the configuarable aspect of the dbserver's dedicated temporary
directory.  The change is reliably detectable on extensions and the
requried change is easy.

> Obviously there's no strong demand for pg_stat_statements et al to use the
> user-configurable stats_temp_directory, given they've not done so for years
> without complaints?  The code to support the configurable stats_temp_dir isn't
> huge, but it's not small either.

I even doubt anyone have stats_temp_directory changed in a cluster.
Thus I agree that it is reasonable that we take advantage of the
chance to remove the feature of little significance.


Do we really think no one has taken our advice in the documentation and moved their stats_temp_directory to a RAM-based file system?


It doesn't seem right that extensions are making the decision of where to place their temporary statistics files.  If the user has specified a directory for them the system should be placing them there.

The question is whether current uses of PG_STAT_TMP_DIR can be made to use the value of the GUC without them knowing or caring about the fact we changed PG_STAT_TMP_DIR from a static define for pg_stat_tmp to whatever the current value of stats_temp_directory is.  I take it from the compiler directive of #define that this isn't doable.

Given all that I'd say we remove stats_temp_directory (noting it as being obsolete as only core code was ever given leave to use it and that core code now uses shared memory instead - basically forcing the choice of RAM-based file system onto the user).

If we later want to coerce extensions (and even our own code) to use a user-supplied directory we can then remove the define and give them an API to use instead.  I suspect such an API would look different than just "here is a directory name" anyway (e.g., force extensions to use a subdirectory under the base directory for their data).  We would name the base directory GUC something like "stats_temp_base_directory" and be happy that stats_temp_directory isn't sitting there already giving us the stink eye.

David J.

pgsql-hackers by date:

Previous
From: "Euler Taveira"
Date:
Subject: Re: Logical replication timeout problem
Next
From: Amit Langote
Date:
Subject: Re: generic plans and "initial" pruning