Re: "unix_socket_directories" should be GUC_LIST_INPUT? - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: "unix_socket_directories" should be GUC_LIST_INPUT?
Date
Msg-id 20201104035914.GC1711@paquier.xyz
Whole thread Raw
In response to Re: "unix_socket_directories" should be GUC_LIST_INPUT?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: "unix_socket_directories" should be GUC_LIST_INPUT?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Oct 27, 2020 at 12:23:22PM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> If we change this, is it going to be a compatibility break for the
>> contents of postgresql.conf files?
>
> I think not, at least not for the sorts of values you'd ordinarily
> find in that variable, say '/tmp, /var/run/postgresql'.  Possibly
> the behavior would change for pathnames containing spaces or the
> like, but it is probably kinda broken for such cases today anyway.
>
> In any case, Michael had promised to test this aspect before committing.

Paths with spaces or commas would be fine as long as we associate
GUC_LIST_QUOTE with GUC_LIST_INPUT so as commas within quoted entries
are handled consistently.  postmaster.c uses SplitDirectoriesString()
with a comma do decide how to split things.  This discards leading and
trailing whitespaces, requires a double-quote to have a matching
double-quote where trailing/leading whitespaces are allowed, and
nothing to escape quotes.  Those patterns fail:
"/tmp/repo1,"/tmp/repo2
"/tmp/repo1,/tmp/repo2
These are split as a single entry:
"/tmp/repo1,/tmp/repo2"
"/tmp/ repo1,/tmp/ repo2"
"/tmp/ repo1 , /tmp/ repo2 "
These are split as two entries:
"/tmp/repo1,",/tmp/repo2
/tmp /repo1 , /tmp/ repo2
"/tmp/""sock1", "/tmp/, sock2" (here first path has one double quote)

If we use GUC_LIST_INPUT | GUC_LIST_QUOTE, paths are handled the same
way as the original, but we would run into problems if not using
GUC_LIST_QUOTE as mentioned upthread.

Anyway, we have a compatibility problem once we use ALTER SYSTEM.
Just take the following command:
alter system set unix_socket_directories = '/tmp/sock1, /tmp/sock2';

On HEAD, this would be treated and parsed as two items.  However, with
the patch, this becomes one item as this is considered as one single
element of the list of paths, as that's written to
postgresql.auto.conf as '"/tmp/sock1, /tmp/sock2"'.

This last argument would be IMO a reason enough to not do the switch.
Even if I have never seen cases where ALTER SYSTEM was used with
unix_socket_directories, we cannot say either that nobody relies on
the existing behavior (perhaps some failover solutions care?).  So at
least we should add a comment as proposed in
https://postgr.es/m/122596.1603427777@sss.pgh.pa.us.

Thoughts?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: extension patch of CREATE OR REPLACE TRIGGER
Next
From: Thomas Munro
Date:
Subject: Re: PANIC: could not fsync file "pg_multixact/..." since commit dee663f7843