Re: making the backend's json parser work in frontend code - Mailing list pgsql-hackers

From David Steele
Subject Re: making the backend's json parser work in frontend code
Date
Msg-id 12b96994-47c2-c87d-2c9b-710d3e052b3b@pgmasters.net
Whole thread Raw
In response to Re: making the backend's json parser work in frontend code  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: making the backend's json parser work in frontend code  (Mark Dilger <mark.dilger@enterprisedb.com>)
List pgsql-hackers
On 1/24/20 9:27 AM, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
>> On 2020-01-23 18:04, Robert Haas wrote:
>>> Now, you might say "well, why don't we just do an encoding
>>> conversion?", but we can't. When the filesystem tells us what the file
>>> names are, it does not tell us what encoding the person who created
>>> those files had in mind. We don't know that they had*any*  encoding in
>>> mind. IIUC, a file in the data directory can have a name that consists
>>> of any sequence of bytes whatsoever, so long as it doesn't contain
>>> prohibited characters like a path separator or \0 byte. But only some
>>> of those possible octet sequences can be stored in a manifest that has
>>> to be valid UTF-8.
> 
>> I think it wouldn't be unreasonable to require that file names in the
>> database directory be consistently encoded (as defined by pg_control,
>> probably).  After all, this information is sometimes also shown in
>> system views, so it's already difficult to process total junk.  In
>> practice, this shouldn't be an onerous requirement.
> 
> I don't entirely follow why we're discussing this at all, if the
> requirement is backing up a PG data directory.  There are not, and
> are never likely to be, any legitimate files with non-ASCII names
> in that context.  Why can't we just skip any such files?

It's not uncommon in my experience for users to drop odd files into 
PGDATA (usually versioned copies of postgresql.conf, etc.), but I agree 
that it should be discouraged.  Even so, I don't recall ever seeing any 
non-ASCII filenames.

Skipping files sounds scary, I'd prefer an error or a warning (and then 
base64 encode the filename).

Regards,
-- 
-David
david@pgmasters.net



pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: making the backend's json parser work in frontend code
Next
From: Peter Eisentraut
Date:
Subject: Re: Allow to_date() and to_timestamp() to accept localized names