Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE - Mailing list pgsql-general

From Gavan Schneider
Subject Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE
Date
Msg-id A15C12F9-029C-4B03-9633-FE9D5B176CD1@pendari.org
Whole thread Raw
In response to Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE  (Bryn Llewellyn <bryn@yugabyte.com>)
Responses Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE
List pgsql-general

On 6 Oct 2022, at 16:04, Bryn Llewellyn wrote:

Does this imply a risk that a future PG version will go against the SQL standard and reject any non-latin name that is free of all punctuation characters, when used in the role of a SQL identifier, unless it's double quoted?

From my perspective this thread seems to miss the essential purposes behind quote_ident(). It is part of processing external/user input —

  1. Protecting from PostgreSQL which always maps everything to lower case before anything gets to the parser
  2. Protection against SQL injection when processing input from outside the trusted perimeter

Expecting an arbitrary string to be equal to itself after it has been through string processing code is risky unless that processing is part of the design, and quote_ident() was never designed to be part of any such arrangement.

Expanding —

  1. It is a complex question what happens to non-ASCII characters when they are mapped to lower case… sometimes this is a meaningful concept e.g., ∏ -> π, sometimes it is not, e.g., pick any Chinese/Korean/Japanese character. If the designer decides to use non-ASCII characters in the identifier they can… just double-quote those identifiers. If the designer wants to use camelCase ASCII they can, but the identifier will be camelcase inside the machine unless it was double quoted.

AFAIK we never really use quote_ident() except to process external input. As noted above this function is not designed to be part of an equality test when attempting system introspection, rather —

  1. The simple quote_ident() function can also be used to wrap untrusted input so it will not mess with the parser. It is used with quote_literal() when building dynamic SQL statements from user (i.e., untrusted) input.

From my perspective any use of these function outside their scope is just that… outside their scope, with no promise this usage will work or comply with any current or future standard, or imply anything useful about pretty much anything.

Maybe I’m oversimplifying but I believe the current functions work and do their specific jobs, and have nothing to do with anything else. So there is no surprise for me in the subject line. There is mild surprise the question was asked.

BTW this ignores whether or not PG mapping everything that’s not quoted to lower case is standards compliant. This whole topic would be simpler if the case was left alone but that’s a long road ago and I believe most of the bridges have been burnt :)

Regards

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong.
— H. L. Mencken, 1920

pgsql-general by date:

Previous
From: Bryn Llewellyn
Date:
Subject: Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE
Next
From: Karsten Hilbert
Date:
Subject: Aw: Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE