Thread: Seems to be impossible to set a NULL search_path

Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
The section "Writing SECURITY DEFINER Functions Safely":

https://www.postgresql.org/docs/current/sql-createfunction.html#id-1.9.3.67.10.2

explains the risk brought if a bad actor creates an object that preemps what the developer intended by putting it in a
schemathat's ahead of the intended object in the search_path. 

You can avoid this risk by always using fully qualified object names. It seems strange that the section doesn't mention
thisobvious approach. Is it vulnerable to subversion in a way that I haven't spotted? 

I suppose that there are use cases where the actual plan is to resolve to the first object that has the right name as
thesearch_path is traversed. (But this seems not to be the common case.) This is where setting the search_path as an
attributeof a subprogram helps. 

I wondered about a self-documenting belt-and-braces approach: use fully qualified object names in the subprograms
sourcecode and declare that I want no risk of mis-use of the search_path by setting it to null. But this seems not to
bepossible. Am I right? 

I've confirmed that even a superuser cannot create objects in a "special" schema like "pg_catalog" or "pg_toast". So
thisgives me a workaround to the limitation that I cannot force the use of fully qualified names by setting a null
search_path:I could set the attribute of my subprogram to "pg_catalog". 

Apart from the fact that, as I suppose, this would be a rare and therefore possibly puzzling pattern (so clear doc
aboutthe purpose would be needed), are there any risks that I haven't spotted? 

Finally, what do you think of a possible future enhancement to allow setting a null search_path?


Re: Seems to be impossible to set a NULL search_path

From
Adrian Klaver
Date:
On 7/5/22 11:12 AM, Bryn Llewellyn wrote:
> The section "Writing SECURITY DEFINER Functions Safely":
> 
> https://www.postgresql.org/docs/current/sql-createfunction.html#id-1.9.3.67.10.2
> 

> Finally, what do you think of a possible future enhancement to allow setting a null search_path?
> 

set search_path = '';

show search_path ;
  search_path
-------------
  ""

\d
Did not find any relations.


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Seems to be impossible to set a NULL search_path

From
Christophe Pettus
Date:

> On Jul 5, 2022, at 11:12, Bryn Llewellyn <bryn@yugabyte.com> wrote:
> Finally, what do you think of a possible future enhancement to allow setting a null search_path?

You use the empty string, rather than NULL, but it works right now:

xof=# show search_path;
   search_path
-----------------
 "$user", public
(1 row)

xof=# select * from t;
 i | d1 | d2
---+----+----
(0 rows)

xof=# set search_path='';
SET
xof=# show search_path;
 search_path
-------------
 ""
(1 row)

xof=# select * from t;
ERROR:  relation "t" does not exist
LINE 1: select * from t;
                      ^
xof=#





Re: Seems to be impossible to set a NULL search_path

From
Pavel Stehule
Date:


út 5. 7. 2022 v 20:18 odesílatel Adrian Klaver <adrian.klaver@aklaver.com> napsal:
On 7/5/22 11:12 AM, Bryn Llewellyn wrote:
> The section "Writing SECURITY DEFINER Functions Safely":
>
> https://www.postgresql.org/docs/current/sql-createfunction.html#id-1.9.3.67.10.2
>

> Finally, what do you think of a possible future enhancement to allow setting a null search_path?
>

set search_path = '';

show search_path ;
  search_path
-------------
  ""

\d
Did not find any relations.

But still in this case, there is pg_catalog in search path.

Regards

Pavel
 


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Seems to be impossible to set a NULL search_path

From
Adrian Klaver
Date:
On 7/5/22 11:24 AM, Pavel Stehule wrote:

>      >
> 
>     set search_path = '';
> 
>     show search_path ;
>        search_path
>     -------------
>        ""
> 
>     \d
>     Did not find any relations.
> 
> 
> But still in this case, there is pg_catalog in search path.

Yes but from OP:

"I've confirmed that even a superuser cannot create objects in a 
"special" schema like "pg_catalog" or "pg_toast". So this gives me a 
workaround to the limitation that I cannot force the use of fully 
qualified names by setting a null search_path: I could set the attribute 
of my subprogram to "pg_catalog"."

So Bryn Llewellyn does not seem to be concerned about that.

> 
> Regards
> 
> Pavel
> 
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
adrian.klaver@aklaver.com wrote:

set search_path = '';
show search_path ;
search_path
   -------------
      ""

...But still in this case, there is pg_catalog in search path.

Yes but from OP:

« I've confirmed that even a superuser cannot create objects in a "special" schema like "pg_catalog" or "pg_toast". So this gives me a workaround to the limitation that I cannot force the use of fully qualified names by setting a null search_path: I could set the attribute of my subprogram to "pg_catalog". »

So Bryn Llewellyn does not seem to be concerned about that.

Thanks, all, for your replies. I'd assumed that the arguments of "set search_path" had to be SQL names. so I tried "". But that caused an error. I didn't try the ordinary empty string because I'd assumed that, as an illegal SQL name, it would be illegal in "set search_path". Hmm...

I'm slightly troubled by "works right now":


You use the empty string, rather than NULL... it works right now: 

But because you experts all recommend it, I'll go with it. It's more expressive of what I mean than is using "pg_catalog".

(I'm assuming that having such an unwritable schema inevitably on the search_path is simply an entirely benign benefit. But it could give a strange message to the reader in my use case.)

Re: Seems to be impossible to set a NULL search_path

From
Tom Lane
Date:
Bryn Llewellyn <bryn@yugabyte.com> writes:
> Thanks, all, for your replies. I'd assumed that the arguments of "set search_path" had to be SQL names. so I tried
"".But that caused an error. I didn't try the ordinary empty string because I'd assumed that, as an illegal SQL name,
itwould be illegal in "set search_path". Hmm... 

search_path's value is not a SQL name.  It's a list of SQL names
wrapped in a string ... and the list can be empty.

A bit off topic: I'm not sure how you came to the conclusion that
superusers can't write into pg_catalog.  They can.  But I don't see
much point in being paranoid about whether the contents of pg_catalog
are trustworthy.  If an adversary has already obtained superuser
privileges, he needn't bother with anything as complicated as
trojan-horsing something you might call later.

            regards, tom lane



Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
tgl@sss.pgh.pa.us wrote:

bryn@yugabyte.com writes:

...I'd assumed that the arguments of "set search_path" had to be SQL names...

search_path's value is not a SQL name. It's a list of SQL names wrapped in a string ... and the list can be empty.

I was informed by this precedent:

truncate table u1.t1, t2;

It uses a comma-separated list of optionally qualified  SQL names.

And this:

«
CREATE SCHEMA schema_name ...
»

It requires a single unqualified SQL name.

And then this:

«
SET [ SESSION | LOCAL ] configuration_parameter { TO | = } { value | 'value' | DEFAULT }

value — New value of parameter. Values can be specified as... identifiers... or comma-separated lists of these, as appropriate for the particular parameter...
»

Notice that the token « value » in the syntax is not in quotes. I took it to mean (for "set search_path" a SQL name that you would type bare when it's simple and surrounded by double quotes when it's exotic—in line with the much broader general rule.

And I did ad hoc tests like these.

create schema "s1, s2";
create table "s1, s2".t(k int);
insert into "s1, s2".t(k) values(42);
set search_path = "s1, s2";
select k from t;

After all, "s1, s2" is a perfectly legal SQL name—even though folks usually have rules of practice to avoid exotic names like these.

I saw that the test behaves the same if I use this:

set search_path = 's1, s2';

I put that down to an unnecessary and confusing forgiveness that got grandfathered it.

So I'm very confused by your comment. What am I missing?.

A bit off topic: I'm not sure how you came to the conclusion that superusers can't write into pg_catalog.  They can.

With tests like these:

\c demo postgres
create table pg_catalog.t(n int);

It fails with this:

42501: permission denied to create "pg_catalog.t"

I did note this detail: "System catalog modifications are currently disallowed." Is there a configuration parameter that controls this?

I don't see much point in being paranoid... if an adversary has already obtained superuser privileges

Yes, that point is very well taken. But I like to know the limit's of what's technically possible.

Re: Seems to be impossible to set a NULL search_path

From
Adrian Klaver
Date:
On 7/5/22 15:08, Bryn Llewellyn wrote:

> I was informed by this precedent:
> 
> *truncate table u1.t1, t2;
> *
> It uses a comma-separated list of optionally qualified  SQL names.
> 
> And this:
> 
> «
> *CREATE SCHEMA schema_name ...
> *»

Those are creating objects. Set search_path is setting a configuration 
value.


> And then this:
> 
> «
> *SET [ SESSION | LOCAL ] configuration_parameter { TO | = } { value | 
> 'value' | DEFAULT }*
> 
> value — New value of parameter. Values can be specified as... 
> identifiers... or comma-separated lists of these, as appropriate for the 
> particular parameter...
> »
> 
> Notice that the token « value » in the syntax is not in quotes. I took 

Pretty sure it is:

{ TO | = } { value | 'value' | DEFAULT }





-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Tue, Jul 5, 2022 at 12:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bryn Llewellyn <bryn@yugabyte.com> writes:
> Thanks, all, for your replies. I'd assumed that the arguments of "set search_path" had to be SQL names. so I tried "". But that caused an error. I didn't try the ordinary empty string because I'd assumed that, as an illegal SQL name, it would be illegal in "set search_path". Hmm...

search_path's value is not a SQL name.  It's a list of SQL names
wrapped in a string ... and the list can be empty.


This doesn't seem to be correct - wrapping them in single quotes in the SET command ends up behaving as if you wrapped them in double quotes anywhere else (and wrapping them individually in double quotes here works just fine too).

("testschema" exists, schemas "a" and "b" do not)
                     ^
postgres=# set search_path to 'a, b, testschema';
SET
postgres=# show search_path;
    search_path
--------------------
 "a, b, testschema"
(1 row)

postgres=# create table inab (id serial primary key);
ERROR:  no schema has been selected to create in
LINE 1: create table inab (id serial primary key);
                     ^
postgres=# set search_path to a, b, testschema;
SET
postgres=# show search_path;
   search_path
------------------
 a, b, testschema
(1 row)

postgres=# create table inab (id serial primary key);
CREATE TABLE
postgres=#

postgres=# set search_path to '';
SET
postgres=# show search_path;
 search_path
-------------
 ""
(1 row)

postgres=# set search_path to "";
ERROR:  zero-length delimited identifier at or near """"
LINE 1: set search_path to "";
                           ^
postgres=# create table testtable."" (id serial primary key);
ERROR:  zero-length delimited identifier at or near """"
LINE 1: create table testtable."" (id serial primary key);
                               ^
postgres=#

So you cannot actually write an identifier that is double quoted empty string but if you write  < SET search_path TO '' > that is basically what the system thinks you have done.

postgres=# set search_path to '';
SET
postgres=# create table "es" (id serial primary key);
ERROR:  no schema has been selected to create in
LINE 1: create table "es" (id serial primary key);
                     ^
postgres=#

Thus, it must be set using a list of identifiers, separated by commas, but single quotes can be used instead of double quotes if desired, and single quotes must be used to set it to a value where there are no additional schemas added to the mandatory (and thus implied) < pg_temp, pg_catalog > specification.

A bit off topic: I'm not sure how you came to the conclusion that
superusers can't write into pg_catalog.  They can.

True, but they need to jump through the hoop of modifying a setting before they are allowed to do so.

David J.

Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
david.g.johnston@gmail.com wrote:

tgl@sss.pgh.pa.us wrote:

search_path's value is not a SQL name.  It's a list of SQL names wrapped in a string ... and the list can be empty.

This doesn't seem to be correct - wrapping them in single quotes in the SET command ends up behaving as if you wrapped them in double quotes anywhere else (and wrapping them individually in double quotes here works just fine too).

And then...


Those are creating objects. Set search_path is setting a configuration value. Pretty sure it is:

{ TO | = } { value | 'value' | DEFAULT 

There's different use cases. For example:

set my_namspace.x = 'Dog house';
show my_namspace.x ;

I can't reconcile what you three (Tom, David, and Adrian) have said. I'm interested to hear how you interpret what I showed in this reply:


and in particular to this:

create schema "s1, s2";
create table "s1, s2".t(k int);
insert into "s1, s2".t(k) values(42);
set search_path = "s1, s2";
show search_path;
select k from t;

OR (with single quotes in "set search_path":

create schema "s1, s2";
create table "s1, s2".t(k int);
insert into "s1, s2".t(k) values(42);
set search_path = 's1, s2';
show search_path;
select k from t;

I get a resounding 42 in both cases. Now try this:

set search_path = no_such_schema, "No Such Schema";
show search_path;

All outcomes accord with the mental model that you tell me is wrong.

Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Tue, Jul 5, 2022 at 11:12 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:
The section "Writing SECURITY DEFINER Functions Safely":

https://www.postgresql.org/docs/current/sql-createfunction.html#id-1.9.3.67.10.2

explains the risk brought if a bad actor creates an object that preemps what the developer intended by putting it in a schema that's ahead of the intended object in the search_path.

You can avoid this risk by always using fully qualified object names. It seems strange that the section doesn't mention this obvious approach. Is it vulnerable to subversion in a way that I haven't spotted?

No, it is probably more that by naming the schema explicitly the vulnerability regarding search_path is by definition a non-factor.  This is knowledge the reader was probably assumed to have already so did not need to be repeated here. Also, unless you are going to write: operator(pg_catalog.=) in your function the advice to always use schema qualifications is not going to be taken seriously.  You still have to protect the search_path from being unknown.
 

I suppose that there are use cases where the actual plan is to resolve to the first object that has the right name as the search_path is traversed. (But this seems not to be the common case.)

Due to function caching effects this rarely ends up working well anyway.
 
This is where setting the search_path as an attribute of a subprogram helps.

I wondered about a self-documenting belt-and-braces approach: use fully qualified object names in the subprograms source code and declare that I want no risk of mis-use of the search_path by setting it to null. But this seems not to be possible. Am I right?

Based upon the advice regarding the temporary schema the correct search_path to set isn't "empty" but "pg_catalog", "pg_temp".  While this does violate "DRY" principles it is the solution you are looking for.

I'm admittedly unsure why a non-security-definer doesn't care that the pg_temp schema comes before pg_catalog by default, probably this advice is good for any function.

The main problem is that by doing search_path manipulation on the function you prevent inlining it into the main query.  I would much appreciate attention being given to solving that limitation and making the advice "always set search_path to < pg_catalog, pg_temp > on your functions and procedures" be more viable in practice.

Attaching a set search_path clause to a view would be another nice feature.  Even if all it did was, at compile time, replace all operator invocations with their operator(schema.op) version without having to force the view author to do so manually.

Being able to write:  "SET search_path TO null;" instead of "SET search_path TO '';" doesn't make my list.  Now, "SET search_path TO DEFAULT" maybe would work - if it helped solve the inlining limitation.  Have it go along with updating postgresql.conf to default to 'pg_catalog, pg_temp'.

David J.

Re: Seems to be impossible to set a NULL search_path

From
Adrian Klaver
Date:
On 7/5/22 20:55, Bryn Llewellyn wrote:
> //
>> /david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com> wrote:
>> /
>>> tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us> wrote:
>>>
>>> search_path's value is not a SQL name.  It's a list of SQL 
>>> names wrapped in a string ... and the list can be empty.
>>
>> This doesn't seem to be correct - wrapping them in single quotes in 
>> the SET command ends up behaving as if you wrapped them in double 
>> quotes anywhere else (and wrapping them individually in double quotes 
>> here works just fine too).
> 
> And then...
> 
>> /adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> wrote:/
>>
>> Those are creating objects. Set search_path is setting a configuration 
>> value. Pretty sure it is:
>>
>> { TO | = } { value | 'value' | DEFAULT 
> 
> There's different use cases. For example:
> 
> *set my_namspace.x = 'Dog house';*
> *show my_namspace.x ;

Not sure what your point is?

> *
> I can't reconcile what you three (Tom, David, and Adrian) have said. I'm 
> interested to hear how you interpret what I showed in this reply:
> 
> https://www.postgresql.org/message-id/48E1391E-5A21-4736-B4B1-8B9468ECAFD4%40yugabyte.com 
> <https://www.postgresql.org/message-id/48E1391E-5A21-4736-B4B1-8B9468ECAFD4%40yugabyte.com>
> 
> and in particular to this:
> 
> *create schema "s1, s2";
> create table "s1, s2".t(k int);
> insert into "s1, s2".t(k) values(42);
> set search_path = "s1, s2";*
> *show **search_path;**
> select k from t;*
> 
> OR (with single quotes in "set search_path":
> 
> *create schema "s1, s2";
> create table "s1, s2".t(k int);
> insert into "s1, s2".t(k) values(42);
> set search_path = 's1, s2';
> **show **search_path;**
> **select k from t;
> *

 From here 
https://www.postgresql.org/docs/current/runtime-config-client.html:

search_path (string)
...
The value for search_path must be a comma-separated list of schema names
...

By quoting the above in either single or double quotes you change what 
looks like a list pf schemas into a single schema:

show search_path;
  search_path
-------------
  "s1, s2"

If you want that to be a list of schemas then:

set search_path = s1, s2;
SET

show search_path;
  search_path
-------------
  s1, s2


> I get a resounding 42 in both cases. Now try this:
> 
> *set search_path = no_such_schema, "No Such Schema";*
> *show **search_path;**

Which is same as:

set search_path = no_such_schema, 'No Such Schema';

show search_path;
            search_path
----------------------------------
  no_such_schema, "No Such Schema"

Since the list of names will end up being identifiers for schema any 
name with spaces needs to be quoted. Otherwise:

set search_path = no_such_schema, No Such Schema;
ERROR:  syntax error at or near "Such"
LINE 1: set search_path = no_such_schema, No Such Schema;


> *
> 
> All outcomes accord with the mental model that you tell me is wrong.
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
adrian.klaver@aklaver.com wrote:

Not sure what your point is?

Try these two:

set timezone = 'America/New_York';
show timezone;

set timezone = "America/New_York";
show timezone;

Neither causes an error. The "show", in each case, prints the bare value with no quotes. It never struck me try try double quotes around the timezone argument. I'm shocked that they are silently accepted here and seem to have the same effect (in this syntax setting) as single quotes. (I realize that quoting is required.) This comes as shock. It seems to be a "solution" to problem that I don't have—and it brings confusion. Try this:

set search_path = '"x"'; -- Hard to read. It's double-quoted x surrounded by single quotes.
show search_path;

This is the result:

 search_path 
-------------
 """x"""

That's a run of three double quotes each side of x. (For sport, try a single-quoted x surrounded by double-quotes.) I have no idea what this means—or why it's allowed. But the fact that the quoting has different effects in my different examples led me to say what I did.

When I read "string" in doc, then I understand that a manifest string constant (like the timezone setting that I used) must be single-quoted in SQL statements and the like. Not double-quoted.

All this seems to be 100% distinct from the discussion of exotic SQL names that have to be double-quoted in SQL syntax where a simple name doesn't need this (be can be double quoted if you want).

These two produces different outcomes:

select 'cat';
select "cat";

I won't risk explaining why I'm, not at all surprised.

Enough! I'll assume that we're talking past each other. But the cost/benefit of trying to work out why comes out wrong—so I'll stop. Thanks for trying to help me.

Re: Seems to be impossible to set a NULL search_path

From
Christophe Pettus
Date:

> On Jul 6, 2022, at 09:48, Bryn Llewellyn <bryn@yugabyte.com> wrote:
> Neither causes an error. The "show", in each case, prints the bare value with no quotes. It never struck me try try
doublequotes around the timezone argument. I'm shocked that they are silently accepted here and seem to have the same
effect(in this syntax setting) as single quotes. 

It's really a lot easier than it is being made out to be.

GUCs accept a string as a value, which might then be converted to other values based on the semantics of the GUC.
PostgreSQLgenerously accepts three different ways of delimiting this string: 

1. Single quotes (as in SQL).
2. Double quotes (which in SQL, yes, are used to delimit names).
3. No delimiter at all if there are no embedded characters that would terminate the value prematurely.

That's pretty much it.  That's the rule.  In the case of search_path, the argument is "a list of SQL names, including
possiblyone or none".  Now, one could perhaps argue that PostgreSQL is being overly-accepting by allowing #2, since it
doescreate odd situations like you describe with search_path, quotes, and so forth (there might be others, but
search_pathis the only one that jumps to mind).  That being said, it does, it has for decades, and there's really no
compellingreason to change it.  Of all the things that might be described as "shocking" about PostgreSQL, "GUCs allow
double-quotesto delimit values for convenience and historic reasons" is not quite the one I would pick. 


Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Wed, Jul 6, 2022 at 9:49 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:
adrian.klaver@aklaver.com wrote:

Not sure what your point is?

Try these two:

set timezone = 'America/New_York';
show timezone;

set timezone = "America/New_York";
show timezone;

Neither causes an error. The "show", in each case, prints the bare value with no quotes. It never struck me try try double quotes around the timezone argument. I'm shocked that they are silently accepted here and seem to have the same effect (in this syntax setting) as single quotes. (I realize that quoting is required.) This comes as shock. It seems to be a "solution" to problem that I don't have—and it brings confusion. Try this:

set search_path = '"x"'; -- Hard to read. It's double-quoted x surrounded by single quotes.
show search_path;

This is the result:

 search_path 
-------------
 """x"""

That's a run of three double quotes each side of x. (For sport, try a single-quoted x surrounded by double-quotes.) I have no idea what this means—or why it's allowed. But the fact that the quoting has different effects in my different examples led me to say what I did.

It is the documented way to represent a double-quote in an identifier.  Just like '''' (4 single quotes in a row) is a literal that contains just a single quote. 

"Quoted identifiers can contain any character, except the character with code zero. (To include a double quote, write two double quotes.)"

At the level of discussion you want to have when you encounter unfamiliar syntax please read the syntax chapter for the related concept (expression identifiers).

 

When I read "string" in doc, then I understand that a manifest string constant (like the timezone setting that I used) must be single-quoted in SQL statements and the like. Not double-quoted.

In an expression a string contained in single quotes is a literal, a string contained in double quotes is considered a name/identifier.

SET guc = value;

value is not defined to be some expression.  It is defined to be its own unique thing.

I will agree that you seem to have uncovered at least an inconsistency or a lack of documentation.  I'm still unsure exactly what needs to be done here, or if I too am missing something.  I think part of the answer is that you can put the value of an identifier anywhere you are expecting a plain literal.  But you cannot put a plain literal in places that are expecting identifiers.  SET is looking for literal values, which it stores, then interprets as identifiers during use.  A query doesn't "store" things for later use so it looks for and requires actual identifier syntax (no single quotes) - and more generally uses expressions which likewise care about the difference.

David J.

Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Wed, Jul 6, 2022 at 10:03 AM Christophe Pettus <xof@thebuild.com> wrote:


> On Jul 6, 2022, at 09:48, Bryn Llewellyn <bryn@yugabyte.com> wrote:
> Neither causes an error. The "show", in each case, prints the bare value with no quotes. It never struck me try try double quotes around the timezone argument. I'm shocked that they are silently accepted here and seem to have the same effect (in this syntax setting) as single quotes.

It's really a lot easier than it is being made out to be.

GUCs accept a string as a value, which might then be converted to other values based on the semantics of the GUC.  PostgreSQL generously accepts three different ways of delimiting this string:

1. Single quotes (as in SQL).
2. Double quotes (which in SQL, yes, are used to delimit names).
3. No delimiter at all if there are no embedded characters that would terminate the value prematurely.


4. Dollar quoting

postgres=# set search_path to $$testschema$$;
SET
postgres=# show search_path;
 search_path
-------------
 testschema
(1 row)

Thus a more accurate syntax specification would be (at least for search_path):

SET [ SESSION | LOCAL ] configuration_parameter { TO | = } value [, ...]

Where value is one of:
 { identifier | string_constant | DEFAULT }

And those first two are the defined terms in SQL Syntax.

That covers all four options split among the two defined types.

Now, maybe that doesn't work for other GUCs and so this level of imprecision is acceptable...I'm ok with coming away from this conversation with the status quo but I'm not going to try and claim there isn't anything confusing here.

David J.

Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
david.g.johnston@gmail.com wrote:

unless you are going to write: operator(pg_catalog.=) in your function the advice to always use schema qualifications is not going to be taken seriously... the correct search_path to set isn't "empty" but "pg_catalog", "pg_temp".  While this does violate "DRY" principles it is the solution you are looking for.
 
The advice always to use fully qualified names is orthogonal to the advice to set an explicit search_path in a subprogram's definition. Am I right that your point is about what search path to set and has nothing to do with the possible use of qualified names?

I do see that I can achieve my goal by setting the search_path to pg_catalog, pg_temp in my functions rather than to empty. So I can adopt that practice. But it's at odds with an example in this section:

Writing SECURITY DEFINER Functions Safely
https://www.postgresql.org/docs/current/sql-createfunction.html#id-1.9.3.67.10.2

The example sets a search path the does *not* mention pg_catalog.

I hadn't come across this locution:

operator(pg_catalog.=)

But I found an account here:

4.2.5. Operator Invocations
https://www.postgresql.org/docs/current/sql-expressions.html#SQL-EXPRESSIONS-OPERATOR-CALLS

To be sure, I tried this:

select 17 operator(pg_catalog.=) 17;

It ran without error and produced TRUE.

I believe that you're saying that because (as I presume) the "normal" definition of the = operator is done by code in the pg_catalog schema, pg_catalog must be on the path for ordinary equality tests to work. (I also found random folks saying this on StackExchange.) So I tried this:

create function f()
  returns boolean
  set search_path = ''
  security definer
  language plpgsql
as $body$
begin
  return (2 + 5) = 7;
end;
$body$;
select f()::text;

It succeeded and said true. I don't know how to explain this outcome. For good measure, I tried this too:

create procedure p()
  set search_path = ''
  security definer
  language plpgsql
as $body$
begin
  create temporary table x(k int);
end;
$body$;
call p();

It succeeded. And the \d metacommand showed me that I now have a table pg_temp_3.x. Using a different database, it ends up in "pg_temp_1. What's going on here? Is "pg_temp" a kind of generic nickname for ANY "pg_temp_N"?

I hate to realize that I'm failing to understand a fundamental principle.

It rather looks like the name-res to pg_catalog and (some) pg_temp is hard-coded and doesn't rely on the reigning search_path. Or, to put it another way, these two schemas are inevitably at the end of the search_path no matter what you set explicitly, and never mind that "show search_path" doesn't show them unless you also put them on the path (again) explicitly.

I can't make sense of this wording from "Writing SECURITY DEFINER Functions Safely":

« A secure arrangement can be obtained by forcing the temporary schema to be searched last. To do this, write pg_temp as the last entry in search_path. »

If I do this:

set search_path = 'pg_catalog, pg_temp';
show search_path;

Then I see what I set—in that order. But if I set the search_path to empty (and don't see pg_catalog or pg_temp with "show") PG behaves as if they're still there. Not only as my f() and p() above show. But even, say, "select count(*) from pg_class". Moreover, this is allowed too:

set search_path = 'pg_temp, pg_catalog, pg_temp';
show search_path;

Now I see exactly what I set. It seems strange that this is allowed. How does the implementation handle this when a to-be-resolved name exists nowhere? Does it just crank on, repeatedly searching where it already failed, right up to the bitter end?

Here's another test whose outcome surprises me and seems to be at odds with what you're saying and what the "Writing SECURITY DEFINER Functions Safely" section says:

select count(*) from pg_class; -- 399
create temporary table pg_class(k int);
select count(*) from pg_class; -- 0
set search_path = 'pg_catalog, pg_temp';
select count(*) from pg_class; -- STILL 0

Why does the final "select" show that the temp table's name has still captured the one in pg_catalog even though it's ahead in the path.

Might I trouble you to explain more carefully—or to give me a doc reference that will allow me to see why the equality function in my example f() is found when the path for the function is empty? And why my example procedure p() manages to create a temporary table while pg_temp is not on the search_path?

The main problem is that by doing search_path manipulation on the function you prevent inlining it into the main query.

The "Writing SECURITY DEFINER Functions Safely" section explicitly recommends that a subprogram includes a "set search_path" specification. But, as I read it, you're saying that this advice is wrong (at least when a function will be invoked in more than a bare "select" because it prevents inlining.

How should I resolve these two conflicting pieces of advice?

Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Wed, Jul 6, 2022 at 11:50 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:

It succeeded. And the \d metacommand showed me that I now have a table pg_temp_3.x. Using a different database, it ends up in "pg_temp_1. What's going on here? Is "pg_temp" a kind of generic nickname for ANY "pg_temp_N"?

I hate to realize that I'm failing to understand a fundamental principle.

It rather looks like the name-res to pg_catalog and (some) pg_temp is hard-coded and doesn't rely on the reigning search_path. Or, to put it another way, these two schemas are inevitably at the end of the search_path no matter what you set explicitly, and never mind that "show search_path" doesn't show them unless you also put them on the path (again) explicitly.

You either didn't read or failed or retain knowledge of the words in the documentation that are the canonical reference for search_path and explain exactly this.  I suggest you (re-)read them.


And elsewhere I'm sure it is written that since temporary objects are session-local it was decided that a useful implementation detail for that would be for each session to have its own temporary schema, hence the appended integer to distinguish them (referencing pg_temp works, the system resolves the session specific schema name for you).
 

I can't make sense of this wording from "Writing SECURITY DEFINER Functions Safely":

« A secure arrangement can be obtained by forcing the temporary schema to be searched last. To do this, write pg_temp as the last entry in search_path. »

If I do this:

set search_path = 'pg_catalog, pg_temp';
show search_path;

Then I see what I set—in that order. But if I set the search_path to empty (and don't see pg_catalog or pg_temp with "show") PG behaves as if they're still there.

Those same docs also explain why search_path shows what it does (it is the literally saved value) and to find out how the system actually resolved it at runtime to come up with a final search_path you need to use a different thing (via a function).

 
Not only as my f() and p() above show. But even, say, "select count(*) from pg_class". Moreover, this is allowed too:

set search_path = 'pg_temp, pg_catalog, pg_temp';
show search_path;

Now I see exactly what I set.

If you meant that to be a literal thing you've only identified one very oddly named schema...otherwise yes I get your point.

 
It seems strange that this is allowed. How does the implementation handle this when a to-be-resolved name exists nowhere? Does it just crank on, repeatedly searching where it already failed, right up to the bitter end?

Probably...or maybe it ignores the second reference and it is treated the same as 'pg_temp, pg_catalog'

Here's another test whose outcome surprises me and seems to be at odds with what you're saying and what the "Writing SECURITY DEFINER Functions Safely" section says:

select count(*) from pg_class; -- 399
create temporary table pg_class(k int);
select count(*) from pg_class; -- 0
set search_path = 'pg_catalog, pg_temp';
select count(*) from pg_class; -- STILL 0

Why does the final "select" show that the temp table's name has still captured the one in pg_catalog even though it's ahead in the path.

Remember that session scoped relation cache we went on about a little while back...I think that by creating the object you got a cache invalidation but simply changing the search_path does not cause a cache invalidation.


The "Writing SECURITY DEFINER Functions Safely" section explicitly recommends that a subprogram includes a "set search_path" specification. But, as I read it, you're saying that this advice is wrong (at least when a function will be invoked in more than a bare "select" because it prevents inlining.

How should I resolve these two conflicting pieces of advice?

There is no "conflict" - you basically get to choose safety or performance.  Though since performance isn't guaranteed nor always a need I would say choose safety unless you've confirmed that you need performance.

David J.

Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
david.g.johnston@gmail.com wrote:

At the level of discussion you want to have when you encounter unfamiliar syntax please read the syntax chapter for the related concept (expression identifiers).

https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS
 
Thanks. That section makes a valiant attempt to distinguish between "identifier" (as a token in a SQL statement or, say PL/pgSQL source code) and "name" as what the identifier denotes. But (I think) it slips up here:

«
A convention often used is to write key words in upper case and names in lower case, e.g.:

UPDATE my_table SET a = 5;
»

It should be « to write key words in upper case and unquoted identifiers in lower case », yes?

About my

 search_path 
-------------
 """x"""

yes, I was confused—by, for example, this precedent.

create table "He said ""dog"", I think."(k int);
\d

I see this:

                List of relations
 Schema |          Name           | Type  | Owner 
--------+-------------------------+-------+-------
 u1     | He said "dog", I think. | table | u1

So here the convention is to list the name of the table and not the identifier that denoted the intended name at creation time.

The situation is analogous with setting the search_path. I want to talk about schemas. Schemas have names. So in SQL syntax, I must denote these names by writing identifiers. It could have been decided that the proper way to display a search_path is by listing the schema names (just as \d does for tables). But it was decided, instead, to denote the path by the list of identifiers that denote the schema names. This doesn't present a huge usability challenge. But it is, nevertheless, a rule that you have to learn (which I had) and then remember (which I didn't).





Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Wed, Jul 6, 2022 at 1:13 PM Bryn Llewellyn <bryn@yugabyte.com> wrote:
david.g.johnston@gmail.com wrote:

At the level of discussion you want to have when you encounter unfamiliar syntax please read the syntax chapter for the related concept (expression identifiers).

https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS
 
Thanks. That section makes a valiant attempt to distinguish between "identifier" (as a token in a SQL statement or, say PL/pgSQL source code) and "name" as what the identifier denotes. But (I think) it slips up here:

«
A convention often used is to write key words in upper case and names in lower case, e.g.:

UPDATE my_table SET a = 5;
»

It should be « to write key words in upper case and unquoted identifiers in lower case », yes?


I would say: ... , and identifiers without quotes and in lower case
 

The situation is analogous with setting the search_path. I want to talk about schemas. Schemas have names. So in SQL syntax, I must denote these names by writing identifiers.

I think this very example shows why holding that position as an absolute is problematic.
 
It could have been decided that the proper way to display a search_path is by listing the schema names (just as \d does for tables). But it was decided, instead, to denote the path by the list of identifiers that denote the schema names. This doesn't present a huge usability challenge. But it is, nevertheless, a rule that you have to learn (which I had) and then remember (which I didn't).


An "identifier" is supposed to exist in the system and when you use it that existence leads you to the object that is named.  search_path accepts labels that aren't true identifiers because they don't have to exist in the system.

I'm not saying you are totally wrong and that a different choice somewhere along in this could have been better, but there is a unique aspect about search_path and it has manifested in its own unique behavior.  Pointing out it is confusing is fine, but as you haven't actually suggested a reasoned alternative nor, when you started this conversation, seemed like you had taken the time to resolve your initial confusions via the documentation, it is hard to want to spend much time here trying to improve things.

I've enjoyed this conversation precisely because it forces me to dig deeper, think more critically, and understand the reasons behind the system's design better.  Yet in terms of being able to properly use search_path to achieve a goal the couple of paragraphs in the documentation are sufficient for all practical purposes I can see.  And do not, with any frequency, seem to generate questions from our users.

David J.

Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
david.g.johnston@gmail.com wrote:

bryn@yugabyte.com wrote
...

You either didn't read or failed or retain knowledge of the words in the documentation that are the canonical reference for search_path and explain exactly this. I suggest you (re-)read them.

https://www.postgresql.org/docs/current/runtime-config-client.html#RUNTIME-CONFIG-CLIENT-STATEMENT  (under search_path)

And elsewhere I'm sure it is written that since temporary objects are session-local it was decided that a useful implementation detail for that would be for each session to have its own temporary schema, hence the appended integer to distinguish them (referencing pg_temp works, the system resolves the session specific schema name for you).
 
Right. Mea maxima culpa. « the current session's temporary-table schema... can be explicitly listed in the path by using the alias pg_temp. »

It seems that the wording is wrong here:

« The value for search_path must be a comma-separated list of schema names. »

It's schema identifiers—and not schema names. Yes?

Here's another test whose outcome surprises me...

Remember that session scoped relation cache we went on about a little while back...I think that by creating the object you got a cache invalidation but simply changing the search_path does not cause a cache invalidation.

The problem was my stupid typo: writing « set search_path = 'pg_catalog, pg_temp'; » when I should *not* have typed those single quotes. Now the demo that I'd intended gets the outcome that I'd expected:

select count(*) from pg_class; --------------<< 399
create temporary table pg_class(k int);
select count(*) from pg_class; --------------<<   0
set search_path = pg_catalog, pg_temp;
select count(*) from pg_class; --------------<< 400

The "Writing SECURITY DEFINER Functions Safely" section explicitly recommends that a subprogram includes a "set search_path" specification. But, as I read it, you're saying that this advice is wrong (at least when a function will be invoked in more than a bare "select" because it prevents inlining.

How should I resolve these two conflicting pieces of advice?

There is no "conflict" - you basically get to choose safety or performance. Though since performance isn't guaranteed nor always a need I would say choose safety unless you've confirmed that you need performance.

Thanks, that's crystal clear now.

Re: Seems to be impossible to set a NULL search_path

From
Rob Sargent
Date:
On 7/6/22 14:47, Bryn Llewellyn wrote:
The problem was my stupid typo: writing « set search_path = 'pg_catalog, pg_temp'; » when I should *not* have typed those single quotes. Now the demo that I'd intended gets the outcome that I'd expected:

select count(*) from pg_class; --------------<< 399
create temporary table pg_class(k int);
select count(*) from pg_class; --------------<<   0
set search_path = pg_catalog, pg_temp;
select count(*) from pg_class; --------------<< 400

There's no surprise here.  you read the system catalogue, made a table in whatever is the current schema which hid system's table, then hid your table by the new path.

Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
> david.g.johnston@gmail.com wrote:
>
>> bryn@yugabyte.com wrote:
>>
>> «
>> A convention often used is to write key words in upper case and names in lower case, e.g.:
>>
>> UPDATE my_table SET a = 5;
>> »
>>
>> It should be « to write key words in upper case and unquoted identifiers in lower case », yes?
>
> I would say: ... , and identifiers without quotes and in lower case

Yes, that works.

> An "identifier" is supposed to exist in the system and when you use it that existence leads you to the object that is
named. search_path accepts labels that aren't true identifiers because they don't have to exist in the system. 

I think that you and I differ on this point. As I see things. the "identifier" notion belongs entirely to the domain of
languages(like SQL and PL/pgSQL) and as such they don't "exist" anywhere except there (as notions) and then in specific
languageutterances (as instances of the notion). For example, just because « select * from "no such table" » causes the
42P01error (when the identifier denotes a name that meets the expectation of its plain English reading), this doesn't
meanthat "no such table", qua identifier, didn't exist. After all, it exists in the SQL text where you see it. But I'll
readilyagree that, in most contexts of documentation and the like, the intended meaning is clear and it can be too hard
(andeven put off readers) always to be a stickler for the precise and correct use of terms of art. 

> I've enjoyed this conversation precisely because it forces me to dig deeper, think more critically, and understand
thereasons behind the system's design better. Yet in terms of being able to properly use search_path to achieve a goal
thecouple of paragraphs in the documentation are sufficient for all practical purposes I can see. And do not, with any
frequency,seem to generate questions from our users. 

It's a huge relief to hear this, David. I was afraid that I might be annoying you. Yes, I'll agree with your "for all
practicalpurposes" stance—but with a caveat. There's a famous SQL injection example, beloved of bloggers, that rests on
thefact that a developer didn't handle the possibility that, say, a table might have an exotic name. Then, maybe, they
getthe name from some metadata and build the text of a SQL statement that uses it. Because they don't know the risks,
anddon't have the concepts and the associated well-defined terms of art to discuss what they need to consider with
themselves,they neglect to use available techniques to avoid risks. 

I've all too often managed to survive with a partial understanding (in any corner, using any software setting) and then
managedto slip up when I stray from the typical scenarios. Just like the apocryphal developer who doesn't double-quote
supposedidentifies that come from some external source. That's why I strive to get the deepest possible understanding
ofthings (and this includes "why was it done this way?"). So thank you vey much for helping me with the present
particularcorner of postgreSQL that's been my latest obsession. I'm ready to lay it aside now, and move on... 


Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Wed, Jul 6, 2022 at 1:47 PM Bryn Llewellyn <bryn@yugabyte.com> wrote:

It seems that the wording is wrong here:

« The value for search_path must be a comma-separated list of schema names. »

It's schema identifiers—and not schema names. Yes?


To add further clarity (or confusion) there is also set_config(...)

postgres=# select set_config('search_path','a,"b",testSchema',false);
    set_config
------------------
 a,"b",testSchema
(1 row)

postgres=# select current_schemas(true);
     current_schemas
-------------------------
 {pg_catalog,testschema}
(1 row)

As for "schema identifiers" vs. "schema names" - they both seem equally wrong.  The list can very much contain sequences of characters that when interpreted as an identifier and looked for in the pg_namespace catalog do not find a matching entry and are therefore by definition not the name of any known schema in that database.

Besides, I hazard to guess how many times we write "table name" and "column name" in the documentation when your argument is that "table identifier" and "column identifier" is the correct choice.  No, rather "name" and "identifier" in the context of database objects are known to mean the same thing - the alphabetic name of the object.

David J.

Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
david.g.johnston@gmail.com wrote:

As for "schema identifiers" vs. "schema names" - they both seem equally wrong. The list can very much contain sequences of characters that when interpreted as an identifier and looked for in the pg_namespace catalog do not find a matching entry and are therefore by definition not the name of any known schema in that database.

Besides, I hazard to guess how many times we write "table name" and "column name" in the documentation when your argument is that "table identifier" and "column identifier" is the correct choice.  No, rather "name" and "identifier" in the context of database objects are known to mean the same thing - the alphabetic name of the object.

Well, "putative" or "candidate" can be used to resolve your existence criterion. But why bother? In my book, Bertie Wooster (or Bertram Wilberforce Wooster if you prefer) is a perfectly fine candidate name in the general English speaking culture. It's neither here nor there if there happens to be any living person who has the name...

But never mind. If you'd like a diverting read on this topic, go here:

https://blogs.oracle.com/sql/post/a-collection-of-plsql-essays

look for this, and download the PDF:

«
Names vs identifiers

Databases are full of things: tables, sequences, columns, views, PL/SQL units, what have you. Things have names and are manipulated by mentioning the names. The programming languages SQL and PL/SQL use identifiers, not names. Questions show many programmers are confused about the difference. This note describes the relationships between things, names, and identifiers. Once the programming rules are absorbed, developers can write code faster and with less heartburn.
»

It's written by a former colleague with whom I spent many happy hours discussing the topic.

Over and out?

Re: Seems to be impossible to set a NULL search_path

From
"Peter J. Holzer"
Date:
On 2022-07-06 20:47:19 -0700, Bryn Llewellyn wrote:
> But never mind. If you'd like a diverting read on this topic, go here:
>
> https://blogs.oracle.com/sql/post/a-collection-of-plsql-essays
>
> look for this, and download the PDF:
>
> «
> Names vs identifiers
>
> Databases are full of things: tables, sequences, columns, views, PL/SQL units,
> what have you. Things have names and are manipulated by mentioning the names.
> The programming languages SQL and PL/SQL use identifiers, not names.
[...]

If there's one thing I've learned in 39 years of programming it's that
technical terms rarely have a fixed, universally accepted meaning.
There's a reason standards include a "terms and definition" section
which often define seemingly trivial terms like "byte" and why project
management coaches often stress the importance of a glossary in the
documentation.

So I think how a single Oracle employee uses those terms has little
bearing on the PostgreSQL documentation (I haven't read it yet - maybe I
agree with him, but I'm also only a single PostgreSQL user). If you want
to argue that the usage is wrong you would have to point at something
within the PostgreSQL documentation (ideally an entry in the glossary)
or some really wide-spread convention and the absence of a local
definition.

> Questions show many programmers are confused about the difference.

Which might be a hint that no wide-spread convention on how to use those
terms exists.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
bryn@yugabyte.com wrote:

david.g.johnston@gmail.com wrote:

As for "schema identifiers" vs. "schema names" - they both seem equally wrong. The list can very much contain sequences of characters that when interpreted as an identifier and looked for in the pg_namespace catalog do not find a matching entry and are therefore by definition not the name of any known schema in that database.

Besides, I hazard to guess how many times we write "table name" and "column name" in the documentation when your argument is that "table identifier" and "column identifier" is the correct choice.  No, rather "name" and "identifier" in the context of database objects are known to mean the same thing - the alphabetic name of the object.

Well, "putative" or "candidate" can be used to resolve your existence criterion. But why bother? In my book, Bertie Wooster (or Bertram Wilberforce Wooster if you prefer) is a perfectly fine candidate name in the general English speaking culture. It's neither here nor there if there happens to be any living person who has the name...

But never mind. If you'd like a diverting read on this topic, go here:

https://blogs.oracle.com/sql/post/a-collection-of-plsql-essays

look for this, and download the PDF:

«
Names vs identifiers

Databases are full of things: tables, sequences, columns, views, PL/SQL units, what have you. Things have names and are manipulated by mentioning the names. The programming languages SQL and PL/SQL use identifiers, not names. Questions show many programmers are confused about the difference. This note describes the relationships between things, names, and identifiers. Once the programming rules are absorbed, developers can write code faster and with less heartburn.
»

It's written by a former colleague with whom I spent many happy hours discussing the topic.

Over and out?

Something has been nagging me since I sent this. I was sure that I’d seen this:

"Bllewell"

in some catalog view where I expected the name of a role. I just found it in pg_proc. The spelling Bllewell with no quotes is the name of an o/s user on my MacBook. I don't know why it ends up in my MacBook's PG installation—but that's a story for a different day. My PG cluster has a database called demo. I just did this test:

\c demo postgres

-- What a pain to have to do this to get silent "if exists" behavior.
do $body$
begin
  begin
    drop owned by "Exotic Me" cascade;
  exception when undefined_object then
    null;
  end;
  begin
    drop owned by """Exotic Me""" cascade;
  exception when undefined_object then
    null;
  end;
end;
$body$;

create role "Exotic Me" login password 'p';
grant connect on database demo to "Exotic Me";

create role """Exotic Me""" login password 'p';

\c demo "Exotic Me"
create schema s;

create procedure s.p()
  language plpgsql
as $body$
begin
  null;
end;
$body$;

with c as (
  select
    rolname                           as role_name
  from pg_roles
  union all
  select
    distinct proowner::regrole::text  as role_name
  from pg_proc)
select role_name
from c
where
  lower(role_name) like '%bllewell%' or
  lower(role_name) like '%exotic%';

This is the result:

  role_name  
-------------
 Bllewell
 Exotic Me
 "Exotic Me"
 "Bllewell"

Of course I understand why I see both Exotic Me with no quotes and "Exotic Me" with double quotes: I asked for it. But why do I see both Bllewell with no quotes and "Bllewell" with double quotes. Are there really two distinct roles with those two names? Or did pg_roles end up with the identifier for the exotic name Bllewell rather than the bare name itself?

And on the disputed notion that the identifier for a name is a distinct phenomenon from the name itself, I noted this here:

«
The input functions for these types allow whitespace between tokens, and will fold upper-case letters to lower case, except within double quotes; this is done to make the syntax rules similar to the way object names are written in SQL. Conversely, the output functions will use double quotes if needed to make the output be a valid SQL identifier.
»

It comes close (but no cigar) to making the distinction. It uses both "object name" and "SQL identifier" and says (more or less) that a "SQL identifier" is the way an object name is written in SQL by double-quoting it. I'm using "exotic" as a tautological shorthand for what you *must* surround with double-quotes in SQL and PL/pgSQL to express what you want.

Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Wed, Jul 13, 2022 at 4:33 PM Bryn Llewellyn <bryn@yugabyte.com> wrote:
  role_name  
-------------
 Bllewell
 Exotic Me
 "Exotic Me"
 "Bllewell"

Of course I understand why I see both Exotic Me with no quotes and "Exotic Me" with double quotes: I asked for it. But why do I see both Bllewell with no quotes and "Bllewell" with double quotes. Are there really two distinct roles with those two names? Or did pg_roles end up with the identifier for the exotic name Bllewell rather than the bare name itself?


You seem perfectly capable of answering those last two questions yourself, better than we who do not have the access to your database; access needed to query pg_roles.

Given that you used UNION ALL I'm not clear why Exotic Me isn't duplicated in the output you profess comes from the test case you wrote.  Is this another one of your mistakes in presenting a self-contained test case?

David J.

Re: Seems to be impossible to set a NULL search_path

From
Bryn Llewellyn
Date:
david.g.johnston@gmail.com wrote:

bryn@yugabyte.com wrote:

  role_name  
-------------
 Bllewell
 ...
 "Bllewell"

...Are there really two distinct roles with those two names?...

Is this another one of your mistakes in presenting a self-contained test case?

I was far, far too elliptical in what I wrote. Sorry, David. Yes, that was a mistake. These are the two fundamental observations that got me started on this:

Observation #1 (the native content of "pg_roles"):

select rolname as role_name from pg_roles order by 1;

         role_name         
---------------------------
 Bllewell
 pg_database_owner
 ...
 postgres

Observation #2 (the native content of "pg_proc"):

select
  proowner::regrole as role_name,
  proname
from pg_proc
order by 1, 2
limit 3;

 role_name  |        proname         
------------+------------------------
 "Bllewell" | RI_FKey_cascade_del
 "Bllewell" | RI_FKey_cascade_upd
 "Bllewell" | RI_FKey_check_ins

The clue to the different spellings is the ::regrole typecast. I knew all along that I could get the name of the owner of a function from: 

pg_proc p inner join pg_roles r on p.proowner = r.oid

But I wondered if I could find an operator (using the word loosely) to save me some typing. Google took me to the "Postgres get function owner" post on Stack Exchange:
https://dba.stackexchange.com/questions/259001/postgres-get-function-owner

Laurenz Albe answered the question thus:

«
The complicated way is to join with pg_roles and get the user name from there. But since PostgreSQL hackers don't want to type more than necessary, they came up with something else... Internally, every object is... identified.. by its “object ID”. This is the number that is for example used in the proowner column of pg_proc... but the type output function, which is used for display of the type, renders it as the object's name. Put that together with PostgreSQL's type cast operator ::, and you can do... ::regrole.
»

This was exactly what I wanted! I took Lawrence's « ::regrole renders proowner as the [owner]'s name » at face value. But when the name happens to be exotic, as is Joe, you see that ::regrole in fact renders proowner as the *identifier* for the function's owner's name. The testcase at the end makes the point more explicitly. It does some set-up that lets me do this:

select
  r.rolname            as "The owner name",
  p.proowner::regrole  as "The *identifier* for the owner name"
from
  pg_roles r
  inner join
  pg_proc p
  on r.oid = p.proowner
where p.proname = 'something_strange';

 The owner name | The *identifier* for the owner name 
----------------+-------------------------------------
 Joe            | "Joe"

This is what I've been banging on about all the time. It seems that I'm the only person in the pgsql-general community who wants some proper terms of art (as I used them in my column aliases) that let me say simply and clearly, why pg_proc.proowner::regrole gives a differently spelled answer than does pg_roles.role_name.

B.t.w., I looked at "8.19. Object Identifier Types". It uses the term "identifier" in the more ordinary sense of "unique identifier" (as are the values in a table's PK column that might be numeric, text, uuid, or whatever. It mentions quote_ident(). I tried it thus:

select quote_ident('Joe') as joe, quote_ident('"Dog"') as dog;

  joe  |    dog    
-------+-----------
 "Joe" | """Dog"""

(so that's right). And it goes on to say "...names that require quoting". So the difference is in the air and is hinted at with "ident" and "name". But the wording doesn't tie things down.

I'll shut up on this now.

----------------------------------------------------------------------
-- Testcase setup

\c postgres postgres
create role "Joe" login password 'p';

create database tmp;
grant connect on database tmp to "Joe";
grant create on database tmp to "Joe";

\c tmp "Joe"
create schema s;
create procedure s.something_strange()
  language plpgsql
as $body$
begin
  null;
end;

Re: Seems to be impossible to set a NULL search_path

From
"David G. Johnston"
Date:
On Wed, Jul 13, 2022 at 10:02 PM Bryn Llewellyn <bryn@yugabyte.com> wrote:

 The owner name | The *identifier* for the owner name 
----------------+-------------------------------------
 Joe            | "Joe"

This is what I've been banging on about all the time. It seems that I'm the only person in the pgsql-general community who wants some proper terms of art

You might get a more trustworthy representation if this wasn't buried in a thread about NULL and search_path.

But even with that I'd probably say while there are people who might find the pureness of such a nomenclature to be appealing the project at this point has no need to seek such perfection in this area.
 
(as I used them in my column aliases) that let me say simply and clearly, why pg_proc.proowner::regrole gives a differently spelled answer than does pg_roles.role_name. 

So you want "identifier" to be used whenever the printed or written value allows the usage of structural double quotes and "name" when it doesn't?

IOW, you seem to have a long-winded way of saying, for the following documentation paragraph:

"The input functions for these types allow whitespace between tokens, and will fold upper-case letters to lower case, except within double quotes; this is done to make the syntax rules similar to the way object names are written in SQL. Conversely, the output functions will use double quotes if needed to make the output be a valid SQL identifier."

Change "object names" to "object identifiers".

Feel free to tug on that string if you'd like, I'm not convinced enough that it is even incorrect and suspect that style of writing quite prevalent in the docs.  You'd need a larger sample size to argue with and propose fixes for before a patch would likely be considered.

David J.