Re: search_path for PL/pgSQL functions partially cached? - Mailing list pgsql-general

From Pavel Stehule
Subject Re: search_path for PL/pgSQL functions partially cached?
Date
Msg-id CAFj8pRCb1aRzB4MmPX-X5BNF6-JgKMfDeirgnYNL-_gbCOqp8w@mail.gmail.com
Whole thread Raw
In response to Re: search_path for PL/pgSQL functions partially cached?  (Jan Behrens <jbe-mlist@magnetkern.de>)
Responses Re: search_path for PL/pgSQL functions partially cached?
List pgsql-general
Hi

čt 2. 1. 2025 v 11:37 odesílatel Jan Behrens <jbe-mlist@magnetkern.de> napsal:
On Wed, 1 Jan 2025 11:19:32 -0700
"David G. Johnston" <david.g.johnston@gmail.com> wrote:

> On Wed, Jan 1, 2025 at 10:55 AM Jan Behrens <jbe-mlist@magnetkern.de> wrote:
>
> > On Sat, 28 Dec 2024 00:40:09 +0100
> > Jan Behrens <jbe-mlist@magnetkern.de> wrote:
> >
> > > On Fri, 27 Dec 2024 13:26:28 -0700
> > > "David G. Johnston" <david.g.johnston@gmail.com> wrote:
> > >
> > > > > Or is it documented somewhere?
> > > >
> > > >
> > https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
> > >
> > > I can't find any notes regarding functions and schemas in that section.
> >
> >
> "Because PL/pgSQL saves prepared statements and sometimes execution plans
> in this way, SQL commands that appear directly in a PL/pgSQL function must
> refer to the same tables and columns on every execution; that is, you
> cannot use a parameter as the name of a table or column in an SQL command."
>
> Changing search_path is just one possible way to change out which object a
> name tries to refer to so it is not called out explicitly.

The first part of the cited sentence seems helpful ("you must always
refer to the same tables and columns on every execution"). I would thus
conclude that using a dynamic search_path when running functions or
procedures is *always* considered errorneous (even though not reported
by the database as an error), except when using EXECUTE.

I wonder if the database could/should generate an error (or at least a
warning?) when a function or procedure without a "SET search_path"
statement uses a non-qualified name? According to the documentation
using a dynamic search_path to refer to different entities in the
database is a case that "must" not happen.

But following through, this might lead to more warnings one might
expect, e.g. when using simple operators such as "=" or the "IN" or
"CASE expression WHEN" statements, as these rely on the search_path as
well. Should such code be considered non-idiomatic, dangerous, or even
errorneous if a "SET search_path" option is missing in the
function's/procedure's definition?

Maybe I'm overthinking this. But in practice, I've been running into
surprising issues whenever functions and schemas are involved, and I'm
not sure if every programmer will be aware of how important it is to
properly set a search_path in the function's defintion after reading
the documentation. (Besides, it's not always possible in procedures.)

How can you identify unwanted usage of non qualified identifiers from wanted usage of non qualified identifiers? It is a common pattern for sharding. Using not qualified identifiers of operators, functions is common when you are using orafce extensions, etc.

Using qualified identifiers everywhere strongly reduces readability. There are no aliases to the schema, so aliases cannot help.

you can identify the functions where search_path is not explicitly assigned

select oid::regprocedure 
  from pg_proc 
where pronamespace::regnamespace not in ('pg_catalog', 'information_schema') 
   and not exists(select 1 from unnest(proconfig) g(v) where  v ~ '^search_path');


Regards

Pavel


>
> > "SQL-language and PL-language functions provided by extensions are at
> > risk of search-path-based attacks when they are executed, since parsing
> > of these functions occurs at execution time not creation time."
>
> > Moreover, it isn't true for all
> > SQL-language functions, as can be demonstrated with the following code:
>
> Yeah, when we added a second method to write an SQL-language function, one
> that doesn't simply accept a string body, we didn't update that section to
> point out that is the string input variant of create function that is
> affected in this manner, the non-string (atomic) variant stores the result
> of parsing the inline code as opposed to storing the raw text.
>
> David J.

I missed that other part in the manual (which is in a totally different
section). Should I report the missing update in section 36.17.6.1. of
the documentation as a documentation issue, or is it not necessary?

Kind regards,
Jan Behrens


pgsql-general by date:

Previous
From: Jan Behrens
Date:
Subject: Re: search_path for PL/pgSQL functions partially cached?
Next
From: Jan Behrens
Date:
Subject: Re: search_path for PL/pgSQL functions partially cached?