Thread: wrong search_path being used

wrong search_path being used

From
Rodrigo Rosenfeld Rosas
Date:
I didn't want to report the bug using the form before confirming it here.

Here is a gist of what I'm trying:

https://gist.github.com/49fcc8c4a5a810f66833#file-cleanup-sql

The relevant part being this:

old_path := pg_catalog.current_setting('search_path');

raise notice 'setting search_path from % to %', old_path, templ;

perform pg_catalog.set_config('search_path', templ, true);

     ...

select count(distinct transaction_id) from public.transaction_condition
into temp_count;

raise notice '% remaining transactions in public!', temp_count;

select count(distinct transaction_id) from transaction_condition into
temp_count;

raise notice '% remaining transactions', temp_count;



For which I get this result (NOTA is Portuguese word for NOTE or NOTICE,
not sure...):

NOTA:  setting search_path from "$user",public to public
NOTA:  1030 remaining transactions in public!
NOTA:  66 remaining transactions

Why do I get different results for both count() queries? The only
difference between them is that I made the schema explicit in the first
call but since current_path is set to "public" there shouldn't be any
difference, right?

By the way, 66 is the record count for stock.transaction_condition after
calling that function the first time for the stock schema.

I've tested the above using PG 9.2.2. Any ideas on what is happening?

Thanks in advance,
Rodrigo.

Re: wrong search_path being used

From
Tom Lane
Date:
Rodrigo Rosenfeld Rosas <rr.rosas@gmail.com> writes:
> perform pg_catalog.set_config('search_path', templ, true);

>      ...

> select count(distinct transaction_id) from public.transaction_condition
> into temp_count;

> raise notice '% remaining transactions in public!', temp_count;

> select count(distinct transaction_id) from transaction_condition into
> temp_count;

If this is inside a plpgsql function that's been executed more than
once, the SELECTs would have plans that were cached the first time
around, so that what would matter is the search_path that prevailed
during the first execution.  There have been discussions about changing
that but we wouldn't treat it as a back-patchable bug fix, because
it would almost certainly break things for somebody.

            regards, tom lane

Re: wrong search_path being used

From
Rodrigo Rosenfeld Rosas
Date:
Em 07-01-2013 17:17, Tom Lane escreveu:
> Rodrigo Rosenfeld Rosas<rr.rosas@gmail.com>  writes:
>> perform pg_catalog.set_config('search_path', templ, true);
>>       ...
>> select count(distinct transaction_id) from public.transaction_condition
>> into temp_count;
>> raise notice '% remaining transactions in public!', temp_count;
>> select count(distinct transaction_id) from transaction_condition into
>> temp_count;
> If this is inside a plpgsql function that's been executed more than
> once, the SELECTs would have plans that were cached the first time
> around, so that what would matter is the search_path that prevailed
> during the first execution.  There have been discussions about changing
> that but we wouldn't treat it as a back-patchable bug fix, because
> it would almost certainly break things for somebody.

So, Tom, if I understand it correctly, you do consider it a bug but you
don't want to backport a fix because it might break existent behavior in
some dbs, right?

But it is not clear to me if you're willing to fix it for 9.2.3 for
instance?

It is likely I'll be creating lots of functions to perform the same
operations in different schemas as my application is being designed to
handle multiple field templates using separate schemas and the
application will switch what templates are used based on current_path
but lots of database migrations (not sure if you know what I mean for
that, but I'm using Rails web framework terminology here) will perform
the same changes to the database for each of the supported
schemas/templates.

So, I'd really like this behavior to be fixed in future versions of PG
if possible.

What is your opinion on the subject?

Thanks for the prompt answer!

Cheers,
Rodrigo.

Re: wrong search_path being used

From
"Kevin Grittner"
Date:
Rodrigo Rosenfeld Rosas wrote:
> Tom Lane wrote:

>> There have been discussions about changing that

> if I understand it correctly, you do consider it a bug but you
> don't want to backport a fix because it might break existent
> behavior in some dbs, right?

No, there has been discussion about whether different behavior
would be better in future major releases, but no consensus has been
reached.

>> but we wouldn't treat it as a back-patchable bug fix, because
>> it would almost certainly break things for somebody.
>
> But it is not clear to me if you're willing to fix it for 9.2.3
> for instance?

Back-patching means changing things in a minor release, where
things only change after the second dot. We don't make changes in
user-visible behavior like this in minor releases; so no, we would
not make a change like this in 9.2.3 or any other 9.2 version.

-Kevin

Re: wrong search_path being used

From
Rodrigo Rosenfeld Rosas
Date:
Em 09-01-2013 20:09, Kevin Grittner escreveu:
> Rodrigo Rosenfeld Rosas wrote:
>> Tom Lane wrote:
>>> There have been discussions about changing that
>> if I understand it correctly, you do consider it a bug but you
>> don't want to backport a fix because it might break existent
>> behavior in some dbs, right?
> No, there has been discussion about whether different behavior
> would be better in future major releases, but no consensus has been
> reached.
>
>>> but we wouldn't treat it as a back-patchable bug fix, because
>>> it would almost certainly break things for somebody.
>> But it is not clear to me if you're willing to fix it for 9.2.3
>> for instance?
> Back-patching means changing things in a minor release, where
> things only change after the second dot. We don't make changes in
> user-visible behavior like this in minor releases; so no, we would
> not make a change like this in 9.2.3 or any other 9.2 version.
>

Ok, thanks for the explanation, Kevin.

I'm curious though. Why wouldn't this behavior be considered a bug? Is
there any link to previous discussions about this subject that I could read?

Re: wrong search_path being used

From
"Kevin Grittner"
Date:
Rodrigo Rosenfeld Rosas wrote:

> I'm curious though. Why wouldn't this behavior be considered a
> bug? Is there any link to previous discussions about this subject
> that I could read?

A plpgsql function generates a plan on initial execution which
chooses which particular tables are used. On subsequent executions
that saves the overhead of parsing out the statement and looking up
the table and its details, which can help performance quite a bit.
If you want to have source code evaluated each time the function is
called, you should use the EXECUTE statement within your function
body, which will force parsing and path resolution of the statement
for each execution. That should give you the behavior you want,
with some cost in performance.

To try to get your function code to work as you expect, the
language would essentially need to identify which statements could
be pre-planned, and which would needed to be treated as raw source
on each execution. That would be tricky to implement, and would
itself have some run-time cost. At this point we've put the burden
on the programmer to identify this at the time the code is written,
rather than adding run-time expense.

-Kevin

Re: wrong search_path being used

From
Tom Lane
Date:
"Kevin Grittner" <kgrittn@mail.com> writes:
> To try to get your function code to work as you expect, the
> language would essentially need to identify which statements could
> be pre-planned, and which would needed to be treated as raw source
> on each execution. That would be tricky to implement, and would
> itself have some run-time cost. At this point we've put the burden
> on the programmer to identify this at the time the code is written,
> rather than adding run-time expense.

I think that the alternative most likely to succeed is to consider any
change in the active value of search_path as forcing replanning of
cached plans.  This wouldn't be that hard to implement but there's
a definite risk of loss of performance due to unnecessary replanning
(since the path change might or might not affect the particular query).
It's also not unlikely that it could break applications that work today,
because they depend -- perhaps without being aware of it -- on the
existing first-path-wins behavior.

Having said that, it seems likely that more people would prefer that
behavior than the existing one.  But it hasn't been clear enough to
justify making such a subtly incompatible change.

            regards, tom lane

Re: wrong search_path being used

From
Andres Freund
Date:
On 2013-01-12 14:29:38 -0500, Tom Lane wrote:
> "Kevin Grittner" <kgrittn@mail.com> writes:
> > To try to get your function code to work as you expect, the
> > language would essentially need to identify which statements could
> > be pre-planned, and which would needed to be treated as raw source
> > on each execution. That would be tricky to implement, and would
> > itself have some run-time cost. At this point we've put the burden
> > on the programmer to identify this at the time the code is written,
> > rather than adding run-time expense.
>
> I think that the alternative most likely to succeed is to consider any
> change in the active value of search_path as forcing replanning of
> cached plans.  This wouldn't be that hard to implement but there's
> a definite risk of loss of performance due to unnecessary replanning
> (since the path change might or might not affect the particular query).
> It's also not unlikely that it could break applications that work today,
> because they depend -- perhaps without being aware of it -- on the
> existing first-path-wins behavior.
>
> Having said that, it seems likely that more people would prefer that
> behavior than the existing one.  But it hasn't been clear enough to
> justify making such a subtly incompatible change.

Its a somewhat common practise to use SET in functions or as a
configuration parameter to functions. I think at least the latter should
still work without forcing to replan any query. Given that we advice
setting the search path for SECURITY DEFINER...

I guess it wouldn't really be feasible to keep the search path used to
plan a query in its cached form and check that it fits the one currently
used on every use of the cached plan?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: wrong search_path being used

From
"Kevin Grittner"
Date:
Andres Freund wrote:
> On 2013-01-12 14:29:38 -0500, Tom Lane wrote:
>> "Kevin Grittner" <kgrittn@mail.com> writes:
>>> To try to get your function code to work as you expect, the
>>> language would essentially need to identify which statements could
>>> be pre-planned, and which would needed to be treated as raw source
>>> on each execution. That would be tricky to implement, and would
>>> itself have some run-time cost. At this point we've put the burden
>>> on the programmer to identify this at the time the code is written,
>>> rather than adding run-time expense.
>>
>> I think that the alternative most likely to succeed is to consider any
>> change in the active value of search_path as forcing replanning of
>> cached plans. This wouldn't be that hard to implement but there's
>> a definite risk of loss of performance due to unnecessary replanning
>> (since the path change might or might not affect the particular query).
>> It's also not unlikely that it could break applications that work today,
>> because they depend -- perhaps without being aware of it -- on the
>> existing first-path-wins behavior.
>>
>> Having said that, it seems likely that more people would prefer that
>> behavior than the existing one. But it hasn't been clear enough to
>> justify making such a subtly incompatible change.
>
> Its a somewhat common practise to use SET in functions or as a
> configuration parameter to functions. I think at least the latter should
> still work without forcing to replan any query. Given that we advice
> setting the search path for SECURITY DEFINER...
>
> I guess it wouldn't really be feasible to keep the search path used to
> plan a query in its cached form and check that it fits the one currently
> used on every use of the cached plan?

The particular complaint here is about:

  perform pg_catalog.set_config('search_path', text_parameter, true);
  select ... from unqualified_tablename ...;

The table used is based on the parameter passed on the first execution;
the local search_path set on subsequent executions is ignored.  Perhaps
we should just not save a plan for any function which sets a new
search_path while it is executing?

-Kevin

Re: wrong search_path being used

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-01-12 14:29:38 -0500, Tom Lane wrote:
>> I think that the alternative most likely to succeed is to consider any
>> change in the active value of search_path as forcing replanning of
>> cached plans.

> I guess it wouldn't really be feasible to keep the search path used to
> plan a query in its cached form and check that it fits the one currently
> used on every use of the cached plan?

Actually that's exactly what I meant: every time we arrive at a query
with a cached plan, check to see if the active search_path value is the
same as what it was when we made the cached plan, and replan if not.

There's already infrastructure to save the search_path value for a plan,
but what it's being used for right now is to restore the first-plan-time
value of the path when a replan is forced for some other reason.  It
wouldn't be that hard to change it around to use it this way instead.

            regards, tom lane

Re: wrong search_path being used

From
Rodrigo Rosenfeld Rosas
Date:
Em 12-01-2013 18:13, Tom Lane escreveu:
> Andres Freund<andres@2ndquadrant.com>  writes:
>> On 2013-01-12 14:29:38 -0500, Tom Lane wrote:
>>> I think that the alternative most likely to succeed is to consider any
>>> change in the active value of search_path as forcing replanning of
>>> cached plans.
>> I guess it wouldn't really be feasible to keep the search path used to
>> plan a query in its cached form and check that it fits the one currently
>> used on every use of the cached plan?
> Actually that's exactly what I meant: every time we arrive at a query
> with a cached plan, check to see if the active search_path value is the
> same as what it was when we made the cached plan, and replan if not.
>
> There's already infrastructure to save the search_path value for a plan,
> but what it's being used for right now is to restore the first-plan-time
> value of the path when a replan is forced for some other reason.  It
> wouldn't be that hard to change it around to use it this way instead.
>
>             regards, tom lane

Something that would be really handy for applications using schemas for
implementing multi-tenant support would be allowing usage of a function
param in the SET section of the function. Something like this:


CREATE FUNCTION some_function(p_schema, ...)
RETURNS VOID AS $$
BEGIN
    ...
END;
$$  LANGUAGE plpgsql
     SET search_path = p_schema, public;

Not sure what syntax to use since p_schema could be the name of some
existent schema but you got the idea.

This would avoid all the wrapper lines to save and restore the old
search_path (specially when there are earlier returns in the function body).

Re: wrong search_path being used

From
Andres Freund
Date:
On 2013-01-12 15:13:51 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2013-01-12 14:29:38 -0500, Tom Lane wrote:
> >> I think that the alternative most likely to succeed is to consider any
> >> change in the active value of search_path as forcing replanning of
> >> cached plans.
>
> > I guess it wouldn't really be feasible to keep the search path used to
> > plan a query in its cached form and check that it fits the one currently
> > used on every use of the cached plan?
>
> Actually that's exactly what I meant: every time we arrive at a query
> with a cached plan, check to see if the active search_path value is the
> same as what it was when we made the cached plan, and replan if not.

Okay. I was afraid it would add noticeable overhead to stuff like
plpgsql... Maybe would lower that by having a "search_path" generation
counter that gets increased everytime it gets changed but that seems a
bit too complicated.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: wrong search_path being used

From
Pavel Stehule
Date:
2013/1/13 Andres Freund <andres@2ndquadrant.com>:
> On 2013-01-12 15:13:51 -0500, Tom Lane wrote:
>> Andres Freund <andres@2ndquadrant.com> writes:
>> > On 2013-01-12 14:29:38 -0500, Tom Lane wrote:
>> >> I think that the alternative most likely to succeed is to consider any
>> >> change in the active value of search_path as forcing replanning of
>> >> cached plans.
>>
>> > I guess it wouldn't really be feasible to keep the search path used to
>> > plan a query in its cached form and check that it fits the one currently
>> > used on every use of the cached plan?
>>
>> Actually that's exactly what I meant: every time we arrive at a query
>> with a cached plan, check to see if the active search_path value is the
>> same as what it was when we made the cached plan, and replan if not.
>
> Okay. I was afraid it would add noticeable overhead to stuff like
> plpgsql... Maybe would lower that by having a "search_path" generation
> counter that gets increased everytime it gets changed but that seems a
> bit too complicated.
>

we can use same mechanism, that is used for plpgsql polymorphic
functions - different parameters => different instances of plpgsql
function.

we can store n instances per function - probably there can be memory
issues when too much different search paths will be used - it is
optimal probably for 10 - 100 different search paths

different solution - can we specify search_path early in connection
string? Then this information can be used by connection pooler - and
we can do replan when different search_path is identified.

Regards

Pavel


> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund                     http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

Re: wrong search_path being used

From
Casey Allen Shobe
Date:
Just adding in my two cents...

At my last company we had several hundred stored procedures which all set a
search path dynamically based on an input argument prior to running any
queries.  The current behavior is very alarming and challenging to
consistently workaround.

Would be very happy to see a future version without the search path
catching issues.

Cheers,
--
Casey Allen Shobe
casey@shobe.info

Re: wrong search_path being used

From
Erika31
Date:
Hi Tom,

Why not implement a session/database option to use (or not) the replanning
of such functions. This would allow existing PG clusters to behave as they
historically did, and would simultaneously allow users who want to have
benefit of dynamically changing "search_path" inside a function to obtain
the expected result.

IMO, the query plan cache is a huge mistake, if it couldn't be disabled (in
another way than using EXECUTE). Perhaps based on the volatility of the
function ?
I intensively use schemas in a "heavy" PG database (understand
business-oriented logic db) and I'm today blocked by not being able to
select schemas in functions called several times in a session, leading to
wrong results. If performance is a good thing, getting the expected result
is primordial.

Lastly, is it planned to resolve this bug quickly ?

Thanks for your attention - and your involvement in PG project ;-)




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/wrong-search-path-being-used-tp5739027p5752394.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.