RE: [bug?] Missed parallel safety checks, and wrong parallel safety - Mailing list pgsql-hackers

From tsunakawa.takay@fujitsu.com
Subject RE: [bug?] Missed parallel safety checks, and wrong parallel safety
Date
Msg-id TYAPR01MB2990D5B8D0689F1B7A7F387DFE589@TYAPR01MB2990.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [bug?] Missed parallel safety checks, and wrong parallel safety  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [bug?] Missed parallel safety checks, and wrong parallel safety  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
From: Robert Haas <robertmhaas@gmail.com>
> On Tue, May 4, 2021 at 11:47 PM Greg Nancarrow <gregn4422@gmail.com>
> wrote:
> > Problem is, for built-in functions, the changes are allowed, but for
> > some properties (like strict) the allowed changes don't actually take
> > effect (this is what Amit was referring to - so why allow those
> > changes?).
> > It's because some of the function properties are cached in
> > FmgrBuiltins[] (for a "fast-path" lookup for built-ins), according to
> > their state at build time (from pg_proc.dat), but ALTER FUNCTION is
> > just changing it in the system catalogs. Also, with sufficient
> > privileges, a built-in function can be redefined, yet the original
> > function (whose info is cached in FmgrBuiltins[]) is always invoked,
> > not the newly-defined version.
> 
> I agree. I think that's not ideal. I think we should consider putting
> some more restrictions on updating system catalog changes, and I also
> think that if we can get out of having strict need to be part of
> FmgrBuiltins[] that would be good. But what I don't agree with is the
> idea that since strict already has this problem, it's OK to do the
> same thing with parallel-safety. That seems to me to be making a bad
> situation worse, and I can't see what problem it actually solves.

Let me divide the points:


(1) Is it better to get hardcoded function properties out of fmgr_builtins[]?
It's little worth doing so or thinking about that.  It's no business for users to change system objects, in this case
systemfunctions.
 

Also, hardcoding is a worthwhile strategy for good performance or other inevitable reasons.  Postgres is using it as in
thesystem catalog relcache below.
 

[relcache.c]
/*
 *      hardcoded tuple descriptors, contents generated by genbki.pl
 */
static const FormData_pg_attribute Desc_pg_class[Natts_pg_class] = {Schema_pg_class};
static const FormData_pg_attribute Desc_pg_attribute[Natts_pg_attribute] = {Schema_pg_attribute};
...


(2) Should it be disallowed for users to change system function properties with ALTER FUNCTION?
Maybe yes, but it's not an important issue for achieving parallel INSERT SELECT at the moment.  So, I think this can be
discussedin an independent separate thread.
 

As a reminder, Postgres have safeguards against modifying system objects as follows.

test=# drop table^C
test=# drop function pg_wal_replay_pause();
ERROR:  cannot drop function pg_wal_replay_pause() because it is required by the database system
test=# drop table pg_largeobject;
ERROR:  permission denied: "pg_largeobject" is a system catalog

OTOH, Postgres doesn't disallow changing the system table column values directly, such as UPDATE pg_proc SET ....  But
it'swarned in the manual that such operations are dangerous.  So, we don't have to care about it.
 

Chapter 52. System Catalogs
https://www.postgresql.org/docs/devel/catalogs.html

"You can drop and recreate the tables, add columns, insert and update values, and severely mess up your system that
way.Normally, one should not change the system catalogs by hand, there are normally SQL commands to do that. (For
example,CREATE DATABASE inserts a row into the pg_database catalog — and actually creates the database on disk.) There
aresome exceptions for particularly esoteric operations, but many of those have been made available as SQL commands
overtime, and so the need for direct manipulation of the system catalogs is ever decreasing."
 


(3) Why do we want to have parallel-safety in fmgr_builtins[]?
As proposed in this thread and/or "Parallel INSERT SELECT take 2", we thought of detecting parallel unsafe function
executionduring SQL statement execution, instead of imposing much overhead to check parallel safety during query
planning. Specifically, we add parallel safety check in fmgr_info() and/or FunctionCallInvoke().
 

(Alternatively, I think we can conclude that we assume parallel unsafe built-in functions won't be used in parallel
DML. In that case, we don't change FmgrBuiltin and we just skip the parallel safety check for built-in functions when
thefunction is called.  Would you buy this?)
 


Regards
Takayuki Tsunakawa



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: MaxOffsetNumber for Table AMs
Next
From: Masahiko Sawada
Date:
Subject: Re: decoupling table and index vacuum