Thread: BUG #18464: Replacing a SQL function silently drops the generated columns that use this function

The following bug has been logged on the website:

Bug reference:      18464
Logged by:          Rémi Cura
Email address:      remi.cura@gmail.com
PostgreSQL version: 14.10
Operating system:   Linux
Description:

I have a SQL function that I defined myself. Let's call it MyFunction().
I use MyFunction in generated columns of 2 distinct tables.
Whe trying to delete MyFunction, the DB correctly raises an error.
However, when I `CREATE OR REPLACE` this existing function in place, the DB
silently drop all the generated columns depending on this function.
This is not the expected behavior.
I should get either an error, or no columns should be dropped.
if they are dropped, I should get at the very least a warning.

This is pretty major: having columns disappear magically on prod dbs is not
what postgres usually do.


When you drop any object with cascade option all associated object will be dropped. That is the normal behavior just and other behavior is not normal

2024년 5월 14일 (화) 오후 8:37, PG Bug reporting form <noreply@postgresql.org>님이 작성:
The following bug has been logged on the website:

Bug reference:      18464
Logged by:          Rémi Cura
Email address:      remi.cura@gmail.com
PostgreSQL version: 14.10
Operating system:   Linux
Description:       

I have a SQL function that I defined myself. Let's call it MyFunction().
I use MyFunction in generated columns of 2 distinct tables.
Whe trying to delete MyFunction, the DB correctly raises an error.
However, when I `CREATE OR REPLACE` this existing function in place, the DB
silently drop all the generated columns depending on this function.
This is not the expected behavior.
I should get either an error, or no columns should be dropped.
if they are dropped, I should get at the very least a warning.

This is pretty major: having columns disappear magically on prod dbs is not
what postgres usually do.

PG Bug reporting form <noreply@postgresql.org> writes:
> I have a SQL function that I defined myself. Let's call it MyFunction().
> I use MyFunction in generated columns of 2 distinct tables.
> Whe trying to delete MyFunction, the DB correctly raises an error.
> However, when I `CREATE OR REPLACE` this existing function in place, the DB
> silently drop all the generated columns depending on this function.
> This is not the expected behavior.

Indeed not, and it does not happen like that for me:

$ psql
psql (14.12)
Type "help" for help.

regression=# create function testfunc(int) returns int strict immutable language sql as 'select $1+1';
CREATE FUNCTION
regression=# create table foo (f1 int, f2 int generated always as (testfunc(f1)) stored);
CREATE TABLE
regression=# insert into foo values(1),(10);
INSERT 0 2
regression=# table foo;
 f1 | f2
----+----
  1 |  2
 10 | 11
(2 rows)

regression=# create or replace function testfunc(int) returns int strict immutable language sql as 'select $1+2';
CREATE FUNCTION
regression=# table foo;
 f1 | f2
----+----
  1 |  2
 10 | 11
(2 rows)

Now you could make a reasonable argument that the generated column
should have auto-updated.  We don't implement that (at least not
yet), on the grounds that what I did above broke the function's
promise of immutability.  The same change would also have broken
indexes using the function, so generated columns are not any worse.

In any case, please provide a self-contained example demonstrating
your claim.

            regards, tom lane



Hey Tom, thanks for taking the time to answer (and more generally your work on Postgres! ).
I tried to reproduce, and as you pointed out, updating the function in place is not what dropped the columns.
Instead, I must have dropped and recreated the function, which indeed drops the generated columns.

I still feel this is a very dangerous behavior.
For instance, if you use this function in an index, postgres will reject dropping the function because it's used in the index
`ERROR: cannot drop function testfunc(integer) because other objects depend on it
   Detail: index delete_me depends on function testfunc(integer)`
Of course you can use the `DROP FUNCTION .. CASCADE`, which will cascade to deleting the index.

The issue is that it should be the same for generated columns. They are a "dependency", they should have the same behavior as index on function.
--> trying to delete the function used in generated column --> it errors, listing the generated column as a dependency

Here is the updated test script:
```
create function testfunc(int) returns int strict immutable language sql as 'select $1+1';
DROP TABLE IF EXISTS foo;
create table foo (f1 int, f2 int generated always as (testfunc(f1)) stored);
CREATE INDEX delete_me ON foo (testfunc(f1));
insert into foo values(1),(10);
SELECT * FROM foo;  -- 2 cols

DROP FUNCTION testfunc(int);
--> raise error bc index use the function, does not say anything about generated column using this function

DROP INDEX delete_me ;
DROP FUNCTION testfunc(int);
  --> raise NO error
  --> cascade to deleting the generated column (not that the CASCADE key word was not user)

SELECT * FROM foo; -- *1* col
```

On Tue, May 14, 2024 at 5:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> I have a SQL function that I defined myself. Let's call it MyFunction().
> I use MyFunction in generated columns of 2 distinct tables.
> Whe trying to delete MyFunction, the DB correctly raises an error.
> However, when I `CREATE OR REPLACE` this existing function in place, the DB
> silently drop all the generated columns depending on this function.
> This is not the expected behavior.

Indeed not, and it does not happen like that for me:

$ psql
psql (14.12)
Type "help" for help.

regression=# create function testfunc(int) returns int strict immutable language sql as 'select $1+1';
CREATE FUNCTION
regression=# create table foo (f1 int, f2 int generated always as (testfunc(f1)) stored);
CREATE TABLE
regression=# insert into foo values(1),(10);
INSERT 0 2
regression=# table foo;
 f1 | f2
----+----
  1 |  2
 10 | 11
(2 rows)

regression=# create or replace function testfunc(int) returns int strict immutable language sql as 'select $1+2';
CREATE FUNCTION
regression=# table foo;
 f1 | f2
----+----
  1 |  2
 10 | 11
(2 rows)

Now you could make a reasonable argument that the generated column
should have auto-updated.  We don't implement that (at least not
yet), on the grounds that what I did above broke the function's
promise of immutability.  The same change would also have broken
indexes using the function, so generated columns are not any worse.

In any case, please provide a self-contained example demonstrating
your claim.

                        regards, tom lane
=?UTF-8?Q?R=C3=A9mi_Cura?= <remi.cura@gmail.com> writes:
> I tried to reproduce, and as you pointed out, updating the function in
> place is not what dropped the columns.
> Instead, I must have dropped and recreated the function, which indeed drops
> the generated columns.
> I still feel this is a very dangerous behavior.

Yeah, we changed this at v15.  See

https://www.postgresql.org/message-id/flat/17439-7df4421197e928f0%40postgresql.org

https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=cb02fcb4c

            regards, tom lane



Awesome!
The irony is that I triggered this behavior while solving issues in order to upgrade to PG16 :-)
thank you very much again
Rémi



On Wed, May 15, 2024, 17:56 Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rémi Cura <remi.cura@gmail.com> writes:
> I tried to reproduce, and as you pointed out, updating the function in
> place is not what dropped the columns.
> Instead, I must have dropped and recreated the function, which indeed drops
> the generated columns.
> I still feel this is a very dangerous behavior.

Yeah, we changed this at v15.  See

https://www.postgresql.org/message-id/flat/17439-7df4421197e928f0%40postgresql.org

https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=cb02fcb4c

                        regards, tom lane