Thread: BUG #16631: postgres_fdw tries to insert into generated columns

BUG #16631: postgres_fdw tries to insert into generated columns

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16631
Logged by:          Daniel Cherniy
Email address:      dacherny@gmail.com
PostgreSQL version: 12.4
Operating system:   Debian (as "docker pull postgres:latest")
Description:

Postgres FDW imports generated (stored) columns from foreign table like
usual columns and tries to insert values into them instead of skipping.

Steps to reproduce:

create extension postgres_fdw;
create server host_as_foreign foreign data wrapper postgres_fdw options
(dbname 'postgres');
create user mapping for current_user server host_as_foreign;

create schema src;
create table src.test (
  id int primary key,
  name text not null,
  name_hash char(32) generated always as (md5(name)) stored
);

insert into src.test (id, name) values (1, 'Hello') returning *;
+--+-----+--------------------------------+
|id|name |name_hash                       |
+--+-----+--------------------------------+
|1 |Hello|8b1a9953c4611296a827abf8c47804d7|
+--+-----+--------------------------------+

-- lets import schema
create schema fgn;
import foreign schema src limit to (test) from server host_as_foreign into
fgn;

-- and check what we've got
select * from fgn.test;
+--+-----+--------------------------------+
|id|name |name_hash                       |
+--+-----+--------------------------------+
|1 |Hello|8b1a9953c4611296a827abf8c47804d7|
+--+-----+--------------------------------+

-- try to insert only columns what we suppose to
insert into fgn.test (id, name) values (2, 'Try to insert without generated
column');
> [42601] ERROR: cannot insert into column "name_hash"
> Detail: Column "name_hash" is a generated column.
> Where: remote SQL command: INSERT INTO src.test(id, name, name_hash)
VALUES ($1, $2, $3)

-- maybe now?
insert into fgn.test (id, name, name_hash) values (2, 'Try to insert with
"default" value', default);
> [42601] ERROR: cannot insert into column "name_hash"
> Detail: Column "name_hash" is a generated column.
> Where: remote SQL command: INSERT INTO src.test(id, name, name_hash)
VALUES ($1, $2, $3)


select version();
+----------------------------------------------------------------------------------------------------------------+
|version
                                    |
+----------------------------------------------------------------------------------------------------------------+
|PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled
by gcc (Debian 8.3.0-6) 8.3.0, 64-bit|
+----------------------------------------------------------------------------------------------------------------+


Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Etsuro Fujita
Date:
I CCed Peter E.

On Thu, Sep 24, 2020 at 4:10 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
> Postgres FDW imports generated (stored) columns from foreign table like
> usual columns and tries to insert values into them instead of skipping.
>
> Steps to reproduce:
>
> create extension postgres_fdw;
> create server host_as_foreign foreign data wrapper postgres_fdw options
> (dbname 'postgres');
> create user mapping for current_user server host_as_foreign;
>
> create schema src;
> create table src.test (
>   id int primary key,
>   name text not null,
>   name_hash char(32) generated always as (md5(name)) stored
> );
>
> insert into src.test (id, name) values (1, 'Hello') returning *;
> +--+-----+--------------------------------+
> |id|name |name_hash                       |
> +--+-----+--------------------------------+
> |1 |Hello|8b1a9953c4611296a827abf8c47804d7|
> +--+-----+--------------------------------+
>
> -- lets import schema
> create schema fgn;
> import foreign schema src limit to (test) from server host_as_foreign into
> fgn;
>
> -- and check what we've got
> select * from fgn.test;
> +--+-----+--------------------------------+
> |id|name |name_hash                       |
> +--+-----+--------------------------------+
> |1 |Hello|8b1a9953c4611296a827abf8c47804d7|
> +--+-----+--------------------------------+
>
> -- try to insert only columns what we suppose to
> insert into fgn.test (id, name) values (2, 'Try to insert without generated
> column');
> > [42601] ERROR: cannot insert into column "name_hash"
> > Detail: Column "name_hash" is a generated column.
> > Where: remote SQL command: INSERT INTO src.test(id, name, name_hash)
> VALUES ($1, $2, $3)

Reproduced.  Thanks for the report!

I studied the handling of generated columns in foreign tables, but I’m
not sure it is very well designed.  This is the documentation note
about it in create_foreign_table.sgml:

    Similar considerations apply to generated columns.  Stored generated
    columns are computed on insert or update on the local
    <productname>PostgreSQL</productname> server and handed to the
    foreign-data wrapper for writing out to the foreign data store, but it is
    not enforced that a query of the foreign table returns values for stored
    generated columns that are consistent with the generation expression.
    Again, this might result in incorrect query results.

I’m not sure why this is similar to the constraint case.  But rather
than computing the generated columns locally, I’m wondering that we
should compute them remotely, assuming that the corresponding
generated columns are defined on the remote sides.  (It would be the
user’s responsibility to ensure that.)  This seems to me similar to
the constraint case, and if we did so, I think we could fix the
reported issue by extending postgresImportForeignSchema to support
generated columns.  Maybe I’m missing something, though.

(Column defaults are also computed locally, but as discussed in [1],
that wouldn’t be ideal, and it would be good if we fixed to compute
them remotely.)

While looking into this, I noticed this:

create schema fgn2;
import foreign schema src limit to (test) from server host_as_foreign
into fgn2 options (import_default 'true');
ERROR:  cannot use column reference in DEFAULT expression
LINE 4: ... 'name_hash') COLLATE pg_catalog."default" DEFAULT md5(name)
                                                                  ^
QUERY:  CREATE FOREIGN TABLE test (
  id integer OPTIONS (column_name 'id') NOT NULL,
  name text OPTIONS (column_name 'name') COLLATE pg_catalog."default" NOT NULL,
  name_hash character(32) OPTIONS (column_name 'name_hash') COLLATE
pg_catalog."default" DEFAULT md5(name)
) SERVER host_as_foreign
OPTIONS (schema_name 'src', table_name 'test');
CONTEXT:  importing foreign table "test"

When enabling the import_default option, postgresImportForeignSchema
incorrectly imports the generation expression for generated column
name_hash defined on the remote table as a default expression for it.
Attached is a patch for fixing this issue.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/26654.1380145647%40sss.pgh.pa.us

Attachment

Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Etsuro Fujita
Date:
On Sat, Jul 3, 2021 at 7:44 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> But rather
> than computing the generated columns locally, I’m wondering that we
> should compute them remotely, assuming that the corresponding
> generated columns are defined on the remote sides.  (It would be the
> user’s responsibility to ensure that.)  This seems to me similar to
> the constraint case, and if we did so, I think we could fix the
> reported issue by extending postgresImportForeignSchema to support
> generated columns.

I updated the patch as such:

* Modified nodeModifyTable.c and copyfrom.c so that they don't compute
generated columns for FDWs anymore.
* Modified deparse functions such as deparseInsertSql() so that the
query is deparsed to set target generated columns (if any) to DEFAULT.
* Added an option import_generated to postgres_fdw to support
importing generated columns.
* Added a bit more test cases to postgres_fdw.sql.
* Updated docs.

Currently, foreign tables are not allowed to be updated directly when
they have stored generated columns, but I think they could be.  But I
left that as future work as that would be an improvement rather than a
fix.

Attached is an updated version of the patch.

Best regards,
Etsuro Fujita

Attachment

Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Peter Eisentraut
Date:
On 07.07.21 09:20, Etsuro Fujita wrote:
> * Modified nodeModifyTable.c and copyfrom.c so that they don't compute
> generated columns for FDWs anymore.

I don't agree with that change.  What is the point of declaring a 
generated column on a foreign table if you ignore it?  Then you might as 
well prohibit declaring such columns in the first place.



Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Etsuro Fujita
Date:
On Wed, Jul 7, 2021 at 5:45 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> On 07.07.21 09:20, Etsuro Fujita wrote:
> > * Modified nodeModifyTable.c and copyfrom.c so that they don't compute
> > generated columns for FDWs anymore.
>
> I don't agree with that change.  What is the point of declaring a
> generated column on a foreign table if you ignore it?  Then you might as
> well prohibit declaring such columns in the first place.

I made that change as FDWs could call ExecComputeStoredGenerated() by
themselves if necessary.  I think that that would be consistent with
the constraint case.

Best regards,
Etsuro Fujita



Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 07.07.21 09:20, Etsuro Fujita wrote:
>> * Modified nodeModifyTable.c and copyfrom.c so that they don't compute
>> generated columns for FDWs anymore.

> I don't agree with that change.  What is the point of declaring a 
> generated column on a foreign table if you ignore it?  Then you might as 
> well prohibit declaring such columns in the first place.

Maybe what we need here is some documentation clarifying exactly what
the point of a generated column on a foreign table actually *is*.
When would you do it, and what relationship if any has it got to the
generation status of the underlying remote column?

            regards, tom lane



Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Peter Eisentraut
Date:
On 07.07.21 17:54, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> On 07.07.21 09:20, Etsuro Fujita wrote:
>>> * Modified nodeModifyTable.c and copyfrom.c so that they don't compute
>>> generated columns for FDWs anymore.
> 
>> I don't agree with that change.  What is the point of declaring a
>> generated column on a foreign table if you ignore it?  Then you might as
>> well prohibit declaring such columns in the first place.
> 
> Maybe what we need here is some documentation clarifying exactly what
> the point of a generated column on a foreign table actually *is*.
> When would you do it, and what relationship if any has it got to the
> generation status of the underlying remote column?

The behavior is explained here: 
https://www.postgresql.org/docs/current/sql-createforeigntable.html

If you have a foreign table with a column "a" and a column "b" generated 
as (a * 2), and say the underlying storage is a flat file, then the 
generated value is computed by the server and written to the file.  The 
file behaves essentially analogous to the regular table heap file.

The way I understand this bug report is, what if the underlying storage 
is another SQL database with its own generated columns.  Then how do we 
manage it so that IMPORT FOREIGN SCHEMA can mark the foreign table 
column as "column exists to be read, but don't write to it".  The 
proposal was to repurpose the generated column syntax for that, but that 
breaks the flat-file use case above.

The follow-up proposal was to let the FDW implementation decide.  I 
don't think that is always the right thing either.  What if I want the 
foreign-table node to do the computations, because it has access to 
special parameters or something.

There might be multiple possibilities of what is appropriate here.  But 
that's not something we can just change around as part of a bug fix.



Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Etsuro Fujita
Date:
On Tue, Jul 13, 2021 at 5:36 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> On 07.07.21 17:54, Tom Lane wrote:
> > Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> >> On 07.07.21 09:20, Etsuro Fujita wrote:
> >>> * Modified nodeModifyTable.c and copyfrom.c so that they don't compute
> >>> generated columns for FDWs anymore.
> >
> >> I don't agree with that change.  What is the point of declaring a
> >> generated column on a foreign table if you ignore it?  Then you might as
> >> well prohibit declaring such columns in the first place.
> >
> > Maybe what we need here is some documentation clarifying exactly what
> > the point of a generated column on a foreign table actually *is*.
> > When would you do it, and what relationship if any has it got to the
> > generation status of the underlying remote column?
>
> The behavior is explained here:
> https://www.postgresql.org/docs/current/sql-createforeigntable.html
>
> If you have a foreign table with a column "a" and a column "b" generated
> as (a * 2), and say the underlying storage is a flat file, then the
> generated value is computed by the server and written to the file.  The
> file behaves essentially analogous to the regular table heap file.
>
> The way I understand this bug report is, what if the underlying storage
> is another SQL database with its own generated columns.  Then how do we
> manage it so that IMPORT FOREIGN SCHEMA can mark the foreign table
> column as "column exists to be read, but don't write to it".  The
> proposal was to repurpose the generated column syntax for that, but that
> breaks the flat-file use case above.

I think this may also depend on the FDW implementation.

> The follow-up proposal was to let the FDW implementation decide.  I
> don't think that is always the right thing either.  What if I want the
> foreign-table node to do the computations, because it has access to
> special parameters or something.

The follow-up proposal was mainly for consistency with the handling of
constraints on foreign tables; they are not enforced on the local PG
server at all, as noted in the above documentation.

> There might be multiple possibilities of what is appropriate here.  But
> that's not something we can just change around as part of a bug fix.

I agree on this part.  The changes I made to the core side would go
beyond a fix for the reported issue.  I'll remove it in the next
version of the patch.

Best regards,
Etsuro Fujita



Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Etsuro Fujita
Date:
On Thu, Jul 15, 2021 at 6:15 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> On Tue, Jul 13, 2021 at 5:36 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:

> > The follow-up proposal was to let the FDW implementation decide.  I
> > don't think that is always the right thing either.  What if I want the
> > foreign-table node to do the computations, because it has access to
> > special parameters or something.
>
> The follow-up proposal was mainly for consistency with the handling of
> constraints on foreign tables; they are not enforced on the local PG
> server at all, as noted in the above documentation.
>
> > There might be multiple possibilities of what is appropriate here.  But
> > that's not something we can just change around as part of a bug fix.
>
> I agree on this part.  The changes I made to the core side would go
> beyond a fix for the reported issue.  I'll remove it in the next
> version of the patch.

I updated the patch as such.  Attached is an update patch
(postgres_fdw-generated-columns-PG14-HEAD.patch), which would apply to
HEAD and v14.  I'm also attaching a patch for v12 and v13
(postgres_fdw-generated-columns-PG12-PG13.patch).  I'll apply these if
there are no objections.

Best regards,
Etsuro Fujita

Attachment

Re: BUG #16631: postgres_fdw tries to insert into generated columns

From
Etsuro Fujita
Date:
On Wed, Aug 4, 2021 at 5:30 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> I'll apply these if there are no objections.

Applied.

Best regards,
Etsuro Fujita