Re: postgres_fdw: Use COPY to speed up batch inserts - Mailing list pgsql-hackers

From Matheus Alcantara
Subject Re: postgres_fdw: Use COPY to speed up batch inserts
Date
Msg-id 72ec1708-0c02-4ae9-b4f5-ee2bac5fd2f3@gmail.com
Whole thread Raw
In response to Re: postgres_fdw: Use COPY to speed up batch inserts  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Thank you for the review.

On 03/03/26 16:47, Masahiko Sawada wrote:
> Thank you for updating the patch! Here are some review comments:
> 
> +                Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);
> +
> +                if (attr->attgenerated)
> +                        continue;
> +
> +                if (!first)
> +                        appendStringInfoString(buf, ", ");
> +
> +                first = false;
> +
> +                appendStringInfoString(buf,
> quote_identifier(NameStr(attr->attname)));
> 
> We need to take care of the 'column_name' option here. If it's set we
> should not use attr->attname as it is.
> 

Fixed

> --
> +        }
> +        if (nattrs > 0)
> +                appendStringInfoString(buf, ") FROM STDIN");
> +        else
> +                appendStringInfoString(buf, " FROM STDIN");
> +}
> 
> It might be better to explicitly specify the format 'text'.
> 

Fixed

> ---
> +        if (useCopy)
> +        {
> +                deparseCopySql(&sql, rel, targetAttrs);
> +                values_end_len = 0;            /* Keep compiler quiet */
> +        }
> +        else
> +                deparseInsertSql(&sql, rte, resultRelation, rel,
> targetAttrs, doNothing,
> +
> resultRelInfo->ri_WithCheckOptions,
> +
> resultRelInfo->ri_returningList,
> +                                                 &retrieved_attrs,
> &values_end_len);
> 
> I think we should consider whether it's okay to use the COPY command
> even if resultRelInfo->ri_WithCheckOptions is non-NULL. As far as I
> researched, it's okay as we currently don't support COPY to a view but
> please consider it as well. We might want to explain it too in the
> comment.
> 

Good point, fixed.

> How about initializing values_end_len with 0 at its declaration?
> 

Fixed

> ---
> +-- test that fdw also use COPY FROM as a remote sql
> +set client_min_messages to 'log';
> +
> +create function insert_or_copy() returns trigger as $$
> +declare query text;
> +begin
> +    query := current_query();
> +    if query ~* '^COPY' then
> +        raise notice 'COPY command';
> +    elsif query ~* '^INSERT' then
> +        raise notice 'INSERT command';
> +    end if;
> +return new;
> +end;
> +$$ language plpgsql;
> 
> On second thoughts, it might be okay to output the current_query() as it is.
> 

Fixed

> ---
> +copy grem1 from stdin;
> +3
> +\.
> 
> I think it would be good to have more tests, for example, checking if
> the COPY command method can work properly with batch_size and
> column_name options.
> 

I've added new test cases for these cases. To test the batch_size case 
I've added an elog(DEBUG1) because using the trigger with 
current_query() log an entry for each row that we send for the foreign 
server, with the elog(DEBUG1) we can expect one log message for each 
batch operation. Please let me know if there is better ways to do this.

Please see the new attached version.

--
Matheus Alcantara
EDB: https://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Jim Jones
Date:
Subject: Re: POC: PLpgSQL FOREACH IN JSON ARRAY
Next
From: Amul Sul
Date:
Subject: Re: pg_waldump: support decoding of WAL inside tarfile