Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From Sutou Kouhei
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id 20250404.173839.2270460917518093037.kou@clear-code.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi,

In <CAD21AoDOcYah-nREv09BB3ZoB-k+Yf1XUfJcDMoq=LLvV1v75w@mail.gmail.com>
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on Mon, 31 Mar 2025 12:35:23 -0700,
  Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> Most of the queries under test_copy_format/sql verifies the input
> patterns of the FORMAT option. I find that the regression tests
> included in that directory probably should focus on testing new
> callback APIs and some regression tests for FORMAT option handling can
> be moved into the normal regression test suite (e.g., in copy.sql or a
> new file like copy_format.sql). IIUC testing for invalid input
> patterns can be done even without creating artificial wrong handler
> functions.

Can we clarify what should we do for the next patch set
before we create the next patch set? Are the followings
correct?

1. Move invalid input patterns in
   src/test/modules/test_copy_format/sql/invalid.sql to
   src/test/regress/sql/copy.sql as much as possible.
   * We can do only 4 patterns in 16 patterns.
   * Other tests in
     src/test/modules/test_copy_format/sql/*.sql depend on
     custom COPY handler for test. So we can't move to
     src/test/regress/sql/copy.sql.
2. Create
   src/test/modules/test_copy_format/sql/test_copy_format.sql
   and move all contents in existing *.sql to the file 

> I'd like to see in the comment what the tests expect. Taking the
> following queries as an example,
> 
> COPY public.test FROM stdin WITH (FORMAT 'test_copy_format');
> \.
> COPY public.test TO stdout WITH (FORMAT 'test_copy_format');
> 
> it would help readers understand the test case better if we have a
> comment like for example:
> 
> -- Specify the custom format name without schema. We test if both
> -- COPY TO and COPY FROM can find the correct handler function
> -- in public schema.

I agree with you that the comment is useful when we use
src/test/modules/test_copy_format/sql/test_copy_format.sql
for all tests. (I feel that it's redundant when we use
no_schema.sql.) I'll add it when I create
test_copy_format.sql in the next patch set.

>> BTW, the current implementation always uses
>> pg_catalog.{text,csv,binary} for (not-qualified) "text",
>> "csv" and "binary" even when there are
>> myschema.{text,csv,binary}. See
>> src/test/modules/test_copy_format/sql/builtin.sql. But I
>> haven't looked into it why...
> 
> Sorry, I don't follow that. IIUC test_copy_format extension doesn't
> create a handler function in myschema schema, and SQLs in builtin.sql
> seem to work as expected (specifying a non-qualified built-in format
> unconditionally uses the built-in format).

Ah, sorry. I should have not used "myschema." in the text
with builtin.sql reference. I just wanted to say "qualified
text,csv,binary formats" by "myschema.{text,csv,binary}". In
builtin.sql uses "public" schema not "myschema"
schema. Sorry.

Yes. builtin.sql works as expected but I don't know why. I
don't add any special codes for them. If "test_copy_format"
exists in public schema, "FORMAT 'test_copy_format'" uses
it. But if "text" exists in public schema, "FORMAT 'text'"
doesn't uses it. ("pg_catalog.text" is used instead.)

>> We can do it but I suggest that we do it as a refactoring
>> (or cleanup) in a separated patch for easy to review.
> 
> I think that csv_mode and binary are used mostly in
> ProcessCopyOptions() so probably we can use local variables for that.
> I find there are two other places where to use csv_mode:
> NextCopyFromRawFields() and CopyToTextLikeStart(). I think we can
> simply check the handler function's OID there, or we can define macros
> like COPY_FORMAT_IS_TEXT/CSV/BINARY checking the OID and use them
> there.

We need this change for "ready for merge", right?


Can we clarify items should be resolved for "ready for
merge"?

Are the followings correct?

1. Move invalid input patterns in
   src/test/modules/test_copy_format/sql/invalid.sql to
   src/test/regress/sql/copy.sql as much as possible.
2. Create
   src/test/modules/test_copy_format/sql/test_copy_format.sql
   and move all contents in existing *.sql to the file.
3. Add comments what the tests expect to
   src/test/modules/test_copy_format/sql/test_copy_format.sql.
4. Remove CopyFormatOptions::{binary,csv_mode}.
5. Squash the "Support custom format" patch and the "Define
   handler functions for built-in formats" patch.
   * Could you do it when you push it? Or is it required for
     "ready for merge"?
6. Use handler OID for detecting the default built-in format
   instead of comparing the given format as string.
7. Update documentation.

There are 3 unconfirmed suggested changes for tests in:
https://www.postgresql.org/message-id/20250330.113126.433742864258096312.kou%40clear-code.com

Here are my opinions for them:

> 1.: There is no difference between single-quoting and
>     double-quoting here. Because the information what quote
>     was used for the given FORMAT value isn't remained
>     here. Should we update gram.y?
> 
> 2.: I don't have a strong opinion for it. If nobody objects
>     it, I'll remove them.
> 
> 3.: I don't have a strong opinion for it. If nobody objects
>     it, I'll remove them.

Is the 1. required for "ready for merge"? If so, is there
any suggestion? I don't have a strong opinion for it.

If there are no more opinions for 2. and 3., I'll remove
them.



Thanks,
-- 
kou



pgsql-hackers by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: Removing unneeded self joins
Next
From: Daniel Gustafsson
Date:
Subject: Re: Making sslrootcert=system work on Windows psql