Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers
From | Sutou Kouhei |
---|---|
Subject | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date | |
Msg-id | 20250404.173839.2270460917518093037.kou@clear-code.com Whole thread Raw |
In response to | Re: Make COPY format extendable: Extract COPY TO format implementations (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
Hi, In <CAD21AoDOcYah-nREv09BB3ZoB-k+Yf1XUfJcDMoq=LLvV1v75w@mail.gmail.com> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Mon, 31 Mar 2025 12:35:23 -0700, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Most of the queries under test_copy_format/sql verifies the input > patterns of the FORMAT option. I find that the regression tests > included in that directory probably should focus on testing new > callback APIs and some regression tests for FORMAT option handling can > be moved into the normal regression test suite (e.g., in copy.sql or a > new file like copy_format.sql). IIUC testing for invalid input > patterns can be done even without creating artificial wrong handler > functions. Can we clarify what should we do for the next patch set before we create the next patch set? Are the followings correct? 1. Move invalid input patterns in src/test/modules/test_copy_format/sql/invalid.sql to src/test/regress/sql/copy.sql as much as possible. * We can do only 4 patterns in 16 patterns. * Other tests in src/test/modules/test_copy_format/sql/*.sql depend on custom COPY handler for test. So we can't move to src/test/regress/sql/copy.sql. 2. Create src/test/modules/test_copy_format/sql/test_copy_format.sql and move all contents in existing *.sql to the file > I'd like to see in the comment what the tests expect. Taking the > following queries as an example, > > COPY public.test FROM stdin WITH (FORMAT 'test_copy_format'); > \. > COPY public.test TO stdout WITH (FORMAT 'test_copy_format'); > > it would help readers understand the test case better if we have a > comment like for example: > > -- Specify the custom format name without schema. We test if both > -- COPY TO and COPY FROM can find the correct handler function > -- in public schema. I agree with you that the comment is useful when we use src/test/modules/test_copy_format/sql/test_copy_format.sql for all tests. (I feel that it's redundant when we use no_schema.sql.) I'll add it when I create test_copy_format.sql in the next patch set. >> BTW, the current implementation always uses >> pg_catalog.{text,csv,binary} for (not-qualified) "text", >> "csv" and "binary" even when there are >> myschema.{text,csv,binary}. See >> src/test/modules/test_copy_format/sql/builtin.sql. But I >> haven't looked into it why... > > Sorry, I don't follow that. IIUC test_copy_format extension doesn't > create a handler function in myschema schema, and SQLs in builtin.sql > seem to work as expected (specifying a non-qualified built-in format > unconditionally uses the built-in format). Ah, sorry. I should have not used "myschema." in the text with builtin.sql reference. I just wanted to say "qualified text,csv,binary formats" by "myschema.{text,csv,binary}". In builtin.sql uses "public" schema not "myschema" schema. Sorry. Yes. builtin.sql works as expected but I don't know why. I don't add any special codes for them. If "test_copy_format" exists in public schema, "FORMAT 'test_copy_format'" uses it. But if "text" exists in public schema, "FORMAT 'text'" doesn't uses it. ("pg_catalog.text" is used instead.) >> We can do it but I suggest that we do it as a refactoring >> (or cleanup) in a separated patch for easy to review. > > I think that csv_mode and binary are used mostly in > ProcessCopyOptions() so probably we can use local variables for that. > I find there are two other places where to use csv_mode: > NextCopyFromRawFields() and CopyToTextLikeStart(). I think we can > simply check the handler function's OID there, or we can define macros > like COPY_FORMAT_IS_TEXT/CSV/BINARY checking the OID and use them > there. We need this change for "ready for merge", right? Can we clarify items should be resolved for "ready for merge"? Are the followings correct? 1. Move invalid input patterns in src/test/modules/test_copy_format/sql/invalid.sql to src/test/regress/sql/copy.sql as much as possible. 2. Create src/test/modules/test_copy_format/sql/test_copy_format.sql and move all contents in existing *.sql to the file. 3. Add comments what the tests expect to src/test/modules/test_copy_format/sql/test_copy_format.sql. 4. Remove CopyFormatOptions::{binary,csv_mode}. 5. Squash the "Support custom format" patch and the "Define handler functions for built-in formats" patch. * Could you do it when you push it? Or is it required for "ready for merge"? 6. Use handler OID for detecting the default built-in format instead of comparing the given format as string. 7. Update documentation. There are 3 unconfirmed suggested changes for tests in: https://www.postgresql.org/message-id/20250330.113126.433742864258096312.kou%40clear-code.com Here are my opinions for them: > 1.: There is no difference between single-quoting and > double-quoting here. Because the information what quote > was used for the given FORMAT value isn't remained > here. Should we update gram.y? > > 2.: I don't have a strong opinion for it. If nobody objects > it, I'll remove them. > > 3.: I don't have a strong opinion for it. If nobody objects > it, I'll remove them. Is the 1. required for "ready for merge"? If so, is there any suggestion? I don't have a strong opinion for it. If there are no more opinions for 2. and 3., I'll remove them. Thanks, -- kou
pgsql-hackers by date: