Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers
From | Sutou Kouhei |
---|---|
Subject | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date | |
Msg-id | 20250718.190553.1172585000083080334.kou@clear-code.com Whole thread Raw |
In response to | Re: Make COPY format extendable: Extract COPY TO format implementations (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
Hi, In <CAD21AoAZL2RzPM4RLOJKm_73z5LXq2_VOVF+S+T0tnbjHdWTFA@mail.gmail.com> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Thu, 17 Jul 2025 13:44:11 -0700, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> > How about adding accessors instead of splitting >> > Copy{From,To}State to Copy{From,To}ExecutionData? If we use >> > the accessors approach, we can export only needed >> > information step by step without breaking ABI. > > Yeah, while it can export required fields without breaking ABI, I'm > concerned that setter and getter functions could be bloated if we need > to have them for many fields. In general, I choose this approach in my projects even when I need to define many accessors. Because I can hide implementation details from users. I can change implementation details without breaking API/ABI. But PostgreSQL isn't my project. Is there any guideline for PostgreSQL API(/ABI?) design that we can refer for this case? FYI: We need to export at least the following fields: https://www.postgresql.org/message-id/flat/20250714.173803.865595983884510428.kou%40clear-code.com#78fdbccf89742f856aa2cf95eaf42032 > FROM: > > - attnumlist (*) > - bytes_processed > - cur_attname > - escontext > - in_functions (*) > - input_buf > - input_reached_eof > - line_buf > - opts (*) > - raw_buf > - raw_buf_index > - raw_buf_len > - rel (*) > - typioparams (*) > > TO: > > - attnumlist (*) > - fe_msgbuf > - opts (*) Here are pros/cons of the Copy{From,To}ExecutionData approach, right? Pros: 1. We can hide internal data from extensions Cons: 1. Built-in format routines need to refer fields via Copy{From,To}ExecutionData. * This MAY has performance impact. If there is no performance impact, this is not a cons. 2. API/ABI compatibility will be broken when we change exported fields. * I'm not sure whether this is a cons in the PostgreSQL design. Here are pros/cons of the accessors approach: Pros: 1. We can hide internal data from extensions 2. We can export new fields change field names without breaking API/ABI compatibility 3. We don't need to change built-in format routines. So we can assume that there is no performance impact. Cons: 1. We may need to define many accessors * I'm not sure whether this is a cons in the PostgreSQL design. >> Another idea: We'll add Copy{From,To}State::opaque >> eventually. (For example, the v40-0003 patch includes it.) >> >> How about using it to hide fields only for built-in formats? > > What is the difference between your idea and splitting CopyToState > into CopyToState and CopyToExecutionData? 1. We don't need to manage 2 similar data for built-in formats and extensions. * Build-in formats use CopyToExecutionData and extensions use opaque. 2. We can introduce registration API now. * We can work on this topic AFTER we introduce registration API. * e.g.: Add registration API -> Add opaque -> Use opaque for internal fields (we will benchmark this implementation at this time) Thanks, -- kou
pgsql-hackers by date: