Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id CAD21AoDjrMWf4bmqGGt8py3xEs98JPUH0ePBD2LV5D4Ts5JpYQ@mail.gmail.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  (Junwang Zhao <zhjwpku@gmail.com>)
Responses Re: Make COPY format extendable: Extract COPY TO format implementations
List pgsql-hackers
On Mon, Jan 29, 2024 at 12:10 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
>
> On Mon, Jan 29, 2024 at 10:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Jan 26, 2024 at 6:02 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
> > >
> > > On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei <kou@clear-code.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > In <CAEG8a3KhS6s1XQgDSvc8vFTb4GkhBmS8TxOoVSDPFX+MPExxxQ@mail.gmail.com>
> > > >   "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 26 Jan 2024 16:41:50 +0800,
> > > >   Junwang Zhao <zhjwpku@gmail.com> wrote:
> > > >
> > > > > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > > > > single option, and store the options in the opaque field,  but it can not
> > > > > check the relation of two options, for example, considering json format,
> > > > > the `header` option can not be handled by these two functions.
> > > > >
> > > > > I want to find a way when the user specifies the header option, customer
> > > > > handler can error out.
> > > >
> > > > Ah, you want to use a built-in option (such as "header")
> > > > value from a custom handler, right? Hmm, it may be better
> > > > that we call CopyToProcessOption()/CopyFromProcessOption()
> > > > for all options including built-in options.
> > > >
> > > Hmm, still I don't think it can handle all cases, since we don't know
> > > the sequence of the options, we need all the options been parsed
> > > before we check the compatibility of the options, or customer
> > > handlers will need complicated logic to resolve that, which might
> > > lead to ugly code :(
> > >
> >
> > Does it make sense to pass only non-builtin options to the custom
> > format callback after parsing and evaluating the builtin options? That
> > is, we parse and evaluate only the builtin options and populate
> > opts_out first, then pass each rest option to the custom format
> > handler callback. The callback can refer to the builtin option values.
>
> Yeah, I think this makes sense.
>
> > The callback is expected to return false if the passed option is not
> > supported. If one of the builtin formats is specified and the rest
> > options list has at least one option, we raise "option %s not
> > recognized" error.  IOW it's the core's responsibility to ranse the
> > "option %s not recognized" error, which is in order to raise a
> > consistent error message. Also, I think the core should check the
> > redundant options including bultiin and custom options.
>
> It would be good that core could check all the redundant options,
> but where should core do the book-keeping of all the options? I have
> no idea about this, in my implementation of pg_copy_json extension,
> I handle redundant options by adding a xxx_specified field for each
> xxx.

What I imagined is that while parsing the all specified options, we
evaluate builtin options and we add non-builtin options to another
list. Then when parsing a non-builtin option, we check if this option
already exists in the list. If there is, we raise the "option %s not
recognized" error.". Once we complete checking all options, we pass
each option in the list to the callback.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500
Next
From: Will Mortensen
Date:
Subject: Re: Exposing the lock manager's WaitForLockers() to SQL