Re: generic copy options - Mailing list pgsql-hackers

From Robert Haas
Subject Re: generic copy options
Date
Msg-id 603c8f070909201458x4e0a9b31w90d066fba37f143a@mail.gmail.com
Whole thread Raw
In response to Re: generic copy options  (Emmanuel Cecchet <manu@asterdata.com>)
Responses Re: generic copy options
List pgsql-hackers
On Sun, Sep 20, 2009 at 2:25 PM, Emmanuel Cecchet <manu@asterdata.com> wrote:
> Tom Lane wrote:
>> Emmanuel Cecchet <manu@asterdata.com> writes:
>>>
>>> Here you will force every format to use the same set of options
>>
>> How does this "force" any such thing?
>>
>
> As far as I understand it, every format will have to handle every format
> options that may exist so that they can either implement it or throw an
> error.

I don't think this is really true.  To be honest with you, I think
it's exactly backwards.  The way the option-parsing logic works, we
parse each option individually FIRST.  Then at the end we do
cross-checks to see whether there is an incompatibility in the
combination specified.  So if two different formats support the same
option, we just change the cross-check to say that foo is OK with
either format bar or format baz.  On the other hand, if we split the
option into bar_foo and baz_foo, then the first loop that does the
initial parsing has to support both cases, and then you still need a
separate cross-check for each one.

> That would argue in favor of a format option that defines the format. Right
> now I find it bogus to have to say (csv on, csv_header on). If csv_header is
> on that should imply csv on.
> The only problem I have is that it is not obvious what options are generic
> COPY options and what are options of an option (like format options).
> So maybe a tradeoff is to differentiate format specific options like in:
> (delimiter '.', format csv, format_header, format_escape...)
> This should also make clear if someone develops a new format what options
> need to be addressed.

I think this is a false dichotomy.  It isn't necessarily the case that
every format will support a delimiter option either.  For example, if
we were to add an XML or JSON format (which I'm not at all convinced
is a good idea, but I'm sure someone is going to propose it!) it
certainly won't support specifying an arbitrary delimiter.

IOW, *every* format will have different needs and we can't necessarily
know which options will be applicable to those needs.  But as long as
we agree that we won't use the same option for two different
format-specific options with wildly different semantics, I don't think
that undecorated names are going to cause us much trouble.  It's also
less typing.

> PS: I don't know why but as I write this message I already feel that Tom
> hates this new proposal :-D

I get those feeling sometimes myself.  :-)  Anyway, FWIW, I think Tom
has analyzed this one correctly...

...Robert


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: operator exclusion constraints [was: generalized index constraints]
Next
From: Andrew Gierth
Date:
Subject: Re: updated hstore patch