Thread: Re: [PATCHES] WITH DELIMITERS in COPY

Re: [PATCHES] WITH DELIMITERS in COPY

From
Gavin Sherry
Date:
Hi Bruce,

On Tue, 5 Mar 2002, Bruce Momjian wrote:

> 
> Seems the original title about "feature causes performance in COPY" was
> confusing. 

Oops.

> This patch merely fixes the identified TODO item in the
> grammar about using WITH in COPY.

Now that I look at this patch again I don't think I like the
syntax.

COPY [BINARY] <relation> [WITH OIDS] TO | FROM <file> [[USING DELIMITERS |
WITH DELIMITER] <delimiter> [WITH NULL AS <char>]

It isn't very elegant.

1) I forced the parser to be able to handle multiple WITHs, but that
doesn't mean its right. I can't remember why I didn't propose a better
syntax back then, such as:

... [WITH [DELIMITER <delimiter>,] [NULL AS <char>]]

2) Given (1), Why does WITH OIDS belong where it is now? Why not have it
as an 'option' at the end?

Anyone have any opinion on this?




Re: [PATCHES] WITH DELIMITERS in COPY

From
Tom Lane
Date:
Gavin Sherry <swm@linuxworld.com.au> writes:
> Now that I look at this patch again I don't think I like the
> syntax.

> COPY [BINARY] <relation> [WITH OIDS] TO | FROM <file> [[USING DELIMITERS |
> WITH DELIMITER] <delimiter> [WITH NULL AS <char>]

> It isn't very elegant.

> 1) I forced the parser to be able to handle multiple WITHs, but that
> doesn't mean its right.

It seems wrong to me.  The other statements that use WITH use only one
WITH to introduce a list of option clauses.
I can't remember why I didn't propose a better
> syntax back then, such as:

> ... [WITH [DELIMITER <delimiter>,] [NULL AS <char>]]

The other statements you might model this on don't use commas either.
The closest thing to the precedents would be
... [WITH copyoption [copyoption ...]]
copyoption := DELIMITER delim              | NULL AS nullstring              | etc

To get some modicum of backwards compatibility, we could allow either
DELIMITER or DELIMITERS as a copy-option keyword, and we could allow
USING as a substitute for the initial WITH.  This still won't be quite
backwards compatible for statements that use both of the old option
clauses; how worried are we about that?  Maybe we could persuade the
parser to handle
... [ WITH | USING copyoption [ [WITH] copyoption ... ]]

but my that seems ugly.

> 2) Given (1), Why does WITH OIDS belong where it is now? Why not have it
> as an 'option' at the end?

Historical precedent, mainly.  Changing this would break existing
pg_dump files, so I'm not eager to do it.  (AFAIR pg_dump never uses
DELIMITERS nor NULL AS, so it doesn't care if you change that part
of the syntax.)

If we were working in a green field I'd vote for moving BINARY into the
WITH-options too, but we aren't.  Again that seems too likely to break
things in the name of a small amount of added consistency.
        regards, tom lane


Re: [PATCHES] WITH DELIMITERS in COPY

From
Lee Kindness
Date:
Tom Lane writes:> [ COPY syntax ]> If we were working in a green field I'd vote for moving BINARY into the>
WITH-optionstoo, but we aren't.  Again that seems too likely to break> things in the name of a small amount of added
consistency.

Hawabout 'COPY ...' retains the existing syntax but if used like 'COPY
TABLE ...' sane syntax is used?


Re: [PATCHES] WITH DELIMITERS in COPY

From
Bruce Momjian
Date:
Lee Kindness wrote:
> Tom Lane writes:
>  > [ COPY syntax ]
>  > If we were working in a green field I'd vote for moving BINARY into the
>  > WITH-options too, but we aren't.  Again that seems too likely to break
>  > things in the name of a small amount of added consistency.
> 
> Hawabout 'COPY ...' retains the existing syntax but if used like 'COPY
> TABLE ...' sane syntax is used?

Interesting idea, but TABLE just seems to be a noise word to me.  COPY
is one of those commands that it is hard to change because pg_dump
relies on it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [PATCHES] WITH DELIMITERS in COPY

From
Bruce Momjian
Date:
Gavin, I will do the legwork on this if you wish.  I think we need to
use DefElem to store the COPY params, rather than using specific fields
in CopyStmt.

Would you send me your original patch so I am make sure I hit
everything.  I can't seem to find a copy.  If you would like to work on
it, I can give you what I have and walk you through the process.

---------------------------------------------------------------------------

Gavin Sherry wrote:
> Hi Bruce,
> 
> On Tue, 5 Mar 2002, Bruce Momjian wrote:
> 
> > 
> > Seems the original title about "feature causes performance in COPY" was
> > confusing. 
> 
> Oops.
> 
> > This patch merely fixes the identified TODO item in the
> > grammar about using WITH in COPY.
> 
> Now that I look at this patch again I don't think I like the
> syntax.
> 
> COPY [BINARY] <relation> [WITH OIDS] TO | FROM <file> [[USING DELIMITERS |
> WITH DELIMITER] <delimiter> [WITH NULL AS <char>]
> 
> It isn't very elegant.
> 
> 1) I forced the parser to be able to handle multiple WITHs, but that
> doesn't mean its right. I can't remember why I didn't propose a better
> syntax back then, such as:
> 
> ... [WITH [DELIMITER <delimiter>,] [NULL AS <char>]]
> 
> 2) Given (1), Why does WITH OIDS belong where it is now? Why not have it
> as an 'option' at the end?
> 
> Anyone have any opinion on this?
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [PATCHES] WITH DELIMITERS in COPY

From
Gavin Sherry
Date:
On Sun, 14 Apr 2002, Bruce Momjian wrote:

> 
> Gavin, I will do the legwork on this if you wish.  I think we need to

No matter. I intended to submit a patch to fix this.

> use DefElem to store the COPY params, rather than using specific fields
> in CopyStmt.

DefElem would have required modification of code outside the parser (to
keep utility.c and DoCopy() happy) or otherwise an even messier loop
executed as a result of CopyStmt than I have given in the attached patch, 
plus other issues with Yacc.

The patch attached maintains backward compatibility. The syntax is as
follows:

COPY [BINARY] <relname> [WITH OIDS] FROM/TO             [USING DELIMITERS <delimiter>]             [WITH [ DELIMITER
<delimiter>| NULL AS <char> | OIDS ]]
 

I was also going to allow BINARY in the WITH list, but there seems to be
little point.

Note that if you execute a query such as:

COPY pg_class TO '/some/path/file/out'USING DELIMITERS <tab>WITH DELIMITER '|';

The code will give preference to WITH DELIMITER.

If no one can find fault with this or my implementation, I'll follow up
with documentation and psql patches (not sure that there is much point
patching pg_dump).

Gavin

Re: [PATCHES] WITH DELIMITERS in COPY

From
Peter Eisentraut
Date:
Gavin Sherry writes:

> The patch attached maintains backward compatibility. The syntax is as
> follows:
>
> COPY [BINARY] <relname> [WITH OIDS] FROM/TO
>               [USING DELIMITERS <delimiter>]
>               [WITH [ DELIMITER <delimiter> | NULL AS <char> | OIDS ]]

I think we should lose the WITH altogether.  It's not any better than
USING.

But just saying "OIDS" is not very clear.  In this case the WITH is
necessary.

> Note that if you execute a query such as:
>
> COPY pg_class TO '/some/path/file/out'
>     USING DELIMITERS <tab>
>     WITH DELIMITER '|';
>
> The code will give preference to WITH DELIMITER.

That should be an error.

> If no one can find fault with this or my implementation, I'll follow up
> with documentation and psql patches (not sure that there is much point
> patching pg_dump).

pg_dump should use the new syntax.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [PATCHES] WITH DELIMITERS in COPY

From
Bruce Momjian
Date:
Gavin, I see where you are going with the patch;  creating a list in
gram.y and stuffing CopyStmt directly there.  However, I can't find any
other instance of our stuffing things like that in gram.y.  We do have
cases using options like COPY in CREATE USER, and we do use DefElem.

I realize it will require changes to other files like copy.c.  However,
it seems like the cleanest solution.  I guess I am not excited about
adding another way to handle WITH options into the code.  Now, if you
want to argue that CREATE USER shouldn't use DefElem either, we can
discuss that, but I think we need to be consistent in how we handle COPY
vs. the other commands that use parameters.  

See commands/user.c for an example of how it uses DefElem, and the tests
done to make sure conflicting arguments are not used or in copy's case,
specified twice.  It just seems like that is the cleanest way to go.

One idea I had for the code is to allow BINARY as $2, and WITH OIDS in
its current place, and all options in the new WITH location, and
concatentate them together into one DefElem list in gram.y, and pass
that to copy.c.  That way, you can allow BINARY and others at the end
too and the list is in one central place.

---------------------------------------------------------------------------

Gavin Sherry wrote:
> On Sun, 14 Apr 2002, Bruce Momjian wrote:
> 
> > 
> > Gavin, I will do the legwork on this if you wish.  I think we need to
> 
> No matter. I intended to submit a patch to fix this.
> 
> > use DefElem to store the COPY params, rather than using specific fields
> > in CopyStmt.
> 
> DefElem would have required modification of code outside the parser (to
> keep utility.c and DoCopy() happy) or otherwise an even messier loop
> executed as a result of CopyStmt than I have given in the attached patch, 
> plus other issues with Yacc.
> 
> The patch attached maintains backward compatibility. The syntax is as
> follows:
> 
> COPY [BINARY] <relname> [WITH OIDS] FROM/TO
>               [USING DELIMITERS <delimiter>]
>               [WITH [ DELIMITER <delimiter> | NULL AS <char> | OIDS ]]
> 
> I was also going to allow BINARY in the WITH list, but there seems to be
> little point.
> 
> Note that if you execute a query such as:
> 
> COPY pg_class TO '/some/path/file/out'
>     USING DELIMITERS <tab>
>     WITH DELIMITER '|';
> 
> The code will give preference to WITH DELIMITER.
> 
> If no one can find fault with this or my implementation, I'll follow up
> with documentation and psql patches (not sure that there is much point
> patching pg_dump).
> 
> Gavin

Content-Description: 

[ Attachment, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [PATCHES] WITH DELIMITERS in COPY

From
Gavin Sherry
Date:
On Sun, 14 Apr 2002, Bruce Momjian wrote:

> 
> Gavin, I see where you are going with the patch;  creating a list in
> gram.y and stuffing CopyStmt directly there.  However, I can't find any
> other instance of our stuffing things like that in gram.y.  We do have
> cases using options like COPY in CREATE USER, and we do use DefElem.

CREATE DATABASE also fills out a list in the same fashion =). I will
however have a look at revising this patch to use DefElem later today.

Thanks,

Gavin



Re: [PATCHES] WITH DELIMITERS in COPY

From
Bruce Momjian
Date:
Gavin Sherry wrote:
> On Sun, 14 Apr 2002, Bruce Momjian wrote:
> 
> > 
> > Gavin, I see where you are going with the patch;  creating a list in
> > gram.y and stuffing CopyStmt directly there.  However, I can't find any
> > other instance of our stuffing things like that in gram.y.  We do have
> > cases using options like COPY in CREATE USER, and we do use DefElem.
> 
> CREATE DATABASE also fills out a list in the same fashion =). I will
> however have a look at revising this patch to use DefElem later today.

Oh, I see that now.  Which method do people prefer.  We should probably
make them all use the same mechanism.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [PATCHES] WITH DELIMITERS in COPY

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Gavin Sherry wrote:
>> CREATE DATABASE also fills out a list in the same fashion =). I will
>> however have a look at revising this patch to use DefElem later today.

> Oh, I see that now.  Which method do people prefer.  We should probably
> make them all use the same mechanism.

Consistency?  Who needs consistency ;-) ?

Seriously, I do not see a need to change either of these approaches
just for the sake of changing it.  CREATE DATABASE is okay as-is, and
so are the statements that use DefElem.  I tend to like DefElem better
for the statements that we change around frequently ... for instance
the recent changes to the set of volatility keywords for functions
didn't require any changes to the grammar or the parsenode definitions.
But I think that a simple struct definition is easier to understand,
so I favor that for stable feature sets.

As for which one is better suited for COPY, I don't have a strong
opinion, but lean to DefElem.  Seems like COPY will probably keep
accreting new features.
        regards, tom lane


Re: [PATCHES] WITH DELIMITERS in COPY

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Gavin Sherry wrote:
> >> CREATE DATABASE also fills out a list in the same fashion =). I will
> >> however have a look at revising this patch to use DefElem later today.
> 
> > Oh, I see that now.  Which method do people prefer.  We should probably
> > make them all use the same mechanism.
> 
> Consistency?  Who needs consistency ;-) ?
> 
> Seriously, I do not see a need to change either of these approaches
> just for the sake of changing it.  CREATE DATABASE is okay as-is, and
> so are the statements that use DefElem.  I tend to like DefElem better
> for the statements that we change around frequently ... for instance
> the recent changes to the set of volatility keywords for functions
> didn't require any changes to the grammar or the parsenode definitions.
> But I think that a simple struct definition is easier to understand,
> so I favor that for stable feature sets.
> 
> As for which one is better suited for COPY, I don't have a strong
> opinion, but lean to DefElem.  Seems like COPY will probably keep
> accreting new features.

The code that bothered me about the CREATE DATABASE param processing
was:
   /* process additional options */   foreach(l, $5)   {List   *optitem = (List *) lfirst(l);   switch
(lfirsti(optitem)){   case 1:       n->dbpath = (char *) lsecond(optitem);    break;         case 2:      n->dbtemplate
=(char *) lsecond(optitem);    break;    case 3:    n->encoding = lfirsti(lnext(optitem));    break;    case 4:
n->dbowner= (char *) lsecond(optitem);    break;}   }
 

I see what it is doing, but it seems quite unclear.  Seeing that people
are using this as a pattern for other param processing, I will work on a
patch to convert this to DefElem.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [PATCHES] WITH DELIMITERS in COPY

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> The code that bothered me about the CREATE DATABASE param processing
> was:

>     /* process additional options */
>     foreach(l, $5)
>     {
>     List   *optitem = (List *) lfirst(l);   
>     switch (lfirsti(optitem))
>     {
>         case 1:   
>         n->dbpath = (char *) lsecond(optitem);
>         break;     
>         case 2:  
>         n->dbtemplate = (char *) lsecond(optitem);
>         break;
>         case 3:
>         n->encoding = lfirsti(lnext(optitem));
>         break;
>         case 4:
>         n->dbowner = (char *) lsecond(optitem);
>         break;
>     }
>     }

> I see what it is doing, but it seems quite unclear.  Seeing that people
> are using this as a pattern for other param processing, I will work on a
> patch to convert this to DefElem.

Oh, I think we were talking at cross-purposes then.  What you're really
unhappy about is that this uses a list of two-element sublists?  Yeah,
I agree, that's a messy data structure; a list of DefElem would be
perhaps cleaner.  Not sure if it matters all that much though, since the
list only exists in the context of a few productions in gram.y.  Perhaps
adding a couple of lines of documentation would be better than changing
the code.
        regards, tom lane


Re: [PATCHES] WITH DELIMITERS in COPY

From
Bruce Momjian
Date:
Tom Lane wrote:
> Oh, I think we were talking at cross-purposes then.  What you're really
> unhappy about is that this uses a list of two-element sublists?  Yeah,
> I agree, that's a messy data structure; a list of DefElem would be
> perhaps cleaner.  Not sure if it matters all that much though, since the
> list only exists in the context of a few productions in gram.y.  Perhaps
> adding a couple of lines of documentation would be better than changing
> the code.

Yea, documenation and/or a list of DefElem would be nicer.  The problem
is that people are going to copy this in the future, so I may as well do
it right.  Can't take more than 20 minutes.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [PATCHES] WITH DELIMITERS in COPY

From
Gavin Sherry
Date:
On Tue, 16 Apr 2002, Bruce Momjian wrote:

> The code that bothered me about the CREATE DATABASE param processing
> was:
> 
>     /* process additional options */
>     foreach(l, $5)
>     {
>     List   *optitem = (List *) lfirst(l);
>     
>     switch (lfirsti(optitem))
>     {
>         case 1:   
>         n->dbpath = (char *) lsecond(optitem);
>         break;     
>         case 2:  
>         n->dbtemplate = (char *) lsecond(optitem);
>         break;
>         case 3:
>         n->encoding = lfirsti(lnext(optitem));
>         break;
>         case 4:
>         n->dbowner = (char *) lsecond(optitem);
>         break;
>     }
>     }
> 
> I see what it is doing, but it seems quite unclear.  Seeing that people
> are using this as a pattern for other param processing, I will work on a
> patch to convert this to DefElem.

Wouldn't a few macros clean this up better (ie, make it clearer)?

#define CDBOPTDBPATH 1

#define optparam(l)    (char *)lsecond(l)
#define optparami(l)    (int)lfirsti(lnext(l))
    foreach(l, $5)    {      List   *optitem = (List *) lfirst(l);
      switch (lfirsti(optitem))      {          case CDBOPTDBPATH:              n->dbpath = optparam(optitem);
   break;
 

...


Regardless, I guess that code is pointless since the consensus seems to be
that the use of DefElem is better since it allows for the abstraction of
the parameters list. Obviously a good thing if CREATE DATABASE, COPY etc
are to be extended often enough.

Gavin



Re: [PATCHES] WITH DELIMITERS in COPY

From
Bruce Momjian
Date:
Gavin Sherry wrote:
> > I see what it is doing, but it seems quite unclear.  Seeing that people
> > are using this as a pattern for other param processing, I will work on a
> > patch to convert this to DefElem.
> 
> Wouldn't a few macros clean this up better (ie, make it clearer)?
> 
> #define CDBOPTDBPATH 1
> 
> #define optparam(l)    (char *)lsecond(l)
> #define optparami(l)    (int)lfirsti(lnext(l))
> 
>      foreach(l, $5)
>      {
>        List   *optitem = (List *) lfirst(l);
> 
>        switch (lfirsti(optitem))
>        {
>            case CDBOPTDBPATH:
>                n->dbpath = optparam(optitem);
>                break;
> 
> ...
> 
> 
> Regardless, I guess that code is pointless since the consensus seems to be
> that the use of DefElem is better since it allows for the abstraction of
> the parameters list. Obviously a good thing if CREATE DATABASE, COPY etc
> are to be extended often enough.
> 

Yes, macros would be the way to go if we didn't have a cleaner
alternative.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026