Thread: COPY with column headings

COPY with column headings

From

Bruce Momjian

Date:

16 August 2004, 11:53:40

Someone just asked about a COPY capability to supply the column headings
as the first line of the copy statement.  Do we want to support
something like that?  Does anyone else want such functionality?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: COPY with column headings

From

Tom Lane

Date:

16 August 2004, 12:07:43

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Someone just asked about a COPY capability to supply the column headings
> as the first line of the copy statement.  Do we want to support
> something like that?

No.  Tell him to use SELECT.

(Such a capability would be a permanent hazard, because any time you
tried to import with the opposite setting from what you'd exported,
you'd either get a failure, or bad data, or silent loss of the first
line of real data.)
        regards, tom lane

Re: COPY with column headings

From

"Jeroen T. Vermeulen"

Date:

16 August 2004, 12:16:14

On Mon, Aug 16, 2004 at 10:53:36AM -0400, Bruce Momjian wrote:
> Someone just asked about a COPY capability to supply the column headings
> as the first line of the copy statement.  Do we want to support
> something like that?  Does anyone else want such functionality?

Wouldn't it be more logical, and more compatible, to keep this information
as regular column information in the PGresult returned by the COPY?  Or
would that clash with the difference between "command" and "query" result
handling?

Jeroen

Re: COPY with column headings

From

Andrew Dunstan

Date:

16 August 2004, 12:22:52

Bruce Momjian wrote:

>Someone just asked about a COPY capability to supply the column headings
>as the first line of the copy statement.  Do we want to support
>something like that?  Does anyone else want such functionality?
>
>  
>

I had it in mind all along as a possible option for CSV mode, but given 
the heat that was generated by the minimal implementation we did, I 
thought I'd let it rest.

I certainly think it's worth considering, although I see Tom has 
objected :-) . Say we have an option called FIRSTLINELABELS, then on 
copy out it would write the headings on the first line, and on copy in 
it could just ignore the first line (so it could work symmetrically).

cheers

andrew

Re: COPY with column headings

From

Bruce Momjian

Date:

16 August 2004, 12:28:04

Andrew Dunstan wrote:
> 
> 
> Bruce Momjian wrote:
> 
> >Someone just asked about a COPY capability to supply the column headings
> >as the first line of the copy statement.  Do we want to support
> >something like that?  Does anyone else want such functionality?
> >
> >  
> >
> 
> I had it in mind all along as a possible option for CSV mode, but given 
> the heat that was generated by the minimal implementation we did, I 
> thought I'd let it rest.
> 
> I certainly think it's worth considering, although I see Tom has 
> objected :-) . Say we have an option called FIRSTLINELABELS, then on 
> copy out it would write the headings on the first line, and on copy in 
> it could just ignore the first line (so it could work symmetrically).

Yes, that is what I was thinking.  Is this a TODO?  Sure it can be
misused but most copy options can be similarly misused.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: COPY with column headings

From

Tom Lane

Date:

16 August 2004, 12:30:51

"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes:
> Wouldn't it be more logical, and more compatible, to keep this information
> as regular column information in the PGresult returned by the COPY?

That would work but would require a protocol change, which this is
surely not worth by itself.  (We could put it on the to-do list for the
next protocol rev though.)

The bigger question is whether this would do anything to satisfy the
requestor.  He probably wants the headings to appear in the file
resulting from COPY TO file (or the psql equivalent), which this would
not do.  Providing the info in the COPY PGresult would change nothing
except the behavior of client applications specially rewritten to use it.
        regards, tom lane

Re: COPY with column headings

From

"Jeroen T. Vermeulen"

Date:

16 August 2004, 12:45:32

On Mon, Aug 16, 2004 at 11:30:49AM -0400, Tom Lane wrote:

> The bigger question is whether this would do anything to satisfy the
> requestor.  He probably wants the headings to appear in the file
> resulting from COPY TO file (or the psql equivalent), which this would
> not do.  Providing the info in the COPY PGresult would change nothing
> except the behavior of client applications specially rewritten to use it.

True, and from a compatibility standpoint that would probably be good...

Output post-processing (translation to CSV or XML output, XSLT transforms,
what have you) IMHO should probably go on top of COPY anyway, rather than
into it.  Are people really having problems with
select * from table where 1=0 ; copy table to stdout

and first printing the column names found in the first PGresult, then
dumping the COPY lines to the same fd?  I think that should do the trick for
most uses, although obviously I'm no expert.

Jeroen

Re: COPY with column headings

From

Tom Lane

Date:

16 August 2004, 12:49:58

Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> I certainly think it's worth considering, although I see Tom has 
>> objected :-) . Say we have an option called FIRSTLINELABELS, then on 
>> copy out it would write the headings on the first line, and on copy in 
>> it could just ignore the first line (so it could work symmetrically).

> Yes, that is what I was thinking.  Is this a TODO?  Sure it can be
> misused but most copy options can be similarly misused.

Most copy options will result in obvious failures if misused.  I dislike
this one because of the high risk of silent corruption of your data
(loss of a row or insertion of a row that shouldn't be there).
Considering no one has even made a positive case why we should add it
(for instance, a use-case where it's necessary), I don't think it
belongs in the to-do list.
        regards, tom lane

Re: COPY with column headings

From

Andrew Dunstan

Date:

16 August 2004, 15:41:30


Tom Lane wrote:

>Bruce Momjian <pgman@candle.pha.pa.us> writes:
>  
>
>>>I certainly think it's worth considering, although I see Tom has 
>>>objected :-) . Say we have an option called FIRSTLINELABELS, then on 
>>>copy out it would write the headings on the first line, and on copy in 
>>>it could just ignore the first line (so it could work symmetrically).
>>>      
>>>
>
>  
>
>>Yes, that is what I was thinking.  Is this a TODO?  Sure it can be
>>misused but most copy options can be similarly misused.
>>    
>>
>
>Most copy options will result in obvious failures if misused.  I dislike
>this one because of the high risk of silent corruption of your data
>(loss of a row or insertion of a row that shouldn't be there).
>Considering no one has even made a positive case why we should add it
>(for instance, a use-case where it's necessary), I don't think it
>belongs in the to-do list.
>
>
>  
>

The use case is that it fits in with the way spreadsheets usually do 
data tables, and many will only allow you to export a whole worksheet 
(including the heading row) to CSV, not a part of one. Conversely, 
working with imported data tables will be harder if they lack headings.

cheers

andrew

Re: COPY with column headings

From

Tom Lane

Date:

16 August 2004, 16:04:26

Andrew Dunstan <andrew@dunslane.net> writes:
> The use case is that it fits in with the way spreadsheets usually do 
> data tables, and many will only allow you to export a whole worksheet 
> (including the heading row) to CSV, not a part of one. Conversely, 
> working with imported data tables will be harder if they lack headings.

Well, if we wanted to *require* a heading row in CSV mode, it would be
relatively foolproof.  What I don't like is the proposal for an
independent option; you've got a 50-50 chance of getting it wrong on
import, and that's too high odds for me.

Next question: are you imagining that the header row will actually have
any semantic significance on input?  Will we check the column names?
Will we be willing to rearrange the columns if the header row claims the
column order is different than the COPY command says?
        regards, tom lane

Re: COPY with column headings

From

Andrew Dunstan

Date:

16 August 2004, 16:27:49

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>The use case is that it fits in with the way spreadsheets usually do 
>>data tables, and many will only allow you to export a whole worksheet 
>>(including the heading row) to CSV, not a part of one. Conversely, 
>>working with imported data tables will be harder if they lack headings.
>>    
>>
>
>Well, if we wanted to *require* a heading row in CSV mode, it would be
>relatively foolproof.  What I don't like is the proposal for an
>independent option; you've got a 50-50 chance of getting it wrong on
>import, and that's too high odds for me.
>  
>

No, we can't require it. Not all will have headers, and then we sure 
would skip a wanted row. And while my use case referred to spreadsheets, 
they are not the only sources/recipients of CSVs.

Your 50-50 chance assumes the user knows nothing about the data and uses 
the switch at random. I think we're entitled to assume the user knows 
something about their data and uses the switch according to what they 
have/want.

Right now, if you have a CSV with a header line you have to remove the 
line with something like 'sed 1d' before importing to PostgreSQL, and 
when importing from PostgreSQL you have to insert the headers manually 
after importing the CSV to your spreadsheet. That's got to be annoying 
and at least as error-prone as providing a switch to accomodate header 
lines.

>Next question: are you imagining that the header row will actually have
>any semantic significance on input?  Will we check the column names?
>Will we be willing to rearrange the columns if the header row claims the
>column order is different than the COPY command says?
>
>
>  
>

I'm leaning towards not trying to interpret the line on copy in. 
Certainly that is the minimalist way to approach this.

cheers

andrew

Re: COPY with column headings

From

"Dann Corbit"

Date:

16 August 2004, 16:29:58

Might be worthwhile to look at SQL*Server BCP format files:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/adminsq
l/ad_impt_bcp_9yat.asp
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref
/ts_ba-bz_4fec.asp

A TCL/TK front end would be especially nice.  Of course, being a lazy
slug, I am not volunteering to write any of this.  Just a thought in
case someone else is interested in doing it.

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Monday, August 16, 2004 12:04 PM
> To: Andrew Dunstan
> Cc: Bruce Momjian; PostgreSQL-development
> Subject: Re: [HACKERS] COPY with column headings
>
>
> Andrew Dunstan <andrew@dunslane.net> writes:
> > The use case is that it fits in with the way spreadsheets usually do
> > data tables, and many will only allow you to export a whole
> worksheet
> > (including the heading row) to CSV, not a part of one. Conversely,
> > working with imported data tables will be harder if they
> lack headings.
>
> Well, if we wanted to *require* a heading row in CSV mode, it
> would be relatively foolproof.  What I don't like is the
> proposal for an independent option; you've got a 50-50 chance
> of getting it wrong on import, and that's too high odds for me.
>
> Next question: are you imagining that the header row will
> actually have any semantic significance on input?  Will we
> check the column names? Will we be willing to rearrange the
> columns if the header row claims the column order is
> different than the COPY command says?
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>               http://archives.postgresql.org

Re: COPY with column headings

From

Bruce Momjian

Date:

16 August 2004, 16:38:51

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > The use case is that it fits in with the way spreadsheets usually do 
> > data tables, and many will only allow you to export a whole worksheet 
> > (including the heading row) to CSV, not a part of one. Conversely, 
> > working with imported data tables will be harder if they lack headings.
> 
> Well, if we wanted to *require* a heading row in CSV mode, it would be
> relatively foolproof.  What I don't like is the proposal for an
> independent option; you've got a 50-50 chance of getting it wrong on
> import, and that's too high odds for me.

But we have other CVS options with even lower odds of success on a
random guess.  People have to know if their input file has headings,
period.  Telling people they can't have the option because they might
get it wrong seems strange to me.

> Next question: are you imagining that the header row will actually have
> any semantic significance on input?  Will we check the column names?
> Will we be willing to rearrange the columns if the header row claims the
> column order is different than the COPY command says?

I assume no semantic significance to header row values on input.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: COPY with column headings

From

CSN

Date:

17 August 2004, 01:33:07

That'd likely be useful. Either specify the column
names with COPY or \copy, or put them in the data file
and pass an option to the command to look for them.
The only time this could be a problem is if you forget
to tell COPY to look for the field names in the file
(AND specify the field names to COPY), and the values
for the field names are such that they don't cause
data errors (probably pretty slim chance).

I'd also really like to see the ability to specify
default values for fields not included in the data
file (field defaults in the table definition is
limited and doesn't fill the order when you need to
specify various foreign key ids, specific times,
etc.).

CSN


>>>>>>>>>>>>>>>>>>>
Someone just asked about a COPY capability to supply
the column headings
as the first line of the copy statement.  Do we want
to support
something like that?  Does anyone else want such functionality?

    
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail

Re: COPY with column headings

From

Bruce Momjian

Date:

18 August 2004, 23:36:34

Added to TODO:

>       o Allow COPY to optionally include column headings as the first

I know Tom didn't like it but there were others who did.

---------------------------------------------------------------------------

Andrew Dunstan wrote:
> 
> 
> Tom Lane wrote:
> 
> >Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >  
> >
> >>>I certainly think it's worth considering, although I see Tom has 
> >>>objected :-) . Say we have an option called FIRSTLINELABELS, then on 
> >>>copy out it would write the headings on the first line, and on copy in 
> >>>it could just ignore the first line (so it could work symmetrically).
> >>>      
> >>>
> >
> >  
> >
> >>Yes, that is what I was thinking.  Is this a TODO?  Sure it can be
> >>misused but most copy options can be similarly misused.
> >>    
> >>
> >
> >Most copy options will result in obvious failures if misused.  I dislike
> >this one because of the high risk of silent corruption of your data
> >(loss of a row or insertion of a row that shouldn't be there).
> >Considering no one has even made a positive case why we should add it
> >(for instance, a use-case where it's necessary), I don't think it
> >belongs in the to-do list.
> >
> >
> >  
> >
> 
> The use case is that it fits in with the way spreadsheets usually do 
> data tables, and many will only allow you to export a whole worksheet 
> (including the heading row) to CSV, not a part of one. Conversely, 
> working with imported data tables will be harder if they lack headings.
> 
> cheers
> 
> andrew
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073