Re: COPY and default values - Mailing list pgsql-patches
From | Brent Verner |
---|---|
Subject | Re: COPY and default values |
Date | |
Msg-id | 20020528035908.GA62382@rcfile.org Whole thread Raw |
In response to | Re: COPY and default values (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-patches |
[2002-05-27 19:28] Tom Lane said: | Neil Conway <nconway@klamath.dyndns.org> writes: | > Can you elaborate on exactly how you'd like to see COPY's behavior | > changed? What's the rationale for disallowing missing fields in | > COPY input? | | Robustness. We've seen problem reports in the past when COPY got out of | sync with the data stream because of loss or insertion of a newline (eg, | due to faulty quoting logic in an application; this scenario is not as | improbable as you might think). Rejecting lines that don't seem to have | the right number of fields would go a long way towards detecting that | sort of mistake. And, since the existing behavior with unexpected | numbers of fields has never been documented, no one feels any strong | compunction about changing it. | | This has been hashed out in several prior threads, and in fact I think | Brent Verner has a nearly-ready-to-go patch implementing the agreed-on | behavior. Now that I think about it, your patch is heading off in a | quite different direction from what was agreed to: we were never | intending that omitting trailing fields would be the cue for inserting | default values. Without a way to specify a target column list, that's | far too restrictive; and with it, it's unnecessary. attached is an updated patch (copy.colspec.20020527.diff), against HEAD, implementing the following COPY syntax COPY sleep (n,o,w) FROM stdin; where (n,o,w) may be a subset of columns in the "sleep" table. If a column is not specified, and has an attribute default, the default value will be evaluated and used. There is no support for COPY (<attlist>) TO yet, but that should not be too hairy. A second patch for pg_dump to support this syntax is attached, but I have not even tried to apply it, sorry. The pg_dump patch is against a circa 7.2-RELEASE cvs head. I attach it in case someone else wants to help wrap this up. I have been death-marching for 3-4 months and I'm not sure I see the top of the hill yet... How it does what it does. I modified gram.y to add an opt_column_list to CopyStmt. This List is sent down to DoCopy. If no attlist is present, I build an attlist from the relation's available attributes, enabling the old behavior. Once inside CopyFrom, the List attlist is scanned to build two arrays, one representing the actual index of the specified column, and another containing available default exprs. Attributes are read from the input stream and assigned to their appropriate location in the new tuple. If there are any defaults available, they are then evaluated and placed in the new tuple arrays. That's really it. I'm not _at all_ confident that there is not a better way to do this, but this is what I've done. The patch will elog(ERROR) in DoCopy, before Copy(To|From) is entered, if a column list is specified that cannot be handled (non-existent columns, duplicate columns). Should I be waiting until I'm _in_ Copy(To|From) to elog(ERROR) out? I'll try to add some commentary to that snakey code over the next week, and will resubmit an updated patch. Anyone questions regarding this patch should be sent 'To:' me and 'CC:' the list, as I haven't had the luxury of reading the good list lately :-(. comments/brain-death-fixes are welcome. cheers. b -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
Attachment
pgsql-patches by date: