Re: COPY enhancements - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: COPY enhancements |
Date | |
Msg-id | 603c8f070910080832o3b83a332p63575301a44c4c23@mail.gmail.com Whole thread Raw |
In response to | Re: COPY enhancements (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: COPY enhancements
Re: COPY enhancements Re: COPY enhancements |
List | pgsql-hackers |
On Thu, Oct 8, 2009 at 11:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Lest there be any unclarity, I am NOT trying to shoot down this >> feature with my laser-powered bazooka. > > Well, if you need somebody to do that Well, I'm trying not to demoralize people who have put in hard work, however much it may not be usable. Still, your points are well taken.I did raise some of them (with a lot less technicaldetail) in my review of last night. > So as far as I can see, the only form of COPY error handling that > wouldn't be a cruel joke is to run a separate subtransaction for each > row, and roll back the subtransaction on error. Of course the problems > with that are (a) speed, (b) the 2^32 limit on command counter IDs > would mean a max of 2^32 rows per COPY, which is uncomfortably small > these days. Previous discussions of the problem have mentioned trying > to batch multiple rows per subtransaction to alleviate both issues. > Not easy of course, but that's why it's not been done yet. With a > patch like this you'd also have (c) how to avoid rolling back the > insertions into the logging table. Yeah. I think it's going to be hard to make this work without having standalone transactions. One idea would be to start a subtransaction, insert tuples until one fails, then rollback the subtransaction and start a new one, and continue on until the error limit is reached. At the end, if the number of rollbacks is > 0, then roll back the final subtransaction also. This wouldn't have the property of getting the unerrorred data into the table, but at least it would let you report all the errors in a single pass, hopefully without being gratingly slow. Subcommitting every single row is going to be really painful, especially after Hot Standby goes in and we have to issue a WAL record after every 64 subtransactions (AIUI). Another possible approach, which isn't perfect either, is the idea of allowing COPY to generate a single column of output of type text[]. That greatly reduces the number of possible error cases, and at least gets the data into the DB where you can hack on it. But it's still going to be painful for some use cases. ...Robert
pgsql-hackers by date: