Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers

From torikoshia
Subject Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date
Msg-id e78a86704ab2b2b967f0d674e5a72643@oss.nttdata.com
Whole thread Raw
In response to Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Damir Belyalov <dam.bel07@gmail.com>)
Responses Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
List pgsql-hackers
On 2022-07-19 21:40, Damir Belyalov wrote:
> Hi!
> 
> Improved my patch by adding block subtransactions.
> The block size is determined by the REPLAY_BUFFER_SIZE parameter.
> I used the idea of a buffer for accumulating tuples in it.
> If we read REPLAY_BUFFER_SIZE rows without errors, the subtransaction
> will be committed.
> If we find an error, the subtransaction will rollback and the buffer
> will be replayed containing tuples.

Thanks for working on this!

I tested 0002-COPY-IGNORE_ERRORS.patch and faced an unexpected behavior.

I loaded 10000 rows which contained 1 wrong row.
I expected I could see 9999 rows after COPY, but just saw 999 rows.

Since when I changed MAX_BUFFERED_TUPLES from 1000 to other values, the 
number of loaded rows also changed, I imagine MAX_BUFFERED_TUPLES might 
be giving influence of this behavior.

```sh
$ cat /tmp/test10000.dat

1   aaa
2   aaa
3   aaa
4   aaa
5   aaa
6   aaa
7   aaa
8   aaa
9   aaa
10  aaa
11  aaa
...
9994    aaa
9995    aaa
9996    aaa
9997    aaa
9998    aaa
9999    aaa
xxx aaa
```

```SQL
=# CREATE TABLE test (id int, data text);

=# COPY test FROM '/tmp/test10000.dat' WITH (IGNORE_ERRORS);
WARNING:  COPY test, line 10000, column i: "xxx"
COPY 9999

=# SELECT COUNT(*) FROM test;
  count
-------
    999
(1 row)
```

BTW I may be overlooking it, but have you submit this proposal to the 
next CommitFest?

https://commitfest.postgresql.org/39/


-- 
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Marina Polyakova
Date:
Subject: Re: ICU for global collation
Next
From: Christoph Berg
Date:
Subject: pg_receivewal and SIGTERM