Re: [PATCH] Hooks at XactCommand level - Mailing list pgsql-hackers
From | Gilles Darold |
---|---|
Subject | Re: [PATCH] Hooks at XactCommand level |
Date | |
Msg-id | d3ca6093-156d-e8c3-fb92-bd892ca905ec@darold.net Whole thread Raw |
In response to | Re: [PATCH] Hooks at XactCommand level (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [PATCH] Hooks at XactCommand level
|
List | pgsql-hackers |
Le 13/08/2021 à 11:58, Andres Freund a écrit : > Hi, > > On 2021-08-10 10:12:26 +0200, Gilles Darold wrote: >> Sorry for the response delay. I have though about adding this odd hook to be >> able to implement this feature through an extension because I don't think >> this is something that should be implemented in core. There were also >> patches proposals which were all rejected. >> >> We usually implement the feature at client side which is imo enough for the >> use cases. But the problem is that this a catastrophe in term of >> performances. I have done a small benchmark to illustrate the problem. This >> is a single process client on the same host than the PG backend. >> >> For 10,000 tuples inserted with 50% of failures and rollback at statement >> level handled at client side: >> >> Expected: 5001, Count: 5001 >> DML insert took: 13 wallclock secs ( 0.53 usr + 0.94 sys = 1.47 >> CPU) > Something seems off here. This suggests every insert took 2.6ms. That > seems awfully long, unless your network latency is substantial. I did a > quick test implementing this in the naive-most way in pgbench, and I get > better times - and there's *lots* of room for improvement. > > I used a pgbench script that sent the following: > BEGIN; > SAVEPOINT insert_fail; > INSERT INTO testinsert(data) VALUES (1); > ROLLBACK TO SAVEPOINT insert_fail; > SAVEPOINT insert_success; > INSERT INTO testinsert(data) VALUES (1); > RELEASE SAVEPOINT insert_success; > {repeat 5 times} > COMMIT; > > I.e. 5 failing and 5 succeeding insertions wrapped in one transaction. I > get >2500 tps, i.e. > 25k rows/sec. And it's not hard to optimize that > further - the {ROLLBACK TO,RELEASE} SAVEPOINT; SAVEPOINT; INSERT can be > sent in one roundtrip. That gets me to somewhere around 40k rows/sec. > > > BEGIN; > > \startpipeline > SAVEPOINT insert_fail; > INSERT INTO testinsert(data) VALUES (1); > \endpipeline > > \startpipeline > ROLLBACK TO SAVEPOINT insert_fail; > SAVEPOINT insert_success; > INSERT INTO testinsert(data) VALUES (1); > \endpipeline > > \startpipeline > RELEASE SAVEPOINT insert_success; > SAVEPOINT insert_fail; > INSERT INTO testinsert(data) VALUES (1); > \endpipeline > > \startpipeline > ROLLBACK TO SAVEPOINT insert_fail; > SAVEPOINT insert_success; > INSERT INTO testinsert(data) VALUES (1); > \endpipeline > > {repeat last two blocks three times} > COMMIT; > > Greetings, > > Andres Freund I have written a Perl script to mimic what I have found in an Oracle batch script to import data in a table. I had this use case in a recent migration the only difference is that the batch was written in Java. $dbh->do("BEGIN") or die "FATAL: " . $dbh->errstr . "\n"; my $start = new Benchmark; my $sth = $dbh->prepare("INSERT INTO t1 VALUES (?, ?)"); exit 1 if (not defined $sth); for (my $i = 0; $i <= 10000; $i++) { $dbh->do("SAVEPOINT foo") or die "FATAL: " . $dbh->errstr . "\n"; # Generate a duplicate key each two row inserted my $val = $i; $val = $i-1 if ($i % 2 != 0); unless ($sth->execute($val, 'insert '.$i)) { $dbh->do("ROLLBACK TO foo") or die "FATAL: " . $dbh->errstr . "\n"; } else { $dbh->do("RELEASE foo") or die "FATAL: " . $dbh->errstr . "\n"; } } $sth->finish(); my $end = new Benchmark; $dbh->do("COMMIT;"); my $td = timediff($end, $start); print "DML insert took: " . timestr($td) . "\n"; The timing reported are from my personal computer, there is no network latency, it uses localhost. Anyway, the objective was not to bench the DML throughput but the overhead of the rollback at statement level made at client side versus server side. I guess that you might have the same speed gain around x3 to x5 or more following the number of tuples? The full script can be found here https://github.com/darold/pg_statement_rollbackv2/blob/main/test/batch_script_example.pl Cheers, -- Gilles Darold
pgsql-hackers by date: