Thread: COPY LOCK for WAL bypass
Following patch implements COPY ... FROM ... LOCK as discussed earlier this year on these threads: http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php http://archives.postgresql.org/pgsql-hackers/2005-06/msg00075.php The purpose of the new command is to make an explicit request to run COPY without producing WAL records (i.e. no logging), so as to improve the performance of data loads. (This is the first of a number of COPY performance optimizations, discussed on -hackers). Default COPY is unchanged. LOCK option takes an EXCLUSIVE lock (but perhaps that should be a SHARE ROW EXCLUSIVE?), allowing it to block out CREATE INDEX builds and VACUUM. LOCK option will also cause writing of WAL records to be skipped when XLogArchivingActive() and there are no indexes. The implementation refactors the code used by CTAS for syncing the data to disk once command is complete; COPY and CTAS now both use that code. COPY .. LOCK doesn't write any XLog records as was previously suggested. My train of thought: After some thought, no other heap-based xlog action would leave the table in a consistent state after failure. Nobody wants to see a random single row in the table. I looked into UPDATEing the last row loaded to generate an xlog rec but it would be difficult to do that without being horribly kludgy. I looked at adding a new xlog action, but there is only one slot left for a heap-based xlog action, so that seemed unwise. I wrote this using an RM_XLOG xlrec, but that doesn't actually trigger a commit write (I discover). I've simply set a flag to tell the transaction to record the commit anyway. That was already there in heapam.c anyway, but just for temp relations; I've changed the name of the variable to indicate what it does now, across a number of files. (It is also arguable that I should implement a WAL record that truncates the file back down to the starting size, in the event of a failure. I'm not sure where we were on that; there seem to be comments both in favour and against that idea. I can see the use for that, so I'll be happy to add that also, if we agree there is no danger.) I've added a few lock options onto the copy.sql test script, but not added (yet) a full suite of testing. No docs, yet. Short performance test shown below for 10^6 rows, one col table. 1. Normal COPY 4.5s 11.4s 6.0s 6.1s 2. COPY LOCK 3.0s 2.7s 2.8s 2.7s with postgresql.conf all default apart from: checkpoint_segments=30 This test was an "all in cache" test. The improvement is substantial, but the numbers above are best case, IMHO: I would expect only 10-40% improvement for larger loads in the general case. Short output shown below, with checkpoint_segments=3, so timings for the standard non-LOCK COPY probably include checkpoint time also. ======================================================= postgres=# create table ctest as select generate_series(1,1000000)::integer as col1; SELECT postgres=# copy ctest to '/usr/local/pgsql/ctest.data'; COPY postgres=# truncate ctest; TRUNCATE TABLE Time: 41.343 ms postgres=# copy ctest from '/usr/local/pgsql/ctest.data'; COPY Time: 7111.205 ms postgres=# truncate ctest; TRUNCATE TABLE Time: 23.175 ms postgres=# copy ctest from '/usr/local/pgsql/ctest.data' lock; COPY Time: 2992.482 ms postgres=# truncate ctest; TRUNCATE TABLE Time: 8.306 ms postgres=# copy ctest from '/usr/local/pgsql/ctest.data'; COPY Time: 7433.166 ms Best Regards, Simon Riggs
Attachment
On Sat, 2005-12-10 at 12:07 +0000, Simon Riggs wrote: > Following patch implements COPY ... FROM ... LOCK Patch now updated so that it includes an additional optimization of COPY, so that WAL will not be written in the transaction that created the table. This now gives two fast paths for COPY: 1) COPY LOCK 2) COPY in same transaction (e.g. reloading a pg_dump) Patch passes make check on cvstip. No docs yet, but let me know if this is OK and I'll work on them. [Other copied in from the related patch thread on Single-Transaction Utility options. With this new COPY optimization the --single-transaction option will considerably increase performance.] Performance tests shown on previous post for this thread. Best Regards, Simon Riggs
Attachment
simon@2ndquadrant.com (Simon Riggs) writes: > On Sat, 2005-12-10 at 12:07 +0000, Simon Riggs wrote: >> Following patch implements COPY ... FROM ... LOCK > > Patch now updated so that it includes an additional optimization of > COPY, so that WAL will not be written in the transaction that created > the table. > > This now gives two fast paths for COPY: > 1) COPY LOCK > 2) COPY in same transaction (e.g. reloading a pg_dump) I presume that if this doesn't go into WAL, that means that this kind of update wouldn't play with PITR, right? That's presumably something that otta be pretty carefully documented :-). -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://cbbrowne.com/info/nonrdbms.html Microsoft Outlook: Deploying Viruses Has Never Been This Easy!
On Sun, 2005-12-18 at 22:03 -0500, Chris Browne wrote: > simon@2ndquadrant.com (Simon Riggs) writes: > > On Sat, 2005-12-10 at 12:07 +0000, Simon Riggs wrote: > >> Following patch implements COPY ... FROM ... LOCK > > > > Patch now updated so that it includes an additional optimization of > > COPY, so that WAL will not be written in the transaction that created > > the table. > > > > This now gives two fast paths for COPY: > > 1) COPY LOCK > > 2) COPY in same transaction (e.g. reloading a pg_dump) > > I presume that if this doesn't go into WAL, that means that this kind > of update wouldn't play with PITR, right? You're right. PITR is designed for normal production use, rather than initial loading. It is also fairly common to turn PITR off permanently in larger data warehouses, which is also where these optimizations are aimed. I'll take the hint and write the docs then. :-) Best Regards, Simon Riggs
On Mon, Dec 19, 2005 at 07:51:54AM +0000, Simon Riggs wrote: > On Sun, 2005-12-18 at 22:03 -0500, Chris Browne wrote: > > simon@2ndquadrant.com (Simon Riggs) writes: > > > On Sat, 2005-12-10 at 12:07 +0000, Simon Riggs wrote: > > >> Following patch implements COPY ... FROM ... LOCK > > > > > > Patch now updated so that it includes an additional optimization of > > > COPY, so that WAL will not be written in the transaction that created > > > the table. > > > > > > This now gives two fast paths for COPY: > > > 1) COPY LOCK > > > 2) COPY in same transaction (e.g. reloading a pg_dump) > > > > I presume that if this doesn't go into WAL, that means that this kind > > of update wouldn't play with PITR, right? > > You're right. > > PITR is designed for normal production use, rather than initial loading. > It is also fairly common to turn PITR off permanently in larger data > warehouses, which is also where these optimizations are aimed. Hrm... I'd say we need an option to disable the fast-copy then, in case you wanted the copy to make it into PITR. Or perhaps we just disallow the fast-copy when PITR is in use. I believe that's what other databases do... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Simon Riggs wrote: > On Sun, 2005-12-18 at 22:03 -0500, Chris Browne wrote: > > simon@2ndquadrant.com (Simon Riggs) writes: > > > On Sat, 2005-12-10 at 12:07 +0000, Simon Riggs wrote: > > >> Following patch implements COPY ... FROM ... LOCK > > > > > > Patch now updated so that it includes an additional optimization of > > > COPY, so that WAL will not be written in the transaction that created > > > the table. > > > > > > This now gives two fast paths for COPY: > > > 1) COPY LOCK > > > 2) COPY in same transaction (e.g. reloading a pg_dump) > > > > I presume that if this doesn't go into WAL, that means that this kind > > of update wouldn't play with PITR, right? > > You're right. > > PITR is designed for normal production use, rather than initial loading. > It is also fairly common to turn PITR off permanently in larger data > warehouses, which is also where these optimizations are aimed. > > I'll take the hint and write the docs then. :-) I assume in addition to a documentation mention, you will disable this feature when PITR is on, right? (Rather than just document something that is unsafe, we disable it.) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Jim C. Nasby wrote: > On Mon, Dec 19, 2005 at 07:51:54AM +0000, Simon Riggs wrote: > > On Sun, 2005-12-18 at 22:03 -0500, Chris Browne wrote: > > > simon@2ndquadrant.com (Simon Riggs) writes: > > > > On Sat, 2005-12-10 at 12:07 +0000, Simon Riggs wrote: > > > >> Following patch implements COPY ... FROM ... LOCK > > > > > > > > Patch now updated so that it includes an additional optimization of > > > > COPY, so that WAL will not be written in the transaction that created > > > > the table. > > > > > > > > This now gives two fast paths for COPY: > > > > 1) COPY LOCK > > > > 2) COPY in same transaction (e.g. reloading a pg_dump) > > > > > > I presume that if this doesn't go into WAL, that means that this kind > > > of update wouldn't play with PITR, right? > > > > You're right. > > > > PITR is designed for normal production use, rather than initial loading. > > It is also fairly common to turn PITR off permanently in larger data > > warehouses, which is also where these optimizations are aimed. > > Hrm... I'd say we need an option to disable the fast-copy then, in case > you wanted the copy to make it into PITR. Or perhaps we just disallow > the fast-copy when PITR is in use. I believe that's what other databases > do... Right, just disable with a clear error message. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, 2005-12-21 at 08:18 -0500, Bruce Momjian wrote: > Jim C. Nasby wrote: > > On Mon, Dec 19, 2005 at 07:51:54AM +0000, Simon Riggs wrote: > > > On Sun, 2005-12-18 at 22:03 -0500, Chris Browne wrote: > > > > simon@2ndquadrant.com (Simon Riggs) writes: > > > > > On Sat, 2005-12-10 at 12:07 +0000, Simon Riggs wrote: > > > > >> Following patch implements COPY ... FROM ... LOCK > > > > > > > > > > Patch now updated so that it includes an additional optimization of > > > > > COPY, so that WAL will not be written in the transaction that created > > > > > the table. > > > > > > > > > > This now gives two fast paths for COPY: > > > > > 1) COPY LOCK > > > > > 2) COPY in same transaction (e.g. reloading a pg_dump) > > > > > > > > I presume that if this doesn't go into WAL, that means that this kind > > > > of update wouldn't play with PITR, right? > > > > > > You're right. > > > > > > PITR is designed for normal production use, rather than initial loading. > > > It is also fairly common to turn PITR off permanently in larger data > > > warehouses, which is also where these optimizations are aimed. > > > > Hrm... I'd say we need an option to disable the fast-copy then, in case > > you wanted the copy to make it into PITR. Or perhaps we just disallow > > the fast-copy when PITR is in use. I believe that's what other databases > > do... > > Right, just disable with a clear error message. > Forgive me if my earlier reply was not clear: This patch works just the same way CREATE INDEX and CREATE TABLE AS SELECT already works: - if the server has archive_command set, WAL is written - if archive_command is not set then (assuming other conditions apply) the writing WAL will be bypassed for performance. There is no error-message generated, by COPY LOCK nor the other two previously mentioned commands. Writing docs now. Best Regards, Simon Riggs
On Wed, 2005-12-21 at 16:23 +0000, Simon Riggs wrote: > Writing docs now. Here's the docs... Hopefully that explains everything a little better....but this doc patch contains a ref to the as yet unapplied --single-transaction patch, but also explains why that patch exists also. [Patch via doc/src/sgml/ref rather than from tree top] Best Regards, Simon Riggs
Attachment
Simon Riggs wrote: > > Right, just disable with a clear error message. > > > > Forgive me if my earlier reply was not clear: This patch works just the > same way CREATE INDEX and CREATE TABLE AS SELECT already works: > - if the server has archive_command set, WAL is written > - if archive_command is not set then (assuming other conditions apply) > the writing WAL will be bypassed for performance. > There is no error-message generated, by COPY LOCK nor the other two > previously mentioned commands. Sounds good. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073