Thread: Microvacuum support for Hash Index
Hi All, I have added a microvacuum support for hash index access method and attached is the v1 patch for the same. The patch basically takes care of the following things: 1. Firstly, it changes the marking of dead tuples from 'tuple-at-a-time' to 'page-at-a-time' during hash index scan. For this we accumulate the heap tids and offset of all the hash index tuples if it is pointed by kill_prior_tuple during scan and then mark all accumulated tids as LP_DEAD either while stepping from one page to another (assuming the scan in both forward and backward direction) or during end of the hash index scan or during rescan. 2. Secondly, when inserting tuple into hash index table, if not enough space is found on a current page then it ensures that we first clean the dead tuples if found in the current hash index page before moving to the next page in a bucket chain or going for a bucket split. This basically increases the page reusability and reduces the number of page splits, thereby reducing the overall size of hash index table. I have compared the hash index size with and without my patch (microvacuum_hash_index_v1.patch attached with this mail) on a high end machine at various scale factors and the results are shown below. For testing this, i have created hash index (pgbench_accounts_aid) on aid column of 'pgbench_accounts' table instead of primary key and the results shown in below table are for the same. The patch (pgbench.patch) having these changes is also attached with this mail. Moreover, I am using my own script file (file_hash_kill_prior_tuple) for updating the index column with pgbench read-write command. The script file 'file_hash_kill_prior_tuple' is also attached with this mail. Here are some initial test results showing the benefit of this patch: postgresql.conf and pgbench settings: autovacuum=off client counts = 64 run time duration = 15 mins ./pgbench -c $threads -j $threads -T 900 postgres -f ~/file_hash_kill_prior_tuple Scale Factor hash index size @ start HEAD HEAD + Patch 10 32 MB 579 MB 158 MB 50 128 MB 630 MB 350 MB 100 256 MB 1255 MB 635 MB 300 1024 MB 2233 MB 1093 MB As shown in above result, at 10 scale factor the hash index size has reduced by almost 4 times whereas at 50 and 300 scale factors it has reduced by half with my patch. This basically proves that we can reduce the hash index size to a good extent with this patch. System specifications: --------------------------------- Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 8 NUMA node(s): 8 Vendor ID: GenuineIntel Note: The patch (microvacuum_hash_index_v1.patch) is prepared on top of concurrent_hash_index_v8.patch-[1] and wal_hash_index_v5.1.patch[2] for hash index. [1] - https://www.postgresql.org/message-id/CAA4eK1%2BX%3D8sUd1UCZDZnE3D9CGi9kw%2Bkjxp2Tnw7SX5w8pLBNw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAA4eK1KE%3D%2BkkowyYD0vmch%3Dph4ND3H1tViAB%2B0cWTHqjZDDfqg%40mail.gmail.com
Attachment
On Mon, Oct 24, 2016 at 2:21 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > Hi All, > > I have added a microvacuum support for hash index access method and > attached is the v1 patch for the same. > This is an important functionality for hash index as we already do have same functionality for other types of indexes like btree. > The patch basically takes care > of the following things: > > 1. Firstly, it changes the marking of dead tuples from > 'tuple-at-a-time' to 'page-at-a-time' during hash index scan. For this > we accumulate the heap tids and offset of all the hash index tuples if > it is pointed by kill_prior_tuple during scan and then mark all > accumulated tids as LP_DEAD either while stepping from one page to > another (assuming the scan in both forward and backward direction) or > during end of the hash index scan or during rescan. > > 2. Secondly, when inserting tuple into hash index table, if not enough > space is found on a current page then it ensures that we first clean > the dead tuples if found in the current hash index page before moving > to the next page in a bucket chain or going for a bucket split. This > basically increases the page reusability and reduces the number of > page splits, thereby reducing the overall size of hash index table. > Few comments on patch: 1. +static void +hash_xlog_vacuum_one_page(XLogReaderState *record) +{ + XLogRecPtr lsn = record->EndRecPtr; + xl_hash_vacuum *xldata = (xl_hash_vacuum *) XLogRecGetData(record); + Buffer bucketbuf = InvalidBuffer; + Buffer buffer; + Buffer metabuf; + Page page; + XLogRedoAction action; While replaying the delete/vacuum record on standby, it can conflict with some already running queries. Basically the replay can remove some row which can be visible on standby. You need to resolve conflicts similar to what we do in btree delete records (refer btree_xlog_delete). 2. + /* + * Write-lock the meta page so that we can decrement + * tuple count. + */ + _hash_chgbufaccess(rel, metabuf, HASH_NOLOCK, HASH_WRITE); + + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf, + (buf == bucket_buf) ? true : false); + + _hash_chgbufaccess(rel, metabuf, HASH_READ, HASH_NOLOCK); It seems here meta page lock is acquired for duration more than required and also it is not required when there are no deletable items on page. You can take the metapage lock before decrementing the count. 3. Assert(offnum <= maxoff); + Spurious space. There are some other similar spurious white space changes in patch, remove them as well. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Hi Amit, Thanks for showing your interest and reviewing my patch. I have started looking into your review comments. I will share the updated patch in a day or two. With Regards, Ashutosh Sharma EnterpriseDB: http://www.enterprisedb.com On Fri, Oct 28, 2016 at 4:42 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Mon, Oct 24, 2016 at 2:21 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> Hi All, >> >> I have added a microvacuum support for hash index access method and >> attached is the v1 patch for the same. >> > > This is an important functionality for hash index as we already do > have same functionality for other types of indexes like btree. > >> The patch basically takes care >> of the following things: >> >> 1. Firstly, it changes the marking of dead tuples from >> 'tuple-at-a-time' to 'page-at-a-time' during hash index scan. For this >> we accumulate the heap tids and offset of all the hash index tuples if >> it is pointed by kill_prior_tuple during scan and then mark all >> accumulated tids as LP_DEAD either while stepping from one page to >> another (assuming the scan in both forward and backward direction) or >> during end of the hash index scan or during rescan. >> >> 2. Secondly, when inserting tuple into hash index table, if not enough >> space is found on a current page then it ensures that we first clean >> the dead tuples if found in the current hash index page before moving >> to the next page in a bucket chain or going for a bucket split. This >> basically increases the page reusability and reduces the number of >> page splits, thereby reducing the overall size of hash index table. >> > > Few comments on patch: > > 1. > +static void > +hash_xlog_vacuum_one_page(XLogReaderState *record) > +{ > + XLogRecPtr lsn = record->EndRecPtr; > + xl_hash_vacuum *xldata = (xl_hash_vacuum *) XLogRecGetData(record); > + Buffer bucketbuf = InvalidBuffer; > + Buffer buffer; > + Buffer metabuf; > + Page page; > + XLogRedoAction action; > > While replaying the delete/vacuum record on standby, it can conflict > with some already running queries. Basically the replay can remove > some row which can be visible on standby. You need to resolve > conflicts similar to what we do in btree delete records (refer > btree_xlog_delete). > > 2. > + /* > + * Write-lock the meta page so that we can decrement > + * tuple count. > + */ > + _hash_chgbufaccess(rel, metabuf, HASH_NOLOCK, HASH_WRITE); > + > + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf, > + (buf == bucket_buf) ? true : false); > + > + _hash_chgbufaccess(rel, metabuf, HASH_READ, HASH_NOLOCK); > > It seems here meta page lock is acquired for duration more than > required and also it is not required when there are no deletable items > on page. You can take the metapage lock before decrementing the count. > > 3. > Assert(offnum <= maxoff); > + > > Spurious space. There are some other similar spurious white space > changes in patch, remove them as well. > > > > -- > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com
Hi, > While replaying the delete/vacuum record on standby, it can conflict > with some already running queries. Basically the replay can remove > some row which can be visible on standby. You need to resolve > conflicts similar to what we do in btree delete records (refer > btree_xlog_delete). Agreed. Thanks for putting this point. I have taken care of it in the attached v2 patch. > + /* > + * Write-lock the meta page so that we can decrement > + * tuple count. > + */ > + _hash_chgbufaccess(rel, metabuf, HASH_NOLOCK, HASH_WRITE); > + > + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf, > + (buf == bucket_buf) ? true : false); > + > + _hash_chgbufaccess(rel, metabuf, HASH_READ, HASH_NOLOCK); > > It seems here meta page lock is acquired for duration more than > required and also it is not required when there are no deletable items > on page. You can take the metapage lock before decrementing the count. Ok. Corrected. Please refer to the attached v2 patch. > Spurious space. There are some other similar spurious white space > changes in patch, remove them as well. Corrected. Please refer attached v2 patch.
Attachment
Hi, On 11/02/2016 01:38 AM, Ashutosh Sharma wrote: >> While replaying the delete/vacuum record on standby, it can conflict >> with some already running queries. Basically the replay can remove >> some row which can be visible on standby. You need to resolve >> conflicts similar to what we do in btree delete records (refer >> btree_xlog_delete). > > Agreed. Thanks for putting this point. I have taken care of it in the > attached v2 patch. > Some initial comments. _hash_vacuum_one_page: + END_CRIT_SECTION(); + _hash_chgbufaccess(rel, metabuf, HASH_READ, HASH_NOLOCK); The _hash_chgbufaccess() needs a comment. You also need a place where you call pfree for so->killedItems - maybe in hashkillitems(). Best regards, Jesper
Hi Jesper, > Some initial comments. > > _hash_vacuum_one_page: > > + END_CRIT_SECTION(); > + _hash_chgbufaccess(rel, metabuf, HASH_READ, HASH_NOLOCK); > > The _hash_chgbufaccess() needs a comment. > > You also need a place where you call pfree for so->killedItems - maybe in > hashkillitems(). Thanks for reviewing this patch. I would like to update you that this patch has got dependency on a patch for concurrent hash index and WAL log in hash index. So, till these two patches for hash index are not stable I won't be able to share you a next version of patch for supporting microvacuum in hash index. With Regards, Ashutosh Sharma EnterpriseDB: http://www.enterprisedb.com
On 11/11/2016 12:11 AM, Ashutosh Sharma wrote: > Thanks for reviewing this patch. I would like to update you that this > patch has got dependency on a patch for concurrent hash index and WAL > log in hash index. So, till these two patches for hash index are not > stable I won't be able to share you a next version of patch for > supporting microvacuum in hash index. > As the concurrent hash index patch was committed in 6d46f4 this patch needs a rebase. I have moved this submission to the next CF. Thanks for working on this ! Best regards, Jesper
On 11/11/2016 12:11 AM, Ashutosh Sharma wrote: > Hi Jesper, > >> Some initial comments. >> >> _hash_vacuum_one_page: >> >> + END_CRIT_SECTION(); >> + _hash_chgbufaccess(rel, metabuf, HASH_READ, HASH_NOLOCK); >> >> The _hash_chgbufaccess() needs a comment. >> >> You also need a place where you call pfree for so->killedItems - maybe in >> hashkillitems(). > > Thanks for reviewing this patch. I would like to update you that this > patch has got dependency on a patch for concurrent hash index and WAL > log in hash index. So, till these two patches for hash index are not > stable I won't be able to share you a next version of patch for > supporting microvacuum in hash index. > This can be rebased on the WAL v7 patch [1]. In addition to the previous comments you need to take commit 7819ba into account. [1] https://www.postgresql.org/message-id/CAA4eK1%2BdmGNTFMnLO4EbOWJDHUq%3D%2Ba2L8T%3D72ifXeh-Kd8HOsg%40mail.gmail.com Best regards, Jesper
Hi, > This can be rebased on the WAL v7 patch [1]. In addition to the previous > comments you need to take commit 7819ba into account. > Attached is the v3 patch rebased on postgreSQL HEAD and WAL v7 patch. It also takes care of all the previous comments from Jesper - [1]. Also, I have changed the status of this patch to "Needs review" for this commit-fest. [1] https://www.postgresql.org/message-id/a751842f-2aed-9f2e-104c-34cfe06bfbe2%40redhat.com With Regards, Ashutosh Sharma. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi Ashutosh, On 01/04/2017 06:13 AM, Ashutosh Sharma wrote: > Attached is the v3 patch rebased on postgreSQL HEAD and WAL v7 patch. > It also takes care of all the previous comments from Jesper - [1]. > With an --enable-cassert build (master / WAL v7 / MV v3) and -- ddl.sql -- CREATE TABLE test AS SELECT generate_series(1, 10) AS id, 0 AS val; CREATE INDEX IF NOT EXISTS idx_id ON test USING hash (id); CREATE INDEX IF NOT EXISTS idx_val ON test USING hash (val); ANALYZE; -- ddl.sql -- -- test.sql -- \set id random(1,10) \set val random(0,10) BEGIN; UPDATE test SET val = :val WHERE id = :id; COMMIT; -- test.sql -- using pgbench -M prepared -c 10 -j 10 -T 600 -f test.sql test crashes after a few minutes with TRAP: FailedAssertion("!(LWLockHeldByMeInMode(((LWLock*) (&(bufHdr)->content_lock)), LW_EXCLUSIVE))", File: "bufmgr.c", Line: 3781) BTW, better rename 'hashkillitems' to '_hash_kill_items' to follow the naming convention in hash.h Best regards, Jesper
Hi, > using pgbench -M prepared -c 10 -j 10 -T 600 -f test.sql test > > crashes after a few minutes with > > TRAP: FailedAssertion("!(LWLockHeldByMeInMode(((LWLock*) > (&(bufHdr)->content_lock)), LW_EXCLUSIVE))", File: "bufmgr.c", Line: 3781) Attached v4 patch fixes this assertion failure. > BTW, better rename 'hashkillitems' to '_hash_kill_items' to follow the > naming convention in hash.h okay, I have renamed 'hashkillitems' to '_hash_kill_items'. Please check the attached v4 patch. With Regards, Ashutosh Sharma EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi Ashutosh, On 01/06/2017 12:54 AM, Ashutosh Sharma wrote: >> using pgbench -M prepared -c 10 -j 10 -T 600 -f test.sql test >> >> crashes after a few minutes with >> >> TRAP: FailedAssertion("!(LWLockHeldByMeInMode(((LWLock*) >> (&(bufHdr)->content_lock)), LW_EXCLUSIVE))", File: "bufmgr.c", Line: 3781) > > Attached v4 patch fixes this assertion failure. > Yes, that fixes the problem. However (master / WAL v7 / MV v4), --- ddl.sql --- CREATE TABLE test AS SELECT generate_series(1, 10) AS id, 0 AS val; CREATE INDEX IF NOT EXISTS idx_id ON test USING hash (id); CREATE INDEX IF NOT EXISTS idx_val ON test USING hash (val); ANALYZE; --- ddl.sql --- --- test.sql --- \set id random(1,10) \set val random(0,10) BEGIN; DELETE FROM test WHERE id = :id; INSERT INTO test VALUES (:id, :val); COMMIT; --- test.sql --- gives #9 0x000000000098a83e in elog_finish (elevel=20, fmt=0xb6ea92 "incorrect local pin count: %d") at elog.c:1378 #10 0x00000000007f0b33 in LockBufferForCleanup (buffer=1677) at bufmgr.c:3605 #11 0x0000000000549390 in XLogReadBufferForRedoExtended (record=0x2afced8, block_id=1 '\001', mode=RBM_NORMAL, get_cleanup_lock=1 '\001', buf=0x7ffe3ee27c8c) at xlogutils.c:394 #12 0x00000000004c5026 in hash_xlog_vacuum_one_page (record=0x2afced8) at hash_xlog.c:1109 #13 0x00000000004c5547 in hash_redo (record=0x2afced8) at hash_xlog.c:1214 #14 0x000000000053a361 in StartupXLOG () at xlog.c:6975 #15 0x00000000007a4ca0 in StartupProcessMain () at startup.c:216 on the slave instance in a master-slave setup. Also, the src/backend/access/README file should be updated with a description of the changes which this patch provides. Best regards, Jesper
Hi Jesper, > However (master / WAL v7 / MV v4), > > --- ddl.sql --- > CREATE TABLE test AS SELECT generate_series(1, 10) AS id, 0 AS val; > CREATE INDEX IF NOT EXISTS idx_id ON test USING hash (id); > CREATE INDEX IF NOT EXISTS idx_val ON test USING hash (val); > ANALYZE; > --- ddl.sql --- > > --- test.sql --- > \set id random(1,10) > \set val random(0,10) > BEGIN; > DELETE FROM test WHERE id = :id; > INSERT INTO test VALUES (:id, :val); > COMMIT; > --- test.sql --- > > gives > > #9 0x000000000098a83e in elog_finish (elevel=20, fmt=0xb6ea92 "incorrect > local pin count: %d") at elog.c:1378 > #10 0x00000000007f0b33 in LockBufferForCleanup (buffer=1677) at > bufmgr.c:3605 > #11 0x0000000000549390 in XLogReadBufferForRedoExtended (record=0x2afced8, > block_id=1 '\001', mode=RBM_NORMAL, get_cleanup_lock=1 '\001', > buf=0x7ffe3ee27c8c) at xlogutils.c:394 > #12 0x00000000004c5026 in hash_xlog_vacuum_one_page (record=0x2afced8) at > hash_xlog.c:1109 > #13 0x00000000004c5547 in hash_redo (record=0x2afced8) at hash_xlog.c:1214 > #14 0x000000000053a361 in StartupXLOG () at xlog.c:6975 > #15 0x00000000007a4ca0 in StartupProcessMain () at startup.c:216 > > on the slave instance in a master-slave setup. Thanks for reporting this problem. It is basically coming because i forgot to unpin the bucketbuf in hash_xlog_vacuum_one_page(). Please find the attached v5 patch that fixes the issue. > > Also, the src/backend/access/README file should be updated with a > description of the changes which this patch provides. okay, I have updated the insertion algorithm in the README file. -- With Regards, Ashutosh Sharma EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi Ashutosh, On 01/10/2017 05:24 AM, Ashutosh Sharma wrote: > Thanks for reporting this problem. It is basically coming because i > forgot to unpin the bucketbuf in hash_xlog_vacuum_one_page(). Please > find the attached v5 patch that fixes the issue. > The crash is now fixed, but the --- test.sql --- \set id random(1,10) \set val random(0,10) BEGIN; UPDATE test SET val = :val WHERE id = :id; COMMIT; --- test.sql --- case gives client 6 aborted in command 3 of script 0; ERROR: deadlock detected DETAIL: Process 14608 waits for ShareLock on transaction 1444620; blocked by process 14610. Process 14610 waits for ShareLock on transaction 1444616; blocked by process 14608. HINT: See server log for query details. CONTEXT: while rechecking updated tuple (12,3) in relation "test" ... using pgbench -M prepared -c 10 -j 10 -T 300 -f test.sql test Best regards, Jesper
Hi Ashutosh, On 01/10/2017 08:40 AM, Jesper Pedersen wrote: > On 01/10/2017 05:24 AM, Ashutosh Sharma wrote: >> Thanks for reporting this problem. It is basically coming because i >> forgot to unpin the bucketbuf in hash_xlog_vacuum_one_page(). Please >> find the attached v5 patch that fixes the issue. >> > > The crash is now fixed, but the > > --- test.sql --- > \set id random(1,10) > \set val random(0,10) > BEGIN; > UPDATE test SET val = :val WHERE id = :id; > COMMIT; > --- test.sql --- > > case gives > > client 6 aborted in command 3 of script 0; ERROR: deadlock detected > DETAIL: Process 14608 waits for ShareLock on transaction 1444620; > blocked by process 14610. > Process 14610 waits for ShareLock on transaction 1444616; blocked by > process 14608. > HINT: See server log for query details. > CONTEXT: while rechecking updated tuple (12,3) in relation "test" > ... > > using pgbench -M prepared -c 10 -j 10 -T 300 -f test.sql test > I'm not seeing this deadlock with just the WAL v8 patch applied. Best regards, Jesper
Hi Jesper, > I'm not seeing this deadlock with just the WAL v8 patch applied. > okay, Thanks for confirming that. I would like to update you that I am not able to reproduce this issue at my end. I suspect that the steps i am following might be slightly different than your's. Could you please have a look at steps mentioned below and confirm if there is something different that I am doing. Firstly, I am running the test-case on following git commit in head: <git-commmit> commit ba61a04bc7fefeee03416d9911eb825c4897c223 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Thu Jan 19 19:52:13 2017 -0500 Avoid core dump for empty prepared statement in an aborted transaction. Brown-paper-bag bug in commit ab1f0c822: the old code here coped with null CachedPlanSource.raw_parse_tree, the newcode not so much. Per report from Dave Cramer. </git-commit> On top of above commit, I have applied WAL v8 patch for hash index and MV v5 patch. Now, with an --enable-cassert build I am following below steps: 1) Created a 'test' database 2) psql -d test -f ~/ddl.sql where ddl.sql is, -- ddl.sql -- CREATE TABLE test AS SELECT generate_series(1, 10) AS id, 0 AS val; CREATE INDEX IF NOT EXISTS idx_id ON test USING hash (id); CREATE INDEX IF NOT EXISTS idx_val ON test USING hash (val); ANALYZE; -- ddl.sql -- 3) pgbench -M prepared -c 10 -j 10 -T 1800 -f ~/test.sql test where test.sql is, -- test.sql -- \set id random(1,10) \set val random(0,10) BEGIN; UPDATE test SET val = :val WHERE id = :id; COMMIT; -- test.sql -- Machine details are as follows: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 8 Also, It would be great if you could confirm as if you have been getting this issue repeatedly. Thanks. With Regards, Ashutosh Sharma EnterpriseDB: http://www.enterprisedb.com
Hi Ashutosh, On 01/20/2017 04:18 AM, Ashutosh Sharma wrote: > okay, Thanks for confirming that. > > I would like to update you that I am not able to reproduce this issue > at my end. I suspect that the steps i am following might be slightly > different than your's. Could you please have a look at steps mentioned > below and confirm if there is something different that I am doing. > > Firstly, I am running the test-case on following git commit in head: > > <git-commmit> > commit ba61a04bc7fefeee03416d9911eb825c4897c223 > Author: Tom Lane <tgl@sss.pgh.pa.us> > Date: Thu Jan 19 19:52:13 2017 -0500 > > Avoid core dump for empty prepared statement in an aborted transaction. > > Brown-paper-bag bug in commit ab1f0c822: the old code here coped with > null CachedPlanSource.raw_parse_tree, the new code not so much. > Per report from Dave Cramer. > </git-commit> > > On top of above commit, I have applied WAL v8 patch for hash index and > MV v5 patch. > > Now, with an --enable-cassert build I am following below steps: > > 1) Created a 'test' database > > 2) psql -d test -f ~/ddl.sql > > where ddl.sql is, > > -- ddl.sql -- > CREATE TABLE test AS SELECT generate_series(1, 10) AS id, 0 AS val; > CREATE INDEX IF NOT EXISTS idx_id ON test USING hash (id); > CREATE INDEX IF NOT EXISTS idx_val ON test USING hash (val); > ANALYZE; > -- ddl.sql -- > > 3) pgbench -M prepared -c 10 -j 10 -T 1800 -f ~/test.sql test > > where test.sql is, > > -- test.sql -- > \set id random(1,10) > \set val random(0,10) > BEGIN; > UPDATE test SET val = :val WHERE id = :id; > COMMIT; > -- test.sql -- > > > Machine details are as follows: > > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 128 > On-line CPU(s) list: 0-127 > Thread(s) per core: 2 > Core(s) per socket: 8 > Socket(s): 8 > > Also, It would be great if you could confirm as if you have been > getting this issue repeatedly. Thanks. > Yeah, those are the steps; just with a Skylake laptop. However, I restarted with a fresh master, with WAL v8 and MV v5, and can't reproduce the issue. Best regards, Jesper
Hi Ashutosh, On 01/20/2017 03:24 PM, Jesper Pedersen wrote: > Yeah, those are the steps; just with a Skylake laptop. > > However, I restarted with a fresh master, with WAL v8 and MV v5, and > can't reproduce the issue. > I have done some more testing with this, and have moved to the patch back to 'Needs Review' pending Amit's comments. Best regards, Jesper
On 01/23/2017 02:53 PM, Jesper Pedersen wrote: > I have done some more testing with this, and have moved to the patch > back to 'Needs Review' pending Amit's comments. > Moved to "Ready for Committer". Best regards, Jesper
On Thu, Jan 26, 2017 at 6:38 PM, Jesper Pedersen <jesper.pedersen@redhat.com> wrote: > On 01/23/2017 02:53 PM, Jesper Pedersen wrote: >> >> I have done some more testing with this, and have moved to the patch >> back to 'Needs Review' pending Amit's comments. >> > > Moved to "Ready for Committer". > Don't you think we should try to identify the reason of the deadlock error reported by you up thread [1]? I know that you and Ashutosh are not able to reproduce it, but still I feel some investigation is required to find the reason. It is quite possible that the test case is such that the deadlock is expected in rare cases, if that is the case then it is okay. I have not spent enough time on that to comment whether it is a test or code issue. [1] - https://www.postgresql.org/message-id/dc6d7247-050f-4014-8c80-a4ee676eb384%40redhat.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
>>> I have done some more testing with this, and have moved to the patch >>> back to 'Needs Review' pending Amit's comments. >>> >> >> Moved to "Ready for Committer". >> > > Don't you think we should try to identify the reason of the deadlock > error reported by you up thread [1]? I know that you and Ashutosh are > not able to reproduce it, but still I feel some investigation is > required to find the reason. It is quite possible that the test case > is such that the deadlock is expected in rare cases, if that is the > case then it is okay. I have not spent enough time on that to comment > whether it is a test or code issue. > > > [1] - https://www.postgresql.org/message-id/dc6d7247-050f-4014-8c80-a4ee676eb384%40redhat.com > I am finally able to reproduce the issue using the attached script file (deadlock_report). Basically, once I was able to reproduce the issue with hash index I also thought of checking it with a non unique B-Tree index and was able to reproduce it with B-Tree index as well. This certainly tells us that there is nothing wrong at the code level rather there is something wrong with the test script which is causing this deadlock issue. Well, I have ran pgbench with two different configurations and my observations are as follows: 1) With Primary keys i.e. uinque values: I have never encountered deadlock issue with this configuration. 2) With non unique indexes (be it hash or B-Tree): I have seen deadlock many a times with this configuration. Basically when we have non-unique values associated with a column then there is a high probability that multiple records will get updated with a single 'UPDATE' statement and when there are huge number of backends trying to do so there is high chance of getting deadlock which i assume is expected behaviour in database. Also, Attached are pgbench_bt.patch and pgbench_hash.patch files that has changes done to create btree and hash index. pgbench settings: Scale Factor = 300 Shared Buffer= 1GB Client counts = 64 run time duration = 30 mins read-write workload. ./pgbench -c $threads -j $threads -T $time_for_reading -M prepared postgres -f /home/ashu/deadlock_report I hope this makes the things clear now and if there is no more concerns it can be moved to 'Ready for committer' state. Thank you. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Fri, Jan 27, 2017 at 5:15 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> >> Don't you think we should try to identify the reason of the deadlock >> error reported by you up thread [1]? I know that you and Ashutosh are >> not able to reproduce it, but still I feel some investigation is >> required to find the reason. It is quite possible that the test case >> is such that the deadlock is expected in rare cases, if that is the >> case then it is okay. I have not spent enough time on that to comment >> whether it is a test or code issue. >> >> >> [1] - https://www.postgresql.org/message-id/dc6d7247-050f-4014-8c80-a4ee676eb384%40redhat.com >> > > I am finally able to reproduce the issue using the attached script > file (deadlock_report). Basically, once I was able to reproduce the > issue with hash index I also thought of checking it with a non unique > B-Tree index and was able to reproduce it with B-Tree index as well. > This certainly tells us that there is nothing wrong at the code level > rather there is something wrong with the test script which is causing > this deadlock issue. Well, I have ran pgbench with two different > configurations and my observations are as follows: > > 1) With Primary keys i.e. uinque values: I have never encountered > deadlock issue with this configuration. > > 2) With non unique indexes (be it hash or B-Tree): I have seen > deadlock many a times with this configuration. Basically when we have > non-unique values associated with a column then there is a high > probability that multiple records will get updated with a single > 'UPDATE' statement and when there are huge number of backends trying > to do so there is high chance of getting deadlock which i assume is > expected behaviour in database. > I agree with your analysis, surely trying to update multiple rows with same values from different backends can lead to deadlock. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Sat, Jan 28, 2017 at 8:02 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Jan 27, 2017 at 5:15 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >>> >>> Don't you think we should try to identify the reason of the deadlock >>> error reported by you up thread [1]? I know that you and Ashutosh are >>> not able to reproduce it, but still I feel some investigation is >>> required to find the reason. It is quite possible that the test case >>> is such that the deadlock is expected in rare cases, if that is the >>> case then it is okay. I have not spent enough time on that to comment >>> whether it is a test or code issue. >> >> I am finally able to reproduce the issue using the attached script >> file (deadlock_report). Basically, once I was able to reproduce the >> issue with hash index I also thought of checking it with a non unique >> B-Tree index and was able to reproduce it with B-Tree index as well. >> This certainly tells us that there is nothing wrong at the code level >> rather there is something wrong with the test script which is causing >> this deadlock issue. Well, I have ran pgbench with two different >> configurations and my observations are as follows: >> >> 1) With Primary keys i.e. uinque values: I have never encountered >> deadlock issue with this configuration. >> >> 2) With non unique indexes (be it hash or B-Tree): I have seen >> deadlock many a times with this configuration. Basically when we have >> non-unique values associated with a column then there is a high >> probability that multiple records will get updated with a single >> 'UPDATE' statement and when there are huge number of backends trying >> to do so there is high chance of getting deadlock which i assume is >> expected behaviour in database. >> > > I agree with your analysis, surely trying to update multiple rows with > same values from different backends can lead to deadlock. Moved that to CF 2017-03. -- Michael
Hi, Attached is the v6 patch for microvacuum in hash index rebased on top of 'v10 patch for WAL in hash index - [1]' and 'v1 patch for WAL consistency check for hash index - [2]'. [1] - https://www.postgresql.org/message-id/CAA4eK1%2Bk5wR4-kAjPqLoKemuHayQd6RkQQT9gheTfpn%2B72o1UA%40mail.gmail.com [2] - https://www.postgresql.org/message-id/flat/CAGz5QCJLERUn_zoO0eDv6_Y_d0o4tNTMPeR7ivTLBg4rUrJdwg@mail.gmail.com#CAGz5QCJLERUn_zoO0eDv6_Y_d0o4tNTMPeR7ivTLBg4rUrJdwg@mail.gmail.com Also, the patch (mask_hint_bit_LH_PAGE_HAS_DEAD_TUPLES.patch) to mask 'LH_PAGE_HAS_DEAD_TUPLES' flag which got added as a part of Microvacuum patch is attached with this mail. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com On Wed, Feb 1, 2017 at 10:30 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sat, Jan 28, 2017 at 8:02 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Fri, Jan 27, 2017 at 5:15 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >>>> >>>> Don't you think we should try to identify the reason of the deadlock >>>> error reported by you up thread [1]? I know that you and Ashutosh are >>>> not able to reproduce it, but still I feel some investigation is >>>> required to find the reason. It is quite possible that the test case >>>> is such that the deadlock is expected in rare cases, if that is the >>>> case then it is okay. I have not spent enough time on that to comment >>>> whether it is a test or code issue. >>> >>> I am finally able to reproduce the issue using the attached script >>> file (deadlock_report). Basically, once I was able to reproduce the >>> issue with hash index I also thought of checking it with a non unique >>> B-Tree index and was able to reproduce it with B-Tree index as well. >>> This certainly tells us that there is nothing wrong at the code level >>> rather there is something wrong with the test script which is causing >>> this deadlock issue. Well, I have ran pgbench with two different >>> configurations and my observations are as follows: >>> >>> 1) With Primary keys i.e. uinque values: I have never encountered >>> deadlock issue with this configuration. >>> >>> 2) With non unique indexes (be it hash or B-Tree): I have seen >>> deadlock many a times with this configuration. Basically when we have >>> non-unique values associated with a column then there is a high >>> probability that multiple records will get updated with a single >>> 'UPDATE' statement and when there are huge number of backends trying >>> to do so there is high chance of getting deadlock which i assume is >>> expected behaviour in database. >>> >> >> I agree with your analysis, surely trying to update multiple rows with >> same values from different backends can lead to deadlock. > > Moved that to CF 2017-03. > -- > Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Tue, Mar 14, 2017 at 8:02 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > Attached is the v6 patch for microvacuum in hash index rebased on top > of 'v10 patch for WAL in hash index - [1]' and 'v1 patch for WAL > consistency check for hash index - [2]'. > > [1] - https://www.postgresql.org/message-id/CAA4eK1%2Bk5wR4-kAjPqLoKemuHayQd6RkQQT9gheTfpn%2B72o1UA%40mail.gmail.com > [2] - https://www.postgresql.org/message-id/flat/CAGz5QCJLERUn_zoO0eDv6_Y_d0o4tNTMPeR7ivTLBg4rUrJdwg@mail.gmail.com#CAGz5QCJLERUn_zoO0eDv6_Y_d0o4tNTMPeR7ivTLBg4rUrJdwg@mail.gmail.com > > Also, the patch (mask_hint_bit_LH_PAGE_HAS_DEAD_TUPLES.patch) to mask > 'LH_PAGE_HAS_DEAD_TUPLES' flag which got added as a part of > Microvacuum patch is attached with this mail. Generally, this patch looks like a pretty straightforward adaptation of the similar btree mechanism to hash indexes, so if it works for btree it ought to work here, too. But I noticed a few things while reading through it. + /* Get RelfileNode from relation OID */ + rel = relation_open(htup->t_tableOid, NoLock); + rnode = rel->rd_node; + relation_close(rel, NoLock); itup->t_tid = htup->t_self; - _hash_doinsert(index, itup); + _hash_doinsert(index, itup, rnode); This is an awfully low-level place to be doing something like this. I'm not sure exactly where this should be happening, but not in the per-tuple callback. + /* + * If there's nothing running on the standby we don't need to derive a + * full latestRemovedXid value, so use a fast path out of here. This + * returns InvalidTransactionId, and so will conflict with all HS + * transactions; but since we just worked out that that's zero people, + * it's OK. + */ + if (CountDBBackends(InvalidOid) == 0) + return latestRemovedXid; I see that this comment (and most of what surrounds it) was just copied from the existing btree example, but isn't there a discrepancy between the comment and the code? It says it returns InvalidTransactionId, but it doesn't. Also, you dropped the XXX from the btree original, and the following reachedConsistency check. + * Hash Index delete records can conflict with standby queries.You might + * think that vacuum records would conflict as well, but we've handled But they're not called delete records in a hash index. The function is called hash_xlog_vacuum_one_page. The corresponding btree function is btree_xlog_delete. So this comment needs a little more updating. + if (IsBufferCleanupOK(bucket_buf)) + { + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf, + (buf == bucket_buf) ? true : false, + hnode); + if (bucket_buf != buf) + LockBuffer(bucket_buf, BUFFER_LOCK_UNLOCK); + + if (PageGetFreeSpace(page) >= itemsz) + break; /* OK, now we have enough space */ + } I might be missing something, but I don't quite see why this needs a cleanup lock on the primary bucket page. I would think a cleanup lock on the page we're actually modifying would suffice, and I think if that's correct it would be a better way to go. If that's not correct, then I think the comments needs some work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Mar 15, 2017 at 1:15 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Mar 14, 2017 at 8:02 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> Attached is the v6 patch for microvacuum in hash index rebased on top >> of 'v10 patch for WAL in hash index - [1]' and 'v1 patch for WAL >> consistency check for hash index - [2]'. >> >> [1] - https://www.postgresql.org/message-id/CAA4eK1%2Bk5wR4-kAjPqLoKemuHayQd6RkQQT9gheTfpn%2B72o1UA%40mail.gmail.com >> [2] - https://www.postgresql.org/message-id/flat/CAGz5QCJLERUn_zoO0eDv6_Y_d0o4tNTMPeR7ivTLBg4rUrJdwg@mail.gmail.com#CAGz5QCJLERUn_zoO0eDv6_Y_d0o4tNTMPeR7ivTLBg4rUrJdwg@mail.gmail.com >> >> Also, the patch (mask_hint_bit_LH_PAGE_HAS_DEAD_TUPLES.patch) to mask >> 'LH_PAGE_HAS_DEAD_TUPLES' flag which got added as a part of >> Microvacuum patch is attached with this mail. > > Generally, this patch looks like a pretty straightforward adaptation > of the similar btree mechanism to hash indexes, so if it works for > btree it ought to work here, too. But I noticed a few things while > reading through it. > > + /* Get RelfileNode from relation OID */ > + rel = relation_open(htup->t_tableOid, NoLock); > + rnode = rel->rd_node; > + relation_close(rel, NoLock); > itup->t_tid = htup->t_self; > - _hash_doinsert(index, itup); > + _hash_doinsert(index, itup, rnode); > > This is an awfully low-level place to be doing something like this. > I'm not sure exactly where this should be happening, but not in the > per-tuple callback. > I think one possibility is to get it using indexrel->rd_index->indrelid in _hash_doinsert(). > > But they're not called delete records in a hash index. The function > is called hash_xlog_vacuum_one_page. The corresponding btree function > is btree_xlog_delete. So this comment needs a little more updating. > > + if (IsBufferCleanupOK(bucket_buf)) > + { > + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf, > + (buf == bucket_buf) ? true : false, > + hnode); > + if (bucket_buf != buf) > + LockBuffer(bucket_buf, BUFFER_LOCK_UNLOCK); > + > + if (PageGetFreeSpace(page) >= itemsz) > + break; /* OK, now we have enough space */ > + } > > I might be missing something, but I don't quite see why this needs a > cleanup lock on the primary bucket page. I would think a cleanup lock > on the page we're actually modifying would suffice, and I think if > that's correct it would be a better way to go. > Offhand, I also don't see any problem with it. Few other comments: 1. + if (ndeletable > 0) + { + /* No ereport(ERROR) until changes are logged */ + START_CRIT_SECTION(); + + PageIndexMultiDelete(page, deletable, ndeletable); + + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; You clearing this flag while logging the action, but same is not taken care during replay. Any reasons? 2. + /* No ereport(ERROR) until changes are logged */ + START_CRIT_SECTION (); + + PageIndexMultiDelete(page, deletable, ndeletable); + + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; + + /* + * Write-lock the meta page so that we can decrement + * tuple count. + */ + LockBuffer(metabuf, BUFFER_LOCK_EXCLUSIVE); The lock on buffer should be acquired before critical section. 3. - /* - * Since this can be redone later if needed, mark as a hint. - */ - MarkBufferDirtyHint(buf, true); + if (so->numKilled < MaxIndexTuplesPerPage) + { + so- >killedItems[so->numKilled].heapTid = so->hashso_heappos; + so- >killedItems[so->numKilled].indexOffset = + ItemPointerGetOffsetNumber(&(so->hashso_curpos)); + so->numKilled++; + } By looking at above code, the first thing that comes to mind is when numKilled can become greater than MaxIndexTuplesPerPage and why we are ignoring the marking of dead tuples when it becomes greater than MaxIndexTuplesPerPage. After looking at similar btree code, I realize that it could happen if user reverses the scan direction. I think you should mention in comments that see btgettuple to know the reason of numKilled overun test or something like that. 4. + * We match items by heap TID before assuming they are the right ones to + * delete. If an item has moved off the current page due to a split, we'll + * fail to find it and do nothing (this is not an error case --- we assume + * the item will eventually get marked in a future indexscan). + */ +void +_hash_kill_items(IndexScanDesc scan) I think this comment doesn't make much sense for hash index because item won't move off from the current page due to split, only later cleanup can remove it. 5. +/* * Maximum size of a hash index item (it's okay to have only one per page) Spurious white space change. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
> Generally, this patch looks like a pretty straightforward adaptation > of the similar btree mechanism to hash indexes, so if it works for > btree it ought to work here, too. But I noticed a few things while > reading through it. > > + /* Get RelfileNode from relation OID */ > + rel = relation_open(htup->t_tableOid, NoLock); > + rnode = rel->rd_node; > + relation_close(rel, NoLock); > itup->t_tid = htup->t_self; > - _hash_doinsert(index, itup); > + _hash_doinsert(index, itup, rnode); > > This is an awfully low-level place to be doing something like this. > I'm not sure exactly where this should be happening, but not in the > per-tuple callback. Okay, Now I have done this inside _hash_doinsert() instead of callback function. Please have a look into the attached v7 patch. > > + /* > + * If there's nothing running on the standby we don't need to derive a > + * full latestRemovedXid value, so use a fast path out of here. This > + * returns InvalidTransactionId, and so will conflict with all HS > + * transactions; but since we just worked out that that's zero people, > + * it's OK. > + */ > + if (CountDBBackends(InvalidOid) == 0) > + return latestRemovedXid; > > I see that this comment (and most of what surrounds it) was just > copied from the existing btree example, but isn't there a discrepancy > between the comment and the code? It says it returns > InvalidTransactionId, but it doesn't. Also, you dropped the XXX from > the btree original, and the following reachedConsistency check. It does return InvalidTransactionId if there are no backends running across any database in the standby. As shown below 'latestRemovedXid' is initialised with InvalidTransactionId, TransactionId latestRemovedXid = InvalidTransactionId; So, If there are no backend processes running across any databases in standby latestRemovedXid will be returned as it is. I have also added the note in XXX in above comment. Please check v7 patch attached with this mail. > > + * Hash Index delete records can conflict with standby queries.You might > + * think that vacuum records would conflict as well, but we've handled > > But they're not called delete records in a hash index. The function > is called hash_xlog_vacuum_one_page. The corresponding btree function > is btree_xlog_delete. So this comment needs a little more updating. Okay, I have tried to rephrase it to avoid the confusion. > > + if (IsBufferCleanupOK(bucket_buf)) > + { > + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf, > + (buf == bucket_buf) ? true : false, > + hnode); > + if (bucket_buf != buf) > + LockBuffer(bucket_buf, BUFFER_LOCK_UNLOCK); > + > + if (PageGetFreeSpace(page) >= itemsz) > + break; /* OK, now we have enough space */ > + } > > I might be missing something, but I don't quite see why this needs a > cleanup lock on the primary bucket page. I would think a cleanup lock > on the page we're actually modifying would suffice, and I think if > that's correct it would be a better way to go. If that's not correct, > then I think the comments needs some work. Thanks for your that suggestion... I spent a lot of time thinking on this and also had a small discussion with Amit but could not find any issue with taking cleanup lock on modified page instead of primary bucket page.I had to do some decent code changes for this. Attached v7 patch has the changes. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
> > I think one possibility is to get it using > indexrel->rd_index->indrelid in _hash_doinsert(). > Thanks. I have tried the same in the v7 patch shared upthread. > >> >> But they're not called delete records in a hash index. The function >> is called hash_xlog_vacuum_one_page. The corresponding btree function >> is btree_xlog_delete. So this comment needs a little more updating. >> >> + if (IsBufferCleanupOK(bucket_buf)) >> + { >> + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf, >> + (buf == bucket_buf) ? true : false, >> + hnode); >> + if (bucket_buf != buf) >> + LockBuffer(bucket_buf, BUFFER_LOCK_UNLOCK); >> + >> + if (PageGetFreeSpace(page) >= itemsz) >> + break; /* OK, now we have enough space */ >> + } >> >> I might be missing something, but I don't quite see why this needs a >> cleanup lock on the primary bucket page. I would think a cleanup lock >> on the page we're actually modifying would suffice, and I think if >> that's correct it would be a better way to go. >> > > Offhand, I also don't see any problem with it. I too found no problem with that... > > Few other comments: > 1. > + if (ndeletable > 0) > + { > + /* No ereport(ERROR) until changes are logged */ > + START_CRIT_SECTION(); > + > + PageIndexMultiDelete(page, deletable, ndeletable); > + > + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); > + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; > > You clearing this flag while logging the action, but same is not taken > care during replay. Any reasons? That's because we conditionally WAL Log this flag status and when we do so, we take a it's FPI. > > 2. > + /* No ereport(ERROR) until changes are logged */ > + START_CRIT_SECTION > (); > + > + PageIndexMultiDelete(page, deletable, ndeletable); > + > + pageopaque = > (HashPageOpaque) PageGetSpecialPointer(page); > + pageopaque->hasho_flag &= > ~LH_PAGE_HAS_DEAD_TUPLES; > + > + /* > + * Write-lock the meta page so that we can > decrement > + * tuple count. > + */ > + LockBuffer(metabuf, > BUFFER_LOCK_EXCLUSIVE); > > The lock on buffer should be acquired before critical section. Point taken. I have taken care of it in the v7 patch. > > 3. > - > /* > - * Since this can be redone later if needed, mark as a > hint. > - */ > - MarkBufferDirtyHint(buf, true); > + > if (so->numKilled < MaxIndexTuplesPerPage) > + { > + so- >>killedItems[so->numKilled].heapTid = so->hashso_heappos; > + so- >>killedItems[so->numKilled].indexOffset = > + > ItemPointerGetOffsetNumber(&(so->hashso_curpos)); > + so->numKilled++; > + > } > > By looking at above code, the first thing that comes to mind is when > numKilled can become greater than MaxIndexTuplesPerPage and why we are > ignoring the marking of dead tuples when it becomes greater than > MaxIndexTuplesPerPage. After looking at similar btree code, I realize > that it could > happen if user reverses the scan direction. I think you should > mention in comments that see btgettuple to know the reason of > numKilled overun test or something like that. Added comment. Please refer to v7 patch. > > 4. > + * We match items by heap TID before assuming they are the right ones to > + * delete. If an item has > moved off the current page due to a split, we'll > + * fail to find it and do nothing (this is not an > error case --- we assume > + * the item will eventually get marked in a future indexscan). > + */ > +void > +_hash_kill_items(IndexScanDesc scan) > > I think this comment doesn't make much sense for hash index because > item won't move off from the current page due to split, only later > cleanup can remove it. Yes. The reason being, no cleanup will happen when scan in progress. Corrected it . > > 5. > > + > /* > * Maximum size of a hash index item (it's okay to have only one per page) > > Spurious white space change. Fixed.
On Wed, Mar 15, 2017 at 11:37 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> + /* Get RelfileNode from relation OID */ >> + rel = relation_open(htup->t_tableOid, NoLock); >> + rnode = rel->rd_node; >> + relation_close(rel, NoLock); >> itup->t_tid = htup->t_self; >> - _hash_doinsert(index, itup); >> + _hash_doinsert(index, itup, rnode); >> >> This is an awfully low-level place to be doing something like this. >> I'm not sure exactly where this should be happening, but not in the >> per-tuple callback. > > Okay, Now I have done this inside _hash_doinsert() instead of callback > function. Please have a look into the attached v7 patch. In the btree case, the heap relation isn't re-opened from anywhere in the btree code. I think we should try to do the same thing here. If we add an argument for the heap relation to _hash_doinsert(), hashinsert() can easily it down; it's already got that value available. There are two other calls to _hash_doinsert: 1. _h_indexbuild() calls _hash_doinsert(). It's called only from hashbuild(), which has the heap relation available. So we can just add that as an extra argument to _h_indexbuild() and then from there pass it to _hash_doinsert. 2. hashbuildCallback calls _hash_doinsert(). It's sixth argument is a HashBuildState which is set up by hashbuild(), which has the heap relation available. So we can just add an extra member to the HashBuildState and have hashbuild() set it before calling IndexBuildHeapScan. hashbuildCallback can then fish it out of the HashBuildState and pass it to _hash_doinsert(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Mar 15, 2017 at 9:28 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 15, 2017 at 11:37 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >>> + /* Get RelfileNode from relation OID */ >>> + rel = relation_open(htup->t_tableOid, NoLock); >>> + rnode = rel->rd_node; >>> + relation_close(rel, NoLock); >>> itup->t_tid = htup->t_self; >>> - _hash_doinsert(index, itup); >>> + _hash_doinsert(index, itup, rnode); >>> >>> This is an awfully low-level place to be doing something like this. >>> I'm not sure exactly where this should be happening, but not in the >>> per-tuple callback. >> >> Okay, Now I have done this inside _hash_doinsert() instead of callback >> function. Please have a look into the attached v7 patch. > > In the btree case, the heap relation isn't re-opened from anywhere in > the btree code. I think we should try to do the same thing here. If > we add an argument for the heap relation to _hash_doinsert(), > hashinsert() can easily it down; it's already got that value > available. There are two other calls to _hash_doinsert: > > 1. _h_indexbuild() calls _hash_doinsert(). It's called only from > hashbuild(), which has the heap relation available. So we can just > add that as an extra argument to _h_indexbuild() and then from there > pass it to _hash_doinsert. > > 2. hashbuildCallback calls _hash_doinsert(). It's sixth argument is a > HashBuildState which is set up by hashbuild(), which has the heap > relation available. So we can just add an extra member to the > HashBuildState and have hashbuild() set it before calling > IndexBuildHeapScan. hashbuildCallback can then fish it out of the > HashBuildState and pass it to _hash_doinsert(). Okay, I have done the changes as suggested by you. Please refer to the attached v8 patch. Thanks. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Wed, Mar 15, 2017 at 2:10 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > Okay, I have done the changes as suggested by you. Please refer to the > attached v8 patch. Thanks. Cool, but this doesn't look right: + action = XLogReadBufferForRedo(record, 0, &buffer); + + if (!IsBufferCleanupOK(buffer)) + elog(PANIC, "hash_xlog_vacuum_one_page: failed to acquire cleanup lock"); That could fail, I think, because of a pin from a Hot Standby backend. You want to call XLogReadBufferForRedoExtended() with a third argument of true. Come to think of it, shouldn't hash_xlog_split_allocate_page be changed the same way? + /* + * Let us mark the page as clean if vacuum removes the DEAD tuples + * from an index page. We do this by clearing LH_PAGE_HAS_DEAD_TUPLES + * flag. + */ Maybe add: Clearing this flag is just a hint; replay won't redo this. + * Hash Index records that are marked as LP_DEAD and being removed during + * hash index tuple insertion can conflict with standby queries.You might The word Index shouldn't be capitalized here. There should be a space before "You". The formatting of this comment is oddly narrow: + * _hash_vacuum_one_page - vacuum just one index page. + * Try to remove LP_DEAD items from the given page. We + * must acquire cleanup lock on the page being modified + * before calling this function. I'd add a blank line after the first line and reflow the rest to be something more like 75 characters. pgindent evidently doesn't think this needs reformatting, but it's oddly narrow. I suggest changing the header comment of hash_xlog_vacuum_get_latestRemovedXid like this: + * Get the latestRemovedXid from the heap pages pointed at by the index + * tuples being deleted. See also btree_xlog_delete_get_latestRemovedXid, + * on which this function is based. This is looking good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> + action = XLogReadBufferForRedo(record, 0, &buffer); > + > + if (!IsBufferCleanupOK(buffer)) > + elog(PANIC, "hash_xlog_vacuum_one_page: failed to acquire > cleanup lock"); > > That could fail, I think, because of a pin from a Hot Standby backend. > You want to call XLogReadBufferForRedoExtended() with a third argument > of true. Yes, there is a possibility that a new backend may start in standby after we kill the conflicting backends. If the new backend has pin on the buffer which the startup process is trying to read then 'IsBufferCleanupOK' will fail thereby causing a startup process to PANIC. Come to think of it, shouldn't hash_xlog_split_allocate_page > be changed the same way? No, the reason being we are trying to allocate a new bucket page on standby so there can't be any backend which could have pin on a page that is yet to initialised. > > + /* > + * Let us mark the page as clean if vacuum removes the DEAD tuples > + * from an index page. We do this by clearing > LH_PAGE_HAS_DEAD_TUPLES > + * flag. > + */ > > Maybe add: Clearing this flag is just a hint; replay won't redo this. Added. Please check the attached v9 patch. > > + * Hash Index records that are marked as LP_DEAD and being removed during > + * hash index tuple insertion can conflict with standby queries.You might > > The word Index shouldn't be capitalized here. There should be a space > before "You". Corrected. > > The formatting of this comment is oddly narrow: > > + * _hash_vacuum_one_page - vacuum just one index page. > + * Try to remove LP_DEAD items from the given page. We > + * must acquire cleanup lock on the page being modified > + * before calling this function. > > I'd add a blank line after the first line and reflow the rest to be > something more like 75 characters. pgindent evidently doesn't think > this needs reformatting, but it's oddly narrow. Corrected. > > I suggest changing the header comment of > hash_xlog_vacuum_get_latestRemovedXid like this: > > + * Get the latestRemovedXid from the heap pages pointed at by the index > + * tuples being deleted. See also btree_xlog_delete_get_latestRemovedXid, > + * on which this function is based. > > This is looking good. Changed as per suggestions. Attached v9 patch. Thanks. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Wed, Mar 15, 2017 at 3:54 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > Changed as per suggestions. Attached v9 patch. Thanks. Wow, when do you sleep? Will have a look. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Mar 15, 2017 at 4:31 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 15, 2017 at 3:54 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> Changed as per suggestions. Attached v9 patch. Thanks. > > Wow, when do you sleep? Will have a look. Committed with a few corrections. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-03-15 16:31:11 -0400, Robert Haas wrote: > On Wed, Mar 15, 2017 at 3:54 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Changed as per suggestions. Attached v9 patch. Thanks. > > Wow, when do you sleep? I think that applies to a bunch of people, including yourself ;)
On Mar 16, 2017 7:49 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
On Wed, Mar 15, 2017 at 4:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:Committed with a few corrections.
> On Wed, Mar 15, 2017 at 3:54 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> Changed as per suggestions. Attached v9 patch. Thanks.
>
> Wow, when do you sleep? Will have a look.
Thanks Robert for the commit. Thank you Amit and Jesper for reviewing this patch.
With Regards,
Ashutosh Sharma
On Wed, Mar 15, 2017 at 9:23 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > >> >> Few other comments: >> 1. >> + if (ndeletable > 0) >> + { >> + /* No ereport(ERROR) until changes are logged */ >> + START_CRIT_SECTION(); >> + >> + PageIndexMultiDelete(page, deletable, ndeletable); >> + >> + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); >> + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; >> >> You clearing this flag while logging the action, but same is not taken >> care during replay. Any reasons? > > That's because we conditionally WAL Log this flag status and when we > do so, we take a it's FPI. > Sure, but we are not clearing in conditionally. I am not sure, how after recovery it will be cleared it gets set during normal operation. Moreover, btree already clears similar flag during replay (refer btree_xlog_delete). -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Mar 16, 2017 at 11:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Mar 15, 2017 at 9:23 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> >>> >>> Few other comments: >>> 1. >>> + if (ndeletable > 0) >>> + { >>> + /* No ereport(ERROR) until changes are logged */ >>> + START_CRIT_SECTION(); >>> + >>> + PageIndexMultiDelete(page, deletable, ndeletable); >>> + >>> + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); >>> + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; >>> >>> You clearing this flag while logging the action, but same is not taken >>> care during replay. Any reasons? >> >> That's because we conditionally WAL Log this flag status and when we >> do so, we take a it's FPI. >> > > Sure, but we are not clearing in conditionally. I am not sure, how > after recovery it will be cleared it gets set during normal operation. > Moreover, btree already clears similar flag during replay (refer > btree_xlog_delete). You were right. In case datachecksum is enabled or wal_log_hint is set to true, 'LH_PAGE_HAS_DEAD_TUPLES' will get wal logged and therefore needs to be cleared on the standby as well. Attached is the patch that clears this flag on standby during replay. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Thu, Mar 16, 2017 at 1:02 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > On Thu, Mar 16, 2017 at 11:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Wed, Mar 15, 2017 at 9:23 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >>> >>>> >>>> Few other comments: >>>> 1. >>>> + if (ndeletable > 0) >>>> + { >>>> + /* No ereport(ERROR) until changes are logged */ >>>> + START_CRIT_SECTION(); >>>> + >>>> + PageIndexMultiDelete(page, deletable, ndeletable); >>>> + >>>> + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); >>>> + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; >>>> >>>> You clearing this flag while logging the action, but same is not taken >>>> care during replay. Any reasons? >>> >>> That's because we conditionally WAL Log this flag status and when we >>> do so, we take a it's FPI. >>> >> >> Sure, but we are not clearing in conditionally. I am not sure, how >> after recovery it will be cleared it gets set during normal operation. >> Moreover, btree already clears similar flag during replay (refer >> btree_xlog_delete). > > You were right. In case datachecksum is enabled or wal_log_hint is set > to true, 'LH_PAGE_HAS_DEAD_TUPLES' will get wal logged and therefore > needs to be cleared on the standby as well. > I was thinking what bad can happen if we don't clear this flag during replay, the main thing that comes to mind is that after crash recovery, if the flag is set the inserts on that page might need to traverse all the tuples on that page once the page is full even if there are no dead tuples in that page. It can be later cleared when there are dead tuples in that page and we actually delete them, but I don't think it is worth the price to pay for not clearing the flag during replay. > Attached is the patch that > clears this flag on standby during replay. > Don't you think, we should also clear it during the replay of XLOG_HASH_DELETE? We might want to log the clear of flag along with WAL record for XLOG_HASH_DELETE. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
>>> Sure, but we are not clearing in conditionally. I am not sure, how >>> after recovery it will be cleared it gets set during normal operation. >>> Moreover, btree already clears similar flag during replay (refer >>> btree_xlog_delete). >> >> You were right. In case datachecksum is enabled or wal_log_hint is set >> to true, 'LH_PAGE_HAS_DEAD_TUPLES' will get wal logged and therefore >> needs to be cleared on the standby as well. >> > > I was thinking what bad can happen if we don't clear this flag during > replay, the main thing that comes to mind is that after crash > recovery, if the flag is set the inserts on that page might need to > traverse all the tuples on that page once the page is full even if > there are no dead tuples in that page. It can be later cleared when > there are dead tuples in that page and we actually delete them, but I > don't think it is worth the price to pay for not clearing the flag > during replay. Yes, you are absolutely correct. If we do not clear this flag during replay then there is a possibility of _hash_doinsert() unnecessarily scanning the page with no space assuming that the page has got some dead tuples in it which is not true. > >> Attached is the patch that >> clears this flag on standby during replay. >> > > Don't you think, we should also clear it during the replay of > XLOG_HASH_DELETE? We might want to log the clear of flag along with > WAL record for XLOG_HASH_DELETE. > Yes, it should be cleared. I completely missed this part in a hurry. Thanks for informing. I have taken care of it in the attached v2 patch. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Thu, Mar 16, 2017 at 9:39 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >>> >> >> Don't you think, we should also clear it during the replay of >> XLOG_HASH_DELETE? We might want to log the clear of flag along with >> WAL record for XLOG_HASH_DELETE. >> > > Yes, it should be cleared. I completely missed this part in a hurry. > Thanks for informing. I have taken care of it in the attached v2 > patch. > + /* + * Mark the page as not containing any LP_DEAD items. See comments + * in hashbucketcleanup() for details. + */ + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; Your comment here says, refer hashbucketcleanup and in that function, the comment says "Clearing this flag is just a hint; replay won't redo this.". Both seems contradictory. You need to change the comment in hashbucketcleanup. As I said in my previous e-mail, I think you need to record clearing of this flag in WAL record XLOG_HASH_DELETE as you are not doing this unconditionally and then during replay clear it only when the WAL record indicates the same. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Fri, Mar 17, 2017 at 8:20 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Mar 16, 2017 at 9:39 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >>>> >>> >>> Don't you think, we should also clear it during the replay of >>> XLOG_HASH_DELETE? We might want to log the clear of flag along with >>> WAL record for XLOG_HASH_DELETE. >>> >> >> Yes, it should be cleared. I completely missed this part in a hurry. >> Thanks for informing. I have taken care of it in the attached v2 >> patch. >> > > + /* > + * Mark the page as not containing any LP_DEAD items. See comments > + * in hashbucketcleanup() for details. > + */ > + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); > + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; > > Your comment here says, refer hashbucketcleanup and in that function, > the comment says "Clearing this flag is just a hint; replay won't redo > this.". Both seems contradictory. You need to change the comment in > hashbucketcleanup. Done. Please check the attached v3 patch. As I said in my previous e-mail, I think you need > to record clearing of this flag in WAL record XLOG_HASH_DELETE as you > are not doing this unconditionally and then during replay clear it > only when the WAL record indicates the same. Thank you so much for putting that point. I too think that we should record the flag status in the WAL record and clear it only when required during replay. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Fri, Mar 17, 2017 at 12:27 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > On Fri, Mar 17, 2017 at 8:20 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > > As I said in my previous e-mail, I think you need >> to record clearing of this flag in WAL record XLOG_HASH_DELETE as you >> are not doing this unconditionally and then during replay clear it >> only when the WAL record indicates the same. > > Thank you so much for putting that point. I too think that we should > record the flag status in the WAL record and clear it only when > required during replay. > I think hashdesc.c needs an update (refer case XLOG_HASH_DELETE:). - * flag. Clearing this flag is just a hint; replay won't redo this. + * flag. Clearing this flag is just a hint; replay will check the + * status of clear_dead_marking flag before redo it. */ if (tuples_removed && *tuples_removed > 0 && opaque->hasho_flag& LH_PAGE_HAS_DEAD_TUPLES) + { opaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; + clear_dead_marking = true; + } I feel the above comment is not required as you are logging this action explicitly. + bool clear_dead_marking; /* TRUE if VACUUM clears No need to write VACUUM explicitly, you can simply say "TRUE if this operation clears ...". -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Fri, Mar 17, 2017 at 6:13 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Mar 17, 2017 at 12:27 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> On Fri, Mar 17, 2017 at 8:20 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> As I said in my previous e-mail, I think you need >>> to record clearing of this flag in WAL record XLOG_HASH_DELETE as you >>> are not doing this unconditionally and then during replay clear it >>> only when the WAL record indicates the same. >> >> Thank you so much for putting that point. I too think that we should >> record the flag status in the WAL record and clear it only when >> required during replay. >> > > I think hashdesc.c needs an update (refer case XLOG_HASH_DELETE:). Done. Thanks! > > - * flag. Clearing this flag is just a hint; replay won't redo this. > + * flag. Clearing this flag is just a hint; replay will check the > + * status of clear_dead_marking flag before redo it. > */ > if (tuples_removed && *tuples_removed > 0 && > opaque->hasho_flag & LH_PAGE_HAS_DEAD_TUPLES) > + { > opaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES; > + clear_dead_marking = true; > + } > > > I feel the above comment is not required as you are logging this > action explicitly. That's right. I have removed it in the attached v4 patch. > > + bool clear_dead_marking; /* TRUE if VACUUM clears > > No need to write VACUUM explicitly, you can simply say "TRUE if this > operation clears ...". Corrected. Please find the attached v4 patch. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Wed, Mar 15, 2017 at 07:26:45PM -0700, Andres Freund wrote: > On 2017-03-15 16:31:11 -0400, Robert Haas wrote: > > On Wed, Mar 15, 2017 at 3:54 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > Changed as per suggestions. Attached v9 patch. Thanks. > > > > Wow, when do you sleep? > > I think that applies to a bunch of people, including yourself ;) Gee, no one asks when I sleep. I wonder why. ;-) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Fri, Mar 17, 2017 at 8:34 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > On Fri, Mar 17, 2017 at 6:13 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Fri, Mar 17, 2017 at 12:27 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >>> On Fri, Mar 17, 2017 at 8:20 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >>> >>> As I said in my previous e-mail, I think you need >>>> to record clearing of this flag in WAL record XLOG_HASH_DELETE as you >>>> are not doing this unconditionally and then during replay clear it >>>> only when the WAL record indicates the same. >>> >>> Thank you so much for putting that point. I too think that we should >>> record the flag status in the WAL record and clear it only when >>> required during replay. >>> >> >> I think hashdesc.c needs an update (refer case XLOG_HASH_DELETE:). > > Done. Thanks! > This version looks good to me. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Sat, Mar 18, 2017 at 4:35 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > This version looks good to me. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company