Thread: Tid scan increments value of pg_stat_all_tables.seq_scan. (but not seq_tup_read)
Tid scan increments value of pg_stat_all_tables.seq_scan. (but not seq_tup_read)
From
Kasahara Tatsuhito
Date:
Hi, I noticed that since PostgreSQL 12, Tid scan increments value of pg_stat_all_tables.seq_scan. (but not seq_tup_read) The following is an example. CREATE TABLE t (c int); INSERT INTO t SELECT 1; SET enable_seqscan to off; (v12 -) =# EXPLAIN ANALYZE SELECT * FROM t WHERE ctid = '(0,1)'; QUERY PLAN ------------------------------------------------------------------------------------------- Tid Scan on t (cost=0.00..4.01 rows=1 width=4) (actual time=0.034..0.035 rows=1 loops=1) TID Cond: (ctid = '(0,1)'::tid) Planning Time: 0.341 ms Execution Time: 0.059 ms (4 rows) =# SELECT seq_scan, seq_tup_read FROM pg_stat_user_tables WHERE relname = 't'; seq_scan | seq_tup_read ----------+-------------- 1 | 0 (1 row) (- v11) =# EXPLAIN ANALYZE SELECT * FROM t WHERE ctid = '(0,1)'; QUERY PLAN ------------------------------------------------------------------------------------------- Tid Scan on t (cost=0.00..4.01 rows=1 width=4) (actual time=0.026..0.027 rows=1 loops=1) TID Cond: (ctid = '(0,1)'::tid) Planning Time: 1.003 ms Execution Time: 0.068 ms (4 rows) postgres=# SELECT seq_scan, seq_tup_read FROM pg_stat_user_tables WHERE relname = 't'; seq_scan | seq_tup_read ----------+-------------- 0 | 0 (1 row) Exactly, this change occurred from following commit. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=147e3722f7e531f15ba389a4d518efe8cd0bd736) I think, from this commit, TidListEval() came to call table_beginscan() , so this increments would be happen. I'm not sure this change whether intention or not, it can confuse some users. Best regards, -- NTT Open Source Software Center Tatsuhito Kasahara
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
Hi. Attached patch solve this problem. This patch adds table_beginscan_tid() and call it in TidListEval() instead of table_beginscan(). table_beginscan_tid() is the same as table_beginscan() but do not set SO_TYPE_SEQSCAN to flags. Although I'm not sure this behavior is really problem or not, it seems to me that previous behavior is more prefer. Is it worth to apply to HEAD and v12 branch ? Best regards, 2020年1月27日(月) 14:35 Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com>: > > Hi, I noticed that since PostgreSQL 12, Tid scan increments value of > pg_stat_all_tables.seq_scan. (but not seq_tup_read) > > The following is an example. > > CREATE TABLE t (c int); > INSERT INTO t SELECT 1; > SET enable_seqscan to off; > > (v12 -) > =# EXPLAIN ANALYZE SELECT * FROM t WHERE ctid = '(0,1)'; > QUERY PLAN > ------------------------------------------------------------------------------------------- > Tid Scan on t (cost=0.00..4.01 rows=1 width=4) (actual > time=0.034..0.035 rows=1 loops=1) > TID Cond: (ctid = '(0,1)'::tid) > Planning Time: 0.341 ms > Execution Time: 0.059 ms > (4 rows) > > =# SELECT seq_scan, seq_tup_read FROM pg_stat_user_tables WHERE relname = 't'; > seq_scan | seq_tup_read > ----------+-------------- > 1 | 0 > (1 row) > > (- v11) > =# EXPLAIN ANALYZE SELECT * FROM t WHERE ctid = '(0,1)'; > QUERY PLAN > ------------------------------------------------------------------------------------------- > Tid Scan on t (cost=0.00..4.01 rows=1 width=4) (actual > time=0.026..0.027 rows=1 loops=1) > TID Cond: (ctid = '(0,1)'::tid) > Planning Time: 1.003 ms > Execution Time: 0.068 ms > (4 rows) > > postgres=# SELECT seq_scan, seq_tup_read FROM pg_stat_user_tables > WHERE relname = 't'; > seq_scan | seq_tup_read > ----------+-------------- > 0 | 0 > (1 row) > > > Exactly, this change occurred from following commit. > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=147e3722f7e531f15ba389a4d518efe8cd0bd736) > I think, from this commit, TidListEval() came to call > table_beginscan() , so this increments would be happen. > > I'm not sure this change whether intention or not, it can confuse some users. > > Best regards, > -- > NTT Open Source Software Center > Tatsuhito Kasahara -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Attachment
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Fujii Masao
Date:
On 2020/01/29 20:06, Kasahara Tatsuhito wrote: > Hi. > > Attached patch solve this problem. > > This patch adds table_beginscan_tid() and call it in TidListEval() > instead of table_beginscan(). > table_beginscan_tid() is the same as table_beginscan() but do not set > SO_TYPE_SEQSCAN to flags. > > Although I'm not sure this behavior is really problem or not, > it seems to me that previous behavior is more prefer. > > Is it worth to apply to HEAD and v12 branch ? I've not read the patch yet, but I agree that updating only seq_scan but not seq_tup_read in Tid Scan sounds strange. IMO at least both should be update together or neither should be updated. Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kyotaro Horiguchi
Date:
Hello. At Wed, 29 Jan 2020 23:24:09 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > On 2020/01/29 20:06, Kasahara Tatsuhito wrote: > > Hi. > > Attached patch solve this problem. > > This patch adds table_beginscan_tid() and call it in TidListEval() > > instead of table_beginscan(). > > table_beginscan_tid() is the same as table_beginscan() but do not set > > SO_TYPE_SEQSCAN to flags. > > Although I'm not sure this behavior is really problem or not, > > it seems to me that previous behavior is more prefer. > > Is it worth to apply to HEAD and v12 branch ? > > I've not read the patch yet, but I agree that updating only seq_scan > but not seq_tup_read in Tid Scan sounds strange. IMO at least > both should be update together or neither should be updated. Basically agreed, but sample scan doesn't increment seq_scan but increments seq_tup_read. Aside from that fact, before 147e3722f7 TidScan didn't need a scan descriptor so didn't call table_beginscan. table_beginscan didn't increment the counter for bitmapscan and samplescan. The commit changes TidScan to call beginscan but didn't change table_beginscan not to increment the counter for tidscans. From the view of the view pg_stat_*_tables, the counters moves as follows. increments scan type table_beginscan?, per scan, per tuple , SO_TYPE flags ============================================================================= seq scan : yes , seq_scan, seq_tup_read , SO_TYPE_SEQSCAN index scan : no , idx_scan, idx_tup_fetch , <none> bitmap scan: yes , idx_scan, idx_tup_fetch , SO_TYPE_BITMAPSCAN sample scan: yes , <none> , seq_tup_read , SO_TYPE_SAMPLESCAN TID scan : yes , seq_scan, <none> , <none> bitmap scan and sample scan are historically excluded by corresponding flags is_bitmapscan and is_samplescan and the commit c3b23ae457 moved the work to SO_TYPE_* flags. After 147e3722f7, TID scan has the same characteristics, that is, it calls table_beginscan but doesn't increment seq_scan. But it doesn't have corresponding flag value. I'd rather think that whatever calls table_beginscan should have corresponding SO_TYPE_* flags. (Note: index scan doesn't call it.) It would be another issue what we should do for the sample scan case. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
Hi, On Thu, Jan 30, 2020 at 10:55 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Wed, 29 Jan 2020 23:24:09 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > On 2020/01/29 20:06, Kasahara Tatsuhito wrote: > > > Although I'm not sure this behavior is really problem or not, > > > it seems to me that previous behavior is more prefer. > > > Is it worth to apply to HEAD and v12 branch ? > > > > I've not read the patch yet, but I agree that updating only seq_scan > > but not seq_tup_read in Tid Scan sounds strange. IMO at least > > both should be update together or neither should be updated. > > Basically agreed, but sample scan doesn't increment seq_scan but > increments seq_tup_read. Yeah, sample scan's behavior is also bit annoying. > From the view of the view pg_stat_*_tables, the counters moves as follows. Thanks for your clarification. > TID scan : yes , seq_scan, <none> , <none> Here is wrong, because TID scan came to have SO_TYPE_SEQSCAN flags from commit 147e3722f7. So, currently( v12 and HEAD) TID scan status as following increments scan type table_beginscan?, per scan, per tuple , SO_TYPE flags ============================================================================= TID scan : yes , seq_scan, <none> , SO_TYPE_SEQSCAN And my patch change the status to following (same as -v11) increments scan type table_beginscan?, per scan, per tuple , SO_TYPE flags ============================================================================= TID scan : yes , <none>, <none> , <none> > I'd rather think that whatever calls table_beginscan should have > corresponding SO_TYPE_* flags. (Note: index scan doesn't call it.) Agreed. It may be better to add new flag such like SO_TYPE_TIDSCAN, and handles some statistics updating and other things. But it may be a bit overkill, since I want to revert to the previous behavior this time. Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kyotaro Horiguchi
Date:
At Thu, 30 Jan 2020 13:30:56 +0900, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote in > > TID scan : yes , seq_scan, <none> , <none> > Here is wrong, because TID scan came to have SO_TYPE_SEQSCAN flags > from commit 147e3722f7. It is reflectings the discussion below, which means TID scan doesn't have corresponding SO_TYPE_ value. Currently it is setting SO_TYPE_SEQSCAN by accedent. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
On Thu, Jan 30, 2020 at 1:49 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Thu, 30 Jan 2020 13:30:56 +0900, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote in > > > TID scan : yes , seq_scan, <none> , <none> > > Here is wrong, because TID scan came to have SO_TYPE_SEQSCAN flags > > from commit 147e3722f7. > > It is reflectings the discussion below, which means TID scan doesn't > have corresponding SO_TYPE_ value. Currently it is setting > SO_TYPE_SEQSCAN by accedent. Ah, sorry I misunderstood.. Upon further investigation, the SO_TYPE_SEQSCAN flag was also used at heap_beginscan() to determine whether a predicate lock was taken on the entire relation. if (scan->rs_base.rs_flags & (SO_TYPE_SEQSCAN | SO_TYPE_SAMPLESCAN)) { /* * Ensure a missing snapshot is noticed reliably, even if the * isolation mode means predicate locking isn't performed (and * therefore the snapshot isn't used here). */ Assert(snapshot); PredicateLockRelation(relation, snapshot); } Therefore, it can not simply remove the SO_TYPE_SEQSCAN flag from a TID scan. To keep the old behavior, I think it would be better to add a new SO_TYPE_TIDSCAN flag and take a predicate lock on the entire relation. Attach the v2 patch which change the status to following. (same as -v11 but have new SO_TYPE_TIDSCAN flag) increments scan type table_beginscan?, per scan, per tuple , SO_TYPE flags ============================================================================= TID scan : yes , <none>, <none> , SO_TYPE_TIDSCAN Is it acceptable change for HEAD and v12? Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Attachment
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Fujii Masao
Date:
On 2020/02/01 16:05, Kasahara Tatsuhito wrote: > On Thu, Jan 30, 2020 at 1:49 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: >> At Thu, 30 Jan 2020 13:30:56 +0900, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote in >>>> TID scan : yes , seq_scan, <none> , <none> >>> Here is wrong, because TID scan came to have SO_TYPE_SEQSCAN flags >>> from commit 147e3722f7. >> >> It is reflectings the discussion below, which means TID scan doesn't >> have corresponding SO_TYPE_ value. Currently it is setting >> SO_TYPE_SEQSCAN by accedent. > Ah, sorry I misunderstood.. > > Upon further investigation, the SO_TYPE_SEQSCAN flag was also used at > heap_beginscan() to determine whether a predicate lock was taken on > the entire relation. > > if (scan->rs_base.rs_flags & (SO_TYPE_SEQSCAN | SO_TYPE_SAMPLESCAN)) > { > /* > * Ensure a missing snapshot is noticed reliably, even if the > * isolation mode means predicate locking isn't performed (and > * therefore the snapshot isn't used here). > */ > Assert(snapshot); > PredicateLockRelation(relation, snapshot); > } > > Therefore, it can not simply remove the SO_TYPE_SEQSCAN flag from a TID scan. > To keep the old behavior, I think it would be better to add a new > SO_TYPE_TIDSCAN flag and take a predicate lock on the entire relation. But in the old behavior, PredicateLockRelation() was not called in TidScan case because its flag was not SO_TYPE_SEQSCAN. No? Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
On Mon, Feb 3, 2020 at 4:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > On 2020/02/01 16:05, Kasahara Tatsuhito wrote: > > if (scan->rs_base.rs_flags & (SO_TYPE_SEQSCAN | SO_TYPE_SAMPLESCAN)) > > { > > /* > > * Ensure a missing snapshot is noticed reliably, even if the > > * isolation mode means predicate locking isn't performed (and > > * therefore the snapshot isn't used here). > > */ > > Assert(snapshot); > > PredicateLockRelation(relation, snapshot); > > } > > > > Therefore, it can not simply remove the SO_TYPE_SEQSCAN flag from a TID scan. > > To keep the old behavior, I think it would be better to add a new > > SO_TYPE_TIDSCAN flag and take a predicate lock on the entire relation. > > But in the old behavior, PredicateLockRelation() was not called in TidScan case > because its flag was not SO_TYPE_SEQSCAN. No? No. Tid scan called PredicateLockRelation() both previous and current. In the current (v12 and HEAD), Tid scan has SO_TYPE_SEQSCAN flag so that PredicateLockRelation()is called in Tid scan. In the Previous (- v11), in heap_beginscan_internal(), checks is_bitmapscan flags. If is_bitmapscan is set to false, calls PredicateLockRelation(). (- v11) if (!is_bitmapscan) PredicateLockRelation(relation, snapshot); And in the Tid scan, is_bitmapscan is set to false, so that PredicateLockRelation()is called in Tid scan. Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Fujii Masao
Date:
On 2020/02/03 16:39, Kasahara Tatsuhito wrote: > On Mon, Feb 3, 2020 at 4:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> On 2020/02/01 16:05, Kasahara Tatsuhito wrote: >>> if (scan->rs_base.rs_flags & (SO_TYPE_SEQSCAN | SO_TYPE_SAMPLESCAN)) >>> { >>> /* >>> * Ensure a missing snapshot is noticed reliably, even if the >>> * isolation mode means predicate locking isn't performed (and >>> * therefore the snapshot isn't used here). >>> */ >>> Assert(snapshot); >>> PredicateLockRelation(relation, snapshot); >>> } >>> >>> Therefore, it can not simply remove the SO_TYPE_SEQSCAN flag from a TID scan. >>> To keep the old behavior, I think it would be better to add a new >>> SO_TYPE_TIDSCAN flag and take a predicate lock on the entire relation. >> >> But in the old behavior, PredicateLockRelation() was not called in TidScan case >> because its flag was not SO_TYPE_SEQSCAN. No? > No. Tid scan called PredicateLockRelation() both previous and current. > > In the current (v12 and HEAD), Tid scan has SO_TYPE_SEQSCAN flag so > that PredicateLockRelation()is called in Tid scan. > In the Previous (- v11), in heap_beginscan_internal(), checks > is_bitmapscan flags. > If is_bitmapscan is set to false, calls PredicateLockRelation(). > > (- v11) > if (!is_bitmapscan) > PredicateLockRelation(relation, snapshot); > > And in the Tid scan, is_bitmapscan is set to false, so that > PredicateLockRelation()is called in Tid scan. Thanks for explaining that! But heap_beginscan_internal() was really called in TidScan case? Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
On Mon, Feb 3, 2020 at 5:33 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > Thanks for explaining that! But heap_beginscan_internal() was really > called in TidScan case? Oh, you are right. Tid Scan started calling table_beginscan from v12 (commit 147e3722f7). So previously(-v11), Tid Scan might never calls heap_beginscan_internal(). Therefore, from v12, Tid scan not only increases the value of seq_scan, but also acquires a predicate lock. Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
On Mon, Feb 3, 2020 at 6:20 PM Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote: > Therefore, from v12, Tid scan not only increases the value of > seq_scan, but also acquires a predicate lock. Based on further investigation and Fujii's advice, I've summarized this issue as follows. From commit 147e3722f7, Tid Scan came to (A) increments num of seq_scan on pg_stat_*_tables and (B) take a predicate lock on the entire relation. (A) may be confusing to users, so I think it is better to fix it. For (B), an unexpected serialization error has occurred as follows, so I think it should be fix. ========================================================================= [Preparation] CREATE TABLE tid_test (c1 int, c2 int); INSERT INTO tid_test SELECT generate_series(1,1000), 0; [Session-1:] BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE ; [Session-2:] BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE ; [Session-1:] SELECT * FROM tid_test WHERE ctid = '(0,1)'; [Session-2:] SELECT * FROM tid_test WHERE ctid = '(1,1)'; [Session-1:] INSERT INTO tid_test SELECT 1001, 10; [Session-2:] INSERT INTO tid_test SELECT 1001, 10; [Session-1:] COMMIT; [Session-2:] COMMIT; Result: (-v11): Both session could commit. (v12-): Session-2 raised error as following because of taking a predicate lock on the entire table... -------- ERROR: could not serialize access due to read/write dependencies among transactions DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. HINT: The transaction might succeed if retried. -------- ========================================================================= Attached patch fix both (A) and (B), so that the behavior of Tid Scan back to the same as before v11. (As a result, this patch is the same as the one that first attached.) Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Attachment
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Andres Freund
Date:
Hi, On 2020-02-05 16:25:25 +0900, Kasahara Tatsuhito wrote: > On Mon, Feb 3, 2020 at 6:20 PM Kasahara Tatsuhito > <kasahara.tatsuhito@gmail.com> wrote: > > Therefore, from v12, Tid scan not only increases the value of > > seq_scan, but also acquires a predicate lock. > > Based on further investigation and Fujii's advice, I've summarized > this issue as follows. > > From commit 147e3722f7, Tid Scan came to > (A) increments num of seq_scan on pg_stat_*_tables > and > (B) take a predicate lock on the entire relation. > > (A) may be confusing to users, so I think it is better to fix it. > For (B), an unexpected serialization error has occurred as follows, so > I think it should be fix. I think it'd be good if we could guard against b) via an isolation test. It's more painful to do that for a), due to the unreliability of stats at the moment (we have some tests, but they take a long time). Greetings, Andres Freund
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
Hi, On Thu, Feb 6, 2020 at 11:48 AM Andres Freund <andres@anarazel.de> wrote: > I think it'd be good if we could guard against b) via an isolation > test. It's more painful to do that for a), due to the unreliability of > stats at the moment (we have some tests, but they take a long time). Thanks for your advise, and agreed. I added a new (but minimal) isolation test for the case of tid scan. (v12 and HEAD will be failed this test. v11 and HEAD with my patch will be passed) Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Attachment
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Fujii Masao
Date:
On 2020/02/06 15:04, Kasahara Tatsuhito wrote: > Hi, > > On Thu, Feb 6, 2020 at 11:48 AM Andres Freund <andres@anarazel.de> wrote: >> I think it'd be good if we could guard against b) via an isolation >> test. It's more painful to do that for a), due to the unreliability of >> stats at the moment (we have some tests, but they take a long time). > Thanks for your advise, and agreed. > > I added a new (but minimal) isolation test for the case of tid scan. > (v12 and HEAD will be failed this test. v11 and HEAD with my patch > will be passed) Isn't this test scenario a bit overkill? We can simply test that as follows, instead. CREATE TABLE test_tidscan AS SELECT 1 AS id; BEGIN ISOLATION LEVEL SERIALIZABLE; SELECT * FROM test_tidscan WHERE ctid = '(0,1)'; SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode = 'SIReadLock'; COMMIT; In the expected file, the result of query looking at pg_locks should be matched with the following. locktype | mode ----------+------------ tuple | SIReadLock BTW, in master branch, locktype in that query result is "relation" because of the issue. Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
HI, On Thu, Feb 6, 2020 at 3:24 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > I added a new (but minimal) isolation test for the case of tid scan. > > (v12 and HEAD will be failed this test. v11 and HEAD with my patch > > will be passed) > > Isn't this test scenario a bit overkill? We can simply test that > as follows, instead. > CREATE TABLE test_tidscan AS SELECT 1 AS id; > BEGIN ISOLATION LEVEL SERIALIZABLE; > SELECT * FROM test_tidscan WHERE ctid = '(0,1)'; > SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode = 'SIReadLock'; > COMMIT; > > In the expected file, the result of query looking at pg_locks > should be matched with the following. > > locktype | mode > ----------+------------ > tuple | SIReadLock Thanks for your reply. Hmm, it's an simple and might be the better way than adding isolation test. I added above test case to regress/sql/tidscan.sql. Attach the patch. Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Attachment
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Masahiko Sawada
Date:
On Thu, 6 Feb 2020 at 19:12, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote: > > HI, > > On Thu, Feb 6, 2020 at 3:24 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > I added a new (but minimal) isolation test for the case of tid scan. > > > (v12 and HEAD will be failed this test. v11 and HEAD with my patch > > > will be passed) > > > > Isn't this test scenario a bit overkill? We can simply test that > > as follows, instead. > > CREATE TABLE test_tidscan AS SELECT 1 AS id; > > BEGIN ISOLATION LEVEL SERIALIZABLE; > > SELECT * FROM test_tidscan WHERE ctid = '(0,1)'; > > SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode = 'SIReadLock'; > > COMMIT; > > > > In the expected file, the result of query looking at pg_locks > > should be matched with the following. > > > > locktype | mode > > ----------+------------ > > tuple | SIReadLock > Thanks for your reply. > Hmm, it's an simple and might be the better way than adding isolation test. > > I added above test case to regress/sql/tidscan.sql. > Attach the patch. > I've tested predicate locking including promotion cases with v3 patch and it works fine. +table_beginscan_tid(Relation rel, Snapshot snapshot, + int nkeys, struct ScanKeyData *key) +{ + uint32 flags = SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE; IIUC setting SO_ALLOW_STRAT, SO_ALLOW_SYNC and SO_ALLOW_PAGEMODE has no meaning during tid scan. I think v11 also should be the same. Why did you remove SO_TYPE_TIDSCAN from the previous version patch? I think it's better to add it and then we can set only SO_TYPE_TIDSCAN to the scan option of tid scan. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
Hi, On Thu, Feb 6, 2020 at 11:01 PM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote: > > On Thu, 6 Feb 2020 at 19:12, Kasahara Tatsuhito > <kasahara.tatsuhito@gmail.com> wrote: > I've tested predicate locking including promotion cases with v3 patch > and it works fine. > > +table_beginscan_tid(Relation rel, Snapshot snapshot, > + int nkeys, struct ScanKeyData *key) > +{ > + uint32 flags = SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE; > > IIUC setting SO_ALLOW_STRAT, SO_ALLOW_SYNC and SO_ALLOW_PAGEMODE has > no meaning during tid scan. I think v11 also should be the same. Thanks for your check, and useful advise. I was wondering if I should keep these flags, but I confirmed that I can remove these from TidScan's flags. > Why did you remove SO_TYPE_TIDSCAN from the previous version patch? I > think it's better to add it and then we can set only SO_TYPE_TIDSCAN > to the scan option of tid scan. Because, currently SO_TYPE_TIDSCAN is not used anywhere. So I thought it might be better to avoid adding a new flag. However, as you said, this flag may be useful for the future tid scan feature (like [1]) Attach v4 patch. I re-added SO_TYPE_TIDSCAN to ScanOptions and table_beginscan_tid. And I remove unnecessary SO_ALLOW_* flags. Best regards, [1] https://www.postgresql.org/message-id/flat/CAKJS1f-%2BJJpm6B_NThUWzFv4007zAjObBXX1CBHE_bH9nOAvSw%40mail.gmail.com#1ae648acdc2df930b19218b6026135d3 -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Attachment
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kyotaro Horiguchi
Date:
At Fri, 7 Feb 2020 12:27:26 +0900, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote in > > IIUC setting SO_ALLOW_STRAT, SO_ALLOW_SYNC and SO_ALLOW_PAGEMODE has > > no meaning during tid scan. I think v11 also should be the same. > Thanks for your check, and useful advise. > I was wondering if I should keep these flags, but I confirmed that I > can remove these from TidScan's flags. > > > Why did you remove SO_TYPE_TIDSCAN from the previous version patch? I > > think it's better to add it and then we can set only SO_TYPE_TIDSCAN > > to the scan option of tid scan. > Because, currently SO_TYPE_TIDSCAN is not used anywhere. > So I thought it might be better to avoid adding a new flag. > However, as you said, this flag may be useful for the future tid scan > feature (like [1]) > > Attach v4 patch. I re-added SO_TYPE_TIDSCAN to ScanOptions and > table_beginscan_tid. > And I remove unnecessary SO_ALLOW_* flags. +table_beginscan_tid(Relation rel, Snapshot snapshot, + int nkeys, struct ScanKeyData *key) +{ + uint32 flags = SO_TYPE_TIDSCAN; + + return rel->rd_tableam->scan_begin(rel, snapshot, nkeys, key, NULL, flags); It seems that nkeys and key are useless. Since every table_beginscan_* functions have distinct parameter sets, don't we remove them from table_beginscan_tid? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
Hi, On Fri, Feb 7, 2020 at 1:27 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > It seems that nkeys and key are useless. Since every table_beginscan_* > functions have distinct parameter sets, don't we remove them from > table_beginscan_tid? Yeah, actually, when calling table_beginscan_tid(), nkeys is set to 0 and * key is set to NULL, so these are not used at the moment. I removed unnecessary arguments from table_beginscan_tid(). Attache the v5 patch. -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Attachment
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Fujii Masao
Date:
On 2020/02/07 15:07, Kasahara Tatsuhito wrote: > Hi, > > On Fri, Feb 7, 2020 at 1:27 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: >> It seems that nkeys and key are useless. Since every table_beginscan_* >> functions have distinct parameter sets, don't we remove them from >> table_beginscan_tid? > Yeah, actually, when calling table_beginscan_tid(), nkeys is set to 0 > and * key is set to NULL, > so these are not used at the moment. > > I removed unnecessary arguments from table_beginscan_tid(). > > Attache the v5 patch. Thanks for updating the patch! The patch looks good to me. So barring any objection, I will push it and back-patch to v12 *soon* so that the upcoming minor version can contain it. BTW, commit 147e3722f7 causing the issue changed currtid_byreloid() and currtid_byrelname() so that they also call table_beginscan(). I'm not sure what those functions are, but probably we should fix them so that table_beginscan_tid() is called instead. Thought? Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
Hi, On Fri, Feb 7, 2020 at 5:02 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > BTW, commit 147e3722f7 causing the issue changed currtid_byreloid() > and currtid_byrelname() so that they also call table_beginscan(). > I'm not sure what those functions are, but probably we should fix them > so that table_beginscan_tid() is called instead. Thought? +1, sorry, I overlooked it. Both functions are used to check whether a valid tid or not with a relation name (or oid), and both perform a tid scan internally. So, these functions should call table_beginscan_tid(). Perhaps unnecessary, I will attach a patch. Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Attachment
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kyotaro Horiguchi
Date:
At Fri, 7 Feb 2020 17:01:59 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2020/02/07 15:07, Kasahara Tatsuhito wrote: > > Hi, > > On Fri, Feb 7, 2020 at 1:27 PM Kyotaro Horiguchi > > <horikyota.ntt@gmail.com> wrote: > >> It seems that nkeys and key are useless. Since every table_beginscan_* > >> functions have distinct parameter sets, don't we remove them from > >> table_beginscan_tid? > > Yeah, actually, when calling table_beginscan_tid(), nkeys is set to 0 > > and * key is set to NULL, > > so these are not used at the moment. > > I removed unnecessary arguments from table_beginscan_tid(). > > Attache the v5 patch. > > Thanks for updating the patch! The patch looks good to me. > So barring any objection, I will push it and back-patch to v12 *soon* > so that the upcoming minor version can contain it. > > BTW, commit 147e3722f7 causing the issue changed currtid_byreloid() > and currtid_byrelname() so that they also call table_beginscan(). > I'm not sure what those functions are, but probably we should fix them > so that table_beginscan_tid() is called instead. Thought? At least they don't seem to need table_beginscan(), to me. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Fujii Masao
Date:
On 2020/02/07 17:28, Kasahara Tatsuhito wrote: > Hi, > > On Fri, Feb 7, 2020 at 5:02 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> BTW, commit 147e3722f7 causing the issue changed currtid_byreloid() >> and currtid_byrelname() so that they also call table_beginscan(). >> I'm not sure what those functions are, but probably we should fix them >> so that table_beginscan_tid() is called instead. Thought? > +1, sorry, I overlooked it. > > Both functions are used to check whether a valid tid or not with a > relation name (or oid), > and both perform a tid scan internally. > So, these functions should call table_beginscan_tid(). > > Perhaps unnecessary, I will attach a patch. Pushed! Thanks! Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)
From
Kasahara Tatsuhito
Date:
On Fri, Feb 7, 2020 at 10:09 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > Pushed! Thanks! Thanks Fujii. -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com