Thread: Tid scan increments value of pg_stat_all_tables.seq_scan. (but not seq_tup_read)

Tid scan increments value of pg_stat_all_tables.seq_scan. (but not seq_tup_read)

From

Kasahara Tatsuhito

Date:

27 January 2020, 05:35:03

Hi, I noticed that since PostgreSQL 12, Tid scan increments value of
pg_stat_all_tables.seq_scan. (but not seq_tup_read)

The following is an example.

CREATE TABLE t (c int);
INSERT INTO t SELECT 1;
SET enable_seqscan to off;

(v12 -)
=# EXPLAIN ANALYZE SELECT * FROM t WHERE ctid = '(0,1)';
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Tid Scan on t  (cost=0.00..4.01 rows=1 width=4) (actual
time=0.034..0.035 rows=1 loops=1)
   TID Cond: (ctid = '(0,1)'::tid)
 Planning Time: 0.341 ms
 Execution Time: 0.059 ms
(4 rows)

=# SELECT seq_scan, seq_tup_read FROM pg_stat_user_tables WHERE relname = 't';
 seq_scan | seq_tup_read
----------+--------------
        1 |            0
(1 row)

(- v11)
=# EXPLAIN ANALYZE SELECT * FROM t WHERE ctid = '(0,1)';
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Tid Scan on t  (cost=0.00..4.01 rows=1 width=4) (actual
time=0.026..0.027 rows=1 loops=1)
   TID Cond: (ctid = '(0,1)'::tid)
 Planning Time: 1.003 ms
 Execution Time: 0.068 ms
(4 rows)

postgres=# SELECT seq_scan, seq_tup_read FROM pg_stat_user_tables
WHERE relname = 't';
 seq_scan | seq_tup_read
----------+--------------
        0 |            0
(1 row)


Exactly, this change occurred from following commit.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=147e3722f7e531f15ba389a4d518efe8cd0bd736)
I think, from this commit, TidListEval() came to call
table_beginscan() , so this increments would be happen.

I'm not sure this change whether intention or not, it can confuse some users.

Best regards,
--
NTT Open Source Software Center
Tatsuhito Kasahara

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

29 January 2020, 11:06:06

Hi.

Attached patch solve this problem.

This patch adds table_beginscan_tid() and call it in TidListEval()
instead of table_beginscan().
table_beginscan_tid() is the same as table_beginscan() but do not set
SO_TYPE_SEQSCAN to flags.

Although I'm not sure this behavior is really problem or not,
it seems to me that previous behavior is more prefer.

Is it worth to apply to HEAD and v12 branch ?

Best regards,

2020年1月27日(月) 14:35 Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com>:
>
> Hi, I noticed that since PostgreSQL 12, Tid scan increments value of
> pg_stat_all_tables.seq_scan. (but not seq_tup_read)
>
> The following is an example.
>
> CREATE TABLE t (c int);
> INSERT INTO t SELECT 1;
> SET enable_seqscan to off;
>
> (v12 -)
> =# EXPLAIN ANALYZE SELECT * FROM t WHERE ctid = '(0,1)';
>                                         QUERY PLAN
> -------------------------------------------------------------------------------------------
>  Tid Scan on t  (cost=0.00..4.01 rows=1 width=4) (actual
> time=0.034..0.035 rows=1 loops=1)
>    TID Cond: (ctid = '(0,1)'::tid)
>  Planning Time: 0.341 ms
>  Execution Time: 0.059 ms
> (4 rows)
>
> =# SELECT seq_scan, seq_tup_read FROM pg_stat_user_tables WHERE relname = 't';
>  seq_scan | seq_tup_read
> ----------+--------------
>         1 |            0
> (1 row)
>
> (- v11)
> =# EXPLAIN ANALYZE SELECT * FROM t WHERE ctid = '(0,1)';
>                                         QUERY PLAN
> -------------------------------------------------------------------------------------------
>  Tid Scan on t  (cost=0.00..4.01 rows=1 width=4) (actual
> time=0.026..0.027 rows=1 loops=1)
>    TID Cond: (ctid = '(0,1)'::tid)
>  Planning Time: 1.003 ms
>  Execution Time: 0.068 ms
> (4 rows)
>
> postgres=# SELECT seq_scan, seq_tup_read FROM pg_stat_user_tables
> WHERE relname = 't';
>  seq_scan | seq_tup_read
> ----------+--------------
>         0 |            0
> (1 row)
>
>
> Exactly, this change occurred from following commit.
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=147e3722f7e531f15ba389a4d518efe8cd0bd736)
> I think, from this commit, TidListEval() came to call
> table_beginscan() , so this increments would be happen.
>
> I'm not sure this change whether intention or not, it can confuse some users.
>
> Best regards,
> --
> NTT Open Source Software Center
> Tatsuhito Kasahara



--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment

fix_tidscan_increments_seqscan_num.patch

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Fujii Masao

Date:

29 January 2020, 14:24:09


On 2020/01/29 20:06, Kasahara Tatsuhito wrote:
> Hi.
> 
> Attached patch solve this problem.
> 
> This patch adds table_beginscan_tid() and call it in TidListEval()
> instead of table_beginscan().
> table_beginscan_tid() is the same as table_beginscan() but do not set
> SO_TYPE_SEQSCAN to flags.
> 
> Although I'm not sure this behavior is really problem or not,
> it seems to me that previous behavior is more prefer.
> 
> Is it worth to apply to HEAD and v12 branch ?

I've not read the patch yet, but I agree that updating only seq_scan
but not seq_tup_read in Tid Scan sounds strange. IMO at least
both should be update together or neither should be updated.

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kyotaro Horiguchi

Date:

30 January 2020, 01:54:37

Hello.

At Wed, 29 Jan 2020 23:24:09 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> On 2020/01/29 20:06, Kasahara Tatsuhito wrote:
> > Hi.
> > Attached patch solve this problem.
> > This patch adds table_beginscan_tid() and call it in TidListEval()
> > instead of table_beginscan().
> > table_beginscan_tid() is the same as table_beginscan() but do not set
> > SO_TYPE_SEQSCAN to flags.
> > Although I'm not sure this behavior is really problem or not,
> > it seems to me that previous behavior is more prefer.
> > Is it worth to apply to HEAD and v12 branch ?
> 
> I've not read the patch yet, but I agree that updating only seq_scan
> but not seq_tup_read in Tid Scan sounds strange. IMO at least
> both should be update together or neither should be updated.

Basically agreed, but sample scan doesn't increment seq_scan but
increments seq_tup_read.

Aside from that fact, before 147e3722f7 TidScan didn't need a scan
descriptor so didn't call table_beginscan. table_beginscan didn't
increment the counter for bitmapscan and samplescan. The commit
changes TidScan to call beginscan but didn't change table_beginscan
not to increment the counter for tidscans.

From the view of the view pg_stat_*_tables, the counters moves as follows.

                                    increments
scan type    table_beginscan?, per scan, per tuple     , SO_TYPE flags
=============================================================================
seq scan   : yes             , seq_scan, seq_tup_read  , SO_TYPE_SEQSCAN
index scan : no              , idx_scan, idx_tup_fetch , <none>
bitmap scan: yes             , idx_scan, idx_tup_fetch , SO_TYPE_BITMAPSCAN
sample scan: yes             , <none>  , seq_tup_read  , SO_TYPE_SAMPLESCAN
TID scan   : yes             , seq_scan, <none>        , <none>

bitmap scan and sample scan are historically excluded by corresponding
flags is_bitmapscan and is_samplescan and the commit c3b23ae457 moved
the work to SO_TYPE_* flags.  After 147e3722f7, TID scan has the same
characteristics, that is, it calls table_beginscan but doesn't
increment seq_scan.  But it doesn't have corresponding flag value.

I'd rather think that whatever calls table_beginscan should have
corresponding SO_TYPE_* flags. (Note: index scan doesn't call it.)

It would be another issue what we should do for the sample scan case.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

30 January 2020, 04:30:56

Hi,

On Thu, Jan 30, 2020 at 10:55 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> At Wed, 29 Jan 2020 23:24:09 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> > On 2020/01/29 20:06, Kasahara Tatsuhito wrote:
> > > Although I'm not sure this behavior is really problem or not,
> > > it seems to me that previous behavior is more prefer.
> > > Is it worth to apply to HEAD and v12 branch ?
> >
> > I've not read the patch yet, but I agree that updating only seq_scan
> > but not seq_tup_read in Tid Scan sounds strange. IMO at least
> > both should be update together or neither should be updated.
>
> Basically agreed, but sample scan doesn't increment seq_scan but
> increments seq_tup_read.
Yeah, sample scan's behavior is also bit annoying.


> From the view of the view pg_stat_*_tables, the counters moves as follows.
Thanks for your clarification.

> TID scan   : yes             , seq_scan, <none>        , <none>
Here is wrong, because TID scan came to have SO_TYPE_SEQSCAN flags
from commit 147e3722f7.

So, currently( v12 and HEAD) TID scan status as following

                                                 increments
scan type    table_beginscan?, per scan, per tuple     , SO_TYPE flags
=============================================================================
TID scan   : yes             , seq_scan, <none>        , SO_TYPE_SEQSCAN

And my patch change the status to following (same as -v11)

                                                 increments
scan type    table_beginscan?, per scan, per tuple     , SO_TYPE flags
=============================================================================
TID scan   : yes             , <none>, <none>        , <none>


> I'd rather think that whatever calls table_beginscan should have
> corresponding SO_TYPE_* flags. (Note: index scan doesn't call it.)
Agreed.
It may be better to add new flag such like SO_TYPE_TIDSCAN,
and handles some statistics updating and other things.
But it may be a bit overkill, since I want to revert to the previous
behavior this time.

Best regards,
-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kyotaro Horiguchi

Date:

30 January 2020, 04:48:38

At Thu, 30 Jan 2020 13:30:56 +0900, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote in 
> > TID scan   : yes             , seq_scan, <none>        , <none>
> Here is wrong, because TID scan came to have SO_TYPE_SEQSCAN flags
> from commit 147e3722f7.

It is reflectings the discussion below, which means TID scan doesn't
have corresponding SO_TYPE_ value. Currently it is setting
SO_TYPE_SEQSCAN by accedent.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

01 February 2020, 07:05:04

On Thu, Jan 30, 2020 at 1:49 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> At Thu, 30 Jan 2020 13:30:56 +0900, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote in
> > > TID scan   : yes             , seq_scan, <none>        , <none>
> > Here is wrong, because TID scan came to have SO_TYPE_SEQSCAN flags
> > from commit 147e3722f7.
>
> It is reflectings the discussion below, which means TID scan doesn't
> have corresponding SO_TYPE_ value. Currently it is setting
> SO_TYPE_SEQSCAN by accedent.
Ah, sorry I misunderstood..

Upon further investigation, the SO_TYPE_SEQSCAN flag was also used at
heap_beginscan() to determine whether a predicate lock was taken on
the entire relation.

    if (scan->rs_base.rs_flags & (SO_TYPE_SEQSCAN | SO_TYPE_SAMPLESCAN))
    {
        /*
         * Ensure a missing snapshot is noticed reliably, even if the
         * isolation mode means predicate locking isn't performed (and
         * therefore the snapshot isn't used here).
         */
        Assert(snapshot);
        PredicateLockRelation(relation, snapshot);
    }

Therefore, it can not simply remove the SO_TYPE_SEQSCAN flag from a TID scan.
To keep the old behavior, I think it would be better to add a new
SO_TYPE_TIDSCAN flag and take a predicate lock on the entire relation.

Attach the v2 patch which change the status to following. (same as
-v11 but have new SO_TYPE_TIDSCAN flag)

                                                 increments
scan type    table_beginscan?, per scan, per tuple     , SO_TYPE flags
=============================================================================
TID scan   : yes             , <none>, <none>        , SO_TYPE_TIDSCAN

Is it acceptable change for HEAD and v12?

Best regards,

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment

fix_tidscan_increments_seqscan_num_v2.patch

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Fujii Masao

Date:

03 February 2020, 07:22:34


On 2020/02/01 16:05, Kasahara Tatsuhito wrote:
> On Thu, Jan 30, 2020 at 1:49 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
>> At Thu, 30 Jan 2020 13:30:56 +0900, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote in
>>>> TID scan   : yes             , seq_scan, <none>        , <none>
>>> Here is wrong, because TID scan came to have SO_TYPE_SEQSCAN flags
>>> from commit 147e3722f7.
>>
>> It is reflectings the discussion below, which means TID scan doesn't
>> have corresponding SO_TYPE_ value. Currently it is setting
>> SO_TYPE_SEQSCAN by accedent.
> Ah, sorry I misunderstood..
> 
> Upon further investigation, the SO_TYPE_SEQSCAN flag was also used at
> heap_beginscan() to determine whether a predicate lock was taken on
> the entire relation.
> 
>      if (scan->rs_base.rs_flags & (SO_TYPE_SEQSCAN | SO_TYPE_SAMPLESCAN))
>      {
>          /*
>           * Ensure a missing snapshot is noticed reliably, even if the
>           * isolation mode means predicate locking isn't performed (and
>           * therefore the snapshot isn't used here).
>           */
>          Assert(snapshot);
>          PredicateLockRelation(relation, snapshot);
>      }
> 
> Therefore, it can not simply remove the SO_TYPE_SEQSCAN flag from a TID scan.
> To keep the old behavior, I think it would be better to add a new
> SO_TYPE_TIDSCAN flag and take a predicate lock on the entire relation.

But in the old behavior, PredicateLockRelation() was not called in TidScan case
because its flag was not SO_TYPE_SEQSCAN. No?

Regards,


-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

03 February 2020, 07:39:51

On Mon, Feb 3, 2020 at 4:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> On 2020/02/01 16:05, Kasahara Tatsuhito wrote:
> >      if (scan->rs_base.rs_flags & (SO_TYPE_SEQSCAN | SO_TYPE_SAMPLESCAN))
> >      {
> >          /*
> >           * Ensure a missing snapshot is noticed reliably, even if the
> >           * isolation mode means predicate locking isn't performed (and
> >           * therefore the snapshot isn't used here).
> >           */
> >          Assert(snapshot);
> >          PredicateLockRelation(relation, snapshot);
> >      }
> >
> > Therefore, it can not simply remove the SO_TYPE_SEQSCAN flag from a TID scan.
> > To keep the old behavior, I think it would be better to add a new
> > SO_TYPE_TIDSCAN flag and take a predicate lock on the entire relation.
>
> But in the old behavior, PredicateLockRelation() was not called in TidScan case
> because its flag was not SO_TYPE_SEQSCAN. No?
No. Tid scan called PredicateLockRelation() both previous and current.

In the current (v12 and HEAD), Tid scan has SO_TYPE_SEQSCAN flag so
that PredicateLockRelation()is called in Tid scan.
In the Previous (- v11), in heap_beginscan_internal(), checks
is_bitmapscan flags.
If is_bitmapscan is set to false, calls PredicateLockRelation().

(- v11)
    if (!is_bitmapscan)
        PredicateLockRelation(relation, snapshot);

And in the Tid scan,  is_bitmapscan is set to false, so that
PredicateLockRelation()is called in Tid scan.

Best regards,

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Fujii Masao

Date:

03 February 2020, 08:32:59


On 2020/02/03 16:39, Kasahara Tatsuhito wrote:
> On Mon, Feb 3, 2020 at 4:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>> On 2020/02/01 16:05, Kasahara Tatsuhito wrote:
>>>       if (scan->rs_base.rs_flags & (SO_TYPE_SEQSCAN | SO_TYPE_SAMPLESCAN))
>>>       {
>>>           /*
>>>            * Ensure a missing snapshot is noticed reliably, even if the
>>>            * isolation mode means predicate locking isn't performed (and
>>>            * therefore the snapshot isn't used here).
>>>            */
>>>           Assert(snapshot);
>>>           PredicateLockRelation(relation, snapshot);
>>>       }
>>>
>>> Therefore, it can not simply remove the SO_TYPE_SEQSCAN flag from a TID scan.
>>> To keep the old behavior, I think it would be better to add a new
>>> SO_TYPE_TIDSCAN flag and take a predicate lock on the entire relation.
>>
>> But in the old behavior, PredicateLockRelation() was not called in TidScan case
>> because its flag was not SO_TYPE_SEQSCAN. No?
> No. Tid scan called PredicateLockRelation() both previous and current.
> 
> In the current (v12 and HEAD), Tid scan has SO_TYPE_SEQSCAN flag so
> that PredicateLockRelation()is called in Tid scan.
> In the Previous (- v11), in heap_beginscan_internal(), checks
> is_bitmapscan flags.
> If is_bitmapscan is set to false, calls PredicateLockRelation().
> 
> (- v11)
>      if (!is_bitmapscan)
>          PredicateLockRelation(relation, snapshot);
> 
> And in the Tid scan,  is_bitmapscan is set to false, so that
> PredicateLockRelation()is called in Tid scan.

Thanks for explaining that! But heap_beginscan_internal() was really
called in TidScan case?

Regards,


-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

03 February 2020, 09:20:49

On Mon, Feb 3, 2020 at 5:33 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> Thanks for explaining that! But heap_beginscan_internal() was really
> called in TidScan case?
Oh, you are right.
Tid Scan started calling table_beginscan from v12 (commit 147e3722f7).
So previously(-v11), Tid Scan might never calls heap_beginscan_internal().

Therefore, from v12, Tid scan not only increases the value of
seq_scan, but also acquires a predicate lock.

Best regards,
-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

05 February 2020, 07:25:25

On Mon, Feb 3, 2020 at 6:20 PM Kasahara Tatsuhito
<kasahara.tatsuhito@gmail.com> wrote:
> Therefore, from v12, Tid scan not only increases the value of
> seq_scan, but also acquires a predicate lock.

Based on further investigation and Fujii's advice, I've summarized
this issue as follows.

From commit 147e3722f7, Tid Scan came to
(A) increments num of seq_scan on pg_stat_*_tables
and
(B) take a predicate lock on the entire relation.

(A) may be confusing to users, so I think it is better to fix it.
For (B), an unexpected serialization error has occurred as follows, so
I think it should be fix.

=========================================================================
[Preparation]
CREATE TABLE tid_test (c1 int, c2 int);
INSERT INTO  tid_test SELECT generate_series(1,1000), 0;

[Session-1:]
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
  [Session-2:]
  BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
[Session-1:]
SELECT * FROM tid_test WHERE ctid = '(0,1)';
  [Session-2:]
  SELECT * FROM tid_test WHERE ctid = '(1,1)';
[Session-1:]
INSERT INTO tid_test SELECT 1001, 10;
  [Session-2:]
  INSERT INTO tid_test SELECT 1001, 10;
[Session-1:]
COMMIT;
  [Session-2:]
  COMMIT;

Result:
(-v11): Both session could commit.
(v12-): Session-2 raised error as following because of taking a
predicate lock on the entire table...
--------
ERROR:  could not serialize access due to read/write dependencies
among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during
commit attempt.
HINT:  The transaction might succeed if retried.
--------
=========================================================================

Attached patch fix both (A) and (B), so that the behavior of Tid Scan
back to the same as before v11.
(As a result, this patch is the same as the one that first attached.)

Best regards,
-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment

fix_tidscan_issues.patch

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Andres Freund

Date:

06 February 2020, 02:48:24

Hi,

On 2020-02-05 16:25:25 +0900, Kasahara Tatsuhito wrote:
> On Mon, Feb 3, 2020 at 6:20 PM Kasahara Tatsuhito
> <kasahara.tatsuhito@gmail.com> wrote:
> > Therefore, from v12, Tid scan not only increases the value of
> > seq_scan, but also acquires a predicate lock.
> 
> Based on further investigation and Fujii's advice, I've summarized
> this issue as follows.
> 
> From commit 147e3722f7, Tid Scan came to
> (A) increments num of seq_scan on pg_stat_*_tables
> and
> (B) take a predicate lock on the entire relation.
> 
> (A) may be confusing to users, so I think it is better to fix it.
> For (B), an unexpected serialization error has occurred as follows, so
> I think it should be fix.

I think it'd be good if we could guard against b) via an isolation
test. It's more painful to do that for a), due to the unreliability of
stats at the moment (we have some tests, but they take a long time).

Greetings,

Andres Freund

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

06 February 2020, 06:04:58

Hi,

On Thu, Feb 6, 2020 at 11:48 AM Andres Freund <andres@anarazel.de> wrote:
> I think it'd be good if we could guard against b) via an isolation
> test. It's more painful to do that for a), due to the unreliability of
> stats at the moment (we have some tests, but they take a long time).
Thanks for your advise, and agreed.

I added a new (but minimal)  isolation test for the case of  tid scan.
(v12 and HEAD will be failed this test. v11 and HEAD with my patch
will be passed)

Best regards,

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment

fix_tidscan_issues_v2.patch

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Fujii Masao

Date:

06 February 2020, 06:24:48


On 2020/02/06 15:04, Kasahara Tatsuhito wrote:
> Hi,
> 
> On Thu, Feb 6, 2020 at 11:48 AM Andres Freund <andres@anarazel.de> wrote:
>> I think it'd be good if we could guard against b) via an isolation
>> test. It's more painful to do that for a), due to the unreliability of
>> stats at the moment (we have some tests, but they take a long time).
> Thanks for your advise, and agreed.
> 
> I added a new (but minimal)  isolation test for the case of  tid scan.
> (v12 and HEAD will be failed this test. v11 and HEAD with my patch
> will be passed)

Isn't this test scenario a bit overkill? We can simply test that
as follows, instead.

CREATE TABLE test_tidscan AS SELECT 1 AS id;
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM test_tidscan WHERE ctid = '(0,1)';
SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode = 'SIReadLock';
COMMIT;

In the expected file, the result of query looking at pg_locks
should be matched with the following.

  locktype |    mode
----------+------------
  tuple    | SIReadLock

BTW, in master branch, locktype in that query result is "relation"
because of the issue.

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

06 February 2020, 10:11:46

HI,

On Thu, Feb 6, 2020 at 3:24 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > I added a new (but minimal)  isolation test for the case of  tid scan.
> > (v12 and HEAD will be failed this test. v11 and HEAD with my patch
> > will be passed)
>
> Isn't this test scenario a bit overkill? We can simply test that
> as follows, instead.
> CREATE TABLE test_tidscan AS SELECT 1 AS id;
> BEGIN ISOLATION LEVEL SERIALIZABLE;
> SELECT * FROM test_tidscan WHERE ctid = '(0,1)';
> SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode = 'SIReadLock';
> COMMIT;
>
> In the expected file, the result of query looking at pg_locks
> should be matched with the following.
>
>   locktype |    mode
> ----------+------------
>   tuple    | SIReadLock
Thanks for your reply.
Hmm, it's an simple and  might be the better way than adding isolation test.

I added above test case to regress/sql/tidscan.sql.
Attach the patch.

Best regards,
-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment

fix_tidscan_issues_v3.patch

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Masahiko Sawada

Date:

06 February 2020, 14:01:14

On Thu, 6 Feb 2020 at 19:12, Kasahara Tatsuhito
<kasahara.tatsuhito@gmail.com> wrote:
>
> HI,
>
> On Thu, Feb 6, 2020 at 3:24 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > > I added a new (but minimal)  isolation test for the case of  tid scan.
> > > (v12 and HEAD will be failed this test. v11 and HEAD with my patch
> > > will be passed)
> >
> > Isn't this test scenario a bit overkill? We can simply test that
> > as follows, instead.
> > CREATE TABLE test_tidscan AS SELECT 1 AS id;
> > BEGIN ISOLATION LEVEL SERIALIZABLE;
> > SELECT * FROM test_tidscan WHERE ctid = '(0,1)';
> > SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode = 'SIReadLock';
> > COMMIT;
> >
> > In the expected file, the result of query looking at pg_locks
> > should be matched with the following.
> >
> >   locktype |    mode
> > ----------+------------
> >   tuple    | SIReadLock
> Thanks for your reply.
> Hmm, it's an simple and  might be the better way than adding isolation test.
>
> I added above test case to regress/sql/tidscan.sql.
> Attach the patch.
>

I've tested predicate locking including promotion cases with v3 patch
and it works fine.

+table_beginscan_tid(Relation rel, Snapshot snapshot,
+               int nkeys, struct ScanKeyData *key)
+{
+   uint32      flags = SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE;

IIUC setting SO_ALLOW_STRAT, SO_ALLOW_SYNC and SO_ALLOW_PAGEMODE has
no meaning during tid scan. I think v11 also should be the same.

Why did you remove SO_TYPE_TIDSCAN from the previous version patch? I
think it's better to add it and then we can set only SO_TYPE_TIDSCAN
to the scan option of tid scan.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

07 February 2020, 03:27:26

Hi,

On Thu, Feb 6, 2020 at 11:01 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 6 Feb 2020 at 19:12, Kasahara Tatsuhito
> <kasahara.tatsuhito@gmail.com> wrote:
> I've tested predicate locking including promotion cases with v3 patch
> and it works fine.
>
> +table_beginscan_tid(Relation rel, Snapshot snapshot,
> +               int nkeys, struct ScanKeyData *key)
> +{
> +   uint32      flags = SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE;
>
> IIUC setting SO_ALLOW_STRAT, SO_ALLOW_SYNC and SO_ALLOW_PAGEMODE has
> no meaning during tid scan. I think v11 also should be the same.
Thanks for your check, and useful advise.
I was wondering if I should keep these flags, but I confirmed that I
can remove these from TidScan's flags.

> Why did you remove SO_TYPE_TIDSCAN from the previous version patch? I
> think it's better to add it and then we can set only SO_TYPE_TIDSCAN
> to the scan option of tid scan.
Because, currently SO_TYPE_TIDSCAN is not used anywhere.
So I thought it might be better to avoid adding a new flag.
However, as you said, this flag may be useful for the future tid scan
feature (like [1])

Attach v4 patch. I re-added SO_TYPE_TIDSCAN to ScanOptions and
table_beginscan_tid.
And I remove unnecessary SO_ALLOW_* flags.

Best regards,

[1]

https://www.postgresql.org/message-id/flat/CAKJS1f-%2BJJpm6B_NThUWzFv4007zAjObBXX1CBHE_bH9nOAvSw%40mail.gmail.com#1ae648acdc2df930b19218b6026135d3

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment

fix_tidscan_issues_v4.patch

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kyotaro Horiguchi

Date:

07 February 2020, 04:26:15

At Fri, 7 Feb 2020 12:27:26 +0900, Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote in 
> > IIUC setting SO_ALLOW_STRAT, SO_ALLOW_SYNC and SO_ALLOW_PAGEMODE has
> > no meaning during tid scan. I think v11 also should be the same.
> Thanks for your check, and useful advise.
> I was wondering if I should keep these flags, but I confirmed that I
> can remove these from TidScan's flags.
> 
> > Why did you remove SO_TYPE_TIDSCAN from the previous version patch? I
> > think it's better to add it and then we can set only SO_TYPE_TIDSCAN
> > to the scan option of tid scan.
> Because, currently SO_TYPE_TIDSCAN is not used anywhere.
> So I thought it might be better to avoid adding a new flag.
> However, as you said, this flag may be useful for the future tid scan
> feature (like [1])
> 
> Attach v4 patch. I re-added SO_TYPE_TIDSCAN to ScanOptions and
> table_beginscan_tid.
> And I remove unnecessary SO_ALLOW_* flags.

+table_beginscan_tid(Relation rel, Snapshot snapshot,
+                int nkeys, struct ScanKeyData *key)
+{
+    uint32        flags = SO_TYPE_TIDSCAN;
+
+    return rel->rd_tableam->scan_begin(rel, snapshot, nkeys, key, NULL, flags);

It seems that nkeys and key are useless. Since every table_beginscan_*
functions have distinct parameter sets, don't we remove them from
table_beginscan_tid?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

07 February 2020, 06:07:51

Hi,

On Fri, Feb 7, 2020 at 1:27 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> It seems that nkeys and key are useless. Since every table_beginscan_*
> functions have distinct parameter sets, don't we remove them from
> table_beginscan_tid?
Yeah, actually, when calling table_beginscan_tid(), nkeys is set to 0
and * key is set to NULL,
so these are not used at the moment.

I removed unnecessary arguments from table_beginscan_tid().

Attache the v5 patch.


-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment

fix_tidscan_issues_v5.patch

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Fujii Masao

Date:

07 February 2020, 08:01:59

On 2020/02/07 15:07, Kasahara Tatsuhito wrote:
> Hi,
> 
> On Fri, Feb 7, 2020 at 1:27 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
>> It seems that nkeys and key are useless. Since every table_beginscan_*
>> functions have distinct parameter sets, don't we remove them from
>> table_beginscan_tid?
> Yeah, actually, when calling table_beginscan_tid(), nkeys is set to 0
> and * key is set to NULL,
> so these are not used at the moment.
> 
> I removed unnecessary arguments from table_beginscan_tid().
> 
> Attache the v5 patch.

Thanks for updating the patch! The patch looks good to me.
So barring any objection, I will push it and back-patch to v12 *soon*
so that the upcoming minor version can contain it.

BTW, commit 147e3722f7 causing the issue changed currtid_byreloid()
and currtid_byrelname() so that they also call table_beginscan().
I'm not sure what those functions are, but probably we should fix them
so that table_beginscan_tid() is called instead. Thought?

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

07 February 2020, 08:28:43

Hi,

On Fri, Feb 7, 2020 at 5:02 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> BTW, commit 147e3722f7 causing the issue changed currtid_byreloid()
> and currtid_byrelname() so that they also call table_beginscan().
> I'm not sure what those functions are, but probably we should fix them
> so that table_beginscan_tid() is called instead. Thought?
+1, sorry, I overlooked it.

Both functions are used to check whether a valid tid or not with a
relation name (or oid),
and both perform a tid scan internally.
So, these functions should call table_beginscan_tid().

Perhaps unnecessary, I will attach a patch.

Best regards,
-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment

fix_tidscan_issues_v6.patch

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kyotaro Horiguchi

Date:

07 February 2020, 08:34:29

At Fri, 7 Feb 2020 17:01:59 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2020/02/07 15:07, Kasahara Tatsuhito wrote:
> > Hi,
> > On Fri, Feb 7, 2020 at 1:27 PM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com> wrote:
> >> It seems that nkeys and key are useless. Since every table_beginscan_*
> >> functions have distinct parameter sets, don't we remove them from
> >> table_beginscan_tid?
> > Yeah, actually, when calling table_beginscan_tid(), nkeys is set to 0
> > and * key is set to NULL,
> > so these are not used at the moment.
> > I removed unnecessary arguments from table_beginscan_tid().
> > Attache the v5 patch.
> 
> Thanks for updating the patch! The patch looks good to me.
> So barring any objection, I will push it and back-patch to v12 *soon*
> so that the upcoming minor version can contain it.
> 
> BTW, commit 147e3722f7 causing the issue changed currtid_byreloid()
> and currtid_byrelname() so that they also call table_beginscan().
> I'm not sure what those functions are, but probably we should fix them
> so that table_beginscan_tid() is called instead. Thought?

At least they don't seem to need table_beginscan(), to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Fujii Masao

Date:

07 February 2020, 13:09:48


On 2020/02/07 17:28, Kasahara Tatsuhito wrote:
> Hi,
> 
> On Fri, Feb 7, 2020 at 5:02 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>> BTW, commit 147e3722f7 causing the issue changed currtid_byreloid()
>> and currtid_byrelname() so that they also call table_beginscan().
>> I'm not sure what those functions are, but probably we should fix them
>> so that table_beginscan_tid() is called instead. Thought?
> +1, sorry, I overlooked it.
> 
> Both functions are used to check whether a valid tid or not with a
> relation name (or oid),
> and both perform a tid scan internally.
> So, these functions should call table_beginscan_tid().
> 
> Perhaps unnecessary, I will attach a patch.

Pushed! Thanks!

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (butnot seq_tup_read)

From

Kasahara Tatsuhito

Date:

07 February 2020, 14:25:46

On Fri, Feb 7, 2020 at 10:09 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> Pushed! Thanks!
Thanks Fujii.


--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com