Home > mailing lists

Thread: Tracking last scan time

Tracking last scan time

From

Dave Page

Date:

23 August 2022, 09:55:09

Often it is beneficial to review one's schema with a view to removing indexes (and sometimes tables) that are no longer required. It's very difficult to understand when that is the case by looking at the number of scans of a relation as, for example, an index may be used infrequently but may be critical in those times when it is used.

The attached patch against HEAD adds optional tracking of the last scan time for relations. It updates pg_stat_*_tables with new last_seq_scan and last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to help with this.

Due to the use of gettimeofday(), those values are only maintained if a new GUC, track_scans, is set to on. By default, it is off.

I did run a 12 hour test to see what the performance impact is. pgbench was run with scale factor 10000 and 75 users across 4 identical bare metal machines running Rocky 8 in parallel which showed roughly a -2% average performance penalty against HEAD with track_scans enabled. Machines were PowerEdge R7525's with 128GB RAM, dual 16C/32T AMD 7302 CPUs, with the data directory on 6 x 800GB 12Gb/s SSD SAS drives in RAID 0. Kernel time source is tsc.

HEAD track_scans Penalty (%)
box1 19582.49735 19341.8881 -1.22869541
box2 19936.55513 19928.07479 -0.04253664659
box3 19631.78895 18649.64379 -5.002830696
box4 19810.86767 19420.67192 -1.969604525
Average 19740.42728 19335.06965 -2.05343896

Doc and test updates included.

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachment

last_scan_v1.diff

Re: Tracking last scan time

From

Greg Stark

Date:

23 August 2022, 12:07:05

On Tue, 23 Aug 2022 at 11:00, Dave Page <dpage@pgadmin.org> wrote:
>
> Often it is beneficial to review one's schema with a view to removing indexes (and sometimes tables) that are no
longerrequired. It's very difficult to understand when that is the case by looking at the number of scans of a relation
as,for example, an index may be used infrequently but may be critical in those times when it is used. 

I think this is easy to answer in a prometheus/datadog/etc world since
you can consult the history of the count to see when it was last
incremented. (Or do effectively that continously).

I guess that just reinforces the idea that it should be optional.
Perhaps there's room for some sort of general feature for controlling
various time series aggregates like max() and min() sum() or, uhm,
timeoflastchange() on whatever stats you want. That would let us
remove a bunch of stuff from pg_stat_statements and let users turn on
just the ones they want. And also let users enable things like time of
last rollback or conflict etc. But that's just something to think
about down the road.

> The attached patch against HEAD adds optional tracking of the last scan time for relations. It updates
pg_stat_*_tableswith new last_seq_scan and last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
helpwith this. 
>
> Due to the use of gettimeofday(), those values are only maintained if a new GUC, track_scans, is set to on. By
default,it is off. 

Bikeshedding warning -- "track_scans" could equally apply to almost
any stats about scans. I think the really relevant thing here is the
times, not the scans. I think the GUC should be "track_scan_times". Or
could that still be confused with scan durations? Maybe
"track_scan_timestamps"?

You could maybe make the gettimeofday cheaper by doing it less often.
Like, skipping the increment if the old timestamp is newer than 1s
before the transaction start time (I think that's available free if
some other guc is enabled but I don't recall). Or isn't this cb
normally happening after transaction end? So xactStopTimestamp might
be available already?

--
greg

Re: Tracking last scan time

From

Dave Page

Date:

24 August 2022, 13:01:15

Hi

On Tue, 23 Aug 2022 at 13:07, Greg Stark <stark@mit.edu> wrote:

On Tue, 23 Aug 2022 at 11:00, Dave Page <dpage@pgadmin.org> wrote:
>
> Often it is beneficial to review one's schema with a view to removing indexes (and sometimes tables) that are no longer required. It's very difficult to understand when that is the case by looking at the number of scans of a relation as, for example, an index may be used infrequently but may be critical in those times when it is used.

I think this is easy to answer in a prometheus/datadog/etc world since
you can consult the history of the count to see when it was last
incremented. (Or do effectively that continously).

Yes. But not every PostgreSQL instance is monitored in that way.

I guess that just reinforces the idea that it should be optional.
Perhaps there's room for some sort of general feature for controlling
various time series aggregates like max() and min() sum() or, uhm,
timeoflastchange() on whatever stats you want. That would let us
remove a bunch of stuff from pg_stat_statements and let users turn on
just the ones they want. And also let users enable things like time of
last rollback or conflict etc. But that's just something to think
about down the road.

It's certainly an interesting idea.

> The attached patch against HEAD adds optional tracking of the last scan time for relations. It updates pg_stat_*_tables with new last_seq_scan and last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to help with this.
>
> Due to the use of gettimeofday(), those values are only maintained if a new GUC, track_scans, is set to on. By default, it is off.

Bikeshedding warning -- "track_scans" could equally apply to almost
any stats about scans. I think the really relevant thing here is the
times, not the scans. I think the GUC should be "track_scan_times". Or
could that still be confused with scan durations? Maybe
"track_scan_timestamps"?

The latter seems reasonable.

You could maybe make the gettimeofday cheaper by doing it less often.
Like, skipping the increment if the old timestamp is newer than 1s
before the transaction start time (I think that's available free if
some other guc is enabled but I don't recall). Or isn't this cb
normally happening after transaction end? So xactStopTimestamp might
be available already?

Something like:

if (pgstat_track_scan_timestamps && lstats->t_counts.t_numscans &&
tabentry->lastscan + USECS_PER_SEC < GetCurrentTransactionStopTimestamp())
tabentry->lastscan = GetCurrentTimestamp();

?

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Bruce Momjian

Date:

24 August 2022, 14:18:20

On Tue, Aug 23, 2022 at 10:55:09AM +0100, Dave Page wrote:
> Often it is beneficial to review one's schema with a view to removing indexes
> (and sometimes tables) that are no longer required. It's very difficult to
> understand when that is the case by looking at the number of scans of a
> relation as, for example, an index may be used infrequently but may be critical
> in those times when it is used.
> 
> The attached patch against HEAD adds optional tracking of the last scan time
> for relations. It updates pg_stat_*_tables with new last_seq_scan and
> last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
> help with this.

Would it be simpler to allow the sequential and index scan columns to be
cleared so you can look later to see if it is non-zero?  Should we allow
arbitrary clearing of stat columns?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Tracking last scan time

From

Dave Page

Date:

24 August 2022, 15:01:21

On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Aug 23, 2022 at 10:55:09AM +0100, Dave Page wrote:
> Often it is beneficial to review one's schema with a view to removing indexes
> (and sometimes tables) that are no longer required. It's very difficult to
> understand when that is the case by looking at the number of scans of a
> relation as, for example, an index may be used infrequently but may be critical
> in those times when it is used.
>
> The attached patch against HEAD adds optional tracking of the last scan time
> for relations. It updates pg_stat_*_tables with new last_seq_scan and
> last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
> help with this.

Would it be simpler to allow the sequential and index scan columns to be
cleared so you can look later to see if it is non-zero? Should we allow
arbitrary clearing of stat columns?

I don't think so, because then stat values wouldn't necessarily correlate with each other, and you wouldn't know when any of them were last reset unless we started tracking each individual reset. At least now you can see when they were all reset, and you know they were reset at the same time.

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Bruce Momjian

Date:

24 August 2022, 15:03:44

On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote:
> On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Tue, Aug 23, 2022 at 10:55:09AM +0100, Dave Page wrote:
>     > Often it is beneficial to review one's schema with a view to removing
>     indexes
>     > (and sometimes tables) that are no longer required. It's very difficult
>     to
>     > understand when that is the case by looking at the number of scans of a
>     > relation as, for example, an index may be used infrequently but may be
>     critical
>     > in those times when it is used.
>     >
>     > The attached patch against HEAD adds optional tracking of the last scan
>     time
>     > for relations. It updates pg_stat_*_tables with new last_seq_scan and
>     > last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column
>     to
>     > help with this.
> 
>     Would it be simpler to allow the sequential and index scan columns to be
>     cleared so you can look later to see if it is non-zero?  Should we allow
> 
> I don't think so, because then stat values wouldn't necessarily correlate with
> each other, and you wouldn't know when any of them were last reset unless we
> started tracking each individual reset. At least now you can see when they were
> all reset, and you know they were reset at the same time.

Yeah, true.  I was more asking if these two columns are in some way
special or if people would want a more general solution, and if so, is
that something we want in core Postgres.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Tracking last scan time

From

Dave Page

Date:

24 August 2022, 15:15:47

On Wed, 24 Aug 2022 at 16:03, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote:
> On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Tue, Aug 23, 2022 at 10:55:09AM +0100, Dave Page wrote:
> > Often it is beneficial to review one's schema with a view to removing
> indexes
> > (and sometimes tables) that are no longer required. It's very difficult
> to
> > understand when that is the case by looking at the number of scans of a
> > relation as, for example, an index may be used infrequently but may be
> critical
> > in those times when it is used.
> >
> > The attached patch against HEAD adds optional tracking of the last scan
> time
> > for relations. It updates pg_stat_*_tables with new last_seq_scan and
> > last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column
> to
> > help with this.
>
> Would it be simpler to allow the sequential and index scan columns to be
> cleared so you can look later to see if it is non-zero? Should we allow
>
> I don't think so, because then stat values wouldn't necessarily correlate with
> each other, and you wouldn't know when any of them were last reset unless we
> started tracking each individual reset. At least now you can see when they were
> all reset, and you know they were reset at the same time.

Yeah, true. I was more asking if these two columns are in some way
special or if people would want a more general solution, and if so, is
that something we want in core Postgres.

They're special in the sense that they're the ones you're most likely going to look at to see how much a relation is used I think (at least, I'd look at them rather than the tuple counts).

There are certainly other things for which a last usage value may be useful. Functions/procedures for example, or views. The benefits to removing unused objects of that type are far, far lower than indexes or tables of course.

There are other potential use cases for similar timestamps, such as object creation times (and creating user), but they are more useful for auditing than monitoring and optimisation.

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

David Rowley

Date:

25 August 2022, 00:43:41

On Thu, 25 Aug 2022 at 03:03, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote:
> > On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:
> >     Would it be simpler to allow the sequential and index scan columns to be
> >     cleared so you can look later to see if it is non-zero?  Should we allow
> >
> > I don't think so, because then stat values wouldn't necessarily correlate with
> > each other, and you wouldn't know when any of them were last reset unless we
> > started tracking each individual reset. At least now you can see when they were
> > all reset, and you know they were reset at the same time.
>
> Yeah, true.  I was more asking if these two columns are in some way
> special or if people would want a more general solution, and if so, is
> that something we want in core Postgres.

Back when I used to do a bit of PostgreSQL DBA stuff, I had a nightly
job setup to record the state of pg_stat_all_tables and put that into
another table along with the current date. I then had a view that did
some calculations with col - LAG(col) OVER (PARTITION BY relid ORDER
BY date) to fetch the numerical values for each date.  I didn't ever
want to reset the stats because it messes with autovacuum. If you zero
out n_ins_since_vacuum more often than auto-vacuum would trigger, then
bad things happen over time (we should really warn about that in the
docs).

I don't have a particular opinion about the patch, I'm just pointing
out that there are other ways. Even just writing down the numbers on a
post-it note and coming back in a month to see if they've changed is
enough to tell if the table or index has been used.

We do also need to consider now that stats are stored in shared memory
that any fields we add are in RAM.

David

Re: Tracking last scan time

From

Dave Page

Date:

26 August 2022, 13:05:36

Hi

On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 25 Aug 2022 at 03:03, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote:
> > On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:
> > Would it be simpler to allow the sequential and index scan columns to be
> > cleared so you can look later to see if it is non-zero? Should we allow
> >
> > I don't think so, because then stat values wouldn't necessarily correlate with
> > each other, and you wouldn't know when any of them were last reset unless we
> > started tracking each individual reset. At least now you can see when they were
> > all reset, and you know they were reset at the same time.
>
> Yeah, true. I was more asking if these two columns are in some way
> special or if people would want a more general solution, and if so, is
> that something we want in core Postgres.

Back when I used to do a bit of PostgreSQL DBA stuff, I had a nightly
job setup to record the state of pg_stat_all_tables and put that into
another table along with the current date. I then had a view that did
some calculations with col - LAG(col) OVER (PARTITION BY relid ORDER
BY date) to fetch the numerical values for each date. I didn't ever
want to reset the stats because it messes with autovacuum. If you zero
out n_ins_since_vacuum more often than auto-vacuum would trigger, then
bad things happen over time (we should really warn about that in the
docs).

I don't have a particular opinion about the patch, I'm just pointing
out that there are other ways. Even just writing down the numbers on a
post-it note and coming back in a month to see if they've changed is
enough to tell if the table or index has been used.

There are usually other ways to perform monitoring tasks, but there is something to be said for the convenience of having functionality built in and not having to rely on tools, scripts, or post-it notes :-)

We do also need to consider now that stats are stored in shared memory
that any fields we add are in RAM.

That is a fair point. I believe this is both minimal, and useful though.

I've attached a v2 patch that incorporates Greg's suggestions.

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachment

last_scan_v2.diff

Re: Tracking last scan time

From

Bruce Momjian

Date:

30 August 2022, 18:46:23

On Fri, Aug 26, 2022 at 02:05:36PM +0100, Dave Page wrote:
> On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:
>     I don't have a particular opinion about the patch, I'm just pointing
>     out that there are other ways. Even just writing down the numbers on a
>     post-it note and coming back in a month to see if they've changed is
>     enough to tell if the table or index has been used.
> 
> 
> There are usually other ways to perform monitoring tasks, but there is
> something to be said for the convenience of having functionality built in and
> not having to rely on tools, scripts, or post-it notes :-)

Should we consider using something cheaper like time() so we don't need
a GUC to enable this?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Tracking last scan time

From

Dave Page

Date:

31 August 2022, 16:02:33

On Tue, 30 Aug 2022 at 19:46, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Aug 26, 2022 at 02:05:36PM +0100, Dave Page wrote:
> On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:
> I don't have a particular opinion about the patch, I'm just pointing
> out that there are other ways. Even just writing down the numbers on a
> post-it note and coming back in a month to see if they've changed is
> enough to tell if the table or index has been used.
>
>
> There are usually other ways to perform monitoring tasks, but there is
> something to be said for the convenience of having functionality built in and
> not having to rely on tools, scripts, or post-it notes :-)

Should we consider using something cheaper like time() so we don't need
a GUC to enable this?

Interesting idea, but on my mac at least, 100,000,000 gettimeofday() calls takes about 2 seconds, whilst 100,000,000 time() calls takes 14(!) seconds.

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Bruce Momjian

Date:

31 August 2022, 16:13:37

On Wed, Aug 31, 2022 at 05:02:33PM +0100, Dave Page wrote:
> 
> 
> On Tue, 30 Aug 2022 at 19:46, Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Fri, Aug 26, 2022 at 02:05:36PM +0100, Dave Page wrote:
>     > On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:
>     >     I don't have a particular opinion about the patch, I'm just pointing
>     >     out that there are other ways. Even just writing down the numbers on
>     a
>     >     post-it note and coming back in a month to see if they've changed is
>     >     enough to tell if the table or index has been used.
>     >
>     >
>     > There are usually other ways to perform monitoring tasks, but there is
>     > something to be said for the convenience of having functionality built in
>     and
>     > not having to rely on tools, scripts, or post-it notes :-)
> 
>     Should we consider using something cheaper like time() so we don't need
>     a GUC to enable this?
> 
> 
> Interesting idea, but on my mac at least, 100,000,000 gettimeofday() calls
> takes about 2 seconds, whilst 100,000,000 time() calls takes 14(!) seconds.

Wow.  I was just thinking you need second-level accuracy, which must be
cheap somewhere.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Tracking last scan time

From

Andres Freund

Date:

31 August 2022, 16:21:08

Hi,

On 2022-08-23 10:55:09 +0100, Dave Page wrote:
> Often it is beneficial to review one's schema with a view to removing
> indexes (and sometimes tables) that are no longer required. It's very
> difficult to understand when that is the case by looking at the number of
> scans of a relation as, for example, an index may be used infrequently but
> may be critical in those times when it is used.
> 
> The attached patch against HEAD adds optional tracking of the last scan
> time for relations. It updates pg_stat_*_tables with new last_seq_scan and
> last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
> help with this.
> 
> Due to the use of gettimeofday(), those values are only maintained if a new
> GUC, track_scans, is set to on. By default, it is off.
> 
> I did run a 12 hour test to see what the performance impact is. pgbench was
> run with scale factor 10000 and 75 users across 4 identical bare metal
> machines running Rocky 8 in parallel which showed roughly a -2% average
> performance penalty against HEAD with track_scans enabled. Machines were
> PowerEdge R7525's with 128GB RAM, dual 16C/32T AMD 7302 CPUs, with the data
> directory on 6 x 800GB 12Gb/s SSD SAS drives in RAID 0. Kernel time source
> is tsc.
> 
>                HEAD               track_scans      Penalty (%)
> box1       19582.49735    19341.8881      -1.22869541
> box2       19936.55513    19928.07479    -0.04253664659
> box3       19631.78895    18649.64379    -5.002830696
> box4       19810.86767    19420.67192    -1.969604525
> Average 19740.42728    19335.06965    -2.05343896

Based on the size of those numbers this was a r/w pgbench. If it has this
noticable an impact for r/w, with a pretty low number of scans/sec, how's the
overhead for r/o (which can have 2 orders of magnitude more scans/sec)? It
must be quite bad.

I don't think we should accept this feature with this overhead - but I also
think we can do better, by accepting a bit less accuracy. For this to be
useful we don't need a perfectly accurate timestamp. The statement start time
is probably not accurate enough, but we could just have bgwriter or such
update one in shared memory every time we wake up? Or perhaps we could go to
an even lower granularity, by putting in the current LSN or such?

Greetings,

Andres Freund

Re: Tracking last scan time

From

Matthias van de Meent

Date:

31 August 2022, 17:52:49

On Wed, 31 Aug 2022 at 18:21, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2022-08-23 10:55:09 +0100, Dave Page wrote:
> > Often it is beneficial to review one's schema with a view to removing
> > indexes (and sometimes tables) that are no longer required. It's very
> > difficult to understand when that is the case by looking at the number of
> > scans of a relation as, for example, an index may be used infrequently but
> > may be critical in those times when it is used.
> >
> > The attached patch against HEAD adds optional tracking of the last scan
> > time for relations. It updates pg_stat_*_tables with new last_seq_scan and
> > last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
> > help with this.
> >
> > Due to the use of gettimeofday(), those values are only maintained if a new
> > GUC, track_scans, is set to on. By default, it is off.
> >
> > I did run a 12 hour test to see what the performance impact is. pgbench was
> > run with scale factor 10000 and 75 users across 4 identical bare metal
> > machines running Rocky 8 in parallel which showed roughly a -2% average
> > performance penalty against HEAD with track_scans enabled. Machines were
> > PowerEdge R7525's with 128GB RAM, dual 16C/32T AMD 7302 CPUs, with the data
> > directory on 6 x 800GB 12Gb/s SSD SAS drives in RAID 0. Kernel time source
> > is tsc.
> >
> >                HEAD               track_scans      Penalty (%)
> > box1       19582.49735    19341.8881      -1.22869541
> > box2       19936.55513    19928.07479    -0.04253664659
> > box3       19631.78895    18649.64379    -5.002830696
> > box4       19810.86767    19420.67192    -1.969604525
> > Average 19740.42728    19335.06965    -2.05343896
>
> Based on the size of those numbers this was a r/w pgbench. If it has this
> noticable an impact for r/w, with a pretty low number of scans/sec, how's the
> overhead for r/o (which can have 2 orders of magnitude more scans/sec)? It
> must be quite bad.
>
> I don't think we should accept this feature with this overhead - but I also
> think we can do better, by accepting a bit less accuracy. For this to be
> useful we don't need a perfectly accurate timestamp. The statement start time
> is probably not accurate enough, but we could just have bgwriter or such
> update one in shared memory every time we wake up? Or perhaps we could go to
> an even lower granularity, by putting in the current LSN or such?

I don't think that LSN is precise enough. For example, if you're in a
(mostly) read-only system, the system may go long times without any
meaningful records being written.

As for having a lower granularity and preventing the
one-syscall-per-Relation issue, can't we reuse the query_start or
state_change timestamps that appear in pg_stat_activity (potentially
updated immediately before this stat flush), or some other per-backend
timestamp that is already maintained and considered accurate enough
for this use?
Regardless, with this patch as it is we get a new timestamp for each
relation processed, which I think is a waste of time (heh) even in
VDSO-enabled systems.

Apart from the above, I don't have any other meaningful opinion on
this patch - it might be a good addition, but I don't consume stats
often enough to make a good cost / benefit comparison.

Kind regards,

Matthias van de Meent

Re: Tracking last scan time

From

Bruce Momjian

Date:

31 August 2022, 18:11:11

On Wed, Aug 31, 2022 at 07:52:49PM +0200, Matthias van de Meent wrote:
> As for having a lower granularity and preventing the
> one-syscall-per-Relation issue, can't we reuse the query_start or
> state_change timestamps that appear in pg_stat_activity (potentially

Yeah, query start should be fine, but not transaction start time.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Tracking last scan time

From

Andres Freund

Date:

31 August 2022, 18:56:29

Hi,

On 2022-08-31 19:52:49 +0200, Matthias van de Meent wrote:
> As for having a lower granularity and preventing the
> one-syscall-per-Relation issue, can't we reuse the query_start or
> state_change timestamps that appear in pg_stat_activity (potentially
> updated immediately before this stat flush), or some other per-backend
> timestamp that is already maintained and considered accurate enough
> for this use?

The problem is that it won't change at all for a query that runs for a week -
and we'll report the timestamp from a week ago when it finally ends.

But given this is done when stats are flushed, which only happens after the
transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
we got to flushing the transaction stats we'll already have computed that.

>      tabentry->numscans += lstats->t_counts.t_numscans;
> +    if (pgstat_track_scans && lstats->t_counts.t_numscans)
> +        tabentry->lastscan = GetCurrentTimestamp();

Besides replacing GetCurrentTimestamp() with
GetCurrentTransactionStopTimestamp(), this should then also check if
tabentry->lastscan is already newer than the new timestamp.

Greetings,

Andres Freund

Re: Tracking last scan time

From

Bruce Momjian

Date:

31 August 2022, 19:17:57

On Wed, Aug 31, 2022 at 11:56:29AM -0700, Andres Freund wrote:
> Hi,
> 
> On 2022-08-31 19:52:49 +0200, Matthias van de Meent wrote:
> > As for having a lower granularity and preventing the
> > one-syscall-per-Relation issue, can't we reuse the query_start or
> > state_change timestamps that appear in pg_stat_activity (potentially
> > updated immediately before this stat flush), or some other per-backend
> > timestamp that is already maintained and considered accurate enough
> > for this use?
> 
> The problem is that it won't change at all for a query that runs for a week -
> and we'll report the timestamp from a week ago when it finally ends.
> 
> But given this is done when stats are flushed, which only happens after the
> transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
> we got to flushing the transaction stats we'll already have computed that.

Oh, good point --- it is safer to show a more recent time than a too-old
time.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Tracking last scan time

From

Dave Page

Date:

01 September 2022, 08:46:59

On Wed, 31 Aug 2022 at 17:13, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Aug 31, 2022 at 05:02:33PM +0100, Dave Page wrote:
>
>
> On Tue, 30 Aug 2022 at 19:46, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Fri, Aug 26, 2022 at 02:05:36PM +0100, Dave Page wrote:
> > On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:
> > I don't have a particular opinion about the patch, I'm just pointing
> > out that there are other ways. Even just writing down the numbers on
> a
> > post-it note and coming back in a month to see if they've changed is
> > enough to tell if the table or index has been used.
> >
> >
> > There are usually other ways to perform monitoring tasks, but there is
> > something to be said for the convenience of having functionality built in
> and
> > not having to rely on tools, scripts, or post-it notes :-)
>
> Should we consider using something cheaper like time() so we don't need
> a GUC to enable this?
>
>
> Interesting idea, but on my mac at least, 100,000,000 gettimeofday() calls
> takes about 2 seconds, whilst 100,000,000 time() calls takes 14(!) seconds.

Wow. I was just thinking you need second-level accuracy, which must be
cheap somewhere.

Second-level accuracy would indeed be fine for this. Frankly, for my use case just the date would be enough, but I can imagine people wanting greater accuracy than that.

And yes, I was very surprised by the timing results I got as well. I guess it's a quirk of macOS - on a Linux box I get ~4s for gettimeofday() and ~1s for time().

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Bruce Momjian

Date:

01 September 2022, 12:03:59

On Thu, Sep  1, 2022 at 09:46:59AM +0100, Dave Page wrote:
> On Wed, 31 Aug 2022 at 17:13, Bruce Momjian <bruce@momjian.us> wrote:
>     Wow.  I was just thinking you need second-level accuracy, which must be
>     cheap somewhere.
> 
> 
> Second-level accuracy would indeed be fine for this. Frankly, for my use case
> just the date would be enough, but I can imagine people wanting greater
> accuracy than that. 
> 
> And yes, I was very surprised by the timing results I got as well. I guess it's
> a quirk of macOS - on a Linux box I get ~4s for gettimeofday() and ~1s for time
> ().

i think we lose 95% of our users if we require it to be enabled so let's
work to find a way it can be always enabled.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Tracking last scan time

From

Dave Page

Date:

01 September 2022, 12:18:00

On Thu, 1 Sept 2022 at 13:04, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Sep 1, 2022 at 09:46:59AM +0100, Dave Page wrote:
> On Wed, 31 Aug 2022 at 17:13, Bruce Momjian <bruce@momjian.us> wrote:
> Wow. I was just thinking you need second-level accuracy, which must be
> cheap somewhere.
>
>
> Second-level accuracy would indeed be fine for this. Frankly, for my use case
> just the date would be enough, but I can imagine people wanting greater
> accuracy than that.
>
> And yes, I was very surprised by the timing results I got as well. I guess it's
> a quirk of macOS - on a Linux box I get ~4s for gettimeofday() and ~1s for time
> ().

i think we lose 95% of our users if we require it to be enabled so let's
work to find a way it can be always enabled.

So based on Andres' suggestion, something like this seems like it might work:

if (pgstat_track_scan_timestamps && lstats->t_counts.t_numscans)
{
TimestampTz t = GetCurrentTransactionStopTimestamp();
if (t > tabentry->lastscan)

tabentry->lastscan = t;

}

If that seems like a good option, I can run some more benchmarks (and then remove the GUC if it looks good).

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Matthias van de Meent

Date:

01 September 2022, 12:18:42

On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2022-08-31 19:52:49 +0200, Matthias van de Meent wrote:
> > As for having a lower granularity and preventing the
> > one-syscall-per-Relation issue, can't we reuse the query_start or
> > state_change timestamps that appear in pg_stat_activity (potentially
> > updated immediately before this stat flush), or some other per-backend
> > timestamp that is already maintained and considered accurate enough
> > for this use?
>
> The problem is that it won't change at all for a query that runs for a week -
> and we'll report the timestamp from a week ago when it finally ends.

This earlier proposal to reuse pg_stat_activity values is also invalid
because those timestamps don't exist when you SET track_activities =
OFF.

> But given this is done when stats are flushed, which only happens after the
> transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
> we got to flushing the transaction stats we'll already have computed that.

I'm not entirely happy with that, as that would still add function
call overhead, and potentially still call GetCurrentTimestamp() in
this somewhat hot loop.

As an alternative, we could wire the `now` variable in
pgstat_report_stat (generated from
GetCurrentTransactionStopTimestamp() into pgstat_flush_pending_entries
and then into flush_pending_cb (or, store this in a static variable)
so that we can reuse that value, saving any potential function call
overhead.

> >       tabentry->numscans += lstats->t_counts.t_numscans;
> > +     if (pgstat_track_scans && lstats->t_counts.t_numscans)
> > +             tabentry->lastscan = GetCurrentTimestamp();
>
> Besides replacing GetCurrentTimestamp() with
> GetCurrentTransactionStopTimestamp(), this should then also check if
> tabentry->lastscan is already newer than the new timestamp.

I wonder how important that is. This value only gets set in a stats
flush, which may skew the stat update by several seconds (up to
PGSTAT_MAX_INTERVAL). I don't expect concurrent flushes to take so
long that it will set the values to It is possible, but I think it is
extremely unlikely that this is going to be important when you
consider that these stat flushes are not expected to run for more than
1 second.

Kind regards,

Matthias van de Meent

Re: Tracking last scan time

From

Andres Freund

Date:

01 September 2022, 18:35:40

Hi,

On 2022-09-01 14:18:42 +0200, Matthias van de Meent wrote:
> On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres@anarazel.de> wrote:
> > But given this is done when stats are flushed, which only happens after the
> > transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
> > we got to flushing the transaction stats we'll already have computed that.
> 
> I'm not entirely happy with that, as that would still add function
> call overhead, and potentially still call GetCurrentTimestamp() in
> this somewhat hot loop.

We already used GetCurrentTransactionStopTimestamp() (as you reference below)
before we get to this point, so I doubt that we'll ever call
GetCurrentTimestamp(). And it's hard to imagine that the function call
overhead of GetCurrentTransactionStopTimestamp() matters compared to acquiring
locks etc.


> As an alternative, we could wire the `now` variable in
> pgstat_report_stat (generated from
> GetCurrentTransactionStopTimestamp() into pgstat_flush_pending_entries
> and then into flush_pending_cb (or, store this in a static variable)
> so that we can reuse that value, saving any potential function call
> overhead.

Passing it in doesn't clearly seem an improvement, but I also don't have a
strong opinion on it. I am strongly against the static variable approach.


> > >       tabentry->numscans += lstats->t_counts.t_numscans;
> > > +     if (pgstat_track_scans && lstats->t_counts.t_numscans)
> > > +             tabentry->lastscan = GetCurrentTimestamp();
> >
> > Besides replacing GetCurrentTimestamp() with
> > GetCurrentTransactionStopTimestamp(), this should then also check if
> > tabentry->lastscan is already newer than the new timestamp.
> 
> I wonder how important that is. This value only gets set in a stats
> flush, which may skew the stat update by several seconds (up to
> PGSTAT_MAX_INTERVAL). I don't expect concurrent flushes to take so
> long that it will set the values to It is possible, but I think it is
> extremely unlikely that this is going to be important when you
> consider that these stat flushes are not expected to run for more than
> 1 second.

I think it'll be confusing if you have values going back and forth, even if
just by a little. And it's cheap to defend against, so why not just do that?


Greetings,

Andres Freund

Re: Tracking last scan time

From

Dave Page

Date:

06 September 2022, 13:15:56

Hi

On Thu, 1 Sept 2022 at 19:35, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-09-01 14:18:42 +0200, Matthias van de Meent wrote:
> On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres@anarazel.de> wrote:
> > But given this is done when stats are flushed, which only happens after the
> > transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
> > we got to flushing the transaction stats we'll already have computed that.
>
> I'm not entirely happy with that, as that would still add function
> call overhead, and potentially still call GetCurrentTimestamp() in
> this somewhat hot loop.

We already used GetCurrentTransactionStopTimestamp() (as you reference below)
before we get to this point, so I doubt that we'll ever call
GetCurrentTimestamp(). And it's hard to imagine that the function call
overhead of GetCurrentTransactionStopTimestamp() matters compared to acquiring
locks etc.

Vik and I looked at this a little, and found that we actually don't have generally have GetCurrentTransactionStopTimestamp() at this point - a simple 'select * from pg_class' will result in 9 passes of this code, none of which have xactStopTimestamp != 0.

After discussing it a little, we came to the conclusion that for the stated use case, xactStartTimestamp is actually accurate enough, provided that we only ever update it with a newer value. It would only likely be in extreme edge-cases where the difference between start and end transaction time would have any bearing on whether or not one might drop a table/index for lack of use.

Doing it this way also means we no longer need the GUC to enable the feature, which as Bruce notes, is likely to lose 95% of users.

Updated patch attached:

- GUC removed.

- The timestamp recorded is xactStartTimestamp.

- Docs updated to make it clear we're recording transaction start time.

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachment

last_scan_v3.diff

Re: Tracking last scan time

From

Andres Freund

Date:

06 September 2022, 15:53:25

Hi,

On 2022-09-06 14:15:56 +0100, Dave Page wrote:
> Vik and I looked at this a little, and found that we actually don't have
> generally have GetCurrentTransactionStopTimestamp() at this point - a
> simple 'select * from pg_class' will result in 9 passes of this code, none
> of which have xactStopTimestamp != 0.

Huh, pgstat_report_stat() used GetCurrentTransactionStopTimestamp() has used
for a long time. Wonder when that was broken. Looks like it's set only when a
xid is assigned. We should fix this.


> After discussing it a little, we came to the conclusion that for the stated
> use case, xactStartTimestamp is actually accurate enough, provided that we
> only ever update it with a newer value. It would only likely be in extreme
> edge-cases where the difference between start and end transaction time
> would have any bearing on whether or not one might drop a table/index for
> lack of use.

I don't at all agree with this. Since we already use
GetCurrentTransactionStopTimestamp() in this path we should fix it.

Greetings,

Andres Freund

Re: Tracking last scan time

From

Kyotaro Horiguchi

Date:

07 September 2022, 08:58:22

At Tue, 6 Sep 2022 08:53:25 -0700, Andres Freund <andres@anarazel.de> wrote in 
> Hi,
> 
> On 2022-09-06 14:15:56 +0100, Dave Page wrote:
> > Vik and I looked at this a little, and found that we actually don't have
> > generally have GetCurrentTransactionStopTimestamp() at this point - a
> > simple 'select * from pg_class' will result in 9 passes of this code, none
> > of which have xactStopTimestamp != 0.
> 
> Huh, pgstat_report_stat() used GetCurrentTransactionStopTimestamp() has used
> for a long time. Wonder when that was broken. Looks like it's set only when a
> xid is assigned. We should fix this.

/*
 *    GetCurrentTransactionStopTimestamp
 *
 * We return current time if the transaction stop time hasn't been set
 * (which can happen if we decide we don't need to log an XLOG record).

So, that seems like intentional since 2007 (957d08c81f).  It seems to
me that the patch assumes that the only other use of the timstamp is
pgstats and it didn't let GetCurrentTransactionStopTimestamp() set the
variable for future use.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Tracking last scan time

From

Dave Page

Date:

07 September 2022, 10:03:56

Hi

On Tue, 6 Sept 2022 at 16:53, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-09-06 14:15:56 +0100, Dave Page wrote:
> Vik and I looked at this a little, and found that we actually don't have
> generally have GetCurrentTransactionStopTimestamp() at this point - a
> simple 'select * from pg_class' will result in 9 passes of this code, none
> of which have xactStopTimestamp != 0.

Huh, pgstat_report_stat() used GetCurrentTransactionStopTimestamp() has used
for a long time. Wonder when that was broken. Looks like it's set only when a
xid is assigned. We should fix this.

> After discussing it a little, we came to the conclusion that for the stated
> use case, xactStartTimestamp is actually accurate enough, provided that we
> only ever update it with a newer value. It would only likely be in extreme
> edge-cases where the difference between start and end transaction time
> would have any bearing on whether or not one might drop a table/index for
> lack of use.

I don't at all agree with this. Since we already use
GetCurrentTransactionStopTimestamp() in this path we should fix it.

I just spent some time looking at this, and as far as I can see, we only set xactStopTimestamp if the transaction needs to be WAL logged (and in those cases, it is set before the stats callback runs). As you note though, we are already calling GetCurrentTransactionStopTimestamp() in the read-only case anyway, and thus already incurring the cost of gettimeofday().

Here's a v4 patch. This reverts to using GetCurrentTransactionStopTimestamp() for the last_scan times, and will set xactStopTimestamp the first time GetCurrentTransactionStopTimestamp() is called, thus avoiding multiple gettimeofday() calls. SetCurrentTransactionStopTimestamp() is removed, as is use of xactStopTimestamp (except when resetting it to 0).

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachment

last_scan_v4.diff

Re: Tracking last scan time

From

Vik Fearing

Date:

30 September 2022, 15:58:31

On 9/7/22 12:03, Dave Page wrote:
> Here's a v4 patch. This reverts to using
> GetCurrentTransactionStopTimestamp() for the last_scan times, and will
> set xactStopTimestamp the first time GetCurrentTransactionStopTimestamp()
> is called, thus avoiding multiple gettimeofday() calls.
> SetCurrentTransactionStopTimestamp() is removed, as is use
> of xactStopTimestamp (except when resetting it to 0).

This patch looks good to me and has much saner behavior than what it 
replaces.

As a matter of process, the oid for the new function should be in the 
8000-9000 range and the catversion should be bumped by the committer.

Marked as Ready for Committer.  Thanks for the patch!
-- 
Vik Fearing

Re: Tracking last scan time

From

Andres Freund

Date:

30 September 2022, 17:58:06

Hi,

On 2022-09-30 17:58:31 +0200, Vik Fearing wrote:
> On 9/7/22 12:03, Dave Page wrote:
> > Here's a v4 patch. This reverts to using
> > GetCurrentTransactionStopTimestamp() for the last_scan times, and will
> > set xactStopTimestamp the first time GetCurrentTransactionStopTimestamp()
> > is called, thus avoiding multiple gettimeofday() calls.
> > SetCurrentTransactionStopTimestamp() is removed, as is use
> > of xactStopTimestamp (except when resetting it to 0).
> 
> This patch looks good to me and has much saner behavior than what it
> replaces.

I agree. However, it seems like a significant enough behavioural change that
I'd rather commit it as a separate patch.  I agree with Vik's judgement that
the patch otherwise is otherwise ready. Happy to do that split myself, or you
can do it...

Greetings,

Andres Freund

Re: Tracking last scan time

From

Dave Page

Date:

03 October 2022, 11:55:40

Hi

On Fri, 30 Sept 2022 at 18:58, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-09-30 17:58:31 +0200, Vik Fearing wrote:
> On 9/7/22 12:03, Dave Page wrote:
> > Here's a v4 patch. This reverts to using
> > GetCurrentTransactionStopTimestamp() for the last_scan times, and will
> > set xactStopTimestamp the first time GetCurrentTransactionStopTimestamp()
> > is called, thus avoiding multiple gettimeofday() calls.
> > SetCurrentTransactionStopTimestamp() is removed, as is use
> > of xactStopTimestamp (except when resetting it to 0).
>
> This patch looks good to me and has much saner behavior than what it
> replaces.

I agree. However, it seems like a significant enough behavioural change that
I'd rather commit it as a separate patch. I agree with Vik's judgement that
the patch otherwise is otherwise ready. Happy to do that split myself, or you
can do it...

Thanks. It's just the changes in xact.c, so it doesn't seem like it would cause you any more work either way, in which case, I'll leave it to you :-)

FYI, the OID I chose was simply the closest single value to those used for the other related functions (e.g. pg_stat_get_numscans). Seemed like a good way to use up one more random unused value, but I don't care if it gets changed to the 8000+ range.

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Michael Paquier

Date:

12 October 2022, 06:40:21

On Mon, Oct 03, 2022 at 12:55:40PM +0100, Dave Page wrote:
> Thanks. It's just the changes in xact.c, so it doesn't seem like it would
> cause you any more work either way, in which case, I'll leave it to you :-)

Okay, I have just moved the patch to the next CF then, still marked as
ready for committer.  Are you planning to look at that?
--
Michael

Attachment

signature.asc

Re: Tracking last scan time

From

Dave Page

Date:

12 October 2022, 08:09:46

On Wed, 12 Oct 2022 at 07:40, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Oct 03, 2022 at 12:55:40PM +0100, Dave Page wrote:
> Thanks. It's just the changes in xact.c, so it doesn't seem like it would
> cause you any more work either way, in which case, I'll leave it to you :-)

Okay, I have just moved the patch to the next CF then, still marked as
ready for committer. Are you planning to look at that?

Thanks. Was the question directed at me or Andres?

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Michael Paquier

Date:

12 October 2022, 08:15:30

On Wed, Oct 12, 2022 at 09:09:46AM +0100, Dave Page wrote:
> On Wed, 12 Oct 2022 at 07:40, Michael Paquier <michael@paquier.xyz> wrote:
>> Okay, I have just moved the patch to the next CF then, still marked as
>> ready for committer.  Are you planning to look at that?
>
> Thanks. Was the question directed at me or Andres?

Apologies for the confusion.  This question is addressed to Andres.
--
Michael

Attachment

signature.asc

Re: Tracking last scan time

From

Andres Freund

Date:

12 October 2022, 19:50:31

Hi,

On 2022-10-12 15:40:21 +0900, Michael Paquier wrote:
> On Mon, Oct 03, 2022 at 12:55:40PM +0100, Dave Page wrote:
> > Thanks. It's just the changes in xact.c, so it doesn't seem like it would
> > cause you any more work either way, in which case, I'll leave it to you :-)
> 
> Okay, I have just moved the patch to the next CF then, still marked as
> ready for committer.  Are you planning to look at that?

Yep, doing so right now.

I think this should have at a basic test in src/test/regress/sql/stats.sql. If
I can write one in a few minutes I'll go for that, otherwise will reply
detailing difficulties.


> +      <para>
> +       The time of the last sequential scan of this table, based on the
> +       most recent transaction stop time
> +      </para></entry>

Related rows seem to say "on this table".

Greetings,

Andres Freund

Re: Tracking last scan time

From

Andres Freund

Date:

12 October 2022, 22:52:36

Hi,

On 2022-10-12 12:50:31 -0700, Andres Freund wrote:
> I think this should have at a basic test in src/test/regress/sql/stats.sql. If
> I can write one in a few minutes I'll go for that, otherwise will reply
> detailing difficulties.

Took a bit longer (+lunch). Attached.

In the attached 0001, the patch to make GetCurrentTransactionStopTimestamp()
set xactStopTimestamp, I added a few comment updates and an Assert() to ensure
that CurrentTransactionState->state is TRANS_(DEFAULT|COMMIT|ABORT|PREPARE). I
am worried that otherwise we might end up with someone ending up using it in a
place before the end of the transaction, which'd then end up recording the
wrong timestamp in the commit/abort record.

For 0002, the commit adding lastscan, I added catversion/stats version bumps
(because I was planning to commit it already...), a commit message, and that
minor docs change mentioned earlier.

0003 adds the tests mentioned above. I plan to merge them with 0002, but left
them separate for easier review for now.

To be able to compare timestamps for > not just >= we need to make sure that
two subsequent timestamps differ. The attached achieves this by sleeping for
100ms between those points - we do that in other places already. I'd started
out with 10ms, which I am fairly sure would suffice, but then deciced to copy
the existing 100ms sleeps.

I verified tests pass under valgrind, debug_discard_caches and after I make
pgstat_report_stat() only flush when force is passed in.

Greetings,

Andres Freund

Attachment

Re: Tracking last scan time

From

Dave Page

Date:

13 October 2022, 13:38:06

Hi

On Wed, 12 Oct 2022 at 23:52, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-10-12 12:50:31 -0700, Andres Freund wrote:
> I think this should have at a basic test in src/test/regress/sql/stats.sql. If
> I can write one in a few minutes I'll go for that, otherwise will reply
> detailing difficulties.

Took a bit longer (+lunch). Attached.

In the attached 0001, the patch to make GetCurrentTransactionStopTimestamp()
set xactStopTimestamp, I added a few comment updates and an Assert() to ensure
that CurrentTransactionState->state is TRANS_(DEFAULT|COMMIT|ABORT|PREPARE). I
am worried that otherwise we might end up with someone ending up using it in a
place before the end of the transaction, which'd then end up recording the
wrong timestamp in the commit/abort record.

For 0002, the commit adding lastscan, I added catversion/stats version bumps
(because I was planning to commit it already...), a commit message, and that
minor docs change mentioned earlier.

0003 adds the tests mentioned above. I plan to merge them with 0002, but left
them separate for easier review for now.

To be able to compare timestamps for > not just >= we need to make sure that
two subsequent timestamps differ. The attached achieves this by sleeping for
100ms between those points - we do that in other places already. I'd started
out with 10ms, which I am fairly sure would suffice, but then deciced to copy
the existing 100ms sleeps.

I verified tests pass under valgrind, debug_discard_caches and after I make
pgstat_report_stat() only flush when force is passed in.

Thanks for that. It looks good to me, bar one comment (repeated 3 times in the sql and expected files):

fetch timestamps from before the next test

"from " should be removed.

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Andres Freund

Date:

14 October 2022, 18:16:38

Hi,

On 2022-10-13 14:38:06 +0100, Dave Page wrote:
> Thanks for that. It looks good to me, bar one comment (repeated 3 times in
> the sql and expected files):
> 
> fetch timestamps from before the next test
> 
> "from " should be removed.

I was trying to say something with that from, but clearly it wasn't
understandable :). Removed.

With that I pushed the changes and marked the CF entry as committed.

Thanks for the feature Dave and the reviews everyone.

Greetings,

Andres Freund

Re: Tracking last scan time

From

Dave Page

Date:

14 October 2022, 18:54:46

On Fri, 14 Oct 2022 at 19:16, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-10-13 14:38:06 +0100, Dave Page wrote:
> Thanks for that. It looks good to me, bar one comment (repeated 3 times in
> the sql and expected files):
>
> fetch timestamps from before the next test
>
> "from " should be removed.

I was trying to say something with that from, but clearly it wasn't
understandable :). Removed.

With that I pushed the changes and marked the CF entry as committed.

Thanks!

--

--
Dave Page
https://pgsnake.blogspot.com

EDB Postgres
https://www.enterprisedb.com

Re: Tracking last scan time

From

Robert Treat

Date:

23 October 2022, 20:09:44

On Fri, Oct 14, 2022 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:
> On Fri, 14 Oct 2022 at 19:16, Andres Freund <andres@anarazel.de> wrote:
>> On 2022-10-13 14:38:06 +0100, Dave Page wrote:
>> > Thanks for that. It looks good to me, bar one comment (repeated 3 times in
>> > the sql and expected files):
>> >
>> > fetch timestamps from before the next test
>> >
>> > "from " should be removed.
>>
>> I was trying to say something with that from, but clearly it wasn't
>> understandable :). Removed.
>>
>> With that I pushed the changes and marked the CF entry as committed.
>
>
> Thanks!
>

Hey folks,

I was looking at this a bit further (great addition btw) and noticed
the following behavior (this is a mre of the original testing that
uncovered this):

pagila=# select * from pg_stat_user_tables ;
 relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int);
CREATE TABLE
Time: 2.145 ms
pagila=# select * from pg_stat_user_tables ;
 relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
 16392 | public     | x       |        0 | [null]        |
0 |   [null] | [null]        |        [null] |         0 |         0 |
        0 |             0 |          0 |          0 |
 0 |                  0 | [null]      | [null]          | [null]
| [null]           |            0 |                0 |             0 |
                0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1
pagila=# select * from pg_stat_user_tables ;
 relid | schemaname | relname | seq_scan |     last_seq_scan      |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
 16392 | public     | x       |        0 | 1999-12-31 19:00:00-05 |
        0 |   [null] | [null]        |        [null] |         1 |
    0 |         0 |             0 |          1 |          0 |
         1 |                  1 | [null]      | [null]          |
[null]       | [null]           |            0 |                0 |
         0 |                 0
(1 row)

Normally we populate "last" columns with a NULL value when the
corresponding marker is zero, which seems correct in the first query,
but no longer matches in the second. I can see an argument that this
is a necessary exception to that rule (I'm not sure I agree with it,
but I see it) but even in that scenario, ISTM we should avoid
populating the table with a "special value", which generally goes
against observability best practices, and I believe we've been able to
avoid it elsewhere.

Beyond that, I also notice the behavior changes when adding a table
with a PK, though not necessarily better...

pagila=# drop table x;
DROP TABLE
Time: 2.896 ms
pagila=# select * from pg_stat_user_tables ;
 relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int primary key) ;
CREATE TABLE

pagila=# select * from pg_stat_user_tables ;
 relid | schemaname | relname | seq_scan |         last_seq_scan
  | seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+-------------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
 16400 | public     | x       |        1 | 2022-10-23
15:53:04.935192-04 |            0 |        0 | [null]        |
    0 |         0 |         0 |         0 |             0 |          0
|          0 |                   0 |                  0 | [null]
| [null]          | [null]       | [null]           |            0 |
             0 |             0 |                 0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1

pagila=# select * from pg_stat_user_tables ;
 relid | schemaname | relname | seq_scan |         last_seq_scan
  | seq_tup_read | idx_scan |     last_idx_scan      | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+-------------------------------+--------------+----------+------------------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
 16400 | public     | x       |        1 | 2022-10-23
15:53:04.935192-04 |            0 |        0 | 1999-12-31 19:00:00-05
|             0 |         1 |         0 |         0 |             0 |
        1 |          0 |                   1 |                  1 |
[null]      | [null]          | [null]       | [null]           |
      0 |                0 |             0 |                 0
(1 row)

This time the create table both populate a sequential scan and
populates the last_seq_scan with a real/correct value. However an
insert into the table neither advances the seq_scan nor the
last_seq_scan values which seems like different behavior from my
original example, with the added negative that the last_idx_scan is
now populated with a special value :-(

I think the simplest fix which should correspond to existing versions
behavior would be to just ensure that we replace any "special value"
timestamps with a real transaction timestamp, and then maybe note that
these fields may be advanced by operations which don't strictly show
up as a sequential or index scan.


Robert Treat
https://xzilla.net

Re: Tracking last scan time

From

Dave Page

Date:

31 October 2022, 11:36:51

FYI, this is not intentional, and I do plan to look into it, however I've been somewhat busy with pgconfeu, and am travelling for the rest of this week as well.

On Sun, 23 Oct 2022 at 21:09, Robert Treat <rob@xzilla.net> wrote:

On Fri, Oct 14, 2022 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:
> On Fri, 14 Oct 2022 at 19:16, Andres Freund <andres@anarazel.de> wrote:
>> On 2022-10-13 14:38:06 +0100, Dave Page wrote:
>> > Thanks for that. It looks good to me, bar one comment (repeated 3 times in
>> > the sql and expected files):
>> >
>> > fetch timestamps from before the next test
>> >
>> > "from " should be removed.
>>
>> I was trying to say something with that from, but clearly it wasn't
>> understandable :). Removed.
>>
>> With that I pushed the changes and marked the CF entry as committed.
>
>
> Thanks!
>

Hey folks,

I was looking at this a bit further (great addition btw) and noticed
the following behavior (this is a mre of the original testing that
uncovered this):

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int);
CREATE TABLE
Time: 2.145 ms
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16392 | public | x | 0 | [null] |
0 | [null] | [null] | [null] | 0 | 0 |
0 | 0 | 0 | 0 |
0 | 0 | [null] | [null] | [null]
| [null] | 0 | 0 | 0 |
0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16392 | public | x | 0 | 1999-12-31 19:00:00-05 |
0 | [null] | [null] | [null] | 1 |
0 | 0 | 0 | 1 | 0 |
1 | 1 | [null] | [null] |
[null] | [null] | 0 | 0 |
0 | 0
(1 row)

Normally we populate "last" columns with a NULL value when the
corresponding marker is zero, which seems correct in the first query,
but no longer matches in the second. I can see an argument that this
is a necessary exception to that rule (I'm not sure I agree with it,
but I see it) but even in that scenario, ISTM we should avoid
populating the table with a "special value", which generally goes
against observability best practices, and I believe we've been able to
avoid it elsewhere.

Beyond that, I also notice the behavior changes when adding a table
with a PK, though not necessarily better...

pagila=# drop table x;
DROP TABLE
Time: 2.896 ms
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int primary key) ;
CREATE TABLE

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan
| seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+-------------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16400 | public | x | 1 | 2022-10-23
15:53:04.935192-04 | 0 | 0 | [null] |
0 | 0 | 0 | 0 | 0 | 0
| 0 | 0 | 0 | [null]
| [null] | [null] | [null] | 0 |
0 | 0 | 0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan
| seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+-------------------------------+--------------+----------+------------------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16400 | public | x | 1 | 2022-10-23
15:53:04.935192-04 | 0 | 0 | 1999-12-31 19:00:00-05
| 0 | 1 | 0 | 0 | 0 |
1 | 0 | 1 | 1 |
[null] | [null] | [null] | [null] |
0 | 0 | 0 | 0
(1 row)

This time the create table both populate a sequential scan and
populates the last_seq_scan with a real/correct value. However an
insert into the table neither advances the seq_scan nor the
last_seq_scan values which seems like different behavior from my
original example, with the added negative that the last_idx_scan is
now populated with a special value :-(

I think the simplest fix which should correspond to existing versions
behavior would be to just ensure that we replace any "special value"
timestamps with a real transaction timestamp, and then maybe note that
these fields may be advanced by operations which don't strictly show
up as a sequential or index scan.

Robert Treat
https://xzilla.net

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: Tracking last scan time

From

Dave Page

Date:

03 November 2022, 20:44:16

On Mon, 31 Oct 2022 at 07:36, Dave Page <dpage@pgadmin.org> wrote:

FYI, this is not intentional, and I do plan to look into it, however I've been somewhat busy with pgconfeu, and am travelling for the rest of this week as well.

Here's a patch to fix this issue. Many thanks to Peter Eisentraut who figured it out in a few minutes after I spent far too long looking down rabbit holes in entirely the wrong place.

Thanks for the bug report.

On Sun, 23 Oct 2022 at 21:09, Robert Treat <rob@xzilla.net> wrote:
On Fri, Oct 14, 2022 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:
> On Fri, 14 Oct 2022 at 19:16, Andres Freund <andres@anarazel.de> wrote:
>> On 2022-10-13 14:38:06 +0100, Dave Page wrote:
>> > Thanks for that. It looks good to me, bar one comment (repeated 3 times in
>> > the sql and expected files):
>> >
>> > fetch timestamps from before the next test
>> >
>> > "from " should be removed.
>>
>> I was trying to say something with that from, but clearly it wasn't
>> understandable :). Removed.
>>
>> With that I pushed the changes and marked the CF entry as committed.
>
>
> Thanks!
>

Hey folks,

I was looking at this a bit further (great addition btw) and noticed
the following behavior (this is a mre of the original testing that
uncovered this):

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int);
CREATE TABLE
Time: 2.145 ms
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16392 | public | x | 0 | [null] |
0 | [null] | [null] | [null] | 0 | 0 |
0 | 0 | 0 | 0 |
0 | 0 | [null] | [null] | [null]
| [null] | 0 | 0 | 0 |
0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16392 | public | x | 0 | 1999-12-31 19:00:00-05 |
0 | [null] | [null] | [null] | 1 |
0 | 0 | 0 | 1 | 0 |
1 | 1 | [null] | [null] |
[null] | [null] | 0 | 0 |
0 | 0
(1 row)

Normally we populate "last" columns with a NULL value when the
corresponding marker is zero, which seems correct in the first query,
but no longer matches in the second. I can see an argument that this
is a necessary exception to that rule (I'm not sure I agree with it,
but I see it) but even in that scenario, ISTM we should avoid
populating the table with a "special value", which generally goes
against observability best practices, and I believe we've been able to
avoid it elsewhere.

Beyond that, I also notice the behavior changes when adding a table
with a PK, though not necessarily better...

pagila=# drop table x;
DROP TABLE
Time: 2.896 ms
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int primary key) ;
CREATE TABLE

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan
| seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+-------------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16400 | public | x | 1 | 2022-10-23
15:53:04.935192-04 | 0 | 0 | [null] |
0 | 0 | 0 | 0 | 0 | 0
| 0 | 0 | 0 | [null]
| [null] | [null] | [null] | 0 |
0 | 0 | 0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan
| seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+-------------------------------+--------------+----------+------------------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16400 | public | x | 1 | 2022-10-23
15:53:04.935192-04 | 0 | 0 | 1999-12-31 19:00:00-05
| 0 | 1 | 0 | 0 | 0 |
1 | 0 | 1 | 1 |
[null] | [null] | [null] | [null] |
0 | 0 | 0 | 0
(1 row)

This time the create table both populate a sequential scan and
populates the last_seq_scan with a real/correct value. However an
insert into the table neither advances the seq_scan nor the
last_seq_scan values which seems like different behavior from my
original example, with the added negative that the last_idx_scan is
now populated with a special value :-(

I think the simplest fix which should correspond to existing versions
behavior would be to just ensure that we replace any "special value"
timestamps with a real transaction timestamp, and then maybe note that
these fields may be advanced by operations which don't strictly show
up as a sequential or index scan.

Robert Treat
https://xzilla.net

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachment

fix_no_scan_handling.diff

Re: Tracking last scan time

From

Michael Paquier

Date:

07 November 2022, 07:54:07

On Thu, Nov 03, 2022 at 04:44:16PM -0400, Dave Page wrote:
> Here's a patch to fix this issue. Many thanks to Peter Eisentraut who
> figured it out in a few minutes after I spent far too long looking down
> rabbit holes in entirely the wrong place.

FWIW, all the other areas of pgstatfuncs.c manipulate timestamptz
fields with a style like the attached.  That's a nit, still per the
role of consistency with the surroundings..

Anyway, it seems to me that a regression test is in order before a
scan happens just after the relation creation, and the same problem
shows up with last_idx_scan.
--
Michael

Attachment

Re: Tracking last scan time

From

Michael Paquier

Date:

08 November 2022, 04:09:59

On Mon, Nov 07, 2022 at 04:54:07PM +0900, Michael Paquier wrote:
> FWIW, all the other areas of pgstatfuncs.c manipulate timestamptz
> fields with a style like the attached.  That's a nit, still per the
> role of consistency with the surroundings..
>
> Anyway, it seems to me that a regression test is in order before a
> scan happens just after the relation creation, and the same problem
> shows up with last_idx_scan.

Hearing nothing, done this way as of d7744d5.  Thanks for the report,
Robert.  And thanks for the patch, Dave.
--
Michael

Attachment

signature.asc

Re: Tracking last scan time

From

Dave Page

Date:

08 November 2022, 09:25:25

On Tue, 8 Nov 2022 at 04:10, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Nov 07, 2022 at 04:54:07PM +0900, Michael Paquier wrote:
> FWIW, all the other areas of pgstatfuncs.c manipulate timestamptz
> fields with a style like the attached. That's a nit, still per the
> role of consistency with the surroundings..
>
> Anyway, it seems to me that a regression test is in order before a
> scan happens just after the relation creation, and the same problem
> shows up with last_idx_scan.

Hearing nothing, done this way as of d7744d5. Thanks for the report,
Robert. And thanks for the patch, Dave.

Thank you!

--

Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com