Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) - Mailing list pgsql-hackers

From Nitin Jadhav
Subject Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Date
Msg-id CAMm1aWbuJLEnAjT4_1s8eArmZq=H7oi0Jv=J2GnRpBhkbC_PVw@mail.gmail.com
Whole thread Raw
In response to Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
List pgsql-hackers
> > 3) Why do we need this extra calculation for start_lsn?
> > Do you ever see a negative LSN or something here?
> > +    ('0/0'::pg_lsn + (
> > +        CASE
> > +            WHEN (s.param3 < 0) THEN pow((2)::numeric, (64)::numeric)
> > +            ELSE (0)::numeric
> > +        END + (s.param3)::numeric)) AS start_lsn,
>
> Yes: LSN can take up all of an uint64; whereas the pgstat column is a
> bigint type; thus the signed int64. This cast is OK as it wraps
> around, but that means we have to take care to correctly display the
> LSN when it is > 0x7FFF_FFFF_FFFF_FFFF; which is what we do here using
> the special-casing for negative values.

Yes. The extra calculation is required here as we are storing unit64
value in the variable of type int64. When we convert uint64 to int64
then the bit pattern is preserved (so no data is lost). The high-order
bit becomes the sign bit and if the sign bit is set, both the sign and
magnitude of the value changes. To safely get the actual uint64 value
whatever was assigned, we need the above calculations.

> > 4) Can't you use timestamptz_in(to_char(s.param4))  instead of
> > pg_stat_get_progress_checkpoint_start_time? I don't quite understand
> > the reasoning for having this function and it's named as *checkpoint*
> > when it doesn't do anything specific to the checkpoint at all?
>
> I hadn't thought of using the types' inout functions, but it looks
> like timestamp IO functions use a formatted timestring, which won't
> work with the epoch-based timestamp stored in the view.

There is a variation of to_timestamp() which takes UNIX epoch (float8)
as an argument and converts it to timestamptz but we cannot directly
call this function with S.param4.

TimestampTz
GetCurrentTimestamp(void)
{
    TimestampTz result;
    struct timeval tp;

    gettimeofday(&tp, NULL);

    result = (TimestampTz) tp.tv_sec -
        ((POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * SECS_PER_DAY);
    result = (result * USECS_PER_SEC) + tp.tv_usec;

    return result;
}

S.param4 contains the output of the above function
(GetCurrentTimestamp()) which returns Postgres epoch but the
to_timestamp() expects UNIX epoch as input. So some calculation is
required here. I feel the SQL 'to_timestamp(946684800 +
(S.param4::float / 1000000)) AS start_time' works fine. The value
'946684800' is equal to ((POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) *
SECS_PER_DAY). I am not sure whether it is good practice to use this
way. Kindly share your thoughts.

Thanks & Regards,
Nitin Jadhav

On Mon, Feb 28, 2022 at 6:40 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
>
> On Sun, 27 Feb 2022 at 16:14, Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > 3) Why do we need this extra calculation for start_lsn?
> > Do you ever see a negative LSN or something here?
> > +    ('0/0'::pg_lsn + (
> > +        CASE
> > +            WHEN (s.param3 < 0) THEN pow((2)::numeric, (64)::numeric)
> > +            ELSE (0)::numeric
> > +        END + (s.param3)::numeric)) AS start_lsn,
>
> Yes: LSN can take up all of an uint64; whereas the pgstat column is a
> bigint type; thus the signed int64. This cast is OK as it wraps
> around, but that means we have to take care to correctly display the
> LSN when it is > 0x7FFF_FFFF_FFFF_FFFF; which is what we do here using
> the special-casing for negative values.
>
> As to whether it is reasonable: Generating 16GB of wal every second
> (2^34 bytes /sec) is probably not impossible (cpu <> memory bandwidth
> has been > 20GB/sec for a while); and that leaves you 2^29 seconds of
> database runtime; or about 17 years. Seeing that a cluster can be
> `pg_upgrade`d (which doesn't reset cluster LSN) since PG 9.0 from at
> least version PG 8.4.0 (2009) (and through pg_migrator, from 8.3.0)),
> we can assume that clusters hitting LSN=2^63 will be a reasonable
> possibility within the next few years. As the lifespan of a PG release
> is about 5 years, it doesn't seem impossible that there will be actual
> clusters that are going to hit this naturally in the lifespan of PG15.
>
> It is also possible that someone fat-fingers pg_resetwal; and creates
> a cluster with LSN >= 2^63; resulting in negative values in the
> s.param3 field. Not likely, but we can force such situations; and as
> such we should handle that gracefully.
>
> > 4) Can't you use timestamptz_in(to_char(s.param4))  instead of
> > pg_stat_get_progress_checkpoint_start_time? I don't quite understand
> > the reasoning for having this function and it's named as *checkpoint*
> > when it doesn't do anything specific to the checkpoint at all?
>
> I hadn't thought of using the types' inout functions, but it looks
> like timestamp IO functions use a formatted timestring, which won't
> work with the epoch-based timestamp stored in the view.
>
> > Having 3 unnecessary functions that aren't useful to the users at all
> > in proc.dat will simply eatup the function oids IMO. Hence, I suggest
> > let's try to do without extra functions.
>
> I agree that (1) could be simplified, or at least fully expressed in
> SQL without exposing too many internals. If we're fine with exposing
> internals like flags and type layouts, then (2), and arguably (4), can
> be expressed in SQL as well.
>
> -Matthias



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: In-placre persistance change of a relation
Next
From: Julien Rouhaud
Date:
Subject: Re: Allow file inclusion in pg_hba and pg_ident files