Thread: Miscalculation in IsCheckpointOnSchedule()

Miscalculation in IsCheckpointOnSchedule()

From
ITAGAKI Takahiro
Date:
I run long DBT-2 with 8.3beta2 and saw checkpoint spikes periodically.
The progress against WAL segments consumption *jumped up* in such cases.

It seems to be a miscalculation in IsCheckpointOnSchedule().
xrecoff is uint32, therefore the difference of two xrecoffs could
be between -4G and +4G. We should not cast it to int32, that domain
is [-2G, +2G).

Here is a patch to fix it, casting xrecoff to double directly.

Index: src/backend/postmaster/bgwriter.c
===================================================================
--- src/backend/postmaster/bgwriter.c    (HEAD)
+++ src/backend/postmaster/bgwriter.c    (working copy)
@@ -718,7 +718,7 @@
     recptr = GetInsertRecPtr();
     elapsed_xlogs =
         (((double) (int32) (recptr.xlogid - ckpt_start_recptr.xlogid)) * XLogSegsPerFile +
-         ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
+         ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
         CheckPointSegments;

     if (progress < elapsed_xlogs)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Re: Miscalculation in IsCheckpointOnSchedule()

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> -         ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
> +         ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /

Surely this makes matters worse, not better.  What happens near a segment
boundary crossing?

            regards, tom lane

Re: Miscalculation in IsCheckpointOnSchedule()

From
ITAGAKI Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> > -         ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
> > +         ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
>
> Surely this makes matters worse, not better.  What happens near a segment
> boundary crossing?

Here is the dumped progres information by the attached patch
(only for debug purpose).

         cur prog.  xlog prog. time prog.
[-400ms] 0.030503   0.019247   0.031158   (diff xlogid=0, xrecoff=82665472)
[-200ms] 0.031176   0.019957   0.031839   (diff xlogid=0, xrecoff=85712896)
[*]      0.031860   1.020706   0.032521   (diff xlogid=1, xrecoff=105709568)

> recptr.xrecoff - ckpt_start_recptr.xrecoff

recptr.xrecoff is reset to 0 or so when xlogid is bumped up. At that time,
if ckpt_start_recptr.xrecoff is greater than 2G, we cannot represent
the difference with int32 because the value is less than -2G.
Casting double after int32 does not help us.

We use ( xlogid * 255 * 16MB + xrecoff ) as a base value for the calculation.
If we interprets xrecoff=105709568 at [*] as "minus uint32", we can calcurate
it correctly as below:

         without fix  with fix
[-400ms]   82665472   82665472 (same)
[-200ms]   85712896   85712896 (same)
[*]      4383899648   88932353 (= 1*255*16MB - (0xFFFFFFFF - 105709568) )

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: Miscalculation in IsCheckpointOnSchedule()

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>> -         ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
>> +         ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
>
> Surely this makes matters worse, not better.  What happens near a segment
> boundary crossing?

Hmm. There seems to be another little bug in there. XLogSegsPerFile is
defined as 0xffffffff/XLogSegSize, which is 255 with default settings.
It should be 256. That leads to negative elapsed_xlogs estimates at xlog
file boundaries. XLogCheckpointNeeded suffers from it too; the number of
segments consumed since last checkpoint is off by one per each xlogid,
so we trigger the checkpoint a little bit too late.

Otherwise, the patch looks good to me. I also tested the calculation
with a little C program (attached), and it seems to work on segment and
xlog file boundaries just fine.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
#include <stdio.h>
#include <stdlib.h>

typedef unsigned char uint8;    /* == 8 bits */
typedef unsigned short uint16;    /* == 16 bits */
typedef unsigned int uint32;    /* == 32 bits */

typedef signed char int8;        /* == 8 bits */
typedef signed short int16;        /* == 16 bits */
typedef signed int int32;        /* == 32 bits */

#define XLOG_SEG_SIZE    (16*1024*1024)
#define XLogSegSize        ((uint32) XLOG_SEG_SIZE)
#define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize)

int main(int argc, char **argv)
{
  double elapsed_xlogs;
  uint32 cur_xrecoff, cur_xlogid;
  uint32 ckpt_xrecoff, ckpt_xlogid;
  double a, b;

  cur_xlogid = strtoul(argv[1], NULL, 10);
  cur_xrecoff = strtoul(argv[2], NULL, 10);

  ckpt_xlogid = strtoul(argv[3], NULL, 10);
  ckpt_xrecoff = strtoul(argv[4], NULL, 10);

  //a = (cur_xlogid - ckpt_xlogid) * XLogSegsPerFile;

  //b = (((double) cur_xrecoff) - ((double) ckpt_xrecoff)) / ((double )XLogSegSize);

  //a = ((double) (int32) (cur_xlogid - ckpt_xlogid)) * XLogSegsPerFile;
  //b  = ((double) (int32) (cur_xrecoff - ckpt_xrecoff)) / XLogSegSize;
  a = (cur_xlogid - ckpt_xlogid) * XLogSegsPerFile;
  b = ((double) cur_xrecoff - (double) ckpt_xrecoff) / XLogSegSize;

  elapsed_xlogs = a + b;


  printf("XLogSegsPerFile: %d\n", XLogSegsPerFile);
  printf("xrecoff: %u %f\n", ckpt_xrecoff, (double) ckpt_xrecoff);
  printf("a: %f\n", a);
  printf("b: %f\n", b);
  printf("elapsed_xlogs = %f\n", elapsed_xlogs);
}


Re: Miscalculation in IsCheckpointOnSchedule()

From
Heikki Linnakangas
Date:
Heikki Linnakangas wrote:
> Tom Lane wrote:
>> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>>> -         ((double) (int32) (recptr.xrecoff -
>>> ckpt_start_recptr.xrecoff)) / XLogSegSize) /
>>> +         ((double) recptr.xrecoff - (double)
>>> ckpt_start_recptr.xrecoff) / XLogSegSize) /
>>
>> Surely this makes matters worse, not better.  What happens near a segment
>> boundary crossing?
>
> Hmm. There seems to be another little bug in there. XLogSegsPerFile is
> defined as 0xffffffff/XLogSegSize, which is 255 with default settings.
> It should be 256. That leads to negative elapsed_xlogs estimates at xlog
> file boundaries. XLogCheckpointNeeded suffers from it too; the number of
> segments consumed since last checkpoint is off by one per each xlogid,
> so we trigger the checkpoint a little bit too late.

I'll take that back. We intentionally don't use the last possible
segment of each xlog file, to avoid overflows. Sorry for the noise.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Miscalculation in IsCheckpointOnSchedule()

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Surely this makes matters worse, not better.  What happens near a segment
>> boundary crossing?

> Here is the dumped progres information by the attached patch
> (only for debug purpose).

Oh, I take that back.  I was thinking that conversion of unsigned to
double usually does the Wrong Thing, but in this context it seems to be
the right thing.

            regards, tom lane