Re: [pgsql-hackers-win32] SRA Win32 sync() code - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: [pgsql-hackers-win32] SRA Win32 sync() code
Date
Msg-id 200311170546.hAH5kYn17867@candle.pha.pa.us
Whole thread Raw
In response to Re: [pgsql-hackers-win32] SRA Win32 sync() code  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [pgsql-hackers-win32] SRA Win32 sync() code  (Shridhar Daithankar <shridhar_daithankar@persistent.co.in>)
Re: [pgsql-hackers-win32] SRA Win32 sync() code  (Kurt Roeckx <Q@ping.be>)
List pgsql-patches
Tom Lane wrote:
> > Do we know that having the background writer fsync a file that was
> > written by a backend cause all the data to fsync?  I think I could write
> > a program to test this by timing each of these tests:
>
> That might prove something about the particular platform you tested it
> on; but it would not speak to the real problem, which is what we can
> assume is true on every platform...

The attached program does test if fsync can be used on a file descriptor
after the file is closed and then reopened.  I see:

    write                  0.000613
    write & fsync          0.001727
    write, close & fsync   0.001633

This shows that fsync works even after the file is closed and reopened.
I could test by writing using a subprocess, but I don't see how that
would be different, and it would mess up my timings.

Anyway, if we find all our platforms can pass this test, we might be
able to allow backends to do their own writes and just record the file
name somewhere for the checkpointer to fsync.  It also shows write/fsync
was 3x slower than simple write.

Does anyone have a platform where the last duration is significantly
different from the middle timing?

I am keeping this discussion on patches because of the C program
attachment.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
/*
 *    test_fsync.c
 *        tests if fsync can be done from another process than the original write
 */

#include <sys/types.h>
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>

void die(char *str);
void print_elapse(struct timeval start_t, struct timeval elapse_t);

int main(int argc, char *argv[])
{
    struct timeval start_t;
    struct timeval elapse_t;
    int tmpfile;
    int i;
    char charout = 44;

    /* write only */
    gettimeofday(&start_t, NULL);
    if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1)
        die("can't open /var/tmp/test_fsync.out");
    for (i = 0; i < 200; i++)
        write(tmpfile, &charout, 1);
    close(tmpfile);
    gettimeofday(&elapse_t, NULL);
    unlink("/var/tmp/test_fsync.out");
    printf("write                  ");
    print_elapse(start_t, elapse_t);
    printf("\n");

    /* write & fsync */
    gettimeofday(&start_t, NULL);
    if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1)
        die("can't open /var/tmp/test_fsync.out");
    for (i = 0; i < 200; i++)
        write(tmpfile, &charout, 1);
    fsync(tmpfile);
    close(tmpfile);
    gettimeofday(&elapse_t, NULL);
    unlink("/var/tmp/test_fsync.out");
    printf("write & fsync          ");
    print_elapse(start_t, elapse_t);
    printf("\n");

    /* write, close & fsync */
    gettimeofday(&start_t, NULL);
    if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1)
        die("can't open /var/tmp/test_fsync.out");
    for (i = 0; i < 200; i++)
        write(tmpfile, &charout, 1);
    close(tmpfile);
    /* reopen file */
    if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1)
        die("can't open /var/tmp/test_fsync.out");
    fsync(tmpfile);
    close(tmpfile);
    gettimeofday(&elapse_t, NULL);
    unlink("/var/tmp/test_fsync.out");
    printf("write, close & fsync   ");
    print_elapse(start_t, elapse_t);
    printf("\n");

    return 0;
}

void print_elapse(struct timeval start_t, struct timeval elapse_t)
{
    if (elapse_t.tv_usec < start_t.tv_usec)
    {
        elapse_t.tv_sec--;
        elapse_t.tv_usec += 1000000;
    }

    printf("%ld.%06ld", (long) (elapse_t.tv_sec - start_t.tv_sec),
                     (long) (elapse_t.tv_usec - start_t.tv_usec));
}

void die(char *str)
{
    fprintf(stderr, "%s", str);
    exit(1);
}

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [pgsql-hackers-win32] SRA Win32 sync() code
Next
From: Manfred Spraul
Date:
Subject: Re: SIGPIPE handling