Re: Bgwriter strategies - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Bgwriter strategies |
Date | |
Msg-id | 4694AA71.7040701@enterprisedb.com Whole thread Raw |
In response to | Re: Bgwriter strategies (Greg Smith <gsmith@gregsmith.com>) |
Responses |
Re: Bgwriter strategies
|
List | pgsql-hackers |
In the last couple of days, I've been running a lot of DBT-2 tests and smaller microbenchmarks with different bgwriter settings and experimental patches, but I have not been able to produce a repeatable test case where any of the bgwriter configurations perform better than not having bgwriter at all. I encountered a strange phenomenon that I don't understand. I ran a small test case with DELETEs in random order, using an index, on a table ~300MB table, with shared_buffers smaller than that. I expected that to be dominated by the speed postgres can swap pages in and out of the shared buffer cache, but surprisingly the test starts to block on the write I/O, even though the table fits completely in OS cache. I was able to reproduce the phenomenon with a simple C program that writes 8k blocks in random order to a fixed size file. I've attached it along with output of running it on my test server. The output shows how the writes start to periodically block after a while. I was able to reproduce the problem on my laptop as well. Can anyone explain what's going on? Anyone out there have a repeatable test case where bgwriter helps? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com #include <stdio.h> #include <fcntl.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/time.h> #include <time.h> int main(int argc, char **argv) { int fd; off_t len; char buf[8192]; int i; int size; struct timeval begin_t; if (argc != 3) { printf("Usage: writetest <filename> <size in MB>\n"); exit(1); } fd = open(argv[1], O_RDWR | O_CREAT | O_TRUNC, S_IWUSR | S_IRUSR); if (fd == -1) { perror(NULL); exit(1); } size = atoi(argv[2]) * 1024 * 1024; for(i=0; i < size;) i += write(fd, buf, sizeof(buf)); len = i; fsync(fd); gettimeofday(&begin_t, NULL); for(i = 0; i < 10000000; i++) { lseek(fd, ((random() % (len / sizeof(buf)))) * sizeof(buf), SEEK_SET); write(fd, buf, sizeof(buf)); if(i % 40000 == 0) { struct timeval t; long msecs; gettimeofday(&t, NULL); msecs = (t.tv_sec - begin_t.tv_sec) * 1000 +(t.tv_usec - begin_t.tv_usec) / 1000; printf("%d blocks written, time=%ld ms\n", i, msecs); begin_t = t; } } } ./writetest /mnt/data/writetest-data 80 0 blocks written, time=0 ms 40000 blocks written, time=251 ms 80000 blocks written, time=241 ms 120000 blocks written, time=241 ms 160000 blocks written, time=241 ms 200000 blocks written, time=242 ms 240000 blocks written, time=242 ms 280000 blocks written, time=241 ms 320000 blocks written, time=241 ms 360000 blocks written, time=242 ms 400000 blocks written, time=241 ms 440000 blocks written, time=241 ms 480000 blocks written, time=241 ms 520000 blocks written, time=242 ms 560000 blocks written, time=241 ms 600000 blocks written, time=241 ms 640000 blocks written, time=242 ms 680000 blocks written, time=242 ms 720000 blocks written, time=242 ms 760000 blocks written, time=241 ms 800000 blocks written, time=242 ms 840000 blocks written, time=4579 ms 880000 blocks written, time=244 ms 920000 blocks written, time=242 ms 960000 blocks written, time=4752 ms 1000000 blocks written, time=241 ms 1040000 blocks written, time=4618 ms 1080000 blocks written, time=242 ms 1120000 blocks written, time=4614 ms 1160000 blocks written, time=246 ms 1200000 blocks written, time=243 ms 1240000 blocks written, time=4619 ms 1280000 blocks written, time=242 ms 1320000 blocks written, time=242 ms 1360000 blocks written, time=4605 ms 1400000 blocks written, time=242 ms
pgsql-hackers by date: