Re: WAL: O_DIRECT and multipage-writer - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: WAL: O_DIRECT and multipage-writer |
Date | |
Msg-id | 200502142325.j1ENP4e19810@candle.pha.pa.us Whole thread Raw |
In response to | WAL: O_DIRECT and multipage-writer (ITAGAKI Takahiro <itagaki.takahiro@lab.ntt.co.jp>) |
List | pgsql-hackers |
This thread has been saved for the 8.1 release: http://momjian.postgresql.org/cgi-bin/pgpatches2 --------------------------------------------------------------------------- ITAGAKI Takahiro wrote: > Hello, all. > > I think that there is room for improvement in WAL. > Here is a patch for it. > - Multiple pages are written in one write() if it is contiguous. > - Add 'open_direct' to wal_sync_method. > > WAL writer writes one page in one write(). This is not efficient > when wal_sync_method is 'open_sync', because the writer waits for > IO completions at each write(). Multipage-writer can reduce syscalls > and improve IO throughput. > > 'open_direct' uses O_DIRECT instead of O_SYNC. O_DIRECT implies synchronous > writing, so it may show the tendency like open_sync. But maybe it can reduce > memcpy() and save OS's disk cache memory. > > I benchmarked this patch with pgbench. It works well and > improved 50% of tps on my machine. WAL seems to be bottle-neck > on machines with poor disks. > > This patch has not yet tested enough. I would like it to be examined much > and taken into PostgreSQL. > > There are still many TODOs: > * Is this logic really correct? > - O_DIRECT_BUFFER_ALIGN should be adjusted to runtime, not compile time. > - Consider to use writev() instead of write(). > Buffers are noncontiguous when WAL ring buffer rotates. > - If wan_sync_method is not open_direct, XLOG_EXTRA_BUFFERS can be 0. > > > Sincerely, > ITAGAKI Takahiro > > > > -- pgbench result -- > > $ ./pgbench -s 100 -c 50 -t 400 > > - 8.0.0 default + fsync: > tps = 20.630632 (including connections establishing) > tps = 20.636768 (excluding connections establishing) > - multipage-writer + open_direct: > tps = 33.761917 (including connections establishing) > tps = 33.778320 (excluding connections establishing) > > Environment: > OS : Linux kernel 2.6.9 > CPU : Pentium 4 3GHz > disk : ATA 5400rpm (Data and WAL are placed on same partition.) > memory : 1GB > config : shared_buffers=10000, wal_buffers=256, > XLOG_SEG_SIZE=256MB, checkpoint_segment=4 > > --- > ITAGAKI Takahiro <itagaki.takahiro@lab.ntt.co.jp> > NTT Cyber Space Laboratories > Nippon Telegraph and Telephone Corporation. [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-hackers by date: