Re: Direct I/O - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Direct I/O
Date
Msg-id 20221102002128.yvq62q7eirwqmks6@awork3.anarazel.de
Whole thread Raw
In response to Re: Direct I/O  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 2022-11-01 15:54:02 -0700, Andres Freund wrote:
> On 2022-11-02 09:44:30 +1300, Thomas Munro wrote:
> > Oh, so BufFile is palloc'd and contains one of these.  BufFile is not
> > even using direct I/O, but by these rules it would need to be
> > palloc_io_align'd.  I will think about what to do about that...
>
> It might be worth having two different versions of the struct, so we don't
> impose unnecessarily high alignment everywhere?

Although it might actually be worth aligning fully everywhere - there's a
noticable performance difference for buffered read IO.

I benchmarked this on my workstation and laptop.

I mmap'ed a buffer with 2 MiB alignment, MAP_ANONYMOUS | MAP_HUGETLB, and then
measured performance of reading 8192 bytes into the buffer at different
offsets. Each time I copied 16GiB in total.  Within a program invocation I
benchmarked each offset 4 times, threw away the worst measurement, and
averaged the rest. Then used the best of three program invocations.

workstation with dual xeon Gold 5215:

         turbo on       turbo off
offset   GiB/s          GiB/s
0        18.358         13.528
8        15.361         11.472
9        15.330         11.418
32       17.583         13.097
512      17.707         13.229
513      15.890         11.852
4096     18.176         13.568
8192     18.088         13.566
2Mib     18.658         13.496


laptop with i9-9880H:

         turbo on       turbo off
offset   GiB/s          GiB/s
0        33.589         17.160
8        28.045         14.301
9        27.582         14.318
32       31.797         16.711
512      32.215         16.810
513      28.864         14.932
4096     32.503         17.266
8192     32.871         17.277
2Mib     32.657         17.262


Seems pretty clear that using 4096 byte alignment is worth it.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Glossary and initdb definition work for "superuser" and database/cluster
Next
From: David Rowley
Date:
Subject: Re: Adding doubly linked list type which stores the number of items in the list