Re: data on devel code perf dip - Mailing list pgsql-hackers

From Tom Lane
Subject Re: data on devel code perf dip
Date
Msg-id 5100.1124671078@sss.pgh.pa.us
Whole thread Raw
In response to Re: data on devel code perf dip  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: data on devel code perf dip  ("Jeffrey W. Baker" <jwbaker@acm.org>)
Re: data on devel code perf dip  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> Mary Edie Meredith <maryedie@osdl.org> writes:
>> I have an example of runs that illustrate a performance 
>> problem that occurred between installing the 7/18 and 8/1 
>> development release codes.

> I dug through the CVS logs to see what had changed, and I'm afraid there
> is just one plausible-looking candidate:

>     * src/backend/access/transam/xlog.c: 
>     Use O_DIRECT if available when using O_SYNC for wal_sync_method.
>     Also, write multiple WAL buffers out in one write() operation.

I've been sniffing around that patch and not really finding any smoking
gun about why it would make things slower when you're not using O_DIRECT.
I noticed that it forces 8K alignment of the WAL buffers on any machine
that has O_DIRECT defined, whether you use O_DIRECT or not --- but it's
pretty hard to see why that would make things slower.  (Indeed, the
older code only guaranteed MAXALIGN alignment of the WAL buffers, and
we *know* that's wrong --- on Intel machines you want cache-line
alignment to make kernel-to-userspace transfers fast.  BTW, does anyone
know if the current definition of BUFFERALIGN == 32 bytes needs to be
increased for newer Intel machines?)

I've got a bunch of minor gripes about the coding style of the
gather-write part of the patch, but I really can't see that it's
causing any performance issue.  AFAICS it just replaces several
closely spaced write() syscalls with one larger one.  It's not
possible that the kernel you're using is less efficient for larger
transfers than smaller ones, is it?

The whole thing's pretty bizarre.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Simplifying wal_sync_method
Next
From: Tom Lane
Date:
Subject: Re: Sleep functions