Re: Patch: add timing of buffer I/O requests - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Patch: add timing of buffer I/O requests
Date
Msg-id 466f1deac38f415fd47e51756acf322d.squirrel@sq.gransy.com
Whole thread Raw
In response to Re: Patch: add timing of buffer I/O requests  (Greg Stark <stark@mit.edu>)
Responses Re: Patch: add timing of buffer I/O requests  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Re: Patch: add timing of buffer I/O requests  (Jim Nasby <jim@nasby.net>)
List pgsql-hackers
On 28 Listopad 2011, 15:40, Greg Stark wrote:
> On Nov 28, 2011 8:55 AM, "Greg Smith" <greg@2ndquadrant.com> wrote:
>>
>> On 11/27/2011 04:39 PM, Ants Aasma wrote:
>>>
>>> On the AMD I saw about 3% performance drop with timing enabled. On the
>>> Intel machine I couldn't measure any statistically significant change.
>>
>>
>> Oh no, it's party pooper time again.  Sorry I have to be the one to do
>> it
> this round.  The real problem with this whole area is that we know there
> are systems floating around where the amount of time taken to grab
> timestamps like this is just terrible.
>
> I believe on most systems on modern linux kernels gettimeofday an its ilk
> will be a vsyscall and nearly as fast as a regular function call.

AFAIK a vsyscall should be faster than a regular syscall. It does not need
to switch to kernel space at all, it "just" reads the data from a shared
page. The problem is that this is Linux-specific - for example FreeBSD
does not have vsyscall at all (it's actually one of the Linux-isms
mentioned here: http://wiki.freebsd.org/AvoidingLinuxisms).

There's also something called VDSO, that (among other things) uses
vsyscall if availabe, or the best implementation available. So there are
platforms that do not provide vsyscall, and in that case it'd be just as
slow as a regular syscall :(

I wouldn't expect a patch that works fine on Linux but not on other
platforms to be accepted, unless there's a compile-time configure switch
(--with-timings) that'd allow to disable that.

Another option would be to reimplement the vsyscall, even on platforms
that don't provide it. The principle is actually quite simple - allocate a
shared memory, store there a current time and update it whenever a clock
interrupt happens. This is basically what Greg suggested in one of the
previous posts, where "regularly" means "on every interrupt". Greg was
worried about the precision, but this should be just fine I guess. It's
the precision you get on Linux, anyway ...

>> I recall a patch similar to this one was submitted by Greg Stark some
> time ago.  It used the info for different reasons--to try and figure out
> whether reads were cached or not--but I believe it withered rather than
> being implemented mainly because it ran into the same fundamental
> roadblocks here.  My memory could be wrong here, there were also concerns
> about what the data would be used for.

The difficulty when distinguishing whether the reads were cached or not is
the price we pay for using filesystem cache instead of managing our own.
Not sure if this can be solved just by measuring the latency - with
spinners it's quite easy, the differences are rather huge (and it's not
difficult to derive that even from pgbench log). But with SSDs, multiple
tablespaces on different storage, etc. it gets much harder.

Tomas



pgsql-hackers by date:

Previous
From: Mikko Tiihonen
Date:
Subject: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
Next
From: Tom Lane
Date:
Subject: Re: Patch: add timing of buffer I/O requests