Home > mailing lists

Re: Patch: add timing of buffer I/O requests - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Patch: add timing of buffer I/O requests
Date	November 28, 2011 11:30:08
Msg-id	466f1deac38f415fd47e51756acf322d.squirrel@sq.gransy.com Whole thread Raw
In response to	Re: Patch: add timing of buffer I/O requests (Greg Stark <stark@mit.edu>)
Responses	Re: Patch: add timing of buffer I/O requests Re: Patch: add timing of buffer I/O requests
List	pgsql-hackers

Tree view

On 28 Listopad 2011, 15:40, Greg Stark wrote:
> On Nov 28, 2011 8:55 AM, "Greg Smith" <greg@2ndquadrant.com> wrote:
>>
>> On 11/27/2011 04:39 PM, Ants Aasma wrote:
>>>
>>> On the AMD I saw about 3% performance drop with timing enabled. On the
>>> Intel machine I couldn't measure any statistically significant change.
>>
>>
>> Oh no, it's party pooper time again.  Sorry I have to be the one to do
>> it
> this round.  The real problem with this whole area is that we know there
> are systems floating around where the amount of time taken to grab
> timestamps like this is just terrible.
>
> I believe on most systems on modern linux kernels gettimeofday an its ilk
> will be a vsyscall and nearly as fast as a regular function call.

AFAIK a vsyscall should be faster than a regular syscall. It does not need
to switch to kernel space at all, it "just" reads the data from a shared
page. The problem is that this is Linux-specific - for example FreeBSD
does not have vsyscall at all (it's actually one of the Linux-isms
mentioned here: http://wiki.freebsd.org/AvoidingLinuxisms).

There's also something called VDSO, that (among other things) uses
vsyscall if availabe, or the best implementation available. So there are
platforms that do not provide vsyscall, and in that case it'd be just as
slow as a regular syscall :(

I wouldn't expect a patch that works fine on Linux but not on other
platforms to be accepted, unless there's a compile-time configure switch
(--with-timings) that'd allow to disable that.

Another option would be to reimplement the vsyscall, even on platforms
that don't provide it. The principle is actually quite simple - allocate a
shared memory, store there a current time and update it whenever a clock
interrupt happens. This is basically what Greg suggested in one of the
previous posts, where "regularly" means "on every interrupt". Greg was
worried about the precision, but this should be just fine I guess. It's
the precision you get on Linux, anyway ...

>> I recall a patch similar to this one was submitted by Greg Stark some
> time ago.  It used the info for different reasons--to try and figure out
> whether reads were cached or not--but I believe it withered rather than
> being implemented mainly because it ran into the same fundamental
> roadblocks here.  My memory could be wrong here, there were also concerns
> about what the data would be used for.

The difficulty when distinguishing whether the reads were cached or not is
the price we pay for using filesystem cache instead of managing our own.
Not sure if this can be solved just by measuring the latency - with
spinners it's quite easy, the differences are rather huge (and it's not
difficult to derive that even from pgbench log). But with SSDs, multiple
tablespaces on different storage, etc. it gets much harder.

Tomas

pgsql-hackers by date:

From: Mikko Tiihonen
Date: 28 November 2011, 11:19:15
Subject: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

From: Tom Lane
Date: 28 November 2011, 11:55:23
Subject: Re: Patch: add timing of buffer I/O requests

Re: Patch: add timing of buffer I/O requests - Mailing list pgsql-hackers

Previous

Next