Re: fstat vs. lseek - Mailing list pgsql-hackers

From Andres Freund
Subject Re: fstat vs. lseek
Date
Msg-id 2366521.2k2cV9r50e@alap2
Whole thread Raw
In response to Re: fstat vs. lseek  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: fstat vs. lseek
List pgsql-hackers
On Monday, August 08, 2011 11:33:29 Robert Haas wrote:

> On Mon, Aug 8, 2011 at 10:49 AM, Andres Freund <andres@anarazel.de> wrote:
> > I don't think its a good idea to replace lseek with fstat in the long
> > run. The likelihood that the lockless generic_file_llseek will get
> > included seems rather high to me. In contrast to that fstat will always
> > be more expensive than that as its going through a security check and
> > then the fs' getattr implementation (which actually takes a lock on
> > some fs).
> *scratches head*  I understand that stat() would need a security
> check, but why would fstat()?
That I am not totally sure of either. I guess Kaigai might know more about 
that.
I guess it might be that a forked process possibly is not allowed anymore to 
access the information from an inherited file handle? Also I think a process 
can change its permissions during runtime.

> I think both of you raise good points.  I wasn't too enthusiastic
> about this approach either.  It's not very appealing to adopt an
> approach where the right performance decision is going to depend on
> operating system, file system, kernel version, core count, and
> workload.  We could add a GUC, but it would be pretty annoying to have
> a setting that won't matter for most people at all, except
> occasionally when it makes a huge difference.
> 
> I wasn't aware that was any current activity around this on the Linux
> side.  But Andres' comments made me Google it again, and now I see
> this:
> 
> https://lkml.org/lkml/2011/6/16/800
> 
> Andes, any idea what the status of that patch is?  I'm not clear on
> how Linux works in terms of things getting upstreamed.
There doesn't seem to have been any activity to inlude it in 3.1. The merge 
window for 3.1 just ended. The next one will open for about a week after the 
release.
Its also not yet included in linux-next which is a "preview" for the currently 
worked on release + 1. A release takes roughly 3 months.

For upstreaming somebody needs to be persistent enough to convince one of the 
maintainers of the particular area to include the code so that linus then can 
pull that.
I guess citing your numbers would go a long way in that direction. Naturally 
it would be even better to inlcude results with the patch applied.
My largest machine I can reboot often enough to test such a thing has only two 
sockets (4cores E5520). I guess you cannot reboot your loaned machine with a 
new kernel easily?

Greetings, 
Andres


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [RFC] Common object property boards
Next
From: Tom Lane
Date:
Subject: Re: [RFC] Common object property boards