Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date
Msg-id 52D54C3D.5020700@2ndQuadrant.com
Whole thread Raw
In response to Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Claudio Freire <klaussfreire@gmail.com>)
Responses Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Claudio Freire <klaussfreire@gmail.com>)
Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (James Bottomley <James.Bottomley@HansenPartnership.com>)
List pgsql-hackers
On 01/14/2014 09:39 AM, Claudio Freire wrote:
> On Tue, Jan 14, 2014 at 5:08 AM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
>> Again, as said above the linux file system is doing fine. What we
>> want is a few ways to interact with it to let it do even better when
>> working with postgresql by telling it some stuff it otherwise would
>> have to second guess and by sometimes giving it back some cache
>> pages which were copied away for potential modifying but ended
>> up clean in the end.
> You don't need new interfaces. Only a slight modification of what
> fadvise DONTNEED does.
>
> This insistence in injecting pages from postgres to kernel is just a
> bad idea. 
Do you think it would be possible to map copy-on-write pages
from linux cache to postgresql cache ?

this would be a step in direction of solving the double-ram-usage
of pages which have not been read from syscache to postgresql
cache without sacrificing linux read-ahead (which I assume does
not happen when reads bypass system cache).

and we can write back the copy at the point when it is safe (from
postgresql perspective)  to let the system write them back ?

Do you think it is possible to make it work with good performance
for a few million 8kb pages ?

> At the very least, it still needs postgres to know too much
> of the filesystem (block layout) to properly work. Ie: pg must be
> required to put entire filesystem-level blocks into the page cache,
> since that's how the page cache works. 
I was more thinking of an simple write() interface with extra
flags/sysctls to tell kernel that "we already have this on disk"
> At the very worst, it may
> introduce serious security and reliability implications, when
> applications can destroy the consistency of the page cache (even if
> full access rights are checked, there's still the possibility this
> inconsistency might be exploitable).
If you allow write() which just writes clean pages, I can not see
where the extra security concerns are beyond what normal
write can do.


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ




pgsql-hackers by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Optimize kernel readahead using buffer access strategy
Next
From: Robert Haas
Date:
Subject: Re: Linux kernel impact on PostgreSQL performance