Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance - Mailing list pgsql-hackers

From Greg Stark
Subject Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date
Msg-id CAM-w4HMk9GwHurUpuBm+djJoTSjuKe23Ab9Od=rrujgpPvXdgw@mail.gmail.com
Whole thread Raw
In response to Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (James Bottomley <James.Bottomley@HansenPartnership.com>)
Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Dave Chinner <david@fromorbit.com>)
List pgsql-hackers
On Mon, Jan 13, 2014 at 9:12 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> For one, postgres doesn't use mmap for files (and can't without major
> new interfaces). Frequently mmap()/madvise()/munmap()ing 8kb chunks has
> horrible consequences for performance/scalability - very quickly you
> contend on locks in the kernel.


I may as well dump this in this thread. We've discussed this in person
a few times, including at least once with Ted T'so when he visited
Dublin last year.

The fundamental conflict is that the kernel understands better the
hardware and other software using the same resources, Postgres
understands better its own access patterns. We need to either add
interfaces so Postgres can teach the kernel what it needs about its
access patterns or add interfaces so Postgres can find out what it
needs to know about the hardware context.

The more ambitious and interesting direction is to let Postgres tell
the kernel what it needs to know to manage everything. To do that we
would need the ability to control when pages are flushed out. This is
absolutely necessary to maintain consistency. Postgres would need to
be able to mark pages as unflushable until some point in time in the
future when the journal is flushed. We discussed various ways that
interface could work but it would be tricky to keep it low enough
overhead to be workable.

The less exciting, more conservative option would be to add kernel
interfaces to teach Postgres about things like raid geometries. Then
Postgres could use directio and decide to do prefetching based on the
raid geometry, how much available i/o bandwidth and iops is available,
etc.

Reimplementing i/o schedulers and all the rest of the work that the
kernel provides inside Postgres just seems like something outside our
competency and that none of us is really excited about doing.

-- 
greg



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Disallow arrays with non-standard lower bounds
Next
From: "Joshua D. Drake"
Date:
Subject: Re: Standalone synchronous master