Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance - Mailing list pgsql-hackers

From Dave Chinner
Subject Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date
Msg-id 20140117003120.GD18112@dastard
Whole thread Raw
In response to Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On Thu, Jan 16, 2014 at 03:58:56PM -0800, Jeff Janes wrote:
> On Thu, Jan 16, 2014 at 3:23 PM, Dave Chinner <david@fromorbit.com> wrote:
> 
> > On Wed, Jan 15, 2014 at 06:14:18PM -0600, Jim Nasby wrote:
> > > On 1/15/14, 12:00 AM, Claudio Freire wrote:
> > > >My completely unproven theory is that swapping is overwhelmed by
> > > >near-misses. Ie: a process touches a page, and before it's
> > > >actually swapped in, another process touches it too, blocking on
> > > >the other process' read. But the second process doesn't account
> > > >for that page when evaluating predictive models (ie: read-ahead),
> > > >so the next I/O by process 2 is unexpected to the kernel. Then
> > > >the same with 1. Etc... In essence, swap, by a fluke of its
> > > >implementation, fails utterly to predict the I/O pattern, and
> > > >results in far sub-optimal reads.
> > > >
> > > >Explicit I/O is free from that effect, all read calls are
> > > >accountable, and that makes a difference.
> > > >
> > > >Maybe, if the kernel could be fixed in that respect, you could
> > > >consider mmap'd files as a suitable form of temporary storage.
> > > >But that would depend on the success and availability of such a
> > > >fix/patch.
> > >
> > > Another option is to consider some of the more "radical" ideas in
> > > this thread, but only for temporary data. Our write sequencing and
> > > other needs are far less stringent for this stuff.  -- Jim C.
> >
> > I suspect that a lot of the temporary data issues can be solved by
> > using tmpfs for temporary files....
> >
> 
> Temp files can collectively reach hundreds of gigs.

So unless you have terabytes of RAM you're going to have to write
them back to disk.

But there's something here that I'm not getting - you're talking
about a data set that you want ot keep cache resident that is at
least an order of magnitude larger than the cyclic 5-15 minute WAL
dataset that ongoing operations need to manage to avoid IO storms.
Where do these temporary files fit into this picture, how fast do
they grow and why are do they need to be so large in comparison to
the ongoing modifications being made to the database?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com



pgsql-hackers by date:

Previous
From: Dave Chinner
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Next
From: Dave Chinner
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance