Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance - Mailing list pgsql-hackers

From Jeff Layton
Subject Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date
Msg-id 20140116082005.68e865ac@tlielax.poochiereds.net
Whole thread Raw
In response to Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, 15 Jan 2014 21:37:16 -0500
Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Jan 15, 2014 at 8:41 PM, Jan Kara <jack@suse.cz> wrote:
> > On Wed 15-01-14 10:12:38, Robert Haas wrote:
> >> On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara <jack@suse.cz> wrote:
> >> > Filesystems could in theory provide facility like atomic write (at least up
> >> > to a certain size say in MB range) but it's not so easy and when there are
> >> > no strong usecases fs people are reluctant to make their code more complex
> >> > unnecessarily. OTOH without widespread atomic write support I understand
> >> > application developers have similar stance. So it's kind of chicken and egg
> >> > problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place
> >> > due to its data=journal mode so if someone on the PostgreSQL side wanted to
> >> > research on this, knitting some experimental ext4 patches should be doable.
> >>
> >> Atomic 8kB writes would improve performance for us quite a lot.  Full
> >> page writes to WAL are very expensive.  I don't remember what
> >> percentage of write-ahead log traffic that accounts for, but it's not
> >> small.
> >   OK, and do you need atomic writes on per-IO basis or per-file is enough?
> > It basically boils down to - is all or most of IO to a file going to be
> > atomic or it's a smaller fraction?
> 
> The write-ahead log wouldn't need it, but data files writes would.  So
> we'd need it a lot, but not for absolutely everything.
> 
> For any given file, we'd either care about writes being atomic, or we wouldn't.
> 

Just getting caught up on this thread. One thing that you're just now
getting to here is that the different types of files in the DB have
different needs.

It might be good to outline each type of file (WAL, data files, tmp
files), what sort of I/O patterns are typically done to them, and what
sort of "special needs" they have (atomicity or whatever). Then we
could treat each file type as a separate problem, which may make some
of these problems easier to solve.

For instance, typically a WAL would be fairly sequential I/O, whereas
the data files are almost certainly random. It may make sense to
consider DIO for some of these use-cases, even if it's not suitable
everywhere.

For tempfiles, it may make sense to consider housing those on tmpfs.
They wouldn't go to disk at all that way, but if there is mem pressure
they could get swapped out (maybe this is standard practice already --
I don't know).

> > As Dave notes, unless there is HW support (which is coming with newest
> > solid state drives), ext4/xfs will have to implement this by writing data
> > to a filesystem journal and after transaction commit checkpointing them to
> > a final location. Which is exactly what you do with your WAL logs so
> > it's not clear it will be a performance win. But it is easy enough to code
> > for ext4 that I'm willing to try...
> 
> Yeah, hardware support would be great.
> 


-- 
Jeff Layton <jlayton@redhat.com>



pgsql-hackers by date:

Previous
From: Jan Kara
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Next
From: Theodore Ts'o
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance