Re: high io BUT huge amount of free memory - Mailing list pgsql-hackers

From Andres Freund
Subject Re: high io BUT huge amount of free memory
Date
Msg-id 20130502230949.GD5998@awork2.anarazel.de
Whole thread Raw
In response to Re: high io BUT huge amount of free memory  (Shaun Thomas <sthomas@optionshouse.com>)
Responses Re: high io BUT huge amount of free memory  (Greg Stark <stark@mit.edu>)
Re: high io BUT huge amount of free memory  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
On 2013-05-02 16:13:42 -0500, Shaun Thomas wrote:
> On 05/02/2013 12:04 PM, Josh Berkus wrote:
> Yeah, this is why I want to go to Linux Plumbers this year.  The
> Kernel.org engineers are increasingly doing things which makes Linux
> unsuitable for applications which depend on the filesystem.

Uh. Yea.

> >There is a good, but sad, reason for this: IBM and Oracle and their
> >partners are the largest employers of people hacking on core Linux
> >memory/IO functionality, and both of those companies use DirectIO
> >extensively in their products.
> 
> I never thought of that. Somehow I figured all the Redhat engineers would
> somehow counterbalance that kind of influence.

I think the reason you never thought of that is that it doesn't have
much to do with reality. Calling the linux direct io implemention well
maintained and well performing is a rather bad joke. Sorry, I can't find
a friendlier description. And no, thats not my opinion. That's the
opinion of the people maintaining it. Google it if you don't believe me.
Also, IBM and Oracle - which afaik was never really up there - haven't
been at top of the contributing companies list for a while. Like several
years.

I can only repeat myself: The blame game against the linux kernel played
here on the lists is neither an accurate description of reality nor
helpful. The only two recent occasions where I can remember postgres
people reaching out to lkml the reported problems got fixed in an
reasonable amount of time. One was the lseek(2) scalability issue
discovered by Robert which, after some prodding by yours truly, got
solved entirely by Andi Kleen and some major performance regression in
an development (!) kernel that was made visible by pg that got fixed
before the final release was made.
Note well that they *do* regularly test development kernels with various
version of postgres. We don't do the reverse in any way that is remotely
systematic.

Report the problems you find instead of whining! And remember when you
measure the performance of a several year old kernel how we react when
somebody complains too loudly about performance problems in 8.3. Yes it
sucks majorly to update your kernel. But quite often its far easier than
updating the postgres major version. And way easier to roll back.

> But that brings up an interesting question. How hard / feasible would it be
> to add DIO functionality to PG itself?

I don't think there is too much chance of that - but I also don't really
see the point in trying to do it. We should start by improving postgres
buffer writeout which isn't that great, especially with big shared
buffers. We would have to invest quite a lot of work in how our
buffering and writeout works to make DIO perform nicely.

> I've already heard chatter (Robert
> Haas?) about converting the shared memory allocation to an anonymous block,
> so could we simultaneously open up a DMA relationship?

We've got that in 9.3 which is absolutely fabulous! But that's not
related to doing DMA which you cannot (and should not!) do from
userspace.


I hate to be so harsh, but this topic has been getting on my nerves for
quite a while now and its constantly getting worse.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: 9.3 Beta1 status report
Next
From: Bruce Momjian
Date:
Subject: Re: 9.3 Beta1 status report