Thread: concurrent IO in postgres?

concurrent IO in postgres?

From
Przemek Wozniak
Date:
When testing the IO performance of ioSAN storage device from FusionIO
(650GB MLC version) one of the things I tried is a set of IO intensive
operations in Postgres: bulk data loads, updates, and queries calling
for random IO. So far I cannot make Postgres take advantage of this
tremendous IO capacity. I can squeeze a factor of a few here and there
when caching cannot be utilized, but this hardware can do a lot more.

Low level testing with fio shows on average x10 speedups over disk for
sequential IO and x500-800 for random IO. With enough threads I can get
IOPS in the 100-200K range and 1-1.5GB/s bandwidth, basically what's
advertised. But not with Postgres.

Is this because the Postgres backend is essentially single threaded and
in general does not perform asynchronous IO, or I'm missing something?
I found out that the effective_io_concurrency parameter only takes
effect for bitmap index scans.

Also, is there any work going on to allow concurrent IO in the backend
and adapt Postgres to the capabilities of Flash?

Will appreciate any comments, experiences, etc.

Przemek Wozniak




Re: concurrent IO in postgres?

From
Scott Marlowe
Date:
On Thu, Dec 23, 2010 at 10:37 AM, Przemek Wozniak <wozniak@lanl.gov> wrote:
> When testing the IO performance of ioSAN storage device from FusionIO
> (650GB MLC version) one of the things I tried is a set of IO intensive
> operations in Postgres: bulk data loads, updates, and queries calling
> for random IO. So far I cannot make Postgres take advantage of this

So, were you running a lot of these at once?  Or just single threaded?

I get very good io concurrency with lots of parallel postgresql
connections on a 34 disk SAS setup with a battery backed controller.

Re: concurrent IO in postgres?

From
John W Strange
Date:
Typically my problem is that the large queries are simply CPU bound..  do you have a sar/top output that you see. I'm
currentlysetting up two FusionIO DUO @640GB in a lvm stripe to do some testing with, I will publish the results after
I'mdone.
 

If anyone has some tests/suggestions they would like to see done please let me know.

- John

-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Przemek
Wozniak
Sent: Thursday, December 23, 2010 11:38 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] concurrent IO in postgres?

When testing the IO performance of ioSAN storage device from FusionIO
(650GB MLC version) one of the things I tried is a set of IO intensive
operations in Postgres: bulk data loads, updates, and queries calling
for random IO. So far I cannot make Postgres take advantage of this
tremendous IO capacity. I can squeeze a factor of a few here and there
when caching cannot be utilized, but this hardware can do a lot more.

Low level testing with fio shows on average x10 speedups over disk for
sequential IO and x500-800 for random IO. With enough threads I can get
IOPS in the 100-200K range and 1-1.5GB/s bandwidth, basically what's
advertised. But not with Postgres.

Is this because the Postgres backend is essentially single threaded and
in general does not perform asynchronous IO, or I'm missing something?
I found out that the effective_io_concurrency parameter only takes
effect for bitmap index scans.

Also, is there any work going on to allow concurrent IO in the backend
and adapt Postgres to the capabilities of Flash?

Will appreciate any comments, experiences, etc.

Przemek Wozniak




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase &
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.

Re: concurrent IO in postgres?

From
"Kevin Grittner"
Date:
John W Strange <john.w.strange@jpmchase.com> wrote:

> Typically my problem is that the large queries are simply CPU
> bound.

Well, if your bottleneck is CPU, then you're obviously not going to
be driving another resource (like disk) to its limit.  First,
though, I want to confirm that your "CPU bound" case isn't in the
"I/O Wait" category of CPU time.  What does `vmstat 1` show while
you're CPU bound?

If it's not I/O Wait time, then you need to try to look at the
queries involved.  If you're not hitting the disk because most of
the active data is cached, that would normally be a good thing.
What kind of throughput are you seeing?  Do you need better?

http://wiki.postgresql.org/wiki/SlowQueryQuestions

-Kevin

Re: concurrent IO in postgres?

From
Przemek Wozniak
Date:
On Thu, 2010-12-23 at 11:24 -0700, Scott Marlowe wrote:
> On Thu, Dec 23, 2010 at 10:37 AM, Przemek Wozniak <wozniak@lanl.gov> wrote:
> > When testing the IO performance of ioSAN storage device from FusionIO
> > (650GB MLC version) one of the things I tried is a set of IO intensive
> > operations in Postgres: bulk data loads, updates, and queries calling
> > for random IO. So far I cannot make Postgres take advantage of this
>
> So, were you running a lot of these at once?  Or just single threaded?

> I get very good io concurrency with lots of parallel postgresql
> connections on a 34 disk SAS setup with a battery backed controller.

In one test I was running between 1 and 32 clients simultaneously
writing lots of data using copy binary. The problem is that with a large
RAM buffer it all goes there, and then the background writer, a single
postgres process, will issue write requests one at a time I suspect.
So the actual IO is effectively serialized by the backend.



Re: concurrent IO in postgres?

From
Andy
Date:
--- On Thu, 12/23/10, John W Strange <john.w.strange@jpmchase.com> wrote:

> Typically my problem is that the
> large queries are simply CPU bound..  do you have a
> sar/top output that you see. I'm currently setting up two
> FusionIO DUO @640GB in a lvm stripe to do some testing with,
> I will publish the results after I'm done.
>
> If anyone has some tests/suggestions they would like to see
> done please let me know.
>
> - John

Somewhat tangential to the current topics, I've heard that FusionIO uses internal cache and hence is not crash-safe,
andif the cache is turned off performance will take a big hit. Is that your experience? 




Re: concurrent IO in postgres?

From
Ben Chobot
Date:
On Dec 23, 2010, at 11:58 AM, Andy wrote:

>
> Somewhat tangential to the current topics, I've heard that FusionIO uses internal cache and hence is not crash-safe,
andif the cache is turned off performance will take a big hit. Is that your experience? 

It does use an internal cache, but it also has onboard battery power. The driver needs to put its house in order when
restartingafter an unclean shutdown, however, and that can take up to 30 minutes per card. 

Re: concurrent IO in postgres?

From
John Cagle
Date:
On Dec 23, 2010, at 13:22:32, Ben Chobot wrote:
>
> On Dec 23, 2010, at 11:58 AM, Andy wrote:
> >
> > Somewhat tangential to the current topics, I've heard that FusionIO
>uses
> > internal cache and hence is not crash-safe, and if the cache is turned
> > off performance will take a big hit. Is that your experience?
>
> It does use an internal cache, but it also has onboard battery power. The
> driver needs to put its house in order when restarting after an unclean
> shutdown, however, and that can take up to 30 minutes per card.

Sorry to intrude here, but I'd like to clarify the behavior of the
Fusion-io
devices.  Unlike SSDs, we do not use an internal cache nor do we use
batteries.

(We *do* have a small internal FIFO (with capacitive hold-up) that is
100% guaranteed to be written to our persistent storage in the event of
unexpected power failure.)

When a write() to a Fusion-io device has been acknowledged, the data is
guaranteed to be stored safely.  This is a strict requirement for any
enterprise-ready storage device.

Thanks,
John Cagle
Fusion-io, Inc.


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended
recipient,and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are
notthe intended recipient, please immediately notify the sender and destroy the original e-mail message and any
attachments(and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying,
disclosureor distribution of this information is strictly prohibited. 

Re: concurrent IO in postgres?

From
Josh Berkus
Date:
John,

> When a write() to a Fusion-io device has been acknowledged, the data is
> guaranteed to be stored safely.  This is a strict requirement for any
> enterprise-ready storage device.

Thanks for the clarification!

While you're here, any general advice on configuring fusionIO devices
for database access, or vice-versa?

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: concurrent IO in postgres?

From
"Pierre C"
Date:
I wonder how the OP configured effective_io_concurrency ; even on a single
drive with command queuing the fadvise() calls that result do make a
difference...

Re: concurrent IO in postgres?

From
Jeff Janes
Date:
On Thu, Dec 23, 2010 at 11:46 AM, Przemek Wozniak <wozniak@lanl.gov> wrote:

> In one test I was running between 1 and 32 clients simultaneously
> writing lots of data using copy binary.

Are you by-passing WAL?  If not, you are likely serializing on that.
Not so much the writing, but the lock.

> The problem is that with a large
> RAM buffer it all goes there, and then the background writer, a single
> postgres process, will issue write requests one at a time I suspect.

But those "writes" are probably just copies of 8K into kernel's RAM,
and so very fast.

> So the actual IO is effectively serialized by the backend.

If the background writer cannot keep up, then the individual backends
start doing writes as well, so it isn't really serialized..

Cheers,

Jeff

Re: concurrent IO in postgres?

From
Mladen Gogala
Date:
Jeff Janes wrote:
> If the background writer cannot keep up, then the individual backends
> start doing writes as well, so it isn't really serialized..
>
>
Is there any parameter governing that behavior? Can you tell me where in
the code (version 9.0.2) can I find that? Thanks.

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com


Re: concurrent IO in postgres?

From
Jeff Janes
Date:
On 12/25/10, Mladen Gogala <mladen.gogala@vmsinfo.com> wrote:
> Jeff Janes wrote:
>> If the background writer cannot keep up, then the individual backends
>> start doing writes as well, so it isn't really serialized..
>>
>>
> Is there any parameter governing that behavior?

No, it is automatic.

There are parameters governing how likely it is that bgwriter falls
behind in the first place, though.

http://www.postgresql.org/docs/9.0/static/runtime-config-resource.html

In particular bgwriter_lru_maxpages could be made bigger and/or
bgwriter_delay smaller.

But bulk copy binary might use a nondefault allocation strategy, and I
don't know enough about that part of the code to assess the
interaction of that with bgwriter.

> Can you tell me where in
> the code (version 9.0.2) can I find
that? Thanks.

Bufmgr.c, specifically BufferAlloc.

Cheers,

Jeff

Re: concurrent IO in postgres?

From
Greg Smith
Date:
Jeff Janes wrote:
> There are parameters governing how likely it is that bgwriter falls
> behind in the first place, though.
>
> http://www.postgresql.org/docs/9.0/static/runtime-config-resource.html
>
> In particular bgwriter_lru_maxpages could be made bigger and/or
> bgwriter_delay smaller.
>

Also, one of the structures used for caching the list of fsync requests
the background writer is handling, the thing that results in backend
writes when it can't keep up, is proportional to the size of
shared_buffers on the server.  Setting that tunable to a reasonable size
and lowering bgwriter_delay are two things that help most for the
background writer to keep up with overall load rather than having
backends write their own buffers.  And the way checkpoints in PostgreSQL
work, having more backend writes is generally not a performance
improving change, even though it does have the property that it gets
more processes writing at once.

The thread opening post here really didn't discuss if any PostgreSQL
server tuning or OS tuning was done to try and optimize performance.
The usual list at
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server is
normally a help.

At the kernel level, the #1 thing I find necessary to get decent bulk
performance in a lot of situations is proper read-ahead.  On Linux for
example, you must get the OS doing readahead to compensate for the fact
that PostgreSQL is issuing requests in a serial sequence.  It's going to
ask for block #1, then block #2, then block #3, etc.  If the OS doesn't
start picking up on that pattern and reading blocks 4, 5, 6, etc. before
the server asks for them, to keep the disk fully occupied and return the
database data fast from the kernel buffers, you'll never reach the full
potential even of a regular hard drive.  And the default readahead on
Linux is far too low for modern hardware.

> But bulk copy binary might use a nondefault allocation strategy, and I
> don't know enough about that part of the code to assess the
> interaction of that with bgwriter.
>

It's documented pretty well in src/backend/storage/buffer/README ,
specifically the "Buffer Ring Replacement Strategy" section.  Sequential
scan reads, VACUUM, COPY IN, and CREATE TABLE AS SELECT are the
operations that get one of the more specialized buffer replacement
strategies.  These all use the same basic approach, which is to re-use a
ring of data rather than running rampant over the whole buffer cache.
The main thing different between them is the size of the ring.  Inside
freelist.c the GetAccessStrategy code lets you see the size you get in
each of these modes.

Since PostgreSQL reads and writes through the OS buffer cache in
addition to its own shared_buffers pool, this whole ring buffer thing
doesn't protect the OS cache from being trashed by a big bulk
operation.  Your only real defense there is to make shared_buffers large
enough that it retains a decent chunk of data even in the wake of that.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services and Support        www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books