Thread: Need to tune for Heavy Write

Need to tune for Heavy Write

From

Adarsh Sharma

Date:

04 August 2011, 01:56:51

Dear all,

 From the last few days, I researched a lot on Postgresql Performance
Tuning due to slow speed of my server.
My application selects data from mysql database about 100000 rows ,
process it & insert into postgres 2 tables by making about 45 connections.

I set my postgresql parameters in postgresql.conf as below: ( OS :
Ubuntu, RAM : 16 GB, Postgres : 8.4.2 )

max_connections    = 80
shared_buffers    = 2048MB
work_mem = 32MB
maintenance_work_mem = 512MB
fsync=off
full_page_writes=off
synchronous_commit=off
checkpoint_segments = 32
checkpoint_completion_target = 0.7
effective_cache_size = 4096MB


After this I change my pg_xlog directory to a separate directory other
than data directory by symlinking.


By Application issue insert statements through postgresql connections only.

Please let me know if I missing any other important configuration.



Thanks

Re: Need to tune for Heavy Write

From

Dusan Misic

Date:

04 August 2011, 05:19:19

Hi Adarsh,

Have you set checkpoint_segments and checkpoint_completion_target the right way?

Tuning these parameters are a MUST if you want good write performance.

See this link for more information: http://www.postgresql.org/docs/current/static/runtime-config-wal.html

Cheers,

Dusan

On Thu, Aug 4, 2011 at 6:56 AM, Adarsh Sharma <adarsh.sharma@orkash.com> wrote:

Dear all,

From the last few days, I researched a lot on Postgresql Performance Tuning due to slow speed of my server.
My application selects data from mysql database about 100000 rows , process it & insert into postgres 2 tables by making about 45 connections.

I set my postgresql parameters in postgresql.conf as below: ( OS : Ubuntu, RAM : 16 GB, Postgres : 8.4.2 )

max_connections = 80
shared_buffers = 2048MB
work_mem = 32MB
maintenance_work_mem = 512MB
fsync=off full_page_writes=off synchronous_commit=off checkpoint_segments = 32
checkpoint_completion_target = 0.7 effective_cache_size = 4096MB

After this I change my pg_xlog directory to a separate directory other than data directory by symlinking.

By Application issue insert statements through postgresql connections only.

Please let me know if I missing any other important configuration.

Thanks

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Need to tune for Heavy Write

From

Willy-Bas Loos

Date:

04 August 2011, 05:34:21

On Thu, Aug 4, 2011 at 6:56 AM, Adarsh Sharma <adarsh.sharma@orkash.com> wrote:
> After this I change my pg_xlog directory to a separate directory other than
> data directory by symlinking.
>(...)
> Please let me know if I missing any other important configuration.

Moving the pg_xlog to a different directory only helps when that
directory is on a different harddisk (or whatever I/O device).

HTH,

WBL
--
"Patriotism is the conviction that your country is superior to all
others because you were born in it." -- George Bernard Shaw

Re: Need to tune for Heavy Write

From

Dusan Misic

Date:

04 August 2011, 05:43:34

To put it simple, you need to set checkpoint_segments way higher than your current value!

Link: wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

On Aug 4, 2011 6:57 AM, "Adarsh Sharma" <adarsh.sharma@orkash.com> wrote:

Re: Need to tune for Heavy Write

From

Scott Marlowe

Date:

04 August 2011, 05:46:10

On Thu, Aug 4, 2011 at 2:34 AM, Willy-Bas Loos <willybas@gmail.com> wrote:
> On Thu, Aug 4, 2011 at 6:56 AM, Adarsh Sharma <adarsh.sharma@orkash.com> wrote:
>> After this I change my pg_xlog directory to a separate directory other than
>> data directory by symlinking.
>>(...)
>> Please let me know if I missing any other important configuration.
>
> Moving the pg_xlog to a different directory only helps when that
> directory is on a different harddisk (or whatever I/O device).

Not entirely true.  By simply being on a different  mounted file
system this moves the fsync calls on the pg_xlog directories off of
the same file system as the main data store.  Previous testing has
shown improvements in performance from just using a different file
system.

That said, the only real solution to a heavy write load is a heavy
duty IO subsystem, with lots of drives and battery backed cache.

Re: Need to tune for Heavy Write

From

Dusan Misic

Date:

04 August 2011, 06:03:20

Scott is right. His answer solves the problem in the long run. Even if your write load increases, it will perform fast enough.

For now try increasing checkpoint_segments size, restart Postgres for new settings to take effect and try again with your write load.

If you are not satisfied with write speed, then it is time to upgrade your storage system / aka to increase I/O performance.

On Aug 4, 2011 10:46 AM, "Scott Marlowe" <scott.marlowe@gmail.com> wrote:
> On Thu, Aug 4, 2011 at 2:34 AM, Willy-Bas Loos <willybas@gmail.com> wrote:
>> On Thu, Aug 4, 2011 at 6:56 AM, Adarsh Sharma <adarsh.sharma@orkash.com> wrote:
>>> After this I change my pg_xlog directory to a separate directory other than
>>> data directory by symlinking.
>>>(...)
>>> Please let me know if I missing any other important configuration.
>>
>> Moving the pg_xlog to a different directory only helps when that
>> directory is on a different harddisk (or whatever I/O device).
>
> Not entirely true. By simply being on a different mounted file
> system this moves the fsync calls on the pg_xlog directories off of
> the same file system as the main data store. Previous testing has
> shown improvements in performance from just using a different file
> system.
>
> That said, the only real solution to a heavy write load is a heavy
> duty IO subsystem, with lots of drives and battery backed cache.
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

Re: Need to tune for Heavy Write

From

"Nicholson, Brad (Toronto, ON, CA)"

Date:

04 August 2011, 09:50:59

> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-
> owner@postgresql.org] On Behalf Of Scott Marlowe
> Sent: Thursday, August 04, 2011 4:46 AM
> To: Willy-Bas Loos
> Cc: Adarsh Sharma; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Need to tune for Heavy Write
>
>
> > Moving the pg_xlog to a different directory only helps when that
> > directory is on a different harddisk (or whatever I/O device).
>
> Not entirely true.  By simply being on a different  mounted file
> system this moves the fsync calls on the pg_xlog directories off of
> the same file system as the main data store.  Previous testing has
> shown improvements in performance from just using a different file
> system.
>

Is this still the case for xfs or ext4 where fsync is properly flushing only the correct blocks to disk, or was this
referringto the good old ext3 flush everything on fysnc issue? 

Brad.

Re: Need to tune for Heavy Write

From

Craig Ringer

Date:

04 August 2011, 10:50:07

On 4/08/2011 12:56 PM, Adarsh Sharma wrote:
> Dear all,
>
>  From the last few days, I researched a lot on Postgresql Performance
> Tuning due to slow speed of my server.
> My application selects data from mysql database about 100000 rows ,
> process it & insert into postgres 2 tables by making about 45 connections.

Why 45?

Depending on your disk subsystem, that may be way too many for optimum
throughput. Or too few, for that matter.

Also, how are you doing your inserts? Are they being done in a single
big transaction per connection, or at least in resonable chunks? If
you're doing stand-alone INSERTs autocommit-style you'll see pretty
shoddy performance.

Have you looked into using COPY to bulk load your data? Possibly using
the libpq or jdbc copy APIs, or possibly using server-side COPY?

> fsync=off full_page_writes=off synchronous_commit=off

!!!!

I hope you don't want to KEEP that data if you have a hardware fault or
power loss. Setting fsync=off is pretty much saying "I don't mind if you
eat my data".

Keep. Really. Really. Good. Backups.

--
Craig Ringer

Re: Need to tune for Heavy Write

From

"Kevin Grittner"

Date:

04 August 2011, 10:57:45

Adarsh Sharma <adarsh.sharma@orkash.com> wrote:

> Postgres : 8.4.2

You should definitely update to a more recent bug patch level:

http://www.postgresql.org/support/versioning

> RAM : 16 GB

> effective_cache_size = 4096MB

That should probably be more like 12GB to 15GB.  It probably won't
affect the load time here, but could affect other queries.

> My application selects data from mysql database about 100000
> rows process it & insert into postgres 2 tables by making about 45
> connections.

How many cores do you have?  How many disk spindles in what sort of
array with what sort of controller.

Quite possibly you can improve performance dramatically by not
turning loose a "thundering herd" of competing processes.

Can you load the target table without indexes and then build the
indexes?

Can you use the COPY command (or at least prepared statements) for
the inserts to minimize parse/plan time?

An important setting you're missing is:

wal_buffers = 16MB

-Kevin

Re: Need to tune for Heavy Write

From

Scott Marlowe

Date:

04 August 2011, 12:07:29

On Thu, Aug 4, 2011 at 7:57 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>> RAM : 16 GB
>
>> effective_cache_size = 4096MB
>
> That should probably be more like 12GB to 15GB.  It probably won't
> affect the load time here, but could affect other queries.

Actually on a heavily written database a  large effective cache size
makes things slower.

Re: Need to tune for Heavy Write

From

Scott Marlowe

Date:

04 August 2011, 13:40:33

On Thu, Aug 4, 2011 at 6:41 AM, Nicholson, Brad (Toronto, ON, CA)
<bnicholson@hp.com> wrote:
>> -----Original Message-----
>> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-
>> owner@postgresql.org] On Behalf Of Scott Marlowe
>> Sent: Thursday, August 04, 2011 4:46 AM
>> To: Willy-Bas Loos
>> Cc: Adarsh Sharma; pgsql-performance@postgresql.org
>> Subject: Re: [PERFORM] Need to tune for Heavy Write
>>
>>
>> > Moving the pg_xlog to a different directory only helps when that
>> > directory is on a different harddisk (or whatever I/O device).
>>
>> Not entirely true.  By simply being on a different  mounted file
>> system this moves the fsync calls on the pg_xlog directories off of
>> the same file system as the main data store.  Previous testing has
>> shown improvements in performance from just using a different file
>> system.
>>
>
> Is this still the case for xfs or ext4 where fsync is properly flushing only the correct blocks to disk, or was this
referringto the good old ext3 flush everything on fysnc issue? 

Good question.  One I do not know the answer to.  Since I run my dbs
with separate pg_xlog drive sets I've never been able to test that.

Re: Need to tune for Heavy Write

From

Samuel Gendler

Date:

04 August 2011, 14:45:28

On Wed, Aug 3, 2011 at 9:56 PM, Adarsh Sharma <adarsh.sharma@orkash.com> wrote:

Dear all,

From the last few days, I researched a lot on Postgresql Performance Tuning due to slow speed of my server.
My application selects data from mysql database about 100000 rows , process it & insert into postgres 2 tables by making about 45 connections.

It's already been mentioned, but is worth reinforcing, that if you are inserting 100,000 rows in 100,000 transactions, you'll see a huge performance improvement by doing many more inserts per transaction. Try doing at least 500 inserts in each transaction (though you can possibly go quite a bit higher than that without any issues, depending upon what other traffic the database is handling in parallel). You almost certainly don't need 45 connections in order to insert only 100,000 rows. I've got a crappy VM with 2GB of RAM in which inserting 100,000 relatively narrow rows requires less than 10 seconds if I do it in a single transaction on a single connection. Probably much less than 10 seconds, but the code I just tested with does other work while doing the inserts, so I don't have a pure test at hand.

Re: Need to tune for Heavy Write

From

Mark Kirkwood

Date:

04 August 2011, 19:32:45

On 05/08/11 05:40, Samuel Gendler wrote:

On Wed, Aug 3, 2011 at 9:56 PM, Adarsh Sharma <adarsh.sharma@orkash.com> wrote:
Dear all,

From the last few days, I researched a lot on Postgresql Performance Tuning due to slow speed of my server.
My application selects data from mysql database about 100000 rows , process it & insert into postgres 2 tables by making about 45 connections.

It's already been mentioned, but is worth reinforcing, that if you are inserting 100,000 rows in 100,000 transactions, you'll see a huge performance improvement by doing many more inserts per transaction. Try doing at least 500 inserts in each transaction (though you can possibly go quite a bit higher than that without any issues, depending upon what other traffic the database is handling in parallel). You almost certainly don't need 45 connections in order to insert only 100,000 rows. I've got a crappy VM with 2GB of RAM in which inserting 100,000 relatively narrow rows requires less than 10 seconds if I do it in a single transaction on a single connection. Probably much less than 10 seconds, but the code I just tested with does other work while doing the inserts, so I don't have a pure test at hand.

Also worth mentioning is doing those 500 inserts in *fewer* than 500 INSERT operations is likely to be a huge improvement, e.g:

INSERT INTO table VALUES (....),(....);

instead of

INSERT INTO table VALUES (....);
INSERT INTO table VALUES (....);

I'd be tempted to do all 500 row insertions in one INSERT statement as above. You might find that 1 connection doing this is fast enough (it is only doing 200 actual INSERT calls in that case to put in 100000 rows).

regards

Mark

Re: Need to tune for Heavy Write

From

Jim Nasby

Date:

17 August 2011, 18:17:33

On Aug 4, 2011, at 10:07 AM, Scott Marlowe wrote:
> On Thu, Aug 4, 2011 at 7:57 AM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov> wrote:
>>> RAM : 16 GB
>>
>>> effective_cache_size = 4096MB
>>
>> That should probably be more like 12GB to 15GB.  It probably won't
>> affect the load time here, but could affect other queries.
>
> Actually on a heavily written database a  large effective cache size
> makes things slower.

effective_cache_size or shared_buffers? I can see why a large shared_buffers could cause problems, but what effect does
effective_cache_sizehave on a write workload? 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net