Thread: Update on high concurrency OLTP application and Postgres 8 tuning

Update on high concurrency OLTP application and Postgres 8 tuning

From
Cosimo Streppone
Date:
Hi all,

I was searching tips to speed up/reduce load on a Pg8 app.
Thank you for all your suggestions on the matter.
Thread is archived here:

http://www.mail-archive.com/pgsql-performance@postgresql.org/msg18342.html

After intensive application profiling and database workload analysis,
I managed to reduce CPU load with application-level changes.

For database overload in presence of many concurrent
transactions, I found that just doing an "ANALYZE" on sensible
relations makes the situation better.

I scheduled a cron job every hour or so that runs an analyze on the
4/5 most intensive relations and sleeps 30 seconds between every
analyze.

This has optimized db response times when many clients run together.
I wanted to report this, maybe it can be helpful for others
out there... :-)

--
Cosimo


Re: Update on high concurrency OLTP application and Postgres 8 tuning

From
Andrew Sullivan
Date:
On Wed, Sep 20, 2006 at 11:09:23AM +0200, Cosimo Streppone wrote:
>
> I scheduled a cron job every hour or so that runs an analyze on the
> 4/5 most intensive relations and sleeps 30 seconds between every
> analyze.
>
> This has optimized db response times when many clients run together.
> I wanted to report this, maybe it can be helpful for others
> out there... :-)

This suggests to me that your statistics need a lot of updating.  You
_might_ find that setting the statistics to a higher number on some
columns of some of your tables will allow you to analyse less
frequently.  That's a good thing just because ANALYSE will impose an
I/O load.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
        --Alexander Hamilton

Re: Update on high concurrency OLTP application and Postgres

From
Cosimo Streppone
Date:
Andrew wrote:

> On Wed, Sep 20, 2006 at 11:09:23AM +0200, Cosimo Streppone wrote:
>> I scheduled a cron job every hour or so that runs an analyze on the
>> 4/5 most intensive relations and sleeps 30 seconds between every
>> analyze.
>
> This suggests to me that your statistics need a lot of updating.

Agreed.

> You _might_ find that setting the statistics to a higher number on some
> columns of some of your tables will allow you to analyse less
> frequently.

At the moment, my rule of thumb is to check out the ANALYZE VERBOSE
messages to see if all table pages are being scanned.

   INFO: "mytable": scanned xxx of yyy pages, containing ...

If xxx = yyy, then I keep statistics at the current level.
When xxx is way less than yyy, I increase the numbers a bit
and retry.

It's probably primitive, but it seems to work well.

 > [...] ANALYSE will impose an I/O load.

In my case, analyze execution doesn't impact performance
in any noticeable way. YMMV of course.

--
Cosimo


Re: Update on high concurrency OLTP application and Postgres

From
Cosimo Streppone
Date:
Christian Storm wrote:

>>At the moment, my rule of thumb is to check out the ANALYZE VERBOSE
>>messages to see if all table pages are being scanned.
>>
>>   INFO: "mytable": scanned xxx of yyy pages, containing ...
>>
>>If xxx = yyy, then I keep statistics at the current level.
>>When xxx is way less than yyy, I increase the numbers a bit
>>and retry.
>>
>>It's probably primitive, but it seems to work well.
 >
> What heuristic do you use to up the statistics for such a table?

No heuristics, just try and see.
For tables of ~ 10k pages, I set statistics to 100/200.
For ~ 100k pages, I set them to 500 or more.
I don't know the exact relation.

> Once you've changed it, what metric do you use to
 > see if it helps or was effective?

I rerun an analyze and see the results... :-)
If you mean checking the usefulness, I can see it only
under heavy load, if particular db queries run in the order
of a few milliseconds.

If I see normal queries that take longer and longer, or
they even appear in the server's log (> 500 ms), then
I know an analyze is needed, or statistics should be set higher.

--
Cosimo



Re: Update on high concurrency OLTP application and Postgres

From
Jim Nasby
Date:
Have you ever done any testing to see if just setting
default_statistics_target to 500 has a negative impact on the system?

On Sep 22, 2006, at 4:48 PM, Cosimo Streppone wrote:

> Christian Storm wrote:
>
>>> At the moment, my rule of thumb is to check out the ANALYZE VERBOSE
>>> messages to see if all table pages are being scanned.
>>>
>>>   INFO: "mytable": scanned xxx of yyy pages, containing ...
>>>
>>> If xxx = yyy, then I keep statistics at the current level.
>>> When xxx is way less than yyy, I increase the numbers a bit
>>> and retry.
>>>
>>> It's probably primitive, but it seems to work well.
> >
>> What heuristic do you use to up the statistics for such a table?
>
> No heuristics, just try and see.
> For tables of ~ 10k pages, I set statistics to 100/200.
> For ~ 100k pages, I set them to 500 or more.
> I don't know the exact relation.
>
>> Once you've changed it, what metric do you use to
> > see if it helps or was effective?
>
> I rerun an analyze and see the results... :-)
> If you mean checking the usefulness, I can see it only
> under heavy load, if particular db queries run in the order
> of a few milliseconds.
>
> If I see normal queries that take longer and longer, or
> they even appear in the server's log (> 500 ms), then
> I know an analyze is needed, or statistics should be set higher.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)