Thread: Database size Vs performance degradation

Database size Vs performance degradation

From

"Dave North"

Date:

30 July 2008, 09:14:31

Morning folks,
    Long time listener, first time poster.  Having an interesting
problem related to performance which I'll try and describe below and
hopefully get some enlightenment.  First the environment:


Postgres 8.1.8
    shared_buffers = 2000
    max_fsm_pages = 400000
Redhat Enterprise 4
Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1
Also running on the server is a tomcat web server and other ancillaries

Now, the problem.  We have an application that continually writes a
bunch of data to a few tables which is then deleted by a batch job each
night.  We're adding around 80,000 rows to one table per day and
removing around 75,000 that are deemed to be "unimportant".  Now, the
problem we see is that after a period of time, the database access
becomes very 'slow' and the load avg on the machine gets up around 5.
When this happens, the application using the DB basically grinds to a
halt.  Checking the stats, the DB size is around 7.5GB; no tables or
indexes look to be 'bloated' (we have been using psql since 7.3 with the
classic index bloat problem) and the auto-vac has been running solidly.

We had this problem around a month ago and again yesterday.  Because the
application needs reasonably high availability, we couldn't full vacuum
so what we did was a dump and load to another system.  What I found here
was that after the load, the DB size was around 2.7GB - a decrease of
5GB.  Re-loading this back onto the main system, and the world is good.

One observation I've made on the DB system is the disk I/O seems
dreadfully slow...we're at around 75% I/O wait sometimes and the read
rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for
un-cached reads).  I've also observed that the OS cache seems to be
using all of the remaining memory for it's cache (around 3GB) which
seems probably the best it can do with the available memory.

Now, clearly we need to examine the need for the application to write
and remove so much data but my main question is:

Why does the size of the database with so much "un-used" space seem to
impact performance so much?  If (in this case) the extra 5GB of space is
essentially "unallocated", does it factor into any of the caching or
performance metrics that the DBMS uses?  And if so, would I be better
having a higher shared_buffers rather than relying so much on OS cache?

Yes, I know we need to upgrade to 8.3 but that's going to take some time
:)

Many thanks in advance.

Dave

___
Dave North
dnorth@signiant.com
Signiant - Making Media Move
Visit Signiant at: www.signiant.com <http://www.signiant.com/>

Re: Database size Vs performance degradation

From

Richard Huxton

Date:

30 July 2008, 09:28:17

Dave North wrote:
> Morning folks,
>     Long time listener, first time poster.

Hi Dave

> Postgres 8.1.8
>     shared_buffers = 2000
>     max_fsm_pages = 400000
> Redhat Enterprise 4
> Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1
> Also running on the server is a tomcat web server and other ancillaries

The value of 2000 seems a bit low for shared_buffers perhaps. Oh, and
8.1.13 seems to be the latest bugfix for 8.1 too.

> Now, the problem.  We have an application that continually writes a
> bunch of data to a few tables which is then deleted by a batch job each
> night.  We're adding around 80,000 rows to one table per day and
> removing around 75,000 that are deemed to be "unimportant".
[snip]
> We had this problem around a month ago and again yesterday.  Because the
> application needs reasonably high availability, we couldn't full vacuum
> so what we did was a dump and load to another system.  What I found here
> was that after the load, the DB size was around 2.7GB - a decrease of
> 5GB.  Re-loading this back onto the main system, and the world is good.

Well, that's pretty much the definition of bloat. Are you sure you're
vacuuming enough? I don't have an 8.1 to hand at the moment, but a
"vacuum verbose" in 8.2+ gives some details at the end about how many
free-space slots need to be tracked. Presumably you're not tracking
enough of them, or your vacuuming isn't actually taking place.

Check the size of your database every night. It will rise from 2.7GB,
but it should stay roughly static (apart from whatever data you add of
course). If you can keep it so that most of the working-set of your
database fits in RAM speed will stay just fine.

> Yes, I know we need to upgrade to 8.3 but that's going to take some time
> :)

I think you'll like some of the improvements, but it's probably more
important to get 8.1.13 installed soon-ish.

--
   Richard Huxton
   Archonet Ltd

Re: Database size Vs performance degradation

From

Matthew Wakeling

Date:

30 July 2008, 09:36:52

On Wed, 30 Jul 2008, Dave North wrote:
> Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1

> Checking the stats, the DB size is around 7.5GB;

Doesn't fit in RAM.

> ...after the load, the DB size was around 2.7GB

Does fit in RAM.

> One observation I've made on the DB system is the disk I/O seems
> dreadfully slow...we're at around 75% I/O wait sometimes and the read
> rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for
> un-cached reads).

That's incredibly slow in this day and age, especially from 10krpm HDDs.
Definitely worth investigating.

However, I think vacuuming more agressively is going to be your best win
at the moment.

Matthew

--
Patron: "I am looking for a globe of the earth."
Librarian: "We have a table-top model over here."
Patron: "No, that's not good enough. Don't you have a life-size?"
Librarian: (pause) "Yes, but it's in use right now."

Re: Database size Vs performance degradation

From

"Dave North"

Date:

30 July 2008, 09:42:30

-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com]
Sent: July 30, 2008 8:28 AM
To: Dave North
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Database size Vs performance degradation

Dave North wrote:
> Morning folks,
>     Long time listener, first time poster.

Hi Dave

> Postgres 8.1.8
>     shared_buffers = 2000
>     max_fsm_pages = 400000
> Redhat Enterprise 4
> Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1 Also running

> on the server is a tomcat web server and other ancillaries

The value of 2000 seems a bit low for shared_buffers perhaps. Oh, and
8.1.13 seems to be the latest bugfix for 8.1 too.

DN: Yeah, I was thinking the same.  I spent several hours reading info
on this list and other places and it's highly inconclusive about having
high or low shared buffs Vs letting the OS disk cache handle it.

> Now, the problem.  We have an application that continually writes a
> bunch of data to a few tables which is then deleted by a batch job
> each night.  We're adding around 80,000 rows to one table per day and
> removing around 75,000 that are deemed to be "unimportant".
[snip]
> We had this problem around a month ago and again yesterday.  Because
> the application needs reasonably high availability, we couldn't full
> vacuum so what we did was a dump and load to another system.  What I
> found here was that after the load, the DB size was around 2.7GB - a
> decrease of 5GB.  Re-loading this back onto the main system, and the
world is good.

Well, that's pretty much the definition of bloat. Are you sure you're
vacuuming enough?

DN: Well, the auto-vac is kicking off pretty darn frequently...around
once every 2 minutes.  However, you just made me think of the obvious -
is it actually doing anything?!  The app is pretty darn write intensive
so I wonder if it's actually able to vacuum the tables?

 I don't have an 8.1 to hand at the moment, but a "vacuum verbose" in
8.2+ gives some details at the end about how many free-space slots need
to be tracked. Presumably you're not tracking enough of them, or your
vacuuming isn't actually taking place.

DN: I think you've hit it.  Now the next obvious problem is how to make
the vac actually vac while maintaining a running system?

Check the size of your database every night. It will rise from 2.7GB,
but it should stay roughly static (apart from whatever data you add of
course). If you can keep it so that most of the working-set of your
database fits in RAM speed will stay just fine.

DN: Yep, I'm just implementing a size tracker now to keep a track on it.
It grew from the 2.5GB to 7GB in around a month so it's pretty easy to
see big jumps I'd say.  Does the auto-vac log it's results somewhere by
any chance do you know?

Fantastic post, thanks so much.

Dave

> Yes, I know we need to upgrade to 8.3 but that's going to take some
> time
> :)

I think you'll like some of the improvements, but it's probably more
important to get 8.1.13 installed soon-ish.

--
   Richard Huxton
   Archonet Ltd

Re: Database size Vs performance degradation

From

"Dave North"

Date:

30 July 2008, 09:44:27

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Matthew
Wakeling
Sent: July 30, 2008 8:37 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Database size Vs performance degradation

On Wed, 30 Jul 2008, Dave North wrote:
> Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1

> Checking the stats, the DB size is around 7.5GB;

Doesn't fit in RAM.

> ...after the load, the DB size was around 2.7GB

Does fit in RAM.

> One observation I've made on the DB system is the disk I/O seems
> dreadfully slow...we're at around 75% I/O wait sometimes and the read
> rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for
> un-cached reads).

That's incredibly slow in this day and age, especially from 10krpm HDDs.

Definitely worth investigating.

DN: Yeah, I was thinking the same thing.  Unlike the folks here, I'm no
performance whiz but it did seem crazy slow.  Given the 10K disks, it
seems to me there is most likely something on the RAID Array itself that
is set sub-optimally.  Next thing to look at.

However, I think vacuuming more agressively is going to be your best win
at the moment.

DN: As I just replied to the past (very helpful) chap, I think I need to
go see what exactly the vac is vac'ing (autovac that is) because
although it's running super frequently, the big question is "is it doing
anything" :)

Cheers

Dave

Re: Database size Vs performance degradation

From

Valentin Bogdanov

Date:

30 July 2008, 11:59:02

I am guessing that you are using DELETE to remove the 75,000 unimportant.
Change your batch job to CREATE a new table consisting only of the 5,000 important. You can use "CREATE TABLE
table_nameAS select_statement" command. Then drop the old table. After that you can use ALTER TABLE to change the name
ofthe new table to that of the old one.  

I am not an expert but if this is a viable solution for you then I think doing it this way will rid you of your
bloatingproblem. 

Regards,
Val


--- On Wed, 30/7/08, Dave North <DNorth@signiant.com> wrote:

> From: Dave North <DNorth@signiant.com>
> Subject: [PERFORM] Database size Vs performance degradation
> To: pgsql-performance@postgresql.org
> Date: Wednesday, 30 July, 2008, 1:09 PM
> Morning folks,
>     Long time listener, first time poster.  Having an
> interesting
> problem related to performance which I'll try and
> describe below and
> hopefully get some enlightenment.  First the environment:
>
>
> Postgres 8.1.8
>     shared_buffers = 2000
>     max_fsm_pages = 400000
> Redhat Enterprise 4
> Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1
> Also running on the server is a tomcat web server and other
> ancillaries
>
> Now, the problem.  We have an application that continually
> writes a
> bunch of data to a few tables which is then deleted by a
> batch job each
> night.  We're adding around 80,000 rows to one table
> per day and
> removing around 75,000 that are deemed to be
> "unimportant".  Now, the
> problem we see is that after a period of time, the database
> access
> becomes very 'slow' and the load avg on the machine
> gets up around 5.
> When this happens, the application using the DB basically
> grinds to a
> halt.  Checking the stats, the DB size is around 7.5GB; no
> tables or
> indexes look to be 'bloated' (we have been using
> psql since 7.3 with the
> classic index bloat problem) and the auto-vac has been
> running solidly.
>
> We had this problem around a month ago and again yesterday.
>  Because the
> application needs reasonably high availability, we
> couldn't full vacuum
> so what we did was a dump and load to another system.  What
> I found here
> was that after the load, the DB size was around 2.7GB - a
> decrease of
> 5GB.  Re-loading this back onto the main system, and the
> world is good.
>
> One observation I've made on the DB system is the disk
> I/O seems
> dreadfully slow...we're at around 75% I/O wait
> sometimes and the read
> rates seem quite slow (hdparm says around 2.2MB/sec -
> 20MB/sec for
> un-cached reads).  I've also observed that the OS cache
> seems to be
> using all of the remaining memory for it's cache
> (around 3GB) which
> seems probably the best it can do with the available
> memory.
>
> Now, clearly we need to examine the need for the
> application to write
> and remove so much data but my main question is:
>
> Why does the size of the database with so much
> "un-used" space seem to
> impact performance so much?  If (in this case) the extra
> 5GB of space is
> essentially "unallocated", does it factor into
> any of the caching or
> performance metrics that the DBMS uses?  And if so, would I
> be better
> having a higher shared_buffers rather than relying so much
> on OS cache?
>
> Yes, I know we need to upgrade to 8.3 but that's going
> to take some time
> :)
>
> Many thanks in advance.
>
> Dave
>
> ___
> Dave North
> dnorth@signiant.com
> Signiant - Making Media Move
> Visit Signiant at: www.signiant.com
> <http://www.signiant.com/>
>
>
> --
> Sent via pgsql-performance mailing list
> (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo!
http://uk.docs.yahoo.com/ymail/new.html

Re: Database size Vs performance degradation

From

"Dave North"

Date:

30 July 2008, 12:02:56

Thank you for the suggestion..much appreciated.  Alas, I don't think
this will be possible without a change to the application but it's a
good idea nonetheless.

Where I am now is looking at the autovac tuning parameters.  I strongly
suspect that the 2 tables that are "frequent changers" are just not
getting enough cleaning.  It's hard to tell though because the AV
messages are only at debug2 and setting debug2 on this server would be a
killer.  However, just from running the math using the autovac limit and
how it's a percentage of table size, I'm pretty sure that we're not
vac'ing enough and that reducing the multiplier down from 0.4 would make
a significant difference.  My main question was answered though I think
- the growth is NOT normal which was the stimulus I needed to
investigate further.

Thanks again

Dave

> -----Original Message-----
> From: Valentin Bogdanov [mailto:valiouk@yahoo.co.uk]
> Sent: July 30, 2008 10:58 AM
> To: pgsql-performance@postgresql.org; Dave North
> Subject: Re: [PERFORM] Database size Vs performance degradation
>
> I am guessing that you are using DELETE to remove the 75,000
> unimportant.
> Change your batch job to CREATE a new table consisting only
> of the 5,000 important. You can use "CREATE TABLE table_name
> AS select_statement" command. Then drop the old table. After
> that you can use ALTER TABLE to change the name of the new
> table to that of the old one.
>
> I am not an expert but if this is a viable solution for you
> then I think doing it this way will rid you of your bloating problem.
>
> Regards,
> Val
>
>
> --- On Wed, 30/7/08, Dave North <DNorth@signiant.com> wrote:
>
> > From: Dave North <DNorth@signiant.com>
> > Subject: [PERFORM] Database size Vs performance degradation
> > To: pgsql-performance@postgresql.org
> > Date: Wednesday, 30 July, 2008, 1:09 PM Morning folks,
> >     Long time listener, first time poster.  Having an
> interesting problem
> > related to performance which I'll try and describe below
> and hopefully
> > get some enlightenment.  First the environment:
> >
> >
> > Postgres 8.1.8
> >     shared_buffers = 2000
> >     max_fsm_pages = 400000
> > Redhat Enterprise 4
> > Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1
> Also running
> > on the server is a tomcat web server and other ancillaries
> >
> > Now, the problem.  We have an application that continually writes a
> > bunch of data to a few tables which is then deleted by a batch job
> > each night.  We're adding around 80,000 rows to one table
> per day and
> > removing around 75,000 that are deemed to be "unimportant".
>  Now, the
> > problem we see is that after a period of time, the database access
> > becomes very 'slow' and the load avg on the machine gets up
> around 5.
> > When this happens, the application using the DB basically
> grinds to a
> > halt.  Checking the stats, the DB size is around 7.5GB; no
> tables or
> > indexes look to be 'bloated' (we have been using psql since
> 7.3 with
> > the classic index bloat problem) and the auto-vac has been running
> > solidly.
> >
> > We had this problem around a month ago and again yesterday.
> >  Because the
> > application needs reasonably high availability, we couldn't full
> > vacuum so what we did was a dump and load to another
> system.  What I
> > found here was that after the load, the DB size was around
> 2.7GB - a
> > decrease of 5GB.  Re-loading this back onto the main
> system, and the
> > world is good.
> >
> > One observation I've made on the DB system is the disk I/O seems
> > dreadfully slow...we're at around 75% I/O wait sometimes
> and the read
> > rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for
> > un-cached reads).  I've also observed that the OS cache seems to be
> > using all of the remaining memory for it's cache (around 3GB) which
> > seems probably the best it can do with the available memory.
> >
> > Now, clearly we need to examine the need for the
> application to write
> > and remove so much data but my main question is:
> >
> > Why does the size of the database with so much "un-used"
> space seem to
> > impact performance so much?  If (in this case) the extra
> 5GB of space
> > is essentially "unallocated", does it factor into any of
> the caching
> > or performance metrics that the DBMS uses?  And if so, would I be
> > better having a higher shared_buffers rather than relying
> so much on
> > OS cache?
> >
> > Yes, I know we need to upgrade to 8.3 but that's going to take some
> > time
> > :)
> >
> > Many thanks in advance.
> >
> > Dave
> >
> > ___
> > Dave North
> > dnorth@signiant.com
> > Signiant - Making Media Move
> > Visit Signiant at: www.signiant.com
> > <http://www.signiant.com/>
> >
> >
> > --
> > Sent via pgsql-performance mailing list
> > (pgsql-performance@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-performance
>
>
>       __________________________________________________________
> Not happy with your email address?.
> Get the one you really want - millions of new email addresses
> available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
>

Re: Database size Vs performance degradation

From

Tom Lane

Date:

30 July 2008, 12:05:24

"Dave North" <DNorth@signiant.com> writes:
> From: Richard Huxton [mailto:dev@archonet.com]
>> Well, that's pretty much the definition of bloat. Are you sure you're
>> vacuuming enough?

> DN: Well, the auto-vac is kicking off pretty darn frequently...around
> once every 2 minutes.  However, you just made me think of the obvious -
> is it actually doing anything?!  The app is pretty darn write intensive
> so I wonder if it's actually able to vacuum the tables?

IIRC, the default autovac parameters in 8.1 were pretty darn
unaggressive.  You should also check for long-running transactions
that might be preventing vacuum from removing recently-dead rows.

One of the reasons for updating off 8.1 is that finding out what autovac
is really doing is hard :-(.  I think it does log, but at level DEBUG2
or so, which means that the only way to find out is to accept a huge
amount of useless chatter in the postmaster log.  Newer releases have
a saner logging scheme.

            regards, tom lane

Re: Database size Vs performance degradation

From

Craig James

Date:

30 July 2008, 12:32:41

Dave North wrote:
> Thank you for the suggestion..much appreciated.  Alas, I don't think
> this will be possible without a change to the application but it's a
> good idea nonetheless.

I assume you mean the "create table as select ..." suggestion (don't forget to include a little quoted material so
we'llknow what you are replying to :-) 

You don't have to change the application.  One of the great advantages of Postgres is that even table creation,
droppingand renaming are transactional.  So you can do the select / drop / rename as a transaction by an external app,
andyour main application will be none the wiser.  In pseudo-SQL: 

  begin
  create table new_table as (select * from old_table);
  create index ... on new_table ... (as needed)
  drop table old_table
  alter table new_table rename to old_table
  commit

You should be able to just execute this by hand on a running system, and see if some of your bloat goes away.

Craig

Re: Database size Vs performance degradation

From

Mark Roberts

Date:

30 July 2008, 12:46:11

On Wed, 2008-07-30 at 10:02 -0500, Dave North wrote:
> Thank you for the suggestion..much appreciated.  Alas, I don't think
> this will be possible without a change to the application but it's a
> good idea nonetheless.

Affirmative, Dave.  I read you.

If I were in your situation (not having access/desire to change the base
application), I'd write a sql script that does something like this:

- Create __new_table__ from old_table # Read lock on old table
- Rename old_table to __old_table__ # Access Exclusive Lock
- Rename __new_table__ to old_table # Access Exclusive Lock
- Commit # Now the application can write to the new table
- Sync newly written changes to the new table (these would be written
between the creation and access exclusive lock).
- Drop/Vacuum full/Archive old_table

Well, it would at least let you get the benefits of the rename approach
without actually altering the application.  Additionally, the
application's writes would only be blocked for the duration of the
rename itself.

This makes the assumption that these writes aren't strictly necessary
immediately (such as via a find or insert construct).  If this
assumption is false, you would need to lock the table and block the
application from writing while you create the temporary table.  This has
the advantage of not requiring the final sync step.

Sorry if all of this seems redundant, but best of luck!

-Mark

Re: Database size Vs performance degradation

From

Matthew Wakeling

Date:

30 July 2008, 13:16:20

On Wed, 30 Jul 2008, Craig James wrote:
> You don't have to change the application.  One of the great advantages of
> Postgres is that even table creation, dropping and renaming are
> transactional.  So you can do the select / drop / rename as a transaction by
> an external app, and your main application will be none the wiser.  In
> pseudo-SQL:
>
> begin
> create table new_table as (select * from old_table);
> create index ... on new_table ... (as needed)
> drop table old_table
> alter table new_table rename to old_table
> commit

I believe this SQL snippet could cause data loss, because there is a
period during which writes can be made to the old table that will not be
copied to the new table.

On a side note, I would be interested to know what happens with locks when
renaming tables. For example, if we were to alter the above SQL, and add a
"LOCK TABLE old_table IN ACCESS EXCLUSIVE" line, would this fix the
problem? What I mean is, if the application tries to run "INSERT INTO
old_table ...", and blocks on the lock, when the old_table is dropped,
will it resume trying to insert into the dropped table and fail, or will
it redirect its attentions to the new table that has been renamed into
place?

Also, if a lock is taken on a table, and the table is renamed, does the
lock follow the table, or does it stay attached to the table name?

Anyway, surely it's much safer to just run VACUUM manually?

Matthew

--
Change is inevitable, except from vending machines.

Re: Database size Vs performance degradation

From

Greg Smith

Date:

30 July 2008, 13:48:22

On Wed, 30 Jul 2008, Dave North wrote:

> One observation I've made on the DB system is the disk I/O seems
> dreadfully slow...we're at around 75% I/O wait sometimes and the read
> rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for
> un-cached reads).

This is typically what happens when you are not buffering enough of the
right information in RAM, such that there are lots of small reads and
writes to the disk involve lots of seeking.  You'll only get a couple of
MB/s out of a disk if it has to move all over the place to retreive the
blocks you asked for.

Setting shared_buffers too low makes this more likely to happen, because
PostgreSQL has to constantly read and write out random blocks to make
space to read new ones in its limited work area.  The OS buffers some of
that, but not as well as if the database server has a bit more RAM for
itself because then the blocks it most uses won't leave that area.

> And if so, would I be better having a higher shared_buffers rather than
> relying so much on OS cache?

The main situation where making shared_buffers too high is a problem on
8.1 involves checkpoints writing out too much information at once.  You
didn't mention changing checkpoint_segments on your system; if it's at its
default of 3, your system is likely continuously doing tiny checkpoints,
which might be another reason why you're seeing so much scattered seek
behavior above.  Something >30 would be more appropriate for
checkpoint_segments on your server.

I'd suggest re-tuning as follows:

1) Increase shared_buffers to 10,000, test.  Things should be a bit
faster.

2) Increase checkpoint_segments to 30, test.  What you want to watch for
here whether there are periods where the server seems to freeze for a
couple of seconds.  That's a "checkpoint spike".  If this happens, reduce
checkpoint_segments to some sort of middle ground; some people never get
above 10 before it's a problem.

3) Increase shared_buffers in larger chunks, as long as you don't see any
problematic spikes you might usefully keep going until it's set to at
least 100,000 before improvements level off.

> I spent several hours reading info on this list and other places and
> it's highly inconclusive about having high or low shared buffs Vs
> letting the OS disk cache handle it.

A lot of the material floating around the 'net was written circa
PostgreSQL 8.0 or earlier, and you need to ignore any advice in this area
from those articles.  I think if you rescan everything with that filter in
place you'll find its not so inconclusive that increasing shared_buffers
is a win, so long as it doesn't trigger checkpoint spikes (on your
platform at least, there are still Windows issues).  Check out my "Inside
the PostgreSQL Buffer Cache" presentation at
http://www.westnet.com/~gsmith/content/postgresql for an excess of detail
on this topic.  Unfortunately the usage_count recommendations given there
are impractical for use on 8.1 because pg_buffercache doesn't include that
info, but the general "shared_buffers vs. OS cache" theory and suggestions
apply.

The other parameter I hope you're setting correctly for your system is
effective_cache_size, which should be at least 2GB for your server (exact
sizing depends on how much RAM is leftover after the Tomcat app is
running).

All this is something to consider in parallel with the vacuum
investigation you're doing.  It looks like your autovacuum isn't anywhere
close to aggressive enough for your workload, which is not unusual at all
for 8.1, and that may actually be the majority if your problem.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Database size Vs performance degradation

From

Mark Roberts

Date:

30 July 2008, 13:49:35

On Wed, 2008-07-30 at 17:16 +0100, Matthew Wakeling wrote:
>
> I believe this SQL snippet could cause data loss, because there is a
> period during which writes can be made to the old table that will not
> be
> copied to the new table.

It could indeed cause data loss.


> On a side note, I would be interested to know what happens with locks
> when
> renaming tables. For example, if we were to alter the above SQL, and
> add a
> "LOCK TABLE old_table IN ACCESS EXCLUSIVE" line, would this fix the
> problem? What I mean is, if the application tries to run "INSERT INTO
> old_table ...", and blocks on the lock, when the old_table is
> dropped,
> will it resume trying to insert into the dropped table and fail, or
> will
> it redirect its attentions to the new table that has been renamed
> into
> place?

Yes, that would resolve the issue.  It would also block the
application's writes for however long the process takes (this could be
unacceptable).

> Also, if a lock is taken on a table, and the table is renamed, does
> the
> lock follow the table, or does it stay attached to the table name?

The lock will follow the table itself (rather than the table name).

> Anyway, surely it's much safer to just run VACUUM manually?

Generally, you would think so.  The problem comes from Vacuum blocking
the application process' writes.

-Mark

Re: Database size Vs performance degradation

From

Richard Huxton

Date:

30 July 2008, 14:19:10

Dave North wrote:
> -----Original Message-----
> From: Richard Huxton [mailto:dev@archonet.com]

>
> Well, that's pretty much the definition of bloat. Are you sure you're
> vacuuming enough?
>
> DN: Well, the auto-vac is kicking off pretty darn frequently...around
> once every 2 minutes.  However, you just made me think of the obvious -
> is it actually doing anything?!  The app is pretty darn write intensive
> so I wonder if it's actually able to vacuum the tables?

If you've got a big batch delete, it can't hurt to manually vacuum that
table immediately afterwards.

--
   Richard Huxton
   Archonet Ltd

Re: Database size Vs performance degradation

From

"Dave North"

Date:

30 July 2008, 14:39:42

> -----Original Message-----
> From: Greg Smith [mailto:gsmith@gregsmith.com]
> Sent: July 30, 2008 12:48 PM
> To: Dave North
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Database size Vs performance degradation
>
> On Wed, 30 Jul 2008, Dave North wrote:
>
> > One observation I've made on the DB system is the disk I/O seems
> > dreadfully slow...we're at around 75% I/O wait sometimes
> and the read
> > rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for
> > un-cached reads).
>
> This is typically what happens when you are not buffering
> enough of the right information in RAM, such that there are
> lots of small reads and writes to the disk involve lots of
> seeking.  You'll only get a couple of MB/s out of a disk if
> it has to move all over the place to retreive the blocks you
> asked for.

I could totally see that except on another identical server, the MAX
rate I was able to get under no load was 20MB/sec which just seems
awfully low for 10K rpm disks to me (but granted, I'm not a performance
analysis expert by any stretch)

>
> Setting shared_buffers too low makes this more likely to
> happen, because PostgreSQL has to constantly read and write
> out random blocks to make space to read new ones in its
> limited work area.  The OS buffers some of that, but not as
> well as if the database server has a bit more RAM for itself
> because then the blocks it most uses won't leave that area.

OK, this makes sense that a "specialist" cache will provide more
benefits that a "general" cache.  Got it.

>
> > And if so, would I be better having a higher shared_buffers rather
> > than relying so much on OS cache?
>
> The main situation where making shared_buffers too high is a
> problem on
> 8.1 involves checkpoints writing out too much information at
> once.  You didn't mention changing checkpoint_segments on
> your system; if it's at its default of 3, your system is
> likely continuously doing tiny checkpoints, which might be
> another reason why you're seeing so much scattered seek
> behavior above.  Something >30 would be more appropriate for
> checkpoint_segments on your server.

It appears ours is currently set to 12 but this is something I'll have a
play with as well.

>
> I'd suggest re-tuning as follows:
>
> 1) Increase shared_buffers to 10,000, test.  Things should be
> a bit faster.
>
> 2) Increase checkpoint_segments to 30, test.  What you want
> to watch for here whether there are periods where the server
> seems to freeze for a couple of seconds.  That's a
> "checkpoint spike".  If this happens, reduce
> checkpoint_segments to some sort of middle ground; some
> people never get above 10 before it's a problem.
>
> 3) Increase shared_buffers in larger chunks, as long as you
> don't see any problematic spikes you might usefully keep
> going until it's set to at least 100,000 before improvements
> level off.

Do you happen to know if these are "reload" or "restart" tunable
parameters?  I think I've read somewhere before that they are restart
parameters (assuming I've set SHMMAX high enough of course)

>
> > I spent several hours reading info on this list and other
> places and
> > it's highly inconclusive about having high or low shared buffs Vs
> > letting the OS disk cache handle it.
>
<SNIP good reading info
>
> The other parameter I hope you're setting correctly for your
> system is effective_cache_size, which should be at least 2GB
> for your server (exact sizing depends on how much RAM is
> leftover after the Tomcat app is running).

Now, this is interesting.  I'm seeing just from top and vmstat, that the
OS cache is around 2-3GB pretty consistently with everything running
under full load.  So it seems I should be able to pretty safely set this
to 2GB as you suggest.

>
> All this is something to consider in parallel with the vacuum
> investigation you're doing.  It looks like your autovacuum
> isn't anywhere close to aggressive enough for your workload,
> which is not unusual at all for 8.1, and that may actually be
> the majority if your problem.

Yeah, I've pretty well convinced myself that it is.  We have 2 tables
that see this add/remove pattern where we add something like 100,000
rows per table and then delete around 75,000 per night...what I've just
done is added enties into pg_autovacuum to change the vac_scale_factor
down to 0.2 for both of these tables.  By my calcs this will lower the
vac threshold for these tables from 221,000 to 111,000 tuples each.
Even this may be too high with max_fsm_pages at 400,000 but I can go
lower if needed.  As per other threads, my only real metric to measure
this is the overall database size since the better AV messaging is an
8.3 enhancement.

I'm starting with just changing the autovac parameters and see how that
affects things.  I'm reluctant to change multiple parameters in one shot
(although they all make sense logically) just in case one goes awry ;)

I have to say, I've learnt a whole load from you folks here this
morning...very enlightening.  I'm now moving on to your site Greg! :)

Cheers

Dave

>
> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com
> Baltimore, MD
>

Re: Database size Vs performance degradation

From

Tom Lane

Date:

30 July 2008, 14:52:38

Mark Roberts <mailing_lists@pandapocket.com> writes:
> On Wed, 2008-07-30 at 17:16 +0100, Matthew Wakeling wrote:
>> Anyway, surely it's much safer to just run VACUUM manually?

> Generally, you would think so.  The problem comes from Vacuum blocking
> the application process' writes.

Huh?  Vacuum doesn't block writes.

            regards, tom lane

Re: Database size Vs performance degradation

From

Guillaume Lelarge

Date:

30 July 2008, 14:52:39

Dave North a écrit :
> [...]
>> I'd suggest re-tuning as follows:
>>
>> 1) Increase shared_buffers to 10,000, test.  Things should be
>> a bit faster.
>>
>> 2) Increase checkpoint_segments to 30, test.  What you want
>> to watch for here whether there are periods where the server
>> seems to freeze for a couple of seconds.  That's a
>> "checkpoint spike".  If this happens, reduce
>> checkpoint_segments to some sort of middle ground; some
>> people never get above 10 before it's a problem.
>>
>> 3) Increase shared_buffers in larger chunks, as long as you
>> don't see any problematic spikes you might usefully keep
>> going until it's set to at least 100,000 before improvements
>> level off.
>
> Do you happen to know if these are "reload" or "restart" tunable
> parameters?  I think I've read somewhere before that they are restart
> parameters (assuming I've set SHMMAX high enough of course)
>

shared_buffers and checkpoint_segments both need a restart.
> [...]
> I have to say, I've learnt a whole load from you folks here this
> morning...very enlightening.  I'm now moving on to your site Greg! :)
>

There's much to learn from Greg's site. I was kinda impressed by all the
good articles in it.


--
Guillaume.
  http://www.postgresqlfr.org
  http://dalibo.com

Re: Database size Vs performance degradation

From

Mark Roberts

Date:

30 July 2008, 15:04:16

On Wed, 2008-07-30 at 13:51 -0400, Tom Lane wrote:
>
>
> Huh?  Vacuum doesn't block writes.
>
>                         regards, tom lane
>

Of course, you are correct.  I was thinking of Vacuum full, which is
recommended for use when you're deleting the majority of rows in a
table.

http://www.postgresql.org/docs/8.1/interactive/sql-vacuum.html

-Mark

Re: Database size Vs performance degradation

From

Miernik

Date:

30 July 2008, 18:59:02

Valentin Bogdanov <valiouk@yahoo.co.uk> wrote:
> I am guessing that you are using DELETE to remove the 75,000
> unimportant.  Change your batch job to CREATE a new table consisting
> only of the 5,000 important. You can use "CREATE TABLE table_name AS
> select_statement" command. Then drop the old table. After that you can
> use ALTER TABLE to change the name of the new table to that of the old
> one.

I have a similar, but different situation, where I TRUNCATE a table with
60k rows every hour, and refill it with new rows. Would it be better
(concerning bloat) to just DROP the table every hour, and recreate it,
then to TRUNCATE it? Or does TRUNCATE take care of the boat as good as a
DROP and CREATE?

I am running 8.3.3 in a 48 MB RAM Xen, so performance matters much.

--
Miernik
http://miernik.name/

Re: Database size Vs performance degradation

From

Mark Roberts

Date:

30 July 2008, 19:14:14

On Wed, 2008-07-30 at 23:58 +0200, Miernik wrote:

> I have a similar, but different situation, where I TRUNCATE a table
> with
> 60k rows every hour, and refill it with new rows. Would it be better
> (concerning bloat) to just DROP the table every hour, and recreate it,
> then to TRUNCATE it? Or does TRUNCATE take care of the boat as good as
> a
> DROP and CREATE?
>
> I am running 8.3.3 in a 48 MB RAM Xen, so performance matters much.

I've successfully used truncate for this purpose (server 8.2.6):

-----------------------------

psql=> select pg_relation_size(oid) from pg_class where relname =
'asdf';
 pg_relation_size
------------------
            32768
(1 row)

Time: 0.597 ms
psql=> truncate asdf;
TRUNCATE TABLE
Time: 1.069 ms
psql=> select pg_relation_size(oid) from pg_class where relname =
'asdf';
 pg_relation_size
------------------
                0
(1 row)

-Mark

Re: Database size Vs performance degradation

From

Andrzej Zawadzki

Date:

31 July 2008, 14:56:55

Mark Roberts wrote:
> On Wed, 2008-07-30 at 13:51 -0400, Tom Lane wrote:
>
>> Huh?  Vacuum doesn't block writes.
>>
>>                         regards, tom lane
>>
>>
>
> Of course, you are correct.  I was thinking of Vacuum full, which is
> recommended for use when you're deleting the majority of rows in a
> table.
>
> http://www.postgresql.org/docs/8.1/interactive/sql-vacuum.html
>
Maybe I'm wrong but if this "bulk insert and delete" process is cyclical
then You don't need vacuum full.
Released tuples will fill up again with fresh data next day - after
regular vacuum.

I have such situation at work. Size of database on disk is 60GB and is
stable.

--
Andrzej Zawadzki

Re: Database size Vs performance degradation

From

Matthew Wakeling

Date:

01 August 2008, 08:03:16

On Thu, 31 Jul 2008, Andrzej Zawadzki wrote:
> Maybe I'm wrong but if this "bulk insert and delete" process is cyclical then
> You don't need vacuum full.
> Released tuples will fill up again with fresh data next day - after regular
> vacuum.

Yes, a regular manual vacuum will prevent the table from growing more than
it needs to. However, a vacuum full is required to actually reduce the
size of the table from 7.5G to 2.7G if that hasn't been done on the
production system already.

Matthew

--
It's one of those irregular verbs - "I have an independent mind," "You are
an eccentric," "He is round the twist."
                                      -- Bernard Woolly, Yes Prime Minister

Re: Database size Vs performance degradation

From

"Fernando Ike"

Date:

03 August 2008, 20:58:55

2008/8/1 Matthew Wakeling <matthew@flymine.org>:
> On Thu, 31 Jul 2008, Andrzej Zawadzki wrote:
>>
>> Maybe I'm wrong but if this "bulk insert and delete" process is cyclical
>> then You don't need vacuum full.
>> Released tuples will fill up again with fresh data next day - after
>> regular vacuum.
>
> Yes, a regular manual vacuum will prevent the table from growing more than
> it needs to. However, a vacuum full is required to actually reduce the size
> of the table from 7.5G to 2.7G if that hasn't been done on the production
> system already.

    One good possibility is use pg8.3 for fix problem. Enable
Autovacuum+HOT was won a significant performance compared with 8.2 and
minor versions. :)



Kind Regards,
--
Fernando Ike
http://www.midstorm.org/~fike/weblog

Plz Heeeelp! performance settings

From

dforum

Date:

06 August 2008, 19:19:23

Hello to all,


I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET
2008 x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux
with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad,
but it would change has quickly I can).

I have a database of 38Go and take 6Go per week.

I have a lot of update and insert, especially in 8 tables. 2 tables are
using for temporary storage, so I right something like 15000 request per
2 minutes and empty it into 10 min.
I'm making some update or select on tables including more than 20
millions of entrance.

I have the following postgresql.conf settings :


# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload". Some settings, such as listen_addresses, require
# a postmaster shutdown and restart to take effect.


#---------------------------------------------------------------------------
# FILE LOCATIONS
#---------------------------------------------------------------------------

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'           # use data in another directory
#hba_file = 'ConfigDir/pg_hba.conf'     # host-based authentication file
#ident_file = 'ConfigDir/pg_ident.conf' # IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
#external_pid_file = '(none)'           # write an extra pid file


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

listen_addresses = 'xxx.xxx.xxx.xxx'              # what IP address(es)
to listen on;
                                        # comma-separated list of addresses;
                                        # defaults to 'localhost', '*' = all
port = 5432
max_connections = 624
# note: increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777         # octal
#bonjour_name = ''                      # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60            # 1-600, in seconds
#ssl = off
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''
#krb_srvname = 'postgres'
#krb_server_hostname = ''               # empty string matches any
keytab entry
#krb_caseins_users = off

# - TCP Keepalives -
# see 'man 7 tcp' for details

tcp_keepalives_idle = 300               # TCP_KEEPIDLE, in seconds;
                                        # 0 selects the system default
#tcp_keepalives_interval = 0            # TCP_KEEPINTVL, in seconds;
                                        # 0 selects the system default
#tcp_keepalives_count = 0               # TCP_KEEPCNT;
                                        # 0 selects the system default


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 250000                 # min 16 or max_connections*2,
8KB each
temp_buffers = 500                      # min 100, 8KB each
max_prepared_transactions = 200         # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared
memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 9000                 # min 64, size in KB
maintenance_work_mem = 5000             # min 1024, size in KB
max_stack_depth = 8192                  # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 100000                  # min max_fsm_relations*16, 6
bytes each
max_fsm_relations = 5000                # min 100, ~70 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000           # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

vacuum_cost_delay = 5                   # 0-1000 milliseconds
vacuum_cost_page_hit = 10               # 0-10000 credits
vacuum_cost_page_miss = 100             # 0-10000 credits
vacuum_cost_page_dirty = 20             # 0-10000 credits
vacuum_cost_limit = 500         # 0-10000 credits

# - Background writer -

bgwriter_delay = 50                     # 10-10000 milliseconds between
rounds
bgwriter_lru_percent = 1.0              # 0-100% of LRU buffers
scanned/round
bgwriter_lru_maxpages = 25              # 0-1000 buffers max written/round
bgwriter_all_percent = 0.333            # 0-100% of all buffers
scanned/round
bgwriter_all_maxpages = 50              # 0-1000 buffers max written/round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

#fsync = on                             # turns forced synchronization
on or off
#wal_sync_method = fsync                # the default is the first option
                                        # supported by the operating system:
                                        #   open_datasync
                                        #   fdatasync
                                        #   fsync
                                        #   fsync_writethrough
                                        #   open_sync
#full_page_writes = on                  # recover from partial page writes
wal_buffers = 16                        # min 4, 8KB each
commit_delay = 500                      # range 0-100000, in microseconds
commit_siblings = 50                    # range 1-1000

# - Checkpoints -

checkpoint_segments = 50                # in logfile segments, min 1,
16MB each
checkpoint_timeout = 1800               # range 30-3600, in seconds
checkpoint_warning = 180                # in seconds, 0 is off

# - Archiving -

#archive_command = ''                   # command to use to archive a
logfile
                                        # segment


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

effective_cache_size = 625000           # typically 8KB each
random_page_cost = 3                    # units are one sequential page
fetch
                                        # cost
#cpu_tuple_cost = 0.01                  # (same)
#cpu_index_tuple_cost = 0.001           # (same)
#cpu_operator_cost = 0.0025             # (same)

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5                        # range 1-10
#geqo_pool_size = 0                     # selects default based on effort
#geqo_generations = 0                   # selects default based on effort
#geqo_selection_bias = 2.0              # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10         # range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8                # 1 disables collapsing of explicit
                                        # JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'             # Valid values are combinations of
                                        # stderr, syslog and eventlog,
                                        # depending on platform.

# This is used when logging to stderr:
#redirect_stderr = off                  # Enable capturing of stderr
into log
                                        # files

# These are only used if redirect_stderr is on:
#log_directory = 'pg_log'               # Directory where log files are
written
                                        # Can be absolute or relative to
PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.
                                        # Can include strftime() escapes
#log_truncate_on_rotation = off # If on, any existing log file of the same
                                        # name as the new log file will be
                                        # truncated rather than appended
to. But
                                        # such truncation only occurs on
                                        # time-driven rotation, not on
restarts
                                        # or size-driven rotation.
Default is
                                        # off, meaning append to
existing files
                                        # in all cases.
#log_rotation_age = 1440                # Automatic rotation of logfiles
will
                                        # happen after so many minutes.
0 to
                                        # disable.
#log_rotation_size = 10240              # Automatic rotation of logfiles
will
                                        # happen after so many kilobytes
of log
                                        # output.  0 to disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice           # Values, in order of decreasing
detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   log
                                        #   notice
                                        #   warning
                                        #   error

#log_min_messages = notice              # Values, in order of decreasing
detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   info
                                        #   notice
                                        #   warning
                                        #   error
                                        #   log
                                        #   fatal
                                        #   panic

#log_error_verbosity = default          # terse, default, or verbose
messages

#log_min_error_statement = panic        # Values in order of increasing
severity:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   info
                                        #   notice
                                        #   warning
                                        #   error
                                        #   panic(off)

#log_min_duration_statement = -1        # -1 is disabled, 0 logs all
statements
                                        # and their durations, in
milliseconds.

#silent_mode = off                      # DO NOT USE without syslog or
                                        # redirect_stderr

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_line_prefix = ''                   # Special values:
                                        #   %u = user name
                                        #   %d = database name
                                        #   %r = remote host and port
                                        #   %h = remote host
                                        #   %p = PID
                                        #   %t = timestamp (no milliseconds)
                                        #   %m = timestamp with milliseconds
                                        #   %i = command tag
                                        #   %c = session id
                                        #   %l = session line number
                                        #   %s = session start timestamp
                                        #   %x = transaction id
                                        #   %q = stop here in non-session
                                        #        processes
                                        #   %% = '%'
                                        # e.g. '<%u%%%d> '
#log_statement = 'none'                 # none, mod, ddl, all
#log_hostname = off


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off

# - Query/Index Statistics Collector -

stats_start_collector = on
#stats_command_string = off
#stats_block_level = off
stats_row_level = on
#stats_reset_on_server_start = off


#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------

autovacuum = on                 # enable autovacuum subprocess?
autovacuum_naptime = 180                # time between autovacuum runs,
in secs
autovacuum_vacuum_threshold = 100000    # min # of tuple updates before
                                        # vacuum
autovacuum_analyze_threshold = 9000     # min # of tuple updates before
                                        # analyze
autovacuum_vacuum_scale_factor = 0.4    # fraction of rel size before
                                        # vacuum
autovacuum_analyze_scale_factor = 0.2   # fraction of rel size before
                                # analyze
autovacuum_vacuum_cost_delay = -1       # default vacuum cost delay for
                                        # autovac, -1 means use
                                        # vacuum_cost_delay
autovacuum_vacuum_cost_limit = -1       # default vacuum cost limit for
                                        # autovac, -1 means use
                                        # vacuum_cost_limit


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'           # schema names
#default_tablespace = ''                # a tablespace name, '' uses
                                        # the default
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0                  # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown                     # actually, defaults to TZ
                                        # environment setting
#australian_timezones = off
#extra_float_digits = 0                 # min -15, max 2
#client_encoding = sql_ascii            # actually, defaults to database
                                        # encoding

# These settings are initialized by initdb -- they might be changed
#lc_messages = 'fr_FR@euro'                     # locale for system
error message
                                        # strings
#lc_monetary = 'fr_FR@euro'                     # locale for monetary
formatting
#lc_numeric = 'fr_FR@euro'                      # locale for number
formatting
#lc_time = 'fr_FR@euro'                         # locale for time formatting

lc_messages = 'fr_FR.UTF-8'                     # locale for system
error message
                                        # strings
lc_monetary = 'fr_FR.UTF-8'                     # locale for monetary
formatting
lc_numeric = 'fr_FR.UTF-8'                      # locale for number
formatting
lc_time = 'fr_FR.UTF-8'                         # locale for time formatting



# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1000                # in milliseconds
#max_locks_per_transaction = 64         # min 10
# note: each lock table slot uses ~220 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = off
#backslash_quote = safe_encoding        # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = off
#regex_flavor = advanced                # advanced, extended, or basic
#sql_inheritance = on

# - Other Platforms & Clients -

#transform_null_equals = off


#---------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------

#custom_variable_classes = ''           # list of custom variable class
names


I'm sure that it could be more optimised. I don't know any thing on WAL,
autovacuum, fsm, bgwriter, kernel process, geqo or planner cost settings.

I'll thanks you all in advance for your precious help

Regards

David

Re: Plz Heeeelp! performance settings

From

"Merlin Moncure"

Date:

06 August 2008, 23:16:36

On Wed, Aug 6, 2008 at 6:12 PM, dforum <dforums@vieonet.com> wrote:
> Hello to all,
>
>
> I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET 2008
> x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux
> with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad, but
<snip>

this is likely your problem...with fsync on (as you have it), you will
be lucky to get a couple of hundred transactions/sec out of the
database.  you are probably just exceeding your operational
capabilities of the hardware so you probably need to upgrade or turn
off fsync (which means data corruption in event of hard crash).

merlin

Re: Plz Heeeelp! performance settings

From

dforum

Date:

07 August 2008, 04:35:32

Tx for your reply.

You mean that RAID use fsync method for keeping data's copy.

So you invite me to desactivate fsync to increase the performance ?

Desactivating fsync. my second disk will not be uptodate, so if the
machine crash, I wont be able to get the server working quickly??? But
if I use a second machine to replicate the database, I escape this
problem isn't it ?

If I understand right, could you tell me how to do desactivate fsync
please ?

Best regards

David

Merlin Moncure a écrit :
> On Wed, Aug 6, 2008 at 6:12 PM, dforum <dforums@vieonet.com> wrote:
>
>> Hello to all,
>>
>>
>> I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET 2008
>> x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux
>> with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad, but
>>
> <snip>
>
> this is likely your problem...with fsync on (as you have it), you will
> be lucky to get a couple of hundred transactions/sec out of the
> database.  you are probably just exceeding your operational
> capabilities of the hardware so you probably need to upgrade or turn
> off fsync (which means data corruption in event of hard crash).
>
> merlin
>
>

Re: Plz Heeeelp! performance settings

From

Richard Huxton

Date:

07 August 2008, 05:34:07

dforum wrote:
> Tx for your reply.
>
> You mean that RAID use fsync method for keeping data's copy.

No, Merlin means PostgreSQL will issue a sync to force WAL to actual disk.

> So you invite me to desactivate fsync to increase the performance ?

He means you might have to if you can't afford new hardware. Is disk
activity the problem? Have you looked at the output of "vmstat" to check?

> Desactivating fsync. my second disk will not be uptodate,

No - the RAID stuff is happening in the operating-system.

 > so if the
> machine crash, I wont be able to get the server working quickly???

Not "quickly", perhaps not "at all".

 > But
> if I use a second machine to replicate the database, I escape this
> problem isn't it ?

You reduce the chance of a single failure causing disaster.

> If I understand right, could you tell me how to do desactivate fsync
> please ?

There's an "fsync = on" setting in your postgresql.conf, but don't
change it yet.

 > I have a database of 38Go and take 6Go per week.

What do you mean by "take 6Go per week"? You update/delete that much
data? It's growing by that amount each week?

 > I have a lot of update and insert, especially in 8 tables. 2 tables are
 > using for temporary storage, so I right something like 15000 request per
 > 2 minutes and empty it into 10 min.

I'm not sure what "15000 request per 2 minutes and empty it into 10 min"
means.

Do you have 7500 requests per minute?
Are these updates?
To the "temporary storage"?
What is this "temporary storage" - an ordinary table?

 > I'm making some update or select on tables including more than 20
 > millions of entrance.

Again, I'm not sure what this means.

Oh - *important* - which version of PostgreSQL are you running?
Is an upgrade practical?

Looking at your postgresql.conf settings:

   max_connections = 624

That's an odd number.
Do you usually have that many connections?
What are they doing? They can't all be active, the machine you've got
wouldn't cope.

   shared_buffers = 250000
   work_mem = 9000
   temp_buffers = 500

These three are important. The shared_buffers are workspace shared
between all backends, and you've allocated about 2GB. You've also set
work_mem=9MB, which is how much each backend can use for a single sort.
That means it can use double or triple that in a complex query. If
you're using temporary tables, then you'll want to make sure the
temp_buffers setting is correct.

I can't say whether these figures are good or bad without knowing how
the database is being used.

   effective_cache_size = 625000

That's around 5GB - is that roughly the amount of memory used for
caching (what does free -m say for buffers/cache)?

   max_prepared_transactions = 200

Do you use a lot of prepared transactions in two-phase commit?
I'm guessing that you don't.

 > I'm sure that it could be more optimised. I don't know any thing on
 > WAL,
 > autovacuum, fsm, bgwriter, kernel process, geqo or planner cost
 > settings.

If you run a "vacuum verbose" it will recommend fsm settings at the end
of its output. I think you probably need to make your autovacuum more
aggressive, but that's something you'll be able to tell by monitoring
your database.

It's quite likely that Merlin's right, and you need better hardware to
cope with the number of updates you're making - that's something where
you need fast disks. However, he's just guessing because you've not told
us enough to tell where the problem really lies.

--
   Richard Huxton
   Archonet Ltd

Re: Plz Heeeelp! performance settings

From

dforums

Date:

07 August 2008, 06:29:14


Richard Huxton a écrit :
 > dforum wrote:
 >> Tx for your reply.
 >>
 >> You mean that RAID use fsync method for keeping data's copy.
 >
 > No, Merlin means PostgreSQL will issue a sync to force WAL to actual
disk.
 >
 >> So you invite me to desactivate fsync to increase the performance ?
 >
 > He means you might have to if you can't afford new hardware. Is disk
 > activity the problem? Have you looked at the output of "vmstat" to check?
vmstat is giving :
procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us
sy id wa
  0  2   1540  47388  41684 7578976    0    0   131   259    0     1  9
  3 82  7


 >
 >> Desactivating fsync. my second disk will not be uptodate,
 >
 > No - the RAID stuff is happening in the operating-system.
 >
 >  > so if the
 >> machine crash, I wont be able to get the server working quickly???
 >
 > Not "quickly", perhaps not "at all".
Oups
 >
 >  > But
 >> if I use a second machine to replicate the database, I escape this
 >> problem isn't it ?
 > You reduce the chance of a single failure causing disaster.
Not clear this reply. It's scare me ....
 >
 >> If I understand right, could you tell me how to do desactivate fsync
 >> please ?
 >
 > There's an "fsync = on" setting in your postgresql.conf, but don't
 > change it yet.
OK
 >
 >  > I have a database of 38Go and take 6Go per week.
 >
 > What do you mean by "take 6Go per week"? You update/delete that much
 > data? It's growing by that amount each week?
YES
 >
 >  > I have a lot of update and insert, especially in 8 tables. 2
tables are
 >  > using for temporary storage, so I right something like 15000
request per
 >  > 2 minutes and empty it into 10 min.
 >
 > I'm not sure what "15000 request per 2 minutes and empty it into 10 min"
 > means.
I insert 15000 datas every 2 min and delete 15000 every  10 min in those
tables
 >
 > Do you have 7500 requests per minute?
should be that, But in fact I'm not treating the datas in real time, and
I buffer the datas and push the data into the database every 2 min
 > Are these updates?
during the delete the data are aggregated in other tables which make updates
 > To the "temporary storage"?

 > What is this "temporary storage" - an ordinary table?
Yes, I thied to use temporary tables but I never been able to connect
this tables over 2 different session/connection, seems that is a
functionnality of postgresql, or a misunderstanding from me.
 >
 >  > I'm making some update or select on tables including more than 20
 >  > millions of entrance.
 >
 > Again, I'm not sure what this means.

To aggregate the data, I have to check the presence of others
information that are stores in 2 tables which includes 24 millions of
entrance.
 >
 >
 > Oh - *important* - which version of PostgreSQL are you running?
8.1.11
 > Is an upgrade practical?
We are working of trying to upgrade to 8.3.3, but we are not yet ready
for such migration
 >
 >
 > Looking at your postgresql.conf settings:
 >
 >   max_connections = 624
 >
 > That's an odd number.
Now we could decrease this number, it's not so much usefull for now. we
could decrease is to 350.
 > Do you usually have that many connections?
 > What are they doing? They can't all be active, the machine you've got
 > wouldn't cope.
 >
 >   shared_buffers = 250000
 >   work_mem = 9000
 >   temp_buffers = 500
 >
 > These three are important. The shared_buffers are workspace shared
 > between all backends, and you've allocated about 2GB. You've also set
 > work_mem=9MB, which is how much each backend can use for a single sort.
 > That means it can use double or triple that in a complex query

(i now about it).

If
 > you're using temporary tables, then you'll want to make sure the
 > temp_buffers setting is correct.
I need help for that, I don't know
 >
 > I can't say whether these figures are good or bad without knowing how
 > the database is being used.
 >
 >   effective_cache_size = 625000
 >
 > That's around 5GB - is that roughly the amount of memory used for
 > caching (what does free -m say for buffers/cache)?
             total       used       free     shared    buffers     cached
Mem:          7984       7828        156          0         38       7349
-/+ buffers/cache:        440       7544
Swap:          509          1        508


 >
 >   max_prepared_transactions = 200
 >
 > Do you use a lot of prepared transactions in two-phase commit?
 > I'm guessing that you don't.
I don't
 >
 >  > I'm sure that it could be more optimised. I don't know any thing on
 >  > WAL,
 >  > autovacuum, fsm, bgwriter, kernel process, geqo or planner cost
 >  > settings.
 >
 > If you run a "vacuum verbose" it will recommend fsm settings at the end
 > of its output. I think you probably need to make your autovacuum more
 > aggressive, but that's something you'll be able to tell by monitoring
 > your database.
 >
 > It's quite likely that Merlin's right, and you need better hardware to
 > cope with the number of updates you're making - that's something where
 > you need fast disks. However, he's just guessing because you've not told
 > us enough to tell where the problem really lies.
 >

Hope that new information will give you more information to help me.

Regards

david

--
<http://www.1st-affiliation.fr>

*David Bigand
*Président Directeur Générale*
*51 chemin des moulins
73000 CHAMBERY - FRANCE

Web : htttp://www.1st-affiliation.fr
Email : david@1st-affiliation.com
Tel. : +33 479 696 685
Mob. : +33 666 583 836
Skype : firstaffiliation_support

Re: Plz Heeeelp! performance settings

From

Richard Huxton

Date:

07 August 2008, 07:40:35

dforums wrote:
> vmstat is giving :
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
> id wa
>  0  2   1540  47388  41684 7578976    0    0   131   259    0     1  9
>  3 82  7

This system is practically idle. Either you're not measuring it at a
useful time, or there isn't a performance problem.

>  >  > But
>  >> if I use a second machine to replicate the database, I escape this
>  >> problem isn't it ?
>  > You reduce the chance of a single failure causing disaster.
> Not clear this reply. It's scare me ....

If server A fails, you still have server B. If server A fails so that
replication stops working and you don't notice, server B won't help any
more.

>  > What do you mean by "take 6Go per week"? You update/delete that much
>  > data? It's growing by that amount each week?
> YES

That wasn't a yes/no question. Please choose one of:
Are you updating 6Go per week?
Are you adding 6Go per week?

>  > I'm not sure what "15000 request per 2 minutes and empty it into 10 min"
>  > means.
> I insert 15000 datas every 2 min and delete 15000 every  10 min in those
> tables
>  >
>  > Do you have 7500 requests per minute?
> should be that, But in fact I'm not treating the datas in real time, and
> I buffer the datas and push the data into the database every 2 min
>  > Are these updates?
> during the delete the data are aggregated in other tables which make
> updates

OK, so every 2 minutes you run one big query that adds 15000 rows.
Every 10 minutes you run one big query that deletes 15000 rows.

>  > To the "temporary storage"?
>
>  > What is this "temporary storage" - an ordinary table?
> Yes, I thied to use temporary tables but I never been able to connect
> this tables over 2 different session/connection, seems that is a
> functionnality of postgresql, or a misunderstanding from me.

That's correct - temporary tables are private to a backend (connection).

>  >  > I'm making some update or select on tables including more than 20
>  >  > millions of entrance.
>  >
>  > Again, I'm not sure what this means.
>
> To aggregate the data, I have to check the presence of others
> information that are stores in 2 tables which includes 24 millions of
> entrance.

OK. I assume you're happy with the plans you are getting on these
queries, since you've not provided any information about them.

>  > Oh - *important* - which version of PostgreSQL are you running?
> 8.1.11
>  > Is an upgrade practical?
> We are working of trying to upgrade to 8.3.3, but we are not yet ready
> for such migration

OK

>  > Looking at your postgresql.conf settings:
>  >
>  >   max_connections = 624
>  >
>  > That's an odd number.
> Now we could decrease this number, it's not so much usefull for now. we
> could decrease is to 350.

I don't believe you've got 350 active connections either. It will be
easier to help if you can provide some useful information.

>  >   effective_cache_size = 625000
>  >
>  > That's around 5GB - is that roughly the amount of memory used for
>  > caching (what does free -m say for buffers/cache)?
>             total       used       free     shared    buffers     cached
> Mem:          7984       7828        156          0         38       7349
> -/+ buffers/cache:        440       7544
> Swap:          509          1        508

Not far off - free is showing 7349MB cached. You're not running 350
clients there though - you're only using 440MB of RAM.


I don't see anything to show a performance problem from these emails.

--
   Richard Huxton
   Archonet Ltd

Re: Plz Heeeelp! performance settings

From

dforums

Date:

07 August 2008, 10:40:08

The performance problem is really only on the insertion and even more on
the treatment for the aggregation.

To treat the 3000 entrances and to insert, or update the tables it needs
10 minutes.

As I told you I inject 14000 query every 2 minutes, and it needs 10
minutes to treat 3000 of those query.

As you can easly understand it's a big narrow section.

I'm not doing the treatment in ones, cause I can't, but all is managed
by procedure.

 > That wasn't a yes/no question. Please choose one of:
 > Are you updating 6Go per week? most of update
 > Are you adding 6Go per week? less of injection,

This action depend if the data are already present in the database.


 >
 > OK. I assume you're happy with the plans you are getting on these
 > queries, since you've not provided any information about them.

The plan seems ok as it use index as well.
here is the plan :

explain analyse SELECT "insertUpdateTracks"(137,2605, 852, ('2008-08-06
19:28:54'::text)::date,3,'dailydisplay',2,NULL);
INFO:  method 1
                                      QUERY PLAN
------------------------------------------------------------------------------------
  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=1.151..1.151
rows=1 loops=1)
  Total runtime: 1.160 ms
(2 lignes)



  Has you can see the runtime processs for an update in this table.

multiplying this per 10000, it is too long.

regards

david


Richard Huxton a écrit :
> dforums wrote:
>> vmstat is giving :
>> procs -----------memory---------- ---swap-- -----io---- --system--
>> ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us
>> sy id wa
>>  0  2   1540  47388  41684 7578976    0    0   131   259    0     1  9
>>  3 82  7
>
> This system is practically idle. Either you're not measuring it at a
> useful time, or there isn't a performance problem.
>
>>  >  > But
>>  >> if I use a second machine to replicate the database, I escape this
>>  >> problem isn't it ?
>>  > You reduce the chance of a single failure causing disaster.
>> Not clear this reply. It's scare me ....
>
> If server A fails, you still have server B. If server A fails so that
> replication stops working and you don't notice, server B won't help any
> more.
>
>>  > What do you mean by "take 6Go per week"? You update/delete that much
>>  > data? It's growing by that amount each week?
>> YES
>
> That wasn't a yes/no question. Please choose one of:
> Are you updating 6Go per week?
> Are you adding 6Go per week?
>
>>  > I'm not sure what "15000 request per 2 minutes and empty it into 10
>> min"
>>  > means.
>> I insert 15000 datas every 2 min and delete 15000 every  10 min in
>> those tables
>>  >
>>  > Do you have 7500 requests per minute?
>> should be that, But in fact I'm not treating the datas in real time,
>> and I buffer the datas and push the data into the database every 2 min
>>  > Are these updates?
>> during the delete the data are aggregated in other tables which make
>> updates
>
> OK, so every 2 minutes you run one big query that adds 15000 rows.
> Every 10 minutes you run one big query that deletes 15000 rows.
>
>>  > To the "temporary storage"?
>>
>>  > What is this "temporary storage" - an ordinary table?
>> Yes, I thied to use temporary tables but I never been able to connect
>> this tables over 2 different session/connection, seems that is a
>> functionnality of postgresql, or a misunderstanding from me.
>
> That's correct - temporary tables are private to a backend (connection).
>
>>  >  > I'm making some update or select on tables including more than 20
>>  >  > millions of entrance.
>>  >
>>  > Again, I'm not sure what this means.
>>
>> To aggregate the data, I have to check the presence of others
>> information that are stores in 2 tables which includes 24 millions of
>> entrance.
>
> OK. I assume you're happy with the plans you are getting on these
> queries, since you've not provided any information about them.
>
>>  > Oh - *important* - which version of PostgreSQL are you running?
>> 8.1.11
>>  > Is an upgrade practical?
>> We are working of trying to upgrade to 8.3.3, but we are not yet ready
>> for such migration
>
> OK
>
>>  > Looking at your postgresql.conf settings:
>>  >
>>  >   max_connections = 624
>>  >
>>  > That's an odd number.
>> Now we could decrease this number, it's not so much usefull for now.
>> we could decrease is to 350.
>
> I don't believe you've got 350 active connections either. It will be
> easier to help if you can provide some useful information.
>
>>  >   effective_cache_size = 625000
>>  >
>>  > That's around 5GB - is that roughly the amount of memory used for
>>  > caching (what does free -m say for buffers/cache)?
>>             total       used       free     shared    buffers     cached
>> Mem:          7984       7828        156          0         38       7349
>> -/+ buffers/cache:        440       7544
>> Swap:          509          1        508
>
> Not far off - free is showing 7349MB cached. You're not running 350
> clients there though - you're only using 440MB of RAM.
>
>
> I don't see anything to show a performance problem from these emails.
>

--
<http://www.1st-affiliation.fr>

*David Bigand
*Président Directeur Générale*
*51 chemin des moulins
73000 CHAMBERY - FRANCE

Web : htttp://www.1st-affiliation.fr
Email : david@1st-affiliation.com
Tel. : +33 479 696 685
Mob. : +33 666 583 836
Skype : firstaffiliation_support

Re: Plz Heeeelp! performance settings

From

Richard Huxton

Date:

07 August 2008, 11:13:00

dforums wrote:
> The performance problem is really only on the insertion and even more on
> the treatment for the aggregation.
>
> To treat the 3000 entrances and to insert, or update the tables it needs
> 10 minutes.
>
> As I told you I inject 14000 query every 2 minutes, and it needs 10
> minutes to treat 3000 of those query.

Sorry - I still don't understand. What is this "treatment" you are doing?

>  >
>  > OK. I assume you're happy with the plans you are getting on these
>  > queries, since you've not provided any information about them.
>
> The plan seems ok as it use index as well.
> here is the plan :
>
> explain analyse SELECT "insertUpdateTracks"(137,2605, 852, ('2008-08-06
> 19:28:54'::text)::date,3,'dailydisplay',2,NULL);
> INFO:  method 1
>                                      QUERY PLAN
> ------------------------------------------------------------------------------------
>
>  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=1.151..1.151
> rows=1 loops=1)
>  Total runtime: 1.160 ms

There's nothing to do with an index here - this is a function call.

>  Has you can see the runtime processs for an update in this table.
>
> multiplying this per 10000, it is too long.

So - are you calling this function 14000 times to inject your data?
You're doing this in one transaction, yes?

--
   Richard Huxton
   Archonet Ltd

Re: Plz Heeeelp! performance settings

From

"Merlin Moncure"

Date:

07 August 2008, 12:06:10

On Thu, Aug 7, 2008 at 9:30 AM, dforums <dforums@vieonet.com> wrote:
> The performance problem is really only on the insertion and even more on the
> treatment for the aggregation.
>
> To treat the 3000 entrances and to insert, or update the tables it needs 10
> minutes.
>
> As I told you I inject 14000 query every 2 minutes, and it needs 10 minutes
> to treat 3000 of those query.
>
> As you can easly understand it's a big narrow section.
>
> I'm not doing the treatment in ones, cause I can't, but all is managed by
> procedure.
>
>> That wasn't a yes/no question. Please choose one of:
>> Are you updating 6Go per week? most of update
>> Are you adding 6Go per week? less of injection,
>
> This action depend if the data are already present in the database.
>
>
>>
>> OK. I assume you're happy with the plans you are getting on these
>> queries, since you've not provided any information about them.
>
> The plan seems ok as it use index as well.
> here is the plan :
>
> explain analyse SELECT "insertUpdateTracks"(137,2605, 852, ('2008-08-06
> 19:28:54'::text)::date,3,'dailydisplay',2,NULL);
> INFO:  method 1
>                                     QUERY PLAN
> ------------------------------------------------------------------------------------
>  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=1.151..1.151 rows=1
> loops=1)
>  Total runtime: 1.160 ms
> (2 lignes)
>
>  Has you can see the runtime processs for an update in this table.
>
> multiplying this per 10000, it is too long.
>

please don't top-post (put your reply after previous comments).

With fsync on, you are lucky to get 10k inserts in 10 minutes on
single sata 1.  The basic issue is that after each time function runs
postgesql tells disk drive to flush, guaranteeing data safety.  You
have few different options here:

*) group multiple inserts into single transaction
*) modify function to take multiple 'update' records at once.
*) disable fsync (extremely unsafe as noted)
*) upgrade to 8.3 and disable synchronized_commit (the 'fsync lite', a
good compromise between fsync on/off)

merlin

Re: Plz Heeeelp! performance settings

From

Richard Huxton

Date:

07 August 2008, 12:08:57

dforums wrote:
 > The delete is global, the procedure is called for each line/tracks.
 >  > So - are you calling this function 14000 times to inject your data?
 >  > You're doing this in one transaction, yes?
 > NO I have to make it 14000 times cause, I use some inserted information
 > for other insert to make links between data.

Why does that stop you putting all 14000 calls in one transaction?


--
   Richard Huxton
   Archonet Ltd