Thread: performance of insert/delete/update

performance of insert/delete/update

From
Wei Weng
Date:
There had been a great deal of discussion of how to improve the
performance of select/sorting on this list, what about
insert/delete/update?

Is there any rules of thumb we need to follow? What are the parameters
we should tweak to whip the horse to go faster?

Thanks

--
Wei Weng
Network Software Engineer
KenCast Inc.



Re: performance of insert/delete/update

From
"Josh Berkus"
Date:
Wei,

> There had been a great deal of discussion of how to improve the
> performance of select/sorting on this list, what about
> insert/delete/update?
>
> Is there any rules of thumb we need to follow? What are the
> parameters
> we should tweak to whip the horse to go faster?

yes, lots of rules.   Wanna be more specific?   You wondering about
query structure, hardware, memory config, what?

-Josh Berkus

Re: performance of insert/delete/update

From
Wei Weng
Date:
On Thu, 2002-11-21 at 16:23, Josh Berkus wrote:
> Wei,
>
> > There had been a great deal of discussion of how to improve the
> > performance of select/sorting on this list, what about
> > insert/delete/update?
> >
> > Is there any rules of thumb we need to follow? What are the
> > parameters
> > we should tweak to whip the horse to go faster?
>
> yes, lots of rules.   Wanna be more specific?   You wondering about
> query structure, hardware, memory config, what?
I am most concerned about the software side, that is query structures
and postgresql config.

Thanks

--
Wei Weng
Network Software Engineer
KenCast Inc.



Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On 21 Nov 2002, Wei Weng wrote:

> On Thu, 2002-11-21 at 16:23, Josh Berkus wrote:
> > Wei,
> >
> > > There had been a great deal of discussion of how to improve the
> > > performance of select/sorting on this list, what about
> > > insert/delete/update?
> > >
> > > Is there any rules of thumb we need to follow? What are the
> > > parameters
> > > we should tweak to whip the horse to go faster?
> >
> > yes, lots of rules.   Wanna be more specific?   You wondering about
> > query structure, hardware, memory config, what?
> I am most concerned about the software side, that is query structures
> and postgresql config.

The absolutely most important thing to do to speed up inserts and updates
is to squeeze as many as you can into one transaction.  Within reason, of
course.  There's no great gain in putting more than a few thousand
together at a time.  If your application is only doing one or two updates
in a transaction, it's going to be slower in terms of records written per
second than an application that is updating 100 rows in a transaction.

Reducing triggers and foreign keys on the inserted tables to a minimum
helps.

Inserting into temporary holding tables and then having a regular process
that migrates the data into the main tables is sometimes necessary if
you're putting a lot of smaller inserts into a very large dataset.
Then using a unioned view to show the two tables as one.

Putting WAL (e.g. $PGDATA/pg_xlog directory) on it's own drive(s).

Putting indexes that have to be updated during inserts onto their own
drive(s).

Performing regular vacuums on heavily updated tables.

Also, if your hardware is reliable, you can turn off fsync in
postgresql.conf.  That can increase performance by anywhere from 2 to 10
times, depending on your application.



Re: performance of insert/delete/update

From
"Josh Berkus"
Date:
Scott,

> The absolutely most important thing to do to speed up inserts and
> updates
> is to squeeze as many as you can into one transaction.  Within
> reason, of
> course.  There's no great gain in putting more than a few thousand
> together at a time.  If your application is only doing one or two
> updates
> in a transaction, it's going to be slower in terms of records written
> per
> second than an application that is updating 100 rows in a
> transaction.

This only works up to the limit of the memory you have available for
Postgres.  If the updates in one transaction exceed your available
memory, you'll see a lot of swaps to disk log that will slow things
down by a factor of 10-50 times.

> Reducing triggers and foreign keys on the inserted tables to a
> minimum
> helps.

... provided that this will not jeapordize your data integrity.  If you
have indispensable triggers in PL/pgSQL, re-qriting them in C will make
them, and thus updates on their tables, faster.

Also, for foriegn keys, it speeds up inserts and updates on parent
tables with many child records if the foriegn key column in the child
table is indexed.

> Putting WAL (e.g. $PGDATA/pg_xlog directory) on it's own drive(s).
>
> Putting indexes that have to be updated during inserts onto their own
>
> drive(s).
>
> Performing regular vacuums on heavily updated tables.
>
> Also, if your hardware is reliable, you can turn off fsync in
> postgresql.conf.  That can increase performance by anywhere from 2 to
> 10
> times, depending on your application.

It can be dangerous though ... in the event of a power outage, for
example, your database could be corrupted and difficult to recover.  So
... "at your own risk".

I've found that switching from fsync to fdatasync on Linux yields
marginal performance gain ... about 10-20%.

Also, if you are doing large updates (many records at once) you may
want to increase WAL_FILES and CHECKPOINT_BUFFER in postgresql.conf to
allow for large transactions.

Finally, you want to structure your queries so that you do the minimum
number of update writes possible, or insert writes.  For example, a
procedure that inserts a row, does some calculations, and then modifies
several fields in that row is going to slow stuff down significantly
compared to doing the calculations as variables and only a single
insert.  Certainly don't hit a table with 8 updates, each updating one
field instead of a single update statement.

-Josh Berkus

Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On Thu, 21 Nov 2002, Josh Berkus wrote:

> Scott,
>
> > The absolutely most important thing to do to speed up inserts and
> > updates
> > is to squeeze as many as you can into one transaction.  Within
> > reason, of
> > course.  There's no great gain in putting more than a few thousand
> > together at a time.  If your application is only doing one or two
> > updates
> > in a transaction, it's going to be slower in terms of records written
> > per
> > second than an application that is updating 100 rows in a
> > transaction.
>
> This only works up to the limit of the memory you have available for
> Postgres.  If the updates in one transaction exceed your available
> memory, you'll see a lot of swaps to disk log that will slow things
> down by a factor of 10-50 times.

Sorry, but that isn't true.  MVCC means we don't have to hold all the data
in memory, we can have multiple versions of the same tuples on disk, and
use memory for what it's meant for, buffering.

The performance gain
comes from the fact that postgresql doesn't have to perform the data
consistency checks needed during an insert until after all the rows are
inserted, and it can "gang check" them/

> > Reducing triggers and foreign keys on the inserted tables to a
> > minimum
> > helps.
>
> ... provided that this will not jeapordize your data integrity.  If you
> have indispensable triggers in PL/pgSQL, re-qriting them in C will make
> them, and thus updates on their tables, faster.

Agreed.  But you've probably seen the occasional "I wasn't sure if we
needed that check or not, so I threw it in just in case" kind of database
design.  :-)

I definitely don't advocate just tossing all your FKs to make it run
faster.

Also note that many folks have replaced foreign keys with triggers and
gained in performance, as fks in pgsql still have some deadlock issues to
be worked out.

> Also, for foriegn keys, it speeds up inserts and updates on parent
> tables with many child records if the foriegn key column in the child
> table is indexed.

Absolutely.

> > Putting WAL (e.g. $PGDATA/pg_xlog directory) on it's own drive(s).
> >
> > Putting indexes that have to be updated during inserts onto their own
> >
> > drive(s).
> >
> > Performing regular vacuums on heavily updated tables.
> >
> > Also, if your hardware is reliable, you can turn off fsync in
> > postgresql.conf.  That can increase performance by anywhere from 2 to
> > 10
> > times, depending on your application.
>
> It can be dangerous though ... in the event of a power outage, for
> example, your database could be corrupted and difficult to recover.  So
> ... "at your own risk".

No, the database will not be corrupted, at least not in my experience.
however, you MAY lose data from transactions that you thought were
committed.  I think Tom posted something about this a few days back.

> I've found that switching from fsync to fdatasync on Linux yields
> marginal performance gain ... about 10-20%.

I'll have to try that.

> Also, if you are doing large updates (many records at once) you may
> want to increase WAL_FILES and CHECKPOINT_BUFFER in postgresql.conf to
> allow for large transactions.

Actually, postgresql will create more WAL files if it needs to to handle
the size of a transaction.  BUT, it won't create extra ones for heavier
parallel load without being told to.  I've inserted 100,000 rows at a
time with no problem on a machine with only 1 WAL file specified, and it
didn't burp.  It does run faster having multiple wal files when under
parallel load.

> Finally, you want to structure your queries so that you do the minimum
> number of update writes possible, or insert writes.  For example, a
> procedure that inserts a row, does some calculations, and then modifies
> several fields in that row is going to slow stuff down significantly
> compared to doing the calculations as variables and only a single
> insert.  Certainly don't hit a table with 8 updates, each updating one
> field instead of a single update statement.

This is critical, and bites many people coming from a row level locking
database to an MVCC database.  In MVCC every update creates a new on disk
tuple.  I think someone on the list a while back was updating their
database something like this:

update table set field1='abc' where id=1;
update table set field2='def' where id=1;
update table set field3='ghi' where id=1;
update table set field4='jkl' where id=1;
update table set field5='mno' where id=1;
update table set field6='pqr' where id=1;

and they had to vacuum something like every 5 minutes.

Also, things like:

update table set field1=field1+1

are killers in an MVCC database as well.


Re: performance of insert/delete/update

From
Josh Berkus
Date:
Scott,

> > This only works up to the limit of the memory you have available for
> > Postgres.  If the updates in one transaction exceed your available
> > memory, you'll see a lot of swaps to disk log that will slow things
> > down by a factor of 10-50 times.
>
> Sorry, but that isn't true.  MVCC means we don't have to hold all the data
> in memory, we can have multiple versions of the same tuples on disk, and
> use memory for what it's meant for, buffering.

Sorry, you're absolutely correct.  I don't know what I was thinking of; 's the
problem with an off-the-cuff response.

Please disregard the previous quote.   Instead:

Doing several large updates in a single transaction can lower performance if
the number of updates is sufficient to affect index usability and a VACUUM is
really needed between them.  For example, a series of large data
transformation statements on a single table or set of related tables should
have VACCUUM statements between them, thus preventing you from putting them
in a single transaction.

Example, the series:
1. INSERT 10,000 ROWS INTO table_a;
2. UPDATE 100,000 ROWS IN table_a WHERE table_b;
3. UPDATE 100,000 ROWS IN table_c WHERE table_a;

WIll almost certainly need a VACUUM or even VACUUM FULL table_a after 2),
requiring you to split the update series into 2 transactions.  Otherwise, the
"where table_a" condition in step 3) will be extremely slow.

> Also note that many folks have replaced foreign keys with triggers and
> gained in performance, as fks in pgsql still have some deadlock issues to
> be worked out.

Yeah.   I think Neil Conway is overhauling FKs, which everyone considers a bit
of a hack in the current implementation, including Jan who wrote it.

> > It can be dangerous though ... in the event of a power outage, for
> > example, your database could be corrupted and difficult to recover.  So
> > ... "at your own risk".
>
> No, the database will not be corrupted, at least not in my experience.
> however, you MAY lose data from transactions that you thought were
> committed.  I think Tom posted something about this a few days back.

Hmmm ... have you done this?  I'd like the performance gain, but I don't want
to risk my data integrity.  I've seen some awful things in databases (such as
duplicate primary keys) from yanking a power cord repeatedly.

> update table set field1=field1+1
>
> are killers in an MVCC database as well.

Yeah -- don't I know it.


--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On Thu, 21 Nov 2002, Josh Berkus wrote:

> Doing several large updates in a single transaction can lower performance if
> the number of updates is sufficient to affect index usability and a VACUUM is
> really needed between them.  For example, a series of large data
> transformation statements on a single table or set of related tables should
> have VACCUUM statements between them, thus preventing you from putting them
> in a single transaction.
>
> Example, the series:
> 1. INSERT 10,000 ROWS INTO table_a;
> 2. UPDATE 100,000 ROWS IN table_a WHERE table_b;
> 3. UPDATE 100,000 ROWS IN table_c WHERE table_a;
>
> WIll almost certainly need a VACUUM or even VACUUM FULL table_a after 2),
> requiring you to split the update series into 2 transactions.  Otherwise, the
> "where table_a" condition in step 3) will be extremely slow.

Very good point.  One that points out the different mind set one needs
when dealing with pgsql.

> > > It can be dangerous though ... in the event of a power outage, for
> > > example, your database could be corrupted and difficult to recover.  So
> > > ... "at your own risk".
> >
> > No, the database will not be corrupted, at least not in my experience.
> > however, you MAY lose data from transactions that you thought were
> > committed.  I think Tom posted something about this a few days back.
>
> Hmmm ... have you done this?  I'd like the performance gain, but I don't want
> to risk my data integrity.  I've seen some awful things in databases (such as
> duplicate primary keys) from yanking a power cord repeatedly.

I have, with killall -9 postmaster, on several occasions during testing
under heavy parallel load.  I've never had 7.2.x fail because of this.


Re: performance of insert/delete/update

From
"Josh Berkus"
Date:
Scott,

> > > The absolutely most important thing to do to speed up inserts and
> > > updates
> > > is to squeeze as many as you can into one transaction.

I was discussing this on IRC, and nobody could verify this assertion.
  Do you have an example of bunlding multiple writes into a transaction
giving a performance gain?

-Josh

Re: performance of insert/delete/update

From
Ron Johnson
Date:
On Fri, 2002-11-22 at 22:18, Josh Berkus wrote:
> Scott,
>
> > > > The absolutely most important thing to do to speed up inserts and
> > > > updates
> > > > is to squeeze as many as you can into one transaction.
>
> I was discussing this on IRC, and nobody could verify this assertion.
>   Do you have an example of bunlding multiple writes into a transaction
> giving a performance gain?

Unfortunately, I missed the beginning of this thread, but I do
know that eliminating as many indexes as possible is the answer.
If I'm going to insert "lots" of rows in an off-line situation,
then I'll drop *all* of the indexes, load the data, then re-index.
If deleting "lots", then I'll drop all but the 1 relevant index,
then re-index afterwards.

As for bundling multiple statements into a transaction to increase
performance, I think the questions are:
- how much disk IO does one BEGIN TRANSACTION do?  If it *does*
  do disk IO, then "bundling" *will* be more efficient, since
  less disk IO will be performed.
- are, for example, 500 COMMITs of small amounts of data more or
  less efficient than 1 COMMIT of a large chunk of data?  On the
  proprietary database that I use at work, efficiency goes up,
  then levels off at ~100 inserts per transaction.

Ron
--
+------------------------------------------------------------+
| Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
|                                                            |
| "they love our milk and honey, but preach about another    |
|  way of living"                                            |
|    Merle Haggard, "The Fighting Side Of Me"                |
+------------------------------------------------------------+


Re: performance of insert/delete/update

From
Josh Berkus
Date:
Ron,

> As for bundling multiple statements into a transaction to increase
> performance, I think the questions are:
> - how much disk IO does one BEGIN TRANSACTION do?  If it *does*
>   do disk IO, then "bundling" *will* be more efficient, since
>   less disk IO will be performed.
> - are, for example, 500 COMMITs of small amounts of data more or
>   less efficient than 1 COMMIT of a large chunk of data?  On the
>   proprietary database that I use at work, efficiency goes up,
>   then levels off at ~100 inserts per transaction.

That's because some commercial databases (MS SQL, Sybase) use an "unwinding
transaction log" method of updating.  That is, during a transaction, changes
are written only to the transaction log, and those changes are "played" to
the database only on a COMMIT.  It's an approach that is more efficient for
large transactions, but has the unfortuate side effect of *requiring* read
and write row locks for the duration of the transaction.

In Postgres, with MVCC, changes are written to the database immediately with a
new transaction ID and the new rows are "activated" on COMMIT.  So the
changes are written to the database as the statements are executed,
regardless.   This is less efficient for large transactions than the
"unwinding log" method, but has the advantage of eliminating read locks
entirely and most deadlock situations.

Under MVCC, then, I am not convinced that bundling a bunch of writes into one
transaction is faster until I see it demonstrated.  I certainly see no
performance gain on my system.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: performance of insert/delete/update

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Under MVCC, then, I am not convinced that bundling a bunch of writes into one
> transaction is faster until I see it demonstrated.  I certainly see no
> performance gain on my system.

Are you running with fsync off?

The main reason for bundling updates into larger transactions is that
each transaction commit requires an fsync on the WAL log.  If you have
fsync enabled, it is physically impossible to commit transactions faster
than one per revolution of the WAL disk, no matter how small the
transactions. (*) So it pays to make the transactions larger, not smaller.

On my machine I see a sizable difference (more than 2x) in the rate at
which simple INSERT statements are processed as separate transactions
and as large batches --- if I have fsync on.  With fsync off, nearly no
difference.

            regards, tom lane

(*) See recent proposals from Curtis Faith in pgsql-hackers about how
we might circumvent that limit ... but it's there today.

Re: performance of insert/delete/update

From
Josh Berkus
Date:
Tom,

> On my machine I see a sizable difference (more than 2x) in the rate at
> which simple INSERT statements are processed as separate transactions
> and as large batches --- if I have fsync on.  With fsync off, nearly no
> difference.

I'm using fdatasych, which *does* perform faster than fsych on my system.
Could this make the difference?

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: performance of insert/delete/update

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
>> On my machine I see a sizable difference (more than 2x) in the rate at
>> which simple INSERT statements are processed as separate transactions
>> and as large batches --- if I have fsync on.  With fsync off, nearly no
>> difference.

> I'm using fdatasych, which *does* perform faster than fsych on my system.
> Could this make the difference?

No; you still have to write the data and wait for the disk to spin.
(FWIW, PG defaults to wal_sync_method = open_datasync on my system,
and that's what I used in checking the speed just now.  So I wasn't
actually executing any fsync() calls either.)

On lots of PC hardware, the disks are configured to lie and report write
complete as soon as they've accepted the data into their internal
buffers.  If you see very little difference between fsync on and fsync
off, or if you are able to benchmark transaction rates in excess of your
disk's RPM, you should suspect that your disk drive is lying to you.

As an example: in testing INSERT speed on my old HP box just now,
I got measured rates of about 16000 inserts/minute with fsync off, and
5700/min with fsync on (for 1 INSERT per transaction).  Knowing that my
disk drive is 6000 RPM, the latter number is credible.  On my PC I get
numbers way higher than the disk rotation rate :-(

            regards, tom lane

Re: performance of insert/delete/update

From
Josh Berkus
Date:
Tom,

> As an example: in testing INSERT speed on my old HP box just now,
> I got measured rates of about 16000 inserts/minute with fsync off, and
> 5700/min with fsync on (for 1 INSERT per transaction).  Knowing that my
> disk drive is 6000 RPM, the latter number is credible.  On my PC I get
> numbers way higher than the disk rotation rate :-(

Thanks for the info.  As long as I have your ear, what's your opinion on the
risk level of running with fsynch off on a production system?   I've seen a
lot of posts on this list opining the lack of danger, but I'm a bit paranoid.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: performance of insert/delete/update

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Thanks for the info.  As long as I have your ear, what's your opinion on the
> risk level of running with fsynch off on a production system?

Depends on how much you trust your hardware, kernel, and power source.

Fsync off does not introduce any danger from Postgres crashes --- we
always write data out of userspace to the kernel before committing.
The question is whether writes can be relied on to get to disk once
the kernel has 'em.

There is a definite risk of data corruption (not just lost transactions,
but actively inconsistent database contents) if you suffer a
system-level crash while running with fsync off.  The theory of WAL
(which remember means write *ahead* log) is that it protects you from
data corruption as long as WAL records always hit disk before the
associated changes in database data files do.  Then after a crash you
can replay the WAL to make sure you have actually done all the changes
described by each readable WAL record, and presto you're consistent up
to the end of the readable WAL.  But if data file writes can get to disk
in advance of their WAL record, you could have a situation where some
but not all changes described by a WAL record are in the database after
a system crash and recovery.  This could mean incompletely applied
transactions, broken indexes, or who knows what.

When you get right down to it, what we use fsync for is to force write
ordering --- Unix kernels do not guarantee write ordering any other way.
We use it to ensure WAL records hit disk before data file changes do.

Bottom line: I wouldn't run with fsync off in a mission-critical
database.  If you're prepared to accept a risk of having to restore from
your last backup after a system crash, maybe it's okay.

            regards, tom lane

Re: performance of insert/delete/update

From
Josh Berkus
Date:
Tom,

> When you get right down to it, what we use fsync for is to force write
> ordering --- Unix kernels do not guarantee write ordering any other way.
> We use it to ensure WAL records hit disk before data file changes do.
>
> Bottom line: I wouldn't run with fsync off in a mission-critical
> database.  If you're prepared to accept a risk of having to restore from
> your last backup after a system crash, maybe it's okay.

Thanks for that overview.  Sadly, even with fsynch on, I was forced to restore
from backup because the data needs to be 100% reliable and the crash was due
to a disk lockup on a checkpoint ... beyond the ability of WAL to deal with,
I think.

One last, last question:  I was just asked a question on IRC, and I can't find
docs defining fsynch, fdatasynch, opensynch, and opendatasynch beyond section
11.3 which just says that they are all synch methods.   Are there docs?


--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: performance of insert/delete/update

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> One last, last question: I was just asked a question on IRC, and I
> can't find docs defining fsynch, fdatasynch, opensynch, and
> opendatasynch beyond section 11.3 which just says that they are all
> synch methods.  Are there docs?

Section 11.3 of what?

The only mention of open_datasync that I see in the docs is in the
Admin Guide chapter 3:
http://developer.postgresql.org/docs/postgres/runtime-config.html#RUNTIME-CONFIG-WAL

which saith

WAL_SYNC_METHOD (string)

     Method used for forcing WAL updates out to disk. Possible values
     are FSYNC (call fsync() at each commit), FDATASYNC (call
     fdatasync() at each commit), OPEN_SYNC (write WAL files with open()
     option O_SYNC), or OPEN_DATASYNC (write WAL files with open()
     option O_DSYNC). Not all of these choices are available on all
     platforms. This option can only be set at server start or in the
     postgresql.conf file.

This may not help you much to decide which to use :-(, but it does tell
you what they are.

            regards, tom lane

Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On Fri, 22 Nov 2002, Josh Berkus wrote:

> Scott,
>
> > > > The absolutely most important thing to do to speed up inserts and
> > > > updates
> > > > is to squeeze as many as you can into one transaction.
>
> I was discussing this on IRC, and nobody could verify this assertion.
>   Do you have an example of bunlding multiple writes into a transaction
> giving a performance gain?

Yes, my own experience.

It's quite easy to test if you have a database with a large table to play
with, use pg_dump to dump a table with the -d switch (makes the dump use
insert statements.)  Then, make two versions of the dump, one which has a
begin;end; pair around all the inserts and one that doesn't, then use psql
-e to restore both dumps.  The difference is HUGE.  Around 10 to 20 times
faster with the begin end pairs.

I'd think that anyone who's used postgresql for more than a few months
could corroborate my experience.


Re: performance of insert/delete/update

From
Josh Berkus
Date:
Scott,

> It's quite easy to test if you have a database with a large table to play
> with, use pg_dump to dump a table with the -d switch (makes the dump use
> insert statements.)  Then, make two versions of the dump, one which has a
> begin;end; pair around all the inserts and one that doesn't, then use psql
> -e to restore both dumps.  The difference is HUGE.  Around 10 to 20 times
> faster with the begin end pairs.
>
> I'd think that anyone who's used postgresql for more than a few months
> could corroborate my experience.

Ouch!

No need to get testy about it.

Your test works as you said; the way I tried testing it before was different.
Good to know.   However, this approach is only useful if you are doing
rapidfire updates or inserts coming off a single connection.  But then it is
*very* useful.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On Mon, 25 Nov 2002, Josh Berkus wrote:

> Scott,
>
> > It's quite easy to test if you have a database with a large table to play
> > with, use pg_dump to dump a table with the -d switch (makes the dump use
> > insert statements.)  Then, make two versions of the dump, one which has a
> > begin;end; pair around all the inserts and one that doesn't, then use psql
> > -e to restore both dumps.  The difference is HUGE.  Around 10 to 20 times
> > faster with the begin end pairs.
> >
> > I'd think that anyone who's used postgresql for more than a few months
> > could corroborate my experience.
>
> Ouch!
>
> No need to get testy about it.
>
> Your test works as you said; the way I tried testing it before was different.
> Good to know.   However, this approach is only useful if you are doing
> rapidfire updates or inserts coming off a single connection.  But then it is
> *very* useful.

I didn't mean that in a testy way, it's just that after you've sat through
a fifteen minute wait while a 1000 records are inserted, you pretty
quickly switch to the method of inserting  them all in one big
transaction.  That's all.

Note that the opposite is what really gets people in trouble.  I've seen
folks inserting rather large amounts of data, say into ten or 15 tables,
and their web servers were crawling under parallel load.  Then, they put
them into a single transaction and they just flew.

The funny thing it, they've often avoided transactions because they
figured they'd be slower than just inserting the rows, and you kinda have
to make them sit down first before you show them the performance increase
from putting all those inserts into a single transaction.

No offense meant, really.  It's just that you seemed to really doubt that
putting things into one transaction helped, and putting things into one
big transaction if like the very first postgresql lesson a lot of
newcomers learn. :-)


Re: performance of insert/delete/update

From
Tim Gardner
Date:
>The funny thing it, they've often avoided transactions because they
>figured they'd be slower than just inserting the rows, and you kinda have
>to make them sit down first before you show them the performance increase
>from putting all those inserts into a single transaction.
>
>No offense meant, really.  It's just that you seemed to really doubt that
>putting things into one transaction helped, and putting things into one
>big transaction if like the very first postgresql lesson a lot of
>newcomers learn. :-)

Scott,

I'm new to postgresql, and as you suggested, this is
counter-intuitive to me.  I would have thought that having to store
all the inserts to be able to roll them back would take longer.  Is
my thinking wrong or not relevant?  Why is this not the case?

Thanks,
Tim

Re: performance of insert/delete/update

From
Rod Taylor
Date:
> I'm new to postgresql, and as you suggested, this is
> counter-intuitive to me.  I would have thought that having to store
> all the inserts to be able to roll them back would take longer.  Is
> my thinking wrong or not relevant?  Why is this not the case?

Typically that is the case.  But Postgresql switches it around a little
bit.  Different trade-offs.  No rollback log, but other processes are
forced to go through you're left over garbage (hence 'vacuum').

It's still kinda slow with hundreds of connections (as compared to
Oracle) -- but close enough that a license fee -> hardware purchase
funds transfer more than makes up for it.

Get yourself a 1GB battery backed ramdisk on it's own scsi chain for WAL
and it'll fly no matter what size of transaction you use ;)

--
Rod Taylor <rbt@rbt.ca>


Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On Mon, 25 Nov 2002, Tim Gardner wrote:

> >The funny thing it, they've often avoided transactions because they
> >figured they'd be slower than just inserting the rows, and you kinda have
> >to make them sit down first before you show them the performance increase
> >from putting all those inserts into a single transaction.
> >
> >No offense meant, really.  It's just that you seemed to really doubt that
> >putting things into one transaction helped, and putting things into one
> >big transaction if like the very first postgresql lesson a lot of
> >newcomers learn. :-)
>
> Scott,
>
> I'm new to postgresql, and as you suggested, this is
> counter-intuitive to me.  I would have thought that having to store
> all the inserts to be able to roll them back would take longer.  Is
> my thinking wrong or not relevant?  Why is this not the case?

Your thinking on this is wrong, and it is counter-intuitive to think that
a transaction would speed things up.  Postgresql is very different from
other databases.

Postgresql was designed from day one as a transactional database.  Which
is why it was so bothersome that an Oracle marketroid recently was telling
the .org folks why they shouldn't use Postgresql because it didn't have
transactions.  Postgresql may have a few warts here and there, but not
supporting transactions has NEVER been a problem for it.

There are two factors that make Postgresql so weird in regards to
transactions.  One it that everything happens in a transaction (we won't
mention truncate for a while, it's the only exception I know of.)

The next factor that makes for fast inserts of large amounts of data in a
transaction is MVCC.  With Oracle and many other databases, transactions
are written into a seperate log file, and when you commit, they are
inserted into the database as one big group.  This means you write your
data twice, once into the transaction log, and once into the database.

With Postgresql's implementation of MVCC, all your data are inserted in
real time, with a transaction date that makes the other clients ignore
them (mostly, other read committed transactions may or may not see them.)

If there are indexes to update, they are updated in the same "invisible
until committed" way.

All this means that your inserts don't block anyone else's reads as well.

This means that when you commit, all postgresql does is make them visible.

In the event you roll back a transaction, the tuples are all just marked
as dead and they get ignored.

It's interesting when you work with folks who came from other databases.
My coworker, who's been using Postgresql for about 2 years now, had an
interesting experience when he first started here.  He was inserting
something like 10,000 rows.  He comes over and tells me there must be
something wrong with the database, as his inserts have been running for 10
minutes, and he's not even halfway through.  So I had him stop the
inserts, clean out the rows (it was a new table for a new project) and
wrap all 10,000 inserts into a transaction.  What had been running for 10
minutes now ran in about 30 seconds.

He was floored.

Well, good luck on using postgresql, and definitely keep in touch with the
performance and general mailing lists.  They're a wealth of useful info.


Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On 25 Nov 2002, Rod Taylor wrote:

> > I'm new to postgresql, and as you suggested, this is
> > counter-intuitive to me.  I would have thought that having to store
> > all the inserts to be able to roll them back would take longer.  Is
> > my thinking wrong or not relevant?  Why is this not the case?
>
> Typically that is the case.  But Postgresql switches it around a little
> bit.  Different trade-offs.  No rollback log, but other processes are
> forced to go through you're left over garbage (hence 'vacuum').

Yeah, which means you always need to do a vacuum on a table after a lot of
updates/deletes.  And analyze after a lot of inserts/updates/deletes.

> It's still kinda slow with hundreds of connections (as compared to
> Oracle) -- but close enough that a license fee -> hardware purchase
> funds transfer more than makes up for it.

Ain't it amazing how much hardware a typical Oracle license can buy? ;^)

Heck, even the license cost MS-SQL server is enough to buy a nice quad
Xeon with all the trimmings nowadays.  Then you can probably still have
enough left over for one of the pgsql developers to fly out and train your
folks on it.


Re: performance of insert/delete/update

From
Tim Gardner
Date:
>With Postgresql's implementation of MVCC, all your data are inserted in
>real time, with a transaction date that makes the other clients ignore
>them (mostly, other read committed transactions may or may not see them.)
>
>If there are indexes to update, they are updated in the same "invisible
>until committed" way.
>
>All this means that your inserts don't block anyone else's reads as well.
>
>This means that when you commit, all postgresql does is make them visible.

scott,

Exactly the kind of explanation/understanding I was hoping for!

Thank you!

Tim

Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On Mon, 25 Nov 2002, scott.marlowe wrote:

> On Mon, 25 Nov 2002, Tim Gardner wrote:
>
> > I'm new to postgresql, and as you suggested, this is
> > counter-intuitive to me.  I would have thought that having to store
> > all the inserts to be able to roll them back would take longer.  Is
> > my thinking wrong or not relevant?  Why is this not the case?
>
> Your thinking on this is wrong, and it is counter-intuitive to think that
> a transaction would speed things up.  Postgresql is very different from
> other databases.

Sorry that came out like that, I meant to write:

I meant to add in there that I thought the same way at first, and only
after a little trial and much error did I realize that I was thinking in
terms of how other databases did things.  I.e. most people make the same
mistake when starting out with pgsql.


Re: performance of insert/delete/update

From
"Rich Scott"
Date:
I have seen a number of real-world situations where bundling inserts into
transactions made a considerable difference - sometimes as much as a 100x
speed-up, and not just in Postgresql databases, but also commercial
systems
(my experience is in Oracle & Sybase). I've often used an idiom of
building
up rows until I hit some high-water mark, and then insert those rows in
one
fell swoop - it's almost always measurably faster than one-at-a-time.

Sidebar: a number of years ago, while contracting at a regional telephone
company,
I became peripherally enmired in a gigantic billing-system-makeover
fiasco.
Upon initial deployment, the system was so slow at processing that it was
taking about 30 hours for each day of billing data. After a week or so,
when it became apparent that fundamental Cash Flow was threatened, there
were
multi-hour conference calls, in which various VPs called for massive h/w
upgrades and/or lawsuits against Oracle. An astute cohort of mine asked
to
see some of the code, and found out that the original developers (at the
telco)
had created a bloated and slow control system in C++, using semaphores or
somesuch,
to *serialize* inserts/updates/deletes, and so they had gigantic
home-built
queues of insert jobs. Not only were they not bundling updates in
transactions,
they were only ever doing one transaction at a time. (Apparently, they
never
learned RDBMS fundamentals.) He told them to rip out all that code, and
let
Oracle (like any other decent RDBMS) handle the update ordering. The
resultant
speed-up factor was several hundred times.

-R


On Mon, 25 Nov 2002 15:59:16 -0700 (MST), "scott.marlowe"
<scott.marlowe@ihs.com> said:
> On Mon, 25 Nov 2002, Josh Berkus wrote:
>
> > Scott,
> >
> > > It's quite easy to test if you have a database with a large table to play
> > > with, use pg_dump to dump a table with the -d switch (makes the dump use
> > > insert statements.)  Then, make two versions of the dump, one which has a
> > > begin;end; pair around all the inserts and one that doesn't, then use psql
> > > -e to restore both dumps.  The difference is HUGE.  Around 10 to 20 times
> > > faster with the begin end pairs.
> > >
> > > I'd think that anyone who's used postgresql for more than a few months
> > > could corroborate my experience.
> >
> > Ouch!
> >
> > No need to get testy about it.
> >
> > Your test works as you said; the way I tried testing it before was different.
> > Good to know.   However, this approach is only useful if you are doing
> > rapidfire updates or inserts coming off a single connection.  But then it is
> > *very* useful.
>
> I didn't mean that in a testy way, it's just that after you've sat
> through
> a fifteen minute wait while a 1000 records are inserted, you pretty
> quickly switch to the method of inserting  them all in one big
> transaction.  That's all.
>
> Note that the opposite is what really gets people in trouble.  I've seen
> folks inserting rather large amounts of data, say into ten or 15 tables,
> and their web servers were crawling under parallel load.  Then, they put
> them into a single transaction and they just flew.
>
> The funny thing it, they've often avoided transactions because they
> figured they'd be slower than just inserting the rows, and you kinda have
> to make them sit down first before you show them the performance increase
> from putting all those inserts into a single transaction.
>
> No offense meant, really.  It's just that you seemed to really doubt that
> putting things into one transaction helped, and putting things into one
> big transaction if like the very first postgresql lesson a lot of
> newcomers learn. :-)
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

Re: performance of insert/delete/update

From
Josh Berkus
Date:
Scott,

> No offense meant, really.  It's just that you seemed to really doubt that
> putting things into one transaction helped, and putting things into one
> big transaction if like the very first postgresql lesson a lot of
> newcomers learn. :-)

Not so odd, if you think about it.  After all, this approach is only useful
for a series of small update/insert statements on a single connection.
Thinking about it, I frankly never do this except as part of a stored
procedure ... which, in Postgres, is automatically a transaction.

I'm lucky enough that my data loads have all been adaptable to COPY
statements, which bypasses this issue completely.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: performance of insert/delete/update

From
Ron Johnson
Date:
On Mon, 2002-11-25 at 18:23, scott.marlowe wrote:
> On Mon, 25 Nov 2002, Tim Gardner wrote:
>
[snip]
>
> There are two factors that make Postgresql so weird in regards to
> transactions.  One it that everything happens in a transaction (we won't
> mention truncate for a while, it's the only exception I know of.)

Why is this so weird?  Do I use the /other/ weird RDBMS?  (Rdb/VMS)

> The next factor that makes for fast inserts of large amounts of data in a
> transaction is MVCC.  With Oracle and many other databases, transactions
> are written into a seperate log file, and when you commit, they are
> inserted into the database as one big group.  This means you write your
> data twice, once into the transaction log, and once into the database.

You are just deferring the pain.  Whereas others must flush from log
to "database files", they do not have to VACUUM or VACUUM ANALYZE.

> With Postgresql's implementation of MVCC, all your data are inserted in
> real time, with a transaction date that makes the other clients ignore
> them (mostly, other read committed transactions may or may not see them.)

Is this unusual?  (Except that Rdb/VMS uses a 64-bit integer (a
Transaction Sequence Number)  instead of a timestamp, because Rdb,
cominging from VAX/VMS is natively cluster-aware, and it's not
guaranteed that all nodes have the exact same timestamp.

[snip]
> In the event you roll back a transaction, the tuples are all just marked
> as dead and they get ignored.

What if you are in a 24x365 environment?  Doing a VACUUM ANALYZE would
really slow down the nightly operations.

> It's interesting when you work with folks who came from other databases.
> My coworker, who's been using Postgresql for about 2 years now, had an
> interesting experience when he first started here.  He was inserting
> something like 10,000 rows.  He comes over and tells me there must be
> something wrong with the database, as his inserts have been running for 10
> minutes, and he's not even halfway through.  So I had him stop the
> inserts, clean out the rows (it was a new table for a new project) and
> wrap all 10,000 inserts into a transaction.  What had been running for 10
> minutes now ran in about 30 seconds.

Again, why is this so unusual?????

--
+------------------------------------------------------------+
| Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
|                                                            |
| "they love our milk and honey, but preach about another    |
|  way of living"                                            |
|    Merle Haggard, "The Fighting Side Of Me"                |
+------------------------------------------------------------+


Re: performance of insert/delete/update

From
Ron Johnson
Date:
On Mon, 2002-11-25 at 18:43, Rich Scott wrote:
[snip]
> upgrades and/or lawsuits against Oracle. An astute cohort of mine asked
> to
> see some of the code, and found out that the original developers (at the
> telco)
> had created a bloated and slow control system in C++, using semaphores or
> somesuch,
> to *serialize* inserts/updates/deletes, and so they had gigantic
> home-built
> queues of insert jobs. Not only were they not bundling updates in
> transactions,
> they were only ever doing one transaction at a time. (Apparently, they
> never
> learned RDBMS fundamentals.) He told them to rip out all that code, and
> let
> Oracle (like any other decent RDBMS) handle the update ordering. The
> resultant
> speed-up factor was several hundred times.

Just goes to show the difference between RDBMSs.  Even with the biggest
Alphas possible at the time, our Customer Service Center app was
crawling because of lock conflicts (each transaction affected ~10-12
tables).  Rdb/VMS works best with a TP monitor (specifically ACMS).
We were'nt using one, and weren't handling lock conflicts in the best
way.  Thus, the slowdowns.

The developers finally wrote a home-grown dedicated TP-like server,
using DEC Message Queue and /serialized/ data flow.  Without all
of the lock conflicts, performance warped forward.

The reason for this, I think, is that Oracle serializes things itself,
whereas Rdb/VMS expects ACMS to do the serialization.  (Since both
Rdb & ACMS were both written by DEC, that's understandable I think.)

--
+------------------------------------------------------------+
| Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
|                                                            |
| "they love our milk and honey, but preach about another    |
|  way of living"                                            |
|    Merle Haggard, "The Fighting Side Of Me"                |
+------------------------------------------------------------+


Re: performance of insert/delete/update

From
Tom Lane
Date:
Ron Johnson <ron.l.johnson@cox.net> writes:
> On Mon, 2002-11-25 at 18:23, scott.marlowe wrote:
>> The next factor that makes for fast inserts of large amounts of data in a
>> transaction is MVCC.  With Oracle and many other databases, transactions
>> are written into a seperate log file, and when you commit, they are
>> inserted into the database as one big group.  This means you write your
>> data twice, once into the transaction log, and once into the database.

> You are just deferring the pain.  Whereas others must flush from log
> to "database files", they do not have to VACUUM or VACUUM ANALYZE.

Sure, it's just shuffling the housekeeping work from one place to
another.  The thing that I like about Postgres' approach is that we
put the housekeeping in a background task (VACUUM) rather than in the
critical path of foreground transaction commit.

            regards, tom lane

Re: performance of insert/delete/update

From
Tom Lane
Date:
Tim Gardner <tgardner@codeHorse.com> writes:
>> All this means that your inserts don't block anyone else's reads as well.
>> This means that when you commit, all postgresql does is make them visible.

> Exactly the kind of explanation/understanding I was hoping for!

There's another point worth making.  What Scott was pointing out is that
whether you commit or roll back a transaction costs about the same, in
Postgres, as far as tuple update processing is concerned.  At the end of
a transaction, we have both new (inserted/updated) and old
(deleted/replaced) tuples laying about in the database.  Commit marks
the transaction committed in pg_clog; abort marks it aborted instead;
neither one lifts a finger to touch the tuples.  (Subsequent visitors
to the tuples will mark them "good" or "dead" based on consulting
pg_clog, but we don't try to do that during transaction commit.)

But having said all that, transaction commit is more expensive than
transaction abort, because we have to flush the transaction commit
WAL record to disk before we can report "transaction successfully
committed".  That means waiting for the disk to spin.  Transaction abort
doesn't have to wait --- that's because if there's a crash and the abort
record never makes it to disk, the default assumption on restart will be
that the transaction aborted, anyway.

So the basic reason that it's worth batching multiple updates into one
transaction is that you only wait for the commit record flush once,
not once per update.  This makes no difference worth mentioning if your
updates are big, but on modern hardware you can update quite a few
individual rows in the time it takes the disk to spin once.

(BTW, if you set fsync = off, then the performance difference goes away,
because we don't wait for the commit record to flush to disk ... but
then you become vulnerable to problems after a system crash.)

            regards, tom lane

Re: performance of insert/delete/update

From
Rod Taylor
Date:
On Mon, 2002-11-25 at 19:30, scott.marlowe wrote:
> On 25 Nov 2002, Rod Taylor wrote:
>
> > > I'm new to postgresql, and as you suggested, this is
> > > counter-intuitive to me.  I would have thought that having to store
> > > all the inserts to be able to roll them back would take longer.  Is
> > > my thinking wrong or not relevant?  Why is this not the case?
> >
> > Typically that is the case.  But Postgresql switches it around a little
> > bit.  Different trade-offs.  No rollback log, but other processes are
> > forced to go through you're left over garbage (hence 'vacuum').
>
> Yeah, which means you always need to do a vacuum on a table after a lot of
> updates/deletes.  And analyze after a lot of inserts/updates/deletes.

A good auto-vacuum daemon will help that out :)  Not really any
different than an OO dbs garbage collection process -- except PGs vacuum
is several orders of magnitude faster.
--
Rod Taylor <rbt@rbt.ca>


Re: performance of insert/delete/update

From
Ron Johnson
Date:
On Mon, 2002-11-25 at 21:30, Tom Lane wrote:
> Ron Johnson <ron.l.johnson@cox.net> writes:
> > On Mon, 2002-11-25 at 18:23, scott.marlowe wrote:
> >> The next factor that makes for fast inserts of large amounts of data in a
> >> transaction is MVCC.  With Oracle and many other databases, transactions
> >> are written into a seperate log file, and when you commit, they are
> >> inserted into the database as one big group.  This means you write your
> >> data twice, once into the transaction log, and once into the database.
>
> > You are just deferring the pain.  Whereas others must flush from log
> > to "database files", they do not have to VACUUM or VACUUM ANALYZE.
>
> Sure, it's just shuffling the housekeeping work from one place to
> another.  The thing that I like about Postgres' approach is that we
> put the housekeeping in a background task (VACUUM) rather than in the
> critical path of foreground transaction commit.

If you have a quiescent point somewhere in the middle of the night...

It's all about differing philosophies, though, and there's no way
that Oracle will re-write Rdb/VMS (they bought it from DEC in 1997
for it's high-volume OLTP technolgies) and you all won't re-write
Postgres...

--
+------------------------------------------------------------+
| Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
|                                                            |
| "they love our milk and honey, but preach about another    |
|  way of living"                                            |
|    Merle Haggard, "The Fighting Side Of Me"                |
+------------------------------------------------------------+


[HACKERS] Realtime VACUUM, was: performance of insert/delete/update

From
"Curtis Faith"
Date:
tom lane wrote:
> Sure, it's just shuffling the housekeeping work from one place to
> another.  The thing that I like about Postgres' approach is that we
> put the housekeeping in a background task (VACUUM) rather than in the
> critical path of foreground transaction commit.

Thinking with my marketing hat on, MVCC would be a much bigger win if VACUUM
was not required (or was done automagically). The need for periodic VACUUM
just gives ammunition to the PostgreSQL opponents who can claim we are
deferring work but that it amounts to the same thing.

A fully automatic background VACUUM will significantly reduce but will not
eliminate this perceived weakness.

However, it always seemed to me there should be some way to reuse the space
more dynamically and quickly than a background VACUUM thereby reducing the
percentage of tuples that are expired in heavy update cases. If only a very
tiny number of tuples on the disk are expired this will reduce the aggregate
performance/space penalty of MVCC into insignificance for the majority of
uses.

Couldn't we reuse tuple and index space as soon as there are no transactions
that depend on the old tuple or index values. I have imagined that this was
always part of the long-term master plan.

Couldn't we keep a list of dead tuples in shared memory and look in the list
first when deciding where to place new values for inserts or updates so we
don't have to rely on VACUUM (even a background one)? If there are expired
tuple slots in the list these would be used before allocating a new slot from
the tuple heap.

The only issue is determining the lowest transaction ID for in-process
transactions which seems relatively easy to do (if it's not already done
somewhere).

In the normal shutdown and startup case, a tuple VACUUM could be performed
automatically. This would normally be very fast since there would not be many
tuples in the list.

Index slots would be handled differently since these cannot be substituted
one for another. However, these could be recovered as part of every index
page update. Pages would be scanned before being written and any expired
slots that had transaction ID's lower than the lowest active slot would be
removed. This could be done for non-leaf pages as well and would result in
only reorganizing a page that is already going to be written thereby not
adding much to the overall work.

I don't think that internal pages that contain pointers to values in nodes
further down the tree that are no longer in the leaf nodes because of this
partial expired entry elimination will cause a problem since searches and
scans will still work fine.

Does VACUUM do something that could not be handled in this realtime manner?

- Curtis



Re: performance of insert/delete/update

From
Andrew Sullivan
Date:
On Mon, Nov 25, 2002 at 05:30:00PM -0700, scott.marlowe wrote:
>
> Ain't it amazing how much hardware a typical Oracle license can buy? ;^)

Not to mention the hardware budget for a typical Oracle installation.

A
--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: performance of insert/delete/update

From
Andrew Sullivan
Date:
On Mon, Nov 25, 2002 at 07:41:03PM -0600, Ron Johnson wrote:
>
> What if you are in a 24x365 environment?  Doing a VACUUM ANALYZE would
> really slow down the nightly operations.

Why?  After upgrading to 7.2, we find it a good idea to do frequent
vacuum analyse on frequently-changed tables.  It doesn't block, and
if you vacuum frequently enough, it goes real fast.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On Tue, 26 Nov 2002, Andrew Sullivan wrote:

> On Mon, Nov 25, 2002 at 07:41:03PM -0600, Ron Johnson wrote:
> >
> > What if you are in a 24x365 environment?  Doing a VACUUM ANALYZE would
> > really slow down the nightly operations.
>
> Why?  After upgrading to 7.2, we find it a good idea to do frequent
> vacuum analyse on frequently-changed tables.  It doesn't block, and
> if you vacuum frequently enough, it goes real fast.

For example, I just ran pgbench -c 20 -t 200 (20 concurrent's) with a
script in the background that looked like this:

#!/bin/bash
for ((a=0;a=1;a=0)) do {
        vacuumdb -z postgres
}
done

(i.e. run vacuumdb in analyze against the database continuously.)

Output of top:

71 processes: 63 sleeping, 8 running, 0 zombie, 0 stopped
CPU0 states: 66.2% user, 25.1% system,  0.0% nice,  8.1% idle
CPU1 states: 79.4% user, 18.3% system,  0.0% nice,  1.2% idle
Mem:   254660K av,  249304K used,    5356K free,   26736K shrd,   21720K
buff
Swap: 3084272K av,    1300K used, 3082972K free                  142396K
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
21381 postgres  11   0  1304 1304   868 S    10.8  0.5   0:00 pgbench
21393 postgres  14   0  4832 4832  4116 R     8.4  1.8   0:00 postmaster
21390 postgres   9   0  4880 4880  4164 S     7.8  1.9   0:00 postmaster
21385 postgres  14   0  4884 4884  4168 R     6.7  1.9   0:00 postmaster
21399 postgres   9   0  4768 4768  4076 S     6.3  1.8   0:00 postmaster
21402 postgres   9   0  4776 4776  4076 S     6.1  1.8   0:00 postmaster
21383 postgres  14   0  4828 4828  4112 R     5.9  1.8   0:00 postmaster
21386 postgres  14   0  4872 4872  4156 R     5.9  1.9   0:00 postmaster
21392 postgres   9   0  4820 4820  4104 S     5.9  1.8   0:00 postmaster
21409 postgres  11   0  4600 4600  3544 R     5.8  1.8   0:00 postmaster
21387 postgres   9   0  4824 4824  4108 S     5.4  1.8   0:00 postmaster
21394 postgres   9   0  4808 4808  4092 S     5.4  1.8   0:00 postmaster
21391 postgres   9   0  4816 4816  4100 S     5.0  1.8   0:00 postmaster
21398 postgres   9   0  4796 4796  4088 S     5.0  1.8   0:00 postmaster
21384 postgres   9   0  4756 4756  4040 R     4.8  1.8   0:00 postmaster
21389 postgres   9   0  4788 4788  4072 S     4.8  1.8   0:00 postmaster
21397 postgres   9   0  4772 4772  4056 S     4.6  1.8   0:00 postmaster
21388 postgres   9   0  4780 4780  4064 S     4.4  1.8   0:00 postmaster
21396 postgres   9   0  4756 4756  4040 S     4.3  1.8   0:00 postmaster
21395 postgres  14   0  4760 4760  4044 S     4.1  1.8   0:00 postmaster
21401 postgres  14   0  4736 4736  4036 R     4.1  1.8   0:00 postmaster
21400 postgres   9   0  4732 4732  4028 S     2.9  1.8   0:00 postmaster
21403 postgres   9   0  1000 1000   820 S     2.4  0.3   0:00 vacuumdb
21036 postgres   9   0  1056 1056   828 R     2.0  0.4   0:27 top
18615 postgres   9   0  1912 1912  1820 S     1.1  0.7   0:01 postmaster
21408 postgres   9   0   988  988   804 S     0.7  0.3   0:00 psql

So, pgbench is the big eater of CPU at 10%, each postmaster using about
5%, and vacuumdb using 2.4%.  Note that after a second, the vacuumdb use
drops off to 0% until it finishes and runs again.  The output of the
pgbench without vacuumdb running, but with top, to be fair was:

number of clients: 20
number of transactions per client: 200
number of transactions actually processed: 4000/4000
tps = 54.428632 (including connections establishing)
tps = 54.847276 (excluding connections establishing)

While the output with the vacuumdb running continuously was:

number of clients: 20
number of transactions per client: 200
number of transactions actually processed: 4000/4000
tps = 52.114343 (including connections establishing)
tps = 52.873435 (excluding connections establishing)

So, the difference in performance was around 4% slower.

I'd hardly consider that a big hit against the database.

Note that in every test I've made up and run, the difference is at most 5%
with vacuumdb -z running continuously in the background.  Big text fields,
lots of math, lots of fks, etc...

Yes, vacuum WAS a problem long ago, but since 7.2 came out it's only a
"problem" in terms of remember to run it.


Re: performance of insert/delete/update

From
Andrew Sullivan
Date:
On Tue, Nov 26, 2002 at 11:06:47AM -0700, scott.marlowe wrote:
> So, the difference in performance was around 4% slower.
>
> I'd hardly consider that a big hit against the database.
>
> Note that in every test I've made up and run, the difference is at most 5%
> with vacuumdb -z running continuously in the background.  Big text fields,
> lots of math, lots of fks, etc...

Also, it's important to remember that you may see a considerable
improvement in efficiency of some queries if you vacuum often, (it's
partly dependent on the turnover in your database -- if it never
changes, you don't need to vacuum often).  So a 5% hit in regular
performance may be worth it over the long haul, if certain queries
are way cheaper to run.  (That is, while you may get 4% slower
performance overall, if the really slow queries are much faster, the
fast queries running slower may well be worth it.  In my case,
certainly, I think it is.)

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: performance of insert/delete/update

From
"scott.marlowe"
Date:
On Tue, 26 Nov 2002, Andrew Sullivan wrote:

> On Tue, Nov 26, 2002 at 11:06:47AM -0700, scott.marlowe wrote:
> > So, the difference in performance was around 4% slower.
> >
> > I'd hardly consider that a big hit against the database.
> >
> > Note that in every test I've made up and run, the difference is at most 5%
> > with vacuumdb -z running continuously in the background.  Big text fields,
> > lots of math, lots of fks, etc...
>
> Also, it's important to remember that you may see a considerable
> improvement in efficiency of some queries if you vacuum often, (it's
> partly dependent on the turnover in your database -- if it never
> changes, you don't need to vacuum often).  So a 5% hit in regular
> performance may be worth it over the long haul, if certain queries
> are way cheaper to run.  (That is, while you may get 4% slower
> performance overall, if the really slow queries are much faster, the
> fast queries running slower may well be worth it.  In my case,
> certainly, I think it is.)

Agreed.  We used to run vacuumdb at night only when we were running 7.1,
and we had a script top detect if it had hung or anything.  I.e. vacuuming
was still a semi-dangerous activity.  I now have it set to run every hour
(-z -a switches to vacuumdb).  I'd run it more often but we just don't
have enough load to warrant it.


Re: [HACKERS] Realtime VACUUM, was: performance of insert/delete/update

From
Bruce Momjian
Date:
Good ideas.  I think the master solution is to hook the statistics
daemon information into an automatic vacuum that could _know_ which
tables need attention.

---------------------------------------------------------------------------

Curtis Faith wrote:
> tom lane wrote:
> > Sure, it's just shuffling the housekeeping work from one place to
> > another.  The thing that I like about Postgres' approach is that we
> > put the housekeeping in a background task (VACUUM) rather than in the
> > critical path of foreground transaction commit.
>
> Thinking with my marketing hat on, MVCC would be a much bigger win if VACUUM
> was not required (or was done automagically). The need for periodic VACUUM
> just gives ammunition to the PostgreSQL opponents who can claim we are
> deferring work but that it amounts to the same thing.
>
> A fully automatic background VACUUM will significantly reduce but will not
> eliminate this perceived weakness.
>
> However, it always seemed to me there should be some way to reuse the space
> more dynamically and quickly than a background VACUUM thereby reducing the
> percentage of tuples that are expired in heavy update cases. If only a very
> tiny number of tuples on the disk are expired this will reduce the aggregate
> performance/space penalty of MVCC into insignificance for the majority of
> uses.
>
> Couldn't we reuse tuple and index space as soon as there are no transactions
> that depend on the old tuple or index values. I have imagined that this was
> always part of the long-term master plan.
>
> Couldn't we keep a list of dead tuples in shared memory and look in the list
> first when deciding where to place new values for inserts or updates so we
> don't have to rely on VACUUM (even a background one)? If there are expired
> tuple slots in the list these would be used before allocating a new slot from
> the tuple heap.
>
> The only issue is determining the lowest transaction ID for in-process
> transactions which seems relatively easy to do (if it's not already done
> somewhere).
>
> In the normal shutdown and startup case, a tuple VACUUM could be performed
> automatically. This would normally be very fast since there would not be many
> tuples in the list.
>
> Index slots would be handled differently since these cannot be substituted
> one for another. However, these could be recovered as part of every index
> page update. Pages would be scanned before being written and any expired
> slots that had transaction ID's lower than the lowest active slot would be
> removed. This could be done for non-leaf pages as well and would result in
> only reorganizing a page that is already going to be written thereby not
> adding much to the overall work.
>
> I don't think that internal pages that contain pointers to values in nodes
> further down the tree that are no longer in the leaf nodes because of this
> partial expired entry elimination will cause a problem since searches and
> scans will still work fine.
>
> Does VACUUM do something that could not be handled in this realtime manner?
>
> - Curtis
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: performance of insert/delete/update

From
Robert Treat
Date:
On Mon, 2002-11-25 at 23:27, Ron Johnson wrote:
> On Mon, 2002-11-25 at 21:30, Tom Lane wrote:
> > Ron Johnson <ron.l.johnson@cox.net> writes:
> > > On Mon, 2002-11-25 at 18:23, scott.marlowe wrote:
> > >> The next factor that makes for fast inserts of large amounts of data in a
> > >> transaction is MVCC.  With Oracle and many other databases, transactions
> > >> are written into a seperate log file, and when you commit, they are
> > >> inserted into the database as one big group.  This means you write your
> > >> data twice, once into the transaction log, and once into the database.
> >
> > > You are just deferring the pain.  Whereas others must flush from log
> > > to "database files", they do not have to VACUUM or VACUUM ANALYZE.
> >
> > Sure, it's just shuffling the housekeeping work from one place to
> > another.  The thing that I like about Postgres' approach is that we
> > put the housekeeping in a background task (VACUUM) rather than in the
> > critical path of foreground transaction commit.
>
> If you have a quiescent point somewhere in the middle of the night...
>

You seem to be implying that running vacuum analyze causes some large
performance issues, but it's just not the case.  I run a 24x7 operation,
and I have a few tables that "turn over" within 15 minutes. On these
tables I run vacuum analyze every 5 - 10 minutes and really there is
little/no performance penalty.

Robert Treat



Re: [HACKERS] Realtime VACUUM, was: performance of insert/delete/update

From
Tom Lane
Date:
"Curtis Faith" <curtis@galtair.com> writes:
> tom lane wrote:
>> Sure, it's just shuffling the housekeeping work from one place to
>> another.  The thing that I like about Postgres' approach is that we
>> put the housekeeping in a background task (VACUUM) rather than in the
>> critical path of foreground transaction commit.

> Couldn't we reuse tuple and index space as soon as there are no transactions
> that depend on the old tuple or index values. I have imagined that this was
> always part of the long-term master plan.
> Couldn't we keep a list of dead tuples in shared memory and look in the list
> first when deciding where to place new values for inserts or updates so we
> don't have to rely on VACUUM (even a background one)?

ISTM that either of these ideas would lead to pushing VACUUM overhead
into the foreground transactions, which is exactly what we don't want to
do.  Keep in mind also that shared memory is finite ... *very* finite.
It's bad enough trying to keep per-page status in there (cf FSM) ---
per-tuple status is right out.

I agree that automatic background VACUUMing would go a long way towards
reducing operational problems.

            regards, tom lane

Re: [HACKERS] Realtime VACUUM, was: performance of insert/delete/update

From
"Nicolai Tufar"
Date:
I always wandered if VACUUM is the right name for the porcess. Now, when
PostgreSQL
is actively challenging in Enterprise space, it might be a good idea to give
it a more
enterprise-like name. Try to think how it is looking for an outside person
to see
us, database professionals hold lenghty discussions about the ways we
vacuum a database. Why should you need to vacuum a database? Is it
dirty? In my personal opinion, something like "space reclaiming daemon",
"free-list organizer", "tuple recyle job" or "segment coalesce process"
would
sound more business-like .

Regards,
Nick


----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Curtis Faith" <curtis@galtair.com>
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; "Ron Johnson" <ron.l.johnson@cox.net>;
"PgSQL Performance ML" <pgsql-performance@postgresql.org>;
<pgsql-hackers@postgresql.org>
Sent: Tuesday, November 26, 2002 9:09 PM
Subject: Re: [PERFORM] [HACKERS] Realtime VACUUM, was: performance of
insert/delete/update


>
> Good ideas.  I think the master solution is to hook the statistics
> daemon information into an automatic vacuum that could _know_ which
> tables need attention.
>
> --------------------------------------------------------------------------
-
>
> Curtis Faith wrote:
> > tom lane wrote:
> > > Sure, it's just shuffling the housekeeping work from one place to
> > > another.  The thing that I like about Postgres' approach is that we
> > > put the housekeeping in a background task (VACUUM) rather than in the
> > > critical path of foreground transaction commit.
> >
> > Thinking with my marketing hat on, MVCC would be a much bigger win if
VACUUM
> > was not required (or was done automagically). The need for periodic
VACUUM
> > just gives ammunition to the PostgreSQL opponents who can claim we are
> > deferring work but that it amounts to the same thing.
> >
> > A fully automatic background VACUUM will significantly reduce but will
not
> > eliminate this perceived weakness.
> >
> > However, it always seemed to me there should be some way to reuse the
space
> > more dynamically and quickly than a background VACUUM thereby reducing
the
> > percentage of tuples that are expired in heavy update cases. If only a
very
> > tiny number of tuples on the disk are expired this will reduce the
aggregate
> > performance/space penalty of MVCC into insignificance for the majority
of
> > uses.
> >
> > Couldn't we reuse tuple and index space as soon as there are no
transactions
> > that depend on the old tuple or index values. I have imagined that this
was
> > always part of the long-term master plan.
> >
> > Couldn't we keep a list of dead tuples in shared memory and look in the
list
> > first when deciding where to place new values for inserts or updates so
we
> > don't have to rely on VACUUM (even a background one)? If there are
expired
> > tuple slots in the list these would be used before allocating a new slot
from
> > the tuple heap.
> >
> > The only issue is determining the lowest transaction ID for in-process
> > transactions which seems relatively easy to do (if it's not already done
> > somewhere).
> >
> > In the normal shutdown and startup case, a tuple VACUUM could be
performed
> > automatically. This would normally be very fast since there would not be
many
> > tuples in the list.
> >
> > Index slots would be handled differently since these cannot be
substituted
> > one for another. However, these could be recovered as part of every
index
> > page update. Pages would be scanned before being written and any expired
> > slots that had transaction ID's lower than the lowest active slot would
be
> > removed. This could be done for non-leaf pages as well and would result
in
> > only reorganizing a page that is already going to be written thereby not
> > adding much to the overall work.
> >
> > I don't think that internal pages that contain pointers to values in
nodes
> > further down the tree that are no longer in the leaf nodes because of
this
> > partial expired entry elimination will cause a problem since searches
and
> > scans will still work fine.
> >
> > Does VACUUM do something that could not be handled in this realtime
manner?
> >
> > - Curtis
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania
19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


Re: [HACKERS] Realtime VACUUM, was: performance of insert/delete/update

From
Jim Beckstrom
Date:
Just for the humor of it, as well as to confirm Nick's perspective,
years ago on our inhouse developed Burroughs mainframe dbms, we had a
process called "garbage collect".

Nicolai Tufar wrote:

>I always wandered if VACUUM is the right name for the porcess. Now, when
>PostgreSQL
>is actively challenging in Enterprise space, it might be a good idea to give
>it a more
>enterprise-like name. Try to think how it is looking for an outside person
>to see
>us, database professionals hold lenghty discussions about the ways we
>vacuum a database. Why should you need to vacuum a database? Is it
>dirty? In my personal opinion, something like "space reclaiming daemon",
>"free-list organizer", "tuple recyle job" or "segment coalesce process"
>would
>sound more business-like .
>
>Regards,
>Nick
>
>
>----- Original Message -----
>From: "Bruce Momjian" <pgman@candle.pha.pa.us>
>To: "Curtis Faith" <curtis@galtair.com>
>Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; "Ron Johnson" <ron.l.johnson@cox.net>;
>"PgSQL Performance ML" <pgsql-performance@postgresql.org>;
><pgsql-hackers@postgresql.org>
>Sent: Tuesday, November 26, 2002 9:09 PM
>Subject: Re: [PERFORM] [HACKERS] Realtime VACUUM, was: performance of
>insert/delete/update
>
>
>
>
>>Good ideas.  I think the master solution is to hook the statistics
>>daemon information into an automatic vacuum that could _know_ which
>>tables need attention.
>>
>>--------------------------------------------------------------------------
>>
>>
>-
>
>
>>Curtis Faith wrote:
>>
>>
>>>tom lane wrote:
>>>
>>>
>>>>Sure, it's just shuffling the housekeeping work from one place to
>>>>another.  The thing that I like about Postgres' approach is that we
>>>>put the housekeeping in a background task (VACUUM) rather than in the
>>>>critical path of foreground transaction commit.
>>>>
>>>>
>>>Thinking with my marketing hat on, MVCC would be a much bigger win if
>>>
>>>
>VACUUM
>
>
>>>was not required (or was done automagically). The need for periodic
>>>
>>>
>VACUUM
>
>
>>>just gives ammunition to the PostgreSQL opponents who can claim we are
>>>deferring work but that it amounts to the same thing.
>>>
>>>A fully automatic background VACUUM will significantly reduce but will
>>>
>>>
>not
>
>
>>>eliminate this perceived weakness.
>>>
>>>However, it always seemed to me there should be some way to reuse the
>>>
>>>
>space
>
>
>>>more dynamically and quickly than a background VACUUM thereby reducing
>>>
>>>
>the
>
>
>>>percentage of tuples that are expired in heavy update cases. If only a
>>>
>>>
>very
>
>
>>>tiny number of tuples on the disk are expired this will reduce the
>>>
>>>
>aggregate
>
>
>>>performance/space penalty of MVCC into insignificance for the majority
>>>
>>>
>of
>
>
>>>uses.
>>>
>>>Couldn't we reuse tuple and index space as soon as there are no
>>>
>>>
>transactions
>
>
>>>that depend on the old tuple or index values. I have imagined that this
>>>
>>>
>was
>
>
>>>always part of the long-term master plan.
>>>
>>>Couldn't we keep a list of dead tuples in shared memory and look in the
>>>
>>>
>list
>
>
>>>first when deciding where to place new values for inserts or updates so
>>>
>>>
>we
>
>
>>>don't have to rely on VACUUM (even a background one)? If there are
>>>
>>>
>expired
>
>
>>>tuple slots in the list these would be used before allocating a new slot
>>>
>>>
>from
>
>
>>>the tuple heap.
>>>
>>>The only issue is determining the lowest transaction ID for in-process
>>>transactions which seems relatively easy to do (if it's not already done
>>>somewhere).
>>>
>>>In the normal shutdown and startup case, a tuple VACUUM could be
>>>
>>>
>performed
>
>
>>>automatically. This would normally be very fast since there would not be
>>>
>>>
>many
>
>
>>>tuples in the list.
>>>
>>>Index slots would be handled differently since these cannot be
>>>
>>>
>substituted
>
>
>>>one for another. However, these could be recovered as part of every
>>>
>>>
>index
>
>
>>>page update. Pages would be scanned before being written and any expired
>>>slots that had transaction ID's lower than the lowest active slot would
>>>
>>>
>be
>
>
>>>removed. This could be done for non-leaf pages as well and would result
>>>
>>>
>in
>
>
>>>only reorganizing a page that is already going to be written thereby not
>>>adding much to the overall work.
>>>
>>>I don't think that internal pages that contain pointers to values in
>>>
>>>
>nodes
>
>
>>>further down the tree that are no longer in the leaf nodes because of
>>>
>>>
>this
>
>
>>>partial expired entry elimination will cause a problem since searches
>>>
>>>
>and
>
>
>>>scans will still work fine.
>>>
>>>Does VACUUM do something that could not be handled in this realtime
>>>
>>>
>manner?
>
>
>>>- Curtis
>>>
>>>
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 4: Don't 'kill -9' the postmaster
>>>
>>>
>>>
>>--
>>  Bruce Momjian                        |  http://candle.pha.pa.us
>>  pgman@candle.pha.pa.us               |  (610) 359-1001
>>  +  If your life is a hard drive,     |  13 Roberts Road
>>  +  Christ can be your backup.        |  Newtown Square, Pennsylvania
>>
>>
>19073
>
>
>>---------------------------(end of broadcast)---------------------------
>>TIP 2: you can get off all lists at once with the unregister command
>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>
>>
>>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
>





Re: [HACKERS] Realtime VACUUM, was: performance of

From
"scott.marlowe"
Date:
In a similar vein, setting the way back machine to the mid 80s when I was
in the USAF and teaching the computer subsystem of the A-10 INS test
station, we had old reclaimed Sperry 1650 computers (the precursor to the
1750) that had come out of the 1960 era fire control systems on
battleships like the Missouri and what not.

When the OS went south, it would put up a message that said "System Crash
at address XXXXXXX" or something very similar.  A colonol saw that and
insisted that the folks who wrote the OS change the word crash, since in
the Air Force crash (as in plane crash) had such bad connotations.  So, it
got changed to "System Fault at address xxxxxxxxx"  For the first month or
two that happened, folks would ask what a system fault was and what to do
with it.  They new that a crash would need the machine to be power cycled
but didn't know what to do with a system fault.  Shortly after that, the
manual for the test station had a little section added to it that
basically said a system fault was a crash.  :-)

On Wed, 27 Nov 2002, Jim Beckstrom wrote:

> Just for the humor of it, as well as to confirm Nick's perspective,
> years ago on our inhouse developed Burroughs mainframe dbms, we had a
> process called "garbage collect".
>
> Nicolai Tufar wrote:
>
> >I always wandered if VACUUM is the right name for the porcess. Now, when
> >PostgreSQL
> >is actively challenging in Enterprise space, it might be a good idea to give
> >it a more
> >enterprise-like name. Try to think how it is looking for an outside person
> >to see
> >us, database professionals hold lenghty discussions about the ways we
> >vacuum a database. Why should you need to vacuum a database? Is it
> >dirty? In my personal opinion, something like "space reclaiming daemon",
> >"free-list organizer", "tuple recyle job" or "segment coalesce process"
> >would
> >sound more business-like .
> >