Thread: How does the transaction buffer work?

How does the transaction buffer work?

From
Veikko Mäkinen
Date:
Hey,

How does Postgres (8.0.x) buffer changes to a database within a
transaction? I need to insert/update more than a thousand rows (mayde
even more than 10000 rows, ~100 bytes/row) in a table but the changes
must not be visible to other users/transactions before every row is
updated. One way of doing this that I thought of was start a
transaction, delete everything and then just dump new data in (copy
perhaps). The old data would be usable to other transactions until I
commit my insert. This would be the fastest way, but how much memory
would this use? Will this cause performance issues on a heavily loaded
server with too little memory even to begin with :)


-veikko

Re: How does the transaction buffer work?

From
John A Meinel
Date:
Veikko Mäkinen wrote:

> Hey,
>
> How does Postgres (8.0.x) buffer changes to a database within a
> transaction? I need to insert/update more than a thousand rows (mayde
> even more than 10000 rows, ~100 bytes/row) in a table but the changes
> must not be visible to other users/transactions before every row is
> updated. One way of doing this that I thought of was start a
> transaction, delete everything and then just dump new data in (copy
> perhaps). The old data would be usable to other transactions until I
> commit my insert. This would be the fastest way, but how much memory
> would this use? Will this cause performance issues on a heavily loaded
> server with too little memory even to begin with :)
>
Postgres does indeed keep track of who can see what. Such that changes
won't be seen until a final commit.
If you are trying to insert bulk data, definitely consider COPY.

But UPDATE should also be invisible until the commit. So if you are only
changing data, there really isn't any reason to do a DELETE and INSERT.
Especially since you'll have problems with foreign keys at the DELETE stage.

John
=:->

>
> -veikko




Attachment

Re: How does the transaction buffer work?

From
PFC
Date:
>> transaction, delete everything and then just dump new data in (copy
>> perhaps). The old data would be usable to other transactions until I
>> commit my insert. This would be the fastest way, but how much memory
>> would this use? Will this cause performance issues on a heavily loaded
>> server with too little memory even to begin with :)

    Well.

    If you DELETE everything in your table and then COPY in new rows, it will
be fast, old rows will still be visible until the COMMIT. I hope you
haven't anything referencing this table with ON DELETE CASCADE on it, or
else you might delete more stuff than you think.

    Also you have to consider locking.

    You could TRUNCATE the table instead of deleting, but then anyone trying
to SELECT from it will block until the updater transaction is finished.

    If you DELETE you could also vacuum afterwards.

    You could also COPY your rows to a temporary table and use a Joined
Update to update your table in place. This might well be the more elegant
solution, and the only one if the updated table has foreign key references
pointing to it.


Re: How does the transaction buffer work?

From
Josh Berkus
Date:
Veikko,

> One way of doing this that I thought of was start a
> transaction, delete everything and then just dump new data in (copy
> perhaps). The old data would be usable to other transactions until I
> commit my insert. This would be the fastest way, but how much memory
> would this use?

Starting a transaction doesn't use any more memory than without one.
Unlike Some Other Databases, PostgreSQL's transactions occur in WAL and on
data pages, not in RAM.

> Will this cause performance issues on a heavily loaded
> server with too little memory even to begin with :)

Quite possibly, but the visibility issue won't be the problem.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: How does the transaction buffer work?

From
Tom Lane
Date:
=?ISO-8859-1?Q?Veikko_M=E4kinen?= <veikko.makinen@ecom.fi> writes:
> How does Postgres (8.0.x) buffer changes to a database within a
> transaction? I need to insert/update more than a thousand rows (mayde
> even more than 10000 rows, ~100 bytes/row) in a table but the changes
> must not be visible to other users/transactions before every row is
> updated.

There were some other responses already, but I wanted to add this:
there isn't any "transaction buffer" in Postgres.  The above scenario
won't cause us any noticeable problem, because what we do is mark
each row with its originating transaction ID, and then readers compare
that to the set of transaction IDs that they think are "in the past".
The number of rows affected by a transaction is not really a factor
at all.

Now of course this isn't Nirvana, you must pay somewhere ;-) and our
weak spot is the need for VACUUM.  But you have no need to fear large
individual transactions.

            regards, tom lane

Re: How does the transaction buffer work?

From
Rod Taylor
Date:
> Now of course this isn't Nirvana, you must pay somewhere ;-) and our
> weak spot is the need for VACUUM.  But you have no need to fear large
> individual transactions.

No need to fear long running transactions other than their ability to
stop VACUUM from doing what it's supposed to be doing, thus possibly
impacting performance.
--