Thread: How does the transaction buffer work?
Hey, How does Postgres (8.0.x) buffer changes to a database within a transaction? I need to insert/update more than a thousand rows (mayde even more than 10000 rows, ~100 bytes/row) in a table but the changes must not be visible to other users/transactions before every row is updated. One way of doing this that I thought of was start a transaction, delete everything and then just dump new data in (copy perhaps). The old data would be usable to other transactions until I commit my insert. This would be the fastest way, but how much memory would this use? Will this cause performance issues on a heavily loaded server with too little memory even to begin with :) -veikko
Veikko Mäkinen wrote: > Hey, > > How does Postgres (8.0.x) buffer changes to a database within a > transaction? I need to insert/update more than a thousand rows (mayde > even more than 10000 rows, ~100 bytes/row) in a table but the changes > must not be visible to other users/transactions before every row is > updated. One way of doing this that I thought of was start a > transaction, delete everything and then just dump new data in (copy > perhaps). The old data would be usable to other transactions until I > commit my insert. This would be the fastest way, but how much memory > would this use? Will this cause performance issues on a heavily loaded > server with too little memory even to begin with :) > Postgres does indeed keep track of who can see what. Such that changes won't be seen until a final commit. If you are trying to insert bulk data, definitely consider COPY. But UPDATE should also be invisible until the commit. So if you are only changing data, there really isn't any reason to do a DELETE and INSERT. Especially since you'll have problems with foreign keys at the DELETE stage. John =:-> > > -veikko
Attachment
>> transaction, delete everything and then just dump new data in (copy >> perhaps). The old data would be usable to other transactions until I >> commit my insert. This would be the fastest way, but how much memory >> would this use? Will this cause performance issues on a heavily loaded >> server with too little memory even to begin with :) Well. If you DELETE everything in your table and then COPY in new rows, it will be fast, old rows will still be visible until the COMMIT. I hope you haven't anything referencing this table with ON DELETE CASCADE on it, or else you might delete more stuff than you think. Also you have to consider locking. You could TRUNCATE the table instead of deleting, but then anyone trying to SELECT from it will block until the updater transaction is finished. If you DELETE you could also vacuum afterwards. You could also COPY your rows to a temporary table and use a Joined Update to update your table in place. This might well be the more elegant solution, and the only one if the updated table has foreign key references pointing to it.
Veikko, > One way of doing this that I thought of was start a > transaction, delete everything and then just dump new data in (copy > perhaps). The old data would be usable to other transactions until I > commit my insert. This would be the fastest way, but how much memory > would this use? Starting a transaction doesn't use any more memory than without one. Unlike Some Other Databases, PostgreSQL's transactions occur in WAL and on data pages, not in RAM. > Will this cause performance issues on a heavily loaded > server with too little memory even to begin with :) Quite possibly, but the visibility issue won't be the problem. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
=?ISO-8859-1?Q?Veikko_M=E4kinen?= <veikko.makinen@ecom.fi> writes: > How does Postgres (8.0.x) buffer changes to a database within a > transaction? I need to insert/update more than a thousand rows (mayde > even more than 10000 rows, ~100 bytes/row) in a table but the changes > must not be visible to other users/transactions before every row is > updated. There were some other responses already, but I wanted to add this: there isn't any "transaction buffer" in Postgres. The above scenario won't cause us any noticeable problem, because what we do is mark each row with its originating transaction ID, and then readers compare that to the set of transaction IDs that they think are "in the past". The number of rows affected by a transaction is not really a factor at all. Now of course this isn't Nirvana, you must pay somewhere ;-) and our weak spot is the need for VACUUM. But you have no need to fear large individual transactions. regards, tom lane
> Now of course this isn't Nirvana, you must pay somewhere ;-) and our > weak spot is the need for VACUUM. But you have no need to fear large > individual transactions. No need to fear long running transactions other than their ability to stop VACUUM from doing what it's supposed to be doing, thus possibly impacting performance. --