Re: One tuple per transaction - Mailing list pgsql-performance

From Qingqing Zhou
Subject Re: One tuple per transaction
Date
Msg-id d12qcr$a1r$1@news.hub.org
Whole thread Raw
In response to One tuple per transaction  ("Tambet Matiisen" <t.matiisen@aprote.ee>)
List pgsql-performance
""Tambet Matiisen"" <t.matiisen@aprote.ee> writes
> Hi!
>
> In one of our applications we have a database function, which
> recalculates COGS (cost of good sold) for certain period. This involves
> deleting bunch of rows from one table, inserting them again in correct
> order and updating them one-by-one (sometimes one row twice) to reflect
> current state. The problem is, that this generates an enormous amount of
> tuples in that table.
>
> If I'm correct, the dead tuples must be scanned also during table and
> index scan, so a lot of dead tuples slows down queries considerably,
> especially when the table doesn't fit into shared buffers any more. And
> as I'm in transaction, I can't VACUUM to get rid of those tuples. In one
> occasion the page count for a table went from 400 to 22000 at the end.

Not exactly. The dead tuple in the index will be scanned the first time (and
its pointed heap tuple as well), then we will mark it dead, then next time
we came here, we will know that the index tuple actually points to a uesless
tuple, so we will not scan its pointed heap tuple.

>
> All this made me wonder, why is new tuple created after every update?
> One tuple per transaction should be enough, because you always commit or
> rollback transaction as whole. And my observations seem to indicate,
> that new index tuple is created after column update even if this column
> is not indexed.

This is one cost of MVCC. A good thing of MVCC is there is no conflict
between read and write - maybe some applications need this.

A reference could be found here:

http://www.postgresql.org/docs/8.0/static/storage-page-layout.html#HEAPTUPLEHEADERDATA-TABLE

>
> One tuple per transaction would save a loads of I/O bandwidth, so I
> believe there must be a reason why it isn't implemented as such. Or were
> my assumptions wrong, that dead tuples must be read from disk?
>
>   Tambet
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: How to read query plan
Next
From: Daniel Schuchardt
Date:
Subject: Re: cpu_tuple_cost