Re: Tuple concurrency issue in large objects - Mailing list pgsql-general

From Justin
Subject Re: Tuple concurrency issue in large objects
Date
Msg-id CALL-XeOqXXd30e6Yo5_7ZURxtoMLn6SWVOA=U9SqoL1SDywB-A@mail.gmail.com
Whole thread Raw
In response to Re: Tuple concurrency issue in large objects  ("Daniel Verite" <daniel@manitou-mail.org>)
Responses Re: Tuple concurrency issue in large objects
List pgsql-general
I have a question reading through this email chain.   Does Large Objects table using these functions work like normal MVCC where there can be two versions of a large object in pg_largeobject .  My gut says no as moving/copying potentially 4 TB of data would kill any IO. 

I can not find any documentation discussing how these functions actually work with respect to Transaction Isolation, MVCC and Snapshots??

On Wed, Dec 18, 2019 at 10:05 AM Daniel Verite <daniel@manitou-mail.org> wrote:
        Shalini wrote:

> Could you also please state the reason why is it happening in case
> of large objects? Because concurrent transactions are very well
> handled for other data types, but the same is not happening for
> lobs. Is it because the fomer are stored in toast table and there is
> no support for concurrent txns in pg_largeobject table?

Keeping in mind that large objects are not a datatype, but rather a
functionality that is built on top of the bytea and oid datatypes plus
a set of functions, I wouldn't say that concurrent writes would be
better handled if you had a table: document(id serial, contents bytea)
with "contents" being indeed toastable.

To illustrate with a basic example: transactions Tx1 and Tx2
want to update the contents of the same document concurrently,
with this order of execution:

Tx1: begin
Tx1: update document set contents=... where id=...
Tx2: begin
Tx2: update the same document (gets blocked)
Tx1: commit
Tx2: commit

If using the read committed isolation level, Tx2 will be put to wait
until Tx1 commits, and then the update by Tx1 will be overwritten by
Tx2. That's a well known anomaly known as a "lost update", and
probably not what you want.

If using a better isolation level (repeatable read or serializable),
the update by Tx2 will be rejected with a serialization failure,
which, to me, seems the moral equivalent of the "Tuple concurrently
updated" error you're reporting with large objects.
When this occurs, your application can fetch the latest value in a new
transaction and see how it can apply its change to the new value,
unless another conflict arises and so on.

In short, the best the database can do in case of conflicting writes
is to inform the application. It can't know which write should be
prioritized or if the changes should be merged before being written.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


pgsql-general by date:

Previous
From: "Daniel Verite"
Date:
Subject: Re: Tuple concurrency issue in large objects
Next
From: Tom Lane
Date:
Subject: Re: Tuple concurrency issue in large objects