Thread: INSERT/UPDATEs cycles and lack of phantom locking

INSERT/UPDATEs cycles and lack of phantom locking

From
Florian Weimer
Date:
I've got several tables where I need to either insert new records, or
update existing ones (identified based on the primary key).  For
performance reasons, I want to do this in batches, so I plan to use
something like this:

BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED; -- SERIALIZABLE
...
CREATE TEMPORARY TABLE tmp (key TEXT NOT NULL, new_val INTEGER NOT NULL);
COPY tmp (key, new_val) FROM STDIN;
...
\.
-- SAVEPOINT tmp_created;  -- (see below)

CREATE TEMPORARY TABLE tmp2 AS SELECT tmp.key, new_val, real.val AS old_val FROM tmp LEFT OUTER JOIN real ON tmp.key =
real.key;
UPDATE real SET val = new_val + old_val FROM tmp2 WHERE old_val IS NOT NULL AND tmp2.key = real.key;
INSERT INTO real SELECT key, new_val FROM tmp2 WHERE old_val IS NULL;

If this is run concurrently, the INSERT may fail.  In this case, I
rerun the transaction.  Actually, I want to rollback to the
tmp_created checkpoint, but I don't think this will pick up the new
rows in the "real" table, and the INSERT will fail again.

Usually, the batch size is small enough that the necessary data is
still cached, and concurrent updates aren't the norm, so this approach
(complete transaction rollback) is not completely infeasible.

However, I still wonder if there is a more straightforward solution.
Serializing the updates isn't one, I think.  Is there some form of
table-based advisory locking which I could use?  This way, I wouldn't
lock out ordinary readers (which is crucial), but the reading part of
an updating transaction would be blocked.  For bonus points, deadlocks
would be automatically detected by PostgreSQL (although I would order
the locks properly in the usual case, but I can't guarantee this for
all codepaths due to the modularity of the application).

Florian
-- 
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Durlacher Allee 47            tel: +49-721-96201-1
D-76131 Karlsruhe             fax: +49-721-96201-99


Re: INSERT/UPDATEs cycles and lack of phantom locking

From
Tom Lane
Date:
Florian Weimer <fweimer@bfk.de> writes:
> BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED; -- SERIALIZABLE
> ...
> -- SAVEPOINT tmp_created;  -- (see below)

> CREATE TEMPORARY TABLE tmp2 AS SELECT tmp.key, new_val, real.val AS old_val
>   FROM tmp LEFT OUTER JOIN real ON tmp.key = real.key;
> UPDATE real SET val = new_val + old_val FROM tmp2
>   WHERE old_val IS NOT NULL AND tmp2.key = real.key;
> INSERT INTO real SELECT key, new_val FROM tmp2 WHERE old_val IS NULL;

> If this is run concurrently, the INSERT may fail.  In this case, I
> rerun the transaction.  Actually, I want to rollback to the
> tmp_created checkpoint, but I don't think this will pick up the new
> rows in the "real" table, and the INSERT will fail again.

Why do you think that?  If you're running in READ COMMITTED mode then
each statement takes a new snapshot.
        regards, tom lane


Re: INSERT/UPDATEs cycles and lack of phantom locking

From
Florian Weimer
Date:
* Tom Lane:

> Why do you think that?  If you're running in READ COMMITTED mode then
> each statement takes a new snapshot.

Ah, I should have explained that.  I might need the SERIALIZABLE
isolation level in the future (this code doesn't need it, but other
things in the same transaction might require it).

In addition, it occurred to me that I get the INSERT failure only if
there is a suitable PRIMARY KEY/UNIQUE constraint on the table.  I
haven't got that in all cases, so I need that advisory locking anyway,
I fear.

-- 
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Durlacher Allee 47            tel: +49-721-96201-1
D-76131 Karlsruhe             fax: +49-721-96201-99


Re: INSERT/UPDATEs cycles and lack of phantom locking

From
Florian Weimer
Date:
* Florian Weimer:

> In addition, it occurred to me that I get the INSERT failure only if
> there is a suitable PRIMARY KEY/UNIQUE constraint on the table.  I
> haven't got that in all cases, so I need that advisory locking anyway,

It seems that LOCK TABLE ... IN EXCLUSIVE MODE does exactly what I
need: it locks out itself (and write access), but not read access to
the table.  And deadlocks are detected as well.  Yay!

-- 
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Durlacher Allee 47            tel: +49-721-96201-1
D-76131 Karlsruhe             fax: +49-721-96201-99