Home > mailing lists

Re: Cross-database SERIALIZABLE safe snapshots - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Cross-database SERIALIZABLE safe snapshots
Date	July 10, 2023 09:17:50
Msg-id	0974d29d-d047-36fb-d466-93b933411442@iki.fi Whole thread Raw
In response to	Cross-database SERIALIZABLE safe snapshots (Thomas Munro <thomas.munro@gmail.com>)
Responses	Re: Cross-database SERIALIZABLE safe snapshots
List	pgsql-hackers

Tree view

On 09/03/2023 07:34, Thomas Munro wrote:
> Here is a feature idea that emerged from a pgsql-bugs thread[1] that I
> am kicking into the next commitfest.  Example:
> 
> s1: \c db1
> s1: CREATE TABLE t (i int);
> s1: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> s1: INSERT INTO t VALUES (42);
> 
> s2: \c db2
> s2: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY DEFERRABLE;
> s2: SELECT * FROM x;
> 
> I don't know of any reason why s2 should have to wait, or
> alternatively, without DEFERRABLE, why it shouldn't immediately drop
> from SSI to SI (that is, opt out of predicate locking and go faster).
> This change relies on the fact that PostgreSQL doesn't allow any kind
> of cross-database access, except for shared catalogs, and all catalogs
> are already exempt from SSI.  I have updated this new version of the
> patch to explain that very clearly at the place where that exemption
> happens, so that future hackers would see that we rely on that fact
> elsewhere if reconsidering that.

Makes sense.

> @@ -1814,7 +1823,17 @@ GetSerializableTransactionSnapshotInt(Snapshot snapshot,
>                 {
>                         othersxact = dlist_container(SERIALIZABLEXACT, xactLink, iter.cur);
>  
> -                       if (!SxactIsCommitted(othersxact)
> +                       /*
> +                        * We can't possibly have an unsafe conflict with a transaction in
> +                        * another database.  The only possible overlap is on shared
> +                        * catalogs, but we don't support SSI for shared catalogs.  The
> +                        * invalid database case covers 2PC, because we don't yet record
> +                        * database OIDs in the 2PC information.  We also filter out doomed
> +                        * transactions as they can't possibly commit.
> +                        */
> +                       if ((othersxact->database == InvalidOid ||
> +                                othersxact->database == MyDatabaseId)
> +                               && !SxactIsCommitted(othersxact)
>                                 && !SxactIsDoomed(othersxact)
>                                 && !SxactIsReadOnly(othersxact))
>                         {

Why don't we set the database OID in 2PC transactions? We actually do 
set it correctly - or rather we never clear it - when a transaction is 
prepared. But you set it to invalid when recovering a prepared 
transaction on system startup. So the comment is a bit misleading: the 
optimization doesn't apply to 2PC transactions recovered after restart, 
other 2PC transactions are fine.

I'm sure it's not a big deal in practice, but it's also not hard to fix. 
We do store the database OID in the twophase state. The caller of 
predicatelock_twophase_recover() has it, we just need a little plumbing 
to pass it down.

Attached patches:

1. Rebased version of your patch, just trivial pgindent conflict fixes
2. Some comment typo fixes and improvements
3. Set the database ID on recovered 2PC transactions

A test for this would be nice. isolationtester doesn't support 
connecting to different databases, restarting the server to test the 2PC 
recovery, but a TAP test could do it.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Attachment

pgsql-hackers by date:

From: Daniel Gustafsson
Date: 10 July 2023, 09:14:21
Subject: Re: Split index and table statistics into different types of stats

From: "Wen Yi"
Date: 10 July 2023, 09:20:23
Subject: [Question] Can someone provide some links related to the MemoryContext?

Re: Cross-database SERIALIZABLE safe snapshots - Mailing list pgsql-hackers

Attachment

Previous

Next