Thread: Serializable Transaction Anomoly
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/17/transaction-iso.html Description: I discovered an oddity in Serializable Transaction behavior and while referencing the current docs there is a possible contradiction and I'm not sure if this is a bug or expected behavior. At minimum there seems to be a contradiction in the Transaction Isolation page of the docs. 1. At the top "serialization anomoly" is defined as "The result of successfully committing a group of transactions is inconsistent with *all possible* orderings of running those transactions one at a time." (emphasis mine). 2. In the first paragraph of 13.2.3, sentence 4 states "In fact, this isolation level works exactly the same as Repeatable Read except that it also monitors for conditions which could make execution of a concurrent set of serializable transactions behave in a manner inconsistent with *all possible* serial (one at a time) executions of those transactions." (again I'm emphasizing 'all possible") 3. In the first large paragraph above the unordered list at the bottom it states "While PostgreSQL's Serializable transaction isolation level only allows concurrent transactions to commit if it can prove there is a serial order of execution that would produce the same effect, it doesn't always prevent errors from being raised that would not occur in true serial execution." - "if it can prove there is a serial order" implies if it can find a serial execution of statements that would have the same effect - that seems at odd with 1. and 2.? The example I found is caused by poor application code design but based on the docs I would expect the serialization anomaly detection to report a concurrent modification. The example I'm looking at assumes there is a `example` table with id and name. Serializable Transaction 1: INSERT INTO example (name) VALUES ('test1') RETURNING id; -- assume it returns id: 10 -- Don't commit Serializable Transaction 2: SELECT * from example WHERE id = 10 FOR UPDATE; -- Other databases block here, postgreSQL does not and returns 0 rows UPDATE example SET name = 'test2' WHERE id = 10; -- updates 0 rows because insert wasn't committed Serializable Transaction 1: COMMIT; -- example record with id 10 now exists in the database Serializable Transaction 2: COMMIT; -- I expected 40001 error but instead transaction committed without updating name. I understand that with Snapshot Isolation the new record doesn't exist when either SELECT FOR UPDATE or UPDATE execute in the 2nd transaction and the docs do specify "Predicate locks in PostgreSQL, like in most other database systems, are based on data actually accessed by a transaction." which implies if transaction 2 can't see the data it can't predicate lock the data, And I believe the application code should not have been triggering a background process (Transaction 2) before Transaction 1 commits because it could rollback. This seems to be in alignment with #3 but at odds with #1 and #2. Based on the docs I referenced above, our observed behavior seems to be in alignment with some of the documentation but in conflict with others. Am I misreading #1 and 2? Thanks, Daniel
On Tue, 2024-11-05 at 15:05 +0000, PG Doc comments form wrote: > I discovered an oddity in Serializable Transaction behavior and while > referencing the current docs there is a possible contradiction and I'm not > sure if this is a bug or expected behavior. At minimum there seems to be a > contradiction in the Transaction Isolation page of the docs. > > 1. At the top "serialization anomoly" is defined as "The result of > successfully committing a group of transactions is inconsistent with *all > possible* orderings of running those transactions one at a time." (emphasis > mine). > 2. In the first paragraph of 13.2.3, sentence 4 states "In fact, this > isolation level works exactly the same as Repeatable Read except that it > also monitors for conditions which could make execution of a concurrent set > of serializable transactions behave in a manner inconsistent with *all > possible* serial (one at a time) executions of those transactions." (again > I'm emphasizing 'all possible") > 3. In the first large paragraph above the unordered list at the bottom it > states "While PostgreSQL's Serializable transaction isolation level only > allows concurrent transactions to commit if it can prove there is a serial > order of execution that would produce the same effect, it doesn't always > prevent errors from being raised that would not occur in true serial > execution." - "if it can prove there is a serial order" implies if it can > find a serial execution of statements that would have the same effect - that > seems at odd with 1. and 2.? I don't see a contradiction. #1 defines what an anomaly is. #2 says that if there would be an anomaly with SERIALIZABLE isolation, you will get a serialization error. So there cannot be any false negatives. #3 says that it is possible to get false positive serialization errors, that is, serialization errors that occur even though the transactions really would be serializable. > The example I found is caused by poor application code design but based on > the docs I would expect the serialization anomaly detection to report a > concurrent modification. The example I'm looking at assumes there is a > `example` table with id and name. > > Serializable Transaction 1: > INSERT INTO example (name) VALUES ('test1') RETURNING id; -- assume it > returns id: 10 > -- Don't commit > > Serializable Transaction 2: > SELECT * from example WHERE id = 10 FOR UPDATE; -- Other databases block > here, postgreSQL does not and returns 0 rows > UPDATE example SET name = 'test2' WHERE id = 10; -- updates 0 rows because > insert wasn't committed > > Serializable Transaction 1: > COMMIT; -- example record with id 10 now exists in the database > > Serializable Transaction 2: > COMMIT; -- I expected 40001 error but instead transaction committed without > updating name. > > I understand that with Snapshot Isolation the new record doesn't exist when > either SELECT FOR UPDATE or UPDATE execute in the 2nd transaction and the > docs do specify "Predicate locks in PostgreSQL, like in most other database > systems, are based on data actually accessed by a transaction." which > implies if transaction 2 can't see the data it can't predicate lock the > data, And I believe the application code should not have been triggering a > background process (Transaction 2) before Transaction 1 commits because it > could rollback. The transactions you show above are serializable: if you execute transaction 2 strictly before transaction 1, you would end up with the same result. So there is an equivalent serial execution. Yours, Laurenz Albe
Thank you for the prompt response, I’m probably misreading the documentation and how it relates to the example we ran into.
The way I interpreted the documentation, the example I ran into was a false negative according to the definition of a serialization anomaly, because it’s serial in one ordering but not the other which seems incorrect with “all possible”.
I think where I don’t fully understand is the documentation seems to imply all serial orderings must be valid to commit a SERIALIZABLE transaction but it seems like just one serial ordering must be valid?
I appreciate your assistance and responses,
Daniel Bickler
From: Laurenz Albe <laurenz.albe@cybertec.at>
Date: Tuesday, November 5, 2024 at 12:21 PM
To: Daniel Bickler <daniel.bickler@goprominent.com>, pgsql-docs@lists.postgresql.org <pgsql-docs@lists.postgresql.org>
Subject: Re: Serializable Transaction Anomoly
IRONSCALES couldn't recognize this email as this is the first time you received an email from this sender laurenz.albe @ cybertec.at |
[You don't often get email from laurenz.albe@cybertec.at. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
On Tue, 2024-11-05 at 15:05 +0000, PG Doc comments form wrote:
> I discovered an oddity in Serializable Transaction behavior and while
> referencing the current docs there is a possible contradiction and I'm not
> sure if this is a bug or expected behavior. At minimum there seems to be a
> contradiction in the Transaction Isolation page of the docs.
>
> 1. At the top "serialization anomoly" is defined as "The result of
> successfully committing a group of transactions is inconsistent with *all
> possible* orderings of running those transactions one at a time." (emphasis
> mine).
> 2. In the first paragraph of 13.2.3, sentence 4 states "In fact, this
> isolation level works exactly the same as Repeatable Read except that it
> also monitors for conditions which could make execution of a concurrent set
> of serializable transactions behave in a manner inconsistent with *all
> possible* serial (one at a time) executions of those transactions." (again
> I'm emphasizing 'all possible")
> 3. In the first large paragraph above the unordered list at the bottom it
> states "While PostgreSQL's Serializable transaction isolation level only
> allows concurrent transactions to commit if it can prove there is a serial
> order of execution that would produce the same effect, it doesn't always
> prevent errors from being raised that would not occur in true serial
> execution." - "if it can prove there is a serial order" implies if it can
> find a serial execution of statements that would have the same effect - that
> seems at odd with 1. and 2.?
I don't see a contradiction.
#1 defines what an anomaly is.
#2 says that if there would be an anomaly with SERIALIZABLE isolation,
you will get a serialization error.
So there cannot be any false negatives.
#3 says that it is possible to get false positive serialization errors,
that is, serialization errors that occur even though the transactions really
would be serializable.
> The example I found is caused by poor application code design but based on
> the docs I would expect the serialization anomaly detection to report a
> concurrent modification. The example I'm looking at assumes there is a
> `example` table with id and name.
>
> Serializable Transaction 1:
> INSERT INTO example (name) VALUES ('test1') RETURNING id; -- assume it
> returns id: 10
> -- Don't commit
>
> Serializable Transaction 2:
> SELECT * from example WHERE id = 10 FOR UPDATE; -- Other databases block
> here, postgreSQL does not and returns 0 rows
> UPDATE example SET name = 'test2' WHERE id = 10; -- updates 0 rows because
> insert wasn't committed
>
> Serializable Transaction 1:
> COMMIT; -- example record with id 10 now exists in the database
>
> Serializable Transaction 2:
> COMMIT; -- I expected 40001 error but instead transaction committed without
> updating name.
>
> I understand that with Snapshot Isolation the new record doesn't exist when
> either SELECT FOR UPDATE or UPDATE execute in the 2nd transaction and the
> docs do specify "Predicate locks in PostgreSQL, like in most other database
> systems, are based on data actually accessed by a transaction." which
> implies if transaction 2 can't see the data it can't predicate lock the
> data, And I believe the application code should not have been triggering a
> background process (Transaction 2) before Transaction 1 commits because it
> could rollback.
The transactions you show above are serializable: if you execute transaction 2
strictly before transaction 1, you would end up with the same result.
So there is an equivalent serial execution.
Yours,
Laurenz Albe
On Tue, 2024-11-05 at 18:41 +0000, Daniel Bickler wrote: > The way I interpreted the documentation, the example I ran into was a false negative > according to the definition of a serialization anomaly, because it’s serial in one > ordering but not the other which seems incorrect with “all possible”. > > I think where I don’t fully understand is the documentation seems to imply all serial > orderings must be valid to commit a SERIALIZABLE transaction but it seems like just > one serial ordering must be valid? You seem to think that transactions are serializable if their result is consistent with all possible serial execution orders. But that is not so. What the documentation says is: It is an serialization anomaly (that is, not serializable) if the execution is *in*consistent will all possible serial executions. That implies: It is serializable if the execution is consistent with one serial execution. Yours, Laurenz Albe
You are correct, that is where I made the mistake.
Thank you for the clarification; I understand much better now and the behavior we experienced makes sense with the correct reading of the documentation.
Daniel Bickler
From: Laurenz Albe <laurenz.albe@cybertec.at>
Date: Tuesday, November 5, 2024 at 2:17 PM
To: Daniel Bickler <daniel.bickler@goprominent.com>, pgsql-docs@lists.postgresql.org <pgsql-docs@lists.postgresql.org>
Subject: Re: Serializable Transaction Anomoly
[You don't often get email from laurenz.albe@cybertec.at. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
On Tue, 2024-11-05 at 18:41 +0000, Daniel Bickler wrote:
> The way I interpreted the documentation, the example I ran into was a false negative
> according to the definition of a serialization anomaly, because it’s serial in one
> ordering but not the other which seems incorrect with “all possible”.
>
> I think where I don’t fully understand is the documentation seems to imply all serial
> orderings must be valid to commit a SERIALIZABLE transaction but it seems like just
> one serial ordering must be valid?
You seem to think that transactions are serializable if their result is consistent
with all possible serial execution orders. But that is not so.
What the documentation says is:
It is an serialization anomaly (that is, not serializable) if the execution is
*in*consistent will all possible serial executions.
That implies: It is serializable if the execution is consistent with one serial
execution.
Yours,
Laurenz Albe