Thread: Re: Too many serialization errors in production

Re: Too many serialization errors in production

From

pavan95

Date:

17 April 2019, 06:09:57

Hi Community,

We are facing the error "*could not serialize due to concurrent update*" too
many times in our production server.

The errors are occurring due to the "Repeatable Read" isolation level. I
know that the database default_transaction_isolation is "Read Committed"
where we don't get these errors. 

But when I monitored the queries the application(ODOO) user is issuing "SET
default_transaction_isolation to Repeatable Read". So is the cause of these
errors.  I have explored on the ODOO application side where in which I lost
somewhere  but found to know that these errors are common for ODOO. 

In one of the ODOO application files named "sql_db.py" the below code I have
found where I understood the isolation level was set to "Repeatable Read". 

My concern here is if I change this isolation level to "Read Committed" will
this suffice to stop this error occurrence?

Else my another approach is how to stop the application user from executing
session level command "SET default_transaction_isolation to Repeatable Read"
?

Could you please suggest me how to stop any user from changing the
default_transaction_isolation in PostgreSQL?

Looking forward to hear from you.

Regards,
Pavan,
9841380956



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-admin-f2076596.html

Re: Too many serialization errors in production

From

Tom Lane

Date:

17 April 2019, 13:53:42

pavan95 <pavan.postgresdba@gmail.com> writes:
> We are facing the error "*could not serialize due to concurrent update*" too
> many times in our production server.

Why do you think there are too many?  For an app that's depending on
isolation levels above Read Committed, that's an expected behavior.

> My concern here is if I change this isolation level to "Read Committed" will
> this suffice to stop this error occurrence?

You wouldn't see any more of those errors, but almost certainly you'd
have broken the application.  Presumably, it's depending on retrying
serialization failures to ensure consistency across concurrent updates.
Without the retries, such cases would likely soon lead to inconsistent
data.

> Could you please suggest me how to stop any user from changing the
> default_transaction_isolation in PostgreSQL?

That's a spectacularly bad idea.

            regards, tom lane

Re: Too many serialization errors in production

From

pavan95

Date:

22 April 2019, 09:50:51

Hi Tom,

Thank you so much for your timely response. 

So you are suggesting that changing isolation level from the ODOO
application side/ limiting a user to execute "set
default_transaction_isolation='Repeatable Read' is a bad idea correct?

My question is in my error log I'm getting query text for that corresponding
"could not serialize access due to concurrent update" error failures. But
can I get the other transaction which committed and which is responsible for
this above serialization update?

Looking forward to hear from you.

Regards,
Pavanteja.A,
9841380956



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-admin-f2076596.html

Re: Too many serialization errors in production

From

Alvaro Herrera

Date:

22 April 2019, 12:51:43

On 2019-Apr-22, pavan95 wrote:

> So you are suggesting that changing isolation level from the ODOO
> application side/ limiting a user to execute "set
> default_transaction_isolation='Repeatable Read' is a bad idea correct?

Yeah.  Changing the isolation level is a one-line change in terms of
source of code, but the implications for the correctness of the
application are very serious, and might cause your data to become
invalid eventually.

I wonder if this problem is related to this bugfix, which came out of a
report from Odoo.  If you're running 9.5.3 or earlier (or 9.4.8 or
earlier, or 9.3.13 or earlier), the fix might just be to update to a
newer minor.

Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Branch: master Release: REL9_6_BR [533e9c6b0] 2016-07-15 14:17:20 -0400
Branch: REL9_5_STABLE Release: REL9_5_4 [649dd1b58] 2016-07-15 14:17:20 -0400
Branch: REL9_4_STABLE Release: REL9_4_9 [166873dd0] 2016-07-15 14:17:20 -0400
Branch: REL9_3_STABLE Release: REL9_3_14 [6c243f90a] 2016-07-15 14:17:20 -0400

    Avoid serializability errors when locking a tuple with a committed update
    
    When key-share locking a tuple that has been not-key-updated, and the
    update is a committed transaction, in some cases we raised
    serializability errors:
        ERROR:  could not serialize access due to concurrent update
    
    Because the key-share doesn't conflict with the update, the error is
    unnecessary and inconsistent with the case that the update hasn't
    committed yet.  This causes problems for some usage patterns, even if it
    can be claimed that it's sufficient to retry the aborted transaction:
    given a steady stream of updating transactions and a long locking
    transaction, the long transaction can be starved indefinitely despite
    multiple retries.
    
    To fix, we recognize that HeapTupleSatisfiesUpdate can return
    HeapTupleUpdated when an updating transaction has committed, and that we
    need to deal with that case exactly as if it were a non-committed
    update: verify whether the two operations conflict, and if not, carry on
    normally.  If they do conflict, however, there is a difference: in the
    HeapTupleBeingUpdated case we can just sleep until the concurrent
    transaction is gone, while in the HeapTupleUpdated case this is not
    possible and we must raise an error instead.
    
    Per trouble report from Olivier Dony.
    
    In addition to a couple of test cases that verify the changed behavior,
    I added a test case to verify the behavior that remains unchanged,
    namely that errors are raised when a update that modifies the key is
    used.  That must still generate serializability errors.  One
    pre-existing test case changes behavior; per discussion, the new
    behavior is actually the desired one.
    
    Discussion: https://www.postgresql.org/message-id/560AA479.4080807@odoo.com
      https://www.postgresql.org/message-id/20151014164844.3019.25750@wrigleys.postgresql.org
    
    Backpatch to 9.3, where the problem appeared.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Too many serialization errors in production

From

pavan95

Date:

22 April 2019, 13:17:56

Hi Alvaro,

Do you mean this to be a bug?  

My question is in my error log I'm getting query text for that corresponding
"could not serialize access due to concurrent update" error failures. But
can I get the other transaction which committed and which is responsible for
this above serialization update? 

Looking forward to hear from you!.

Regards,
Pavan,
9841380956



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-admin-f2076596.html

Re: Too many serialization errors in production

From

Alvaro Herrera

Date:

25 April 2019, 17:47:38

Hello

On 2019-Apr-22, pavan95 wrote:

> Do you mean this to be a bug?  

A bug fixed two years ago, yeah.

> My question is in my error log I'm getting query text for that corresponding
> "could not serialize access due to concurrent update" error failures. But
> can I get the other transaction which committed and which is responsible for
> this above serialization update? 

You probably *can*, by ensuring that all transactions have sufficient
details in the log; you may be able to link the entries by PID or XID.
Not sure how easy that is.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Too many serialization errors in production

From

Alvaro Herrera

Date:

25 April 2019, 19:23:19

On 2019-Apr-25, Alvaro Herrera wrote:

> Hello
> 
> On 2019-Apr-22, pavan95 wrote:
> 
> > Do you mean this to be a bug?  
> 
> A bug fixed two years ago, yeah.

Let me correct myself.  What I fixed two years ago was some spurious
occurrences of the reported message.  However, the same error message
can be returned in other circumstances, and I suspect many of them would
not be spurious.  Therefore, my suggestion is:

1. if you're on a release older than the bugfix, then upgrade and try
again.

2. if you're already on a release with that problem fixed, then
investigate and to determine whether (some of?) the occassions on which
the error is raised are spurious.

3. if you find some occassions on which they are spurious, see if you
can do something to fix that (say, increment # of predicate locks in
postgresql.conf; maybe fix some other bug).

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services