Re: Implementing Incremental View Maintenance - Mailing list pgsql-hackers
From | Konstantin Knizhnik |
---|---|
Subject | Re: Implementing Incremental View Maintenance |
Date | |
Msg-id | 5663f5f0-48af-686c-bf3c-62d279567e2a@postgrespro.ru Whole thread Raw |
In response to | Re: Implementing Incremental View Maintenance (Yugo NAGATA <nagata@sraoss.co.jp>) |
Responses |
Re: Implementing Incremental View Maintenance
Re: Implementing Incremental View Maintenance |
List | pgsql-hackers |
On 24.11.2020 12:21, Yugo NAGATA wrote: > >> I replaced it with RowExlusiveLock and ... got 1437 TPS with 10 connections. >> It is still about 7 times slower than performance without incremental view. >> But now the gap is not so dramatic. And it seems to be clear that this >> exclusive lock on matview is real show stopper for concurrent updates. >> I do not know which race conditions and anomalies we can get if replace >> table-level lock with row-level lock here. > I explained it here: > https://www.postgresql.org/message-id/20200909092752.c91758a1bec3479668e82643%40sraoss.co.jp > > For example, suppose there is a view V = R*S that joins tables R and S, > and there are two concurrent transactions T1 which changes table R to R' > and T2 which changes S to S'. Without any lock, in READ COMMITTED mode, > V would be updated to R'*S in T1, and R*S' in T2, so it would cause > inconsistency. By locking the view V, transactions T1, T2 are processed > serially and this inconsistency can be avoided. > > Especially, suppose that tuple dR is inserted into R in T1, and dS is > inserted into S in T2, where dR and dS will be joined in according to > the view definition. In this situation, without any lock, the change of V is > computed as dV=dR*S in T1, dV=R*dS in T2, respectively, and dR*dS would not > be included in the results. This inconsistency could not be resolved by > row-level lock. > >> But I think that this problem should be addressed in any case: single >> client update mode is very rare scenario. > This behavior is explained in rules.sgml like this: > > +<sect2> > +<title>Concurrent Transactions</title> > +<para> > + Suppose an <acronym>IMMV</acronym> is defined on two base tables and each > + table was modified in different a concurrent transaction simultaneously. > + In the transaction which was committed first, <acronym>IMMV</acronym> can > + be updated considering only the change which happened in this transaction. > + On the other hand, in order to update the view correctly in the transaction > + which was committed later, we need to know the changes occurred in > + both transactions. For this reason, <literal>ExclusiveLock</literal> > + is held on an <acronym>IMMV</acronym> immediately after a base table is > + modified in <literal>READ COMMITTED</literal> mode to make sure that > + the <acronym>IMMV</acronym> is updated in the latter transaction after > + the former transaction is committed. In <literal>REPEATABLE READ</literal> > + or <literal>SERIALIZABLE</literal> mode, an error is raised immediately > + if lock acquisition fails because any changes which occurred in > + other transactions are not be visible in these modes and > + <acronym>IMMV</acronym> cannot be updated correctly in such situations. > +</para> > +</sect2> > > Hoever, should we describe explicitly its impact on performance here? > Sorry, I didn't think much about this problem. But I think that it is very important to try to find some solution of the problem. The most obvious optimization is not to use exclusive table lock if view depends just on one table (contains no joins). Looks like there are no any anomalies in this case, are there? Yes, most analytic queries contain joins (just two queries among 22 TPC-H have no joins). So may be this optimization will not help much. I wonder if it is possible to somehow use predicate locking mechanism of Postgres to avoid this anomalies without global lock? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: