Thread: Doubt in mvcc
Hi sir,
I m preparing for interview one of the recruiter asked me mvcc drawbacks as i told due to mvcc it use more space and need to perform maintenance activity.
Another one is the same data causes an update conflict because two different transactions can update the same version of the row.
he told its wrong, kindly tell me will you please tell me its correct or wrong?
Thanks
RK
Rama: On Mon, Jul 13, 2020 at 9:52 AM Rama Krishnan <raghuldrag@gmail.com> wrote: > I m preparing for interview one of the recruiter asked me mvcc drawbacks as i told due to mvcc it use more space and needto perform maintenance activity. > Another one is the same data causes an update conflict because two different transactions can update the same version ofthe row. > he told its wrong, kindly tell me will you please tell me its correct or wrong? I'm not sure I understand your question too well, you may want to refresh/expand. One interpretation is, on a pure MVCC contest, two transactions, say 5 and 6, could try to update a tuple valid for [1,) and end up generating two new tuples, [5,), [6,) and closing the original at either [1,5) or [1,6) . That's why MVCC is just a piece, locking is other. On a MVCC the tuples are locked while a transaction manipulates them. Other transactions may read them, which is why readers do not block writers, but two updates on the same tuple serialize. Francisco Olarte.
Hi Francisco,
Still, I have a doubt as per your example both are trying to update the same tuple so it may produce two different copies right?
I read some blocks they mentioned drawback above two things
It may lead to lost update also
For example, two transactions are going to increase the amount on the same account by $100 . The first transaction reads the current value ($1000) and then the second transaction reads the same value. The first transaction increases the amount (this gives $1100) and writes this value. The second transaction acts the same way: it gets the same $1100 and writes this value. As a result, the customer lost $100.
Will u please provide more details
Serialize is the solution to this issue.
On Mon, 13 Jul, 2020, 14:12 Francisco Olarte, <folarte@peoplecall.com> wrote:
Rama:
On Mon, Jul 13, 2020 at 9:52 AM Rama Krishnan <raghuldrag@gmail.com> wrote:
> I m preparing for interview one of the recruiter asked me mvcc drawbacks as i told due to mvcc it use more space and need to perform maintenance activity.
> Another one is the same data causes an update conflict because two different transactions can update the same version of the row.
> he told its wrong, kindly tell me will you please tell me its correct or wrong?
I'm not sure I understand your question too well, you may want to
refresh/expand.
One interpretation is, on a pure MVCC contest, two transactions, say 5
and 6, could try to update a tuple valid for [1,) and end up
generating two new tuples, [5,), [6,) and closing the original at
either [1,5) or [1,6) .
That's why MVCC is just a piece, locking is other. On a MVCC the
tuples are locked while a transaction manipulates them. Other
transactions may read them, which is why readers do not block writers,
but two updates on the same tuple serialize.
Francisco Olarte.
On 7/13/2020 4:52 AM, Rama Krishnan wrote: > For example, two transactions are going to increase the amount on the > same account by $100 . The first transaction reads the current value > ($1000) and then the second transaction reads the same value. The first > transaction increases the amount (this gives $1100) and writes this > value. The second transaction acts the same way: it gets the same $1100 > and writes this value. As a result, the customer lost $100. > > > Will u please provide more details > > > Serialize is the solution to this issue. MVCC applies to read and write situation. Readers do not block writers and writers do not block read. The example you mentioned above is writer vs writer and MVCC (both Oracle and PG) is not applicable.
Rama: 1st of all, please do not top post, specially if you want to discuss examples, I'm not goint to hunt up and down for the relevant sections. On Mon, Jul 13, 2020 at 10:52 AM Rama Krishnan <raghuldrag@gmail.com> wrote: > Still, I have a doubt as per your example both are trying to update the same tuple so it may produce two different copiesright? First, a loud advise. MY EXAMPLE IS NOT A REAL EXAMPLE OF HOW A REAL MVCC DATABASE WORKS. It just tried to show that if you just use the min-max transactions on storage without using more techniques, which many people will consider part of MVCC, it will not work. > I read some blocks they mentioned drawback above two things > It may lead to lost update also > For example, two transactions are going to increase the amount on the same account by $100 . The first transaction readsthe current value ($1000) and then the second transaction reads the same value. The first transaction increases theamount (this gives $1100) and writes this value. The second transaction acts the same way: it gets the same $1100 andwrites this value. As a result, the customer lost $100. PROPER mvcc, like postgres does, will not lead to this if used properly. If both transactions use UPDATE first will read AND lock the row, update value, write it, commit and unlock. Second will try to read and WAIT for lock, read 1100, write 1200. Some things happens if both use select for update and/or use the adequate isolation levels to force the engine to use appropiate locks. BUT if both transactions do a select, wait for a bit, then do an update set ( NOTE: for a banking application the CORRECT way to do a deposit is "update accounts set balance=balance+100", not "select balance from accounts into $B; update accounts set balance=$B+100 ). You may end up which what look like a lost update, but is really not a DB problem. If you do it in two ops, the DB does not know they correlate. For what it knows your bank might be a room with money in tin boxes, you read it to chek a box, counted the box, noticed the discrepancy and sent an update to fix it, and did it twice to be sure. Update correlates more with "I opened the box and put 100$ in without looking at what was there". Update returning would be "and I counted the box afterwards", and select for update would be "I took the box to my table, counted, added 100, counted the new pile, and returned the box to the safe". > Will u please provide more details No, I will not. You need to read a lot more than a mail can hold, MVCC is a complex topic which I just do not fully understand and I'm not going there. But there are tons of info floating around, more or less easy to find. > Serialize is the solution to this issue. That's what locks and isolation levels do. Francisco Olarte.
On 7/13/20 12:31 AM, Rama Krishnan wrote: > > > Hi sir, > > I m preparing for interview one of the recruiter asked me mvcc drawbacks > as i told due to mvcc it use more space and need to perform maintenance > activity. > > Another one is the same data causes an update conflict because two > different transactions can update the same version of the row. > he told its wrong, kindly tell me will you please tell me its correct > or wrong? It depends on what you are calling a conflict and what isolation level you are in. Take a look at this section of the docs: https://www.postgresql.org/docs/12/transaction-iso.html > > > Thanks > RK -- Adrian Klaver adrian.klaver@aklaver.com
On Mon, 2020-07-13 at 13:01 +0530, Rama Krishnan wrote: > I m preparing for interview one of the recruiter asked me mvcc drawbacks as i told due > to mvcc it use more space and need to perform maintenance activity. Yes. Generally speaking, you have to pay a price for keeping old versions of the data around, no matter how you implement it. > Another one is the same data causes an update conflict because two different transactions > can update the same version of the row. > he told its wrong, kindly tell me will you please tell me its correct or wrong? There is always a certain version (the latest) that can be updated, so this is the same no matter if you have MVCC or not: if two sessions want to update the same row, one has to wait until the other is done. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Mon, Jul 13, 2020 at 10:41:28AM +0200, Francisco Olarte wrote: > Rama: > > On Mon, Jul 13, 2020 at 9:52 AM Rama Krishnan <raghuldrag@gmail.com> wrote: > > I m preparing for interview one of the recruiter asked me mvcc drawbacks as i told due to mvcc it use more space andneed to perform maintenance activity. > > Another one is the same data causes an update conflict because two different transactions can update the same versionof the row. > > he told its wrong, kindly tell me will you please tell me its correct or wrong? > > I'm not sure I understand your question too well, you may want to > refresh/expand. > > One interpretation is, on a pure MVCC contest, two transactions, say 5 > and 6, could try to update a tuple valid for [1,) and end up > generating two new tuples, [5,), [6,) and closing the original at > either [1,5) or [1,6) . > > That's why MVCC is just a piece, locking is other. On a MVCC the > tuples are locked while a transaction manipulates them. Other > transactions may read them, which is why readers do not block writers, > but two updates on the same tuple serialize. You might want to look at this: https://momjian.us/main/presentations/internals.html#mvcc -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Dear Ramakrishna,
"two different transactions can update the same version of the row"
This answer itself is wrong.
In my point of view, the drawback of MVCC is just holding multiple versions of tuple in a table which leads to slowness in application access. the more your table is bloated the more it takes to retrieve data.it has to scan so much of _VM, so many pages which is time consuming.
The other drawback is anyway space.
There are a couple of workarounds to address the issue is what you should tell your recruiter.
Any RDBMS has its own mechanism to address Isolation property and each mechanism has it own flaws.
On Thu, Jul 23, 2020 at 12:16 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Jul 13, 2020 at 10:41:28AM +0200, Francisco Olarte wrote:
> Rama:
>
> On Mon, Jul 13, 2020 at 9:52 AM Rama Krishnan <raghuldrag@gmail.com> wrote:
> > I m preparing for interview one of the recruiter asked me mvcc drawbacks as i told due to mvcc it use more space and need to perform maintenance activity.
> > Another one is the same data causes an update conflict because two different transactions can update the same version of the row.
> > he told its wrong, kindly tell me will you please tell me its correct or wrong?
>
> I'm not sure I understand your question too well, you may want to
> refresh/expand.
>
> One interpretation is, on a pure MVCC contest, two transactions, say 5
> and 6, could try to update a tuple valid for [1,) and end up
> generating two new tuples, [5,), [6,) and closing the original at
> either [1,5) or [1,6) .
>
> That's why MVCC is just a piece, locking is other. On a MVCC the
> tuples are locked while a transaction manipulates them. Other
> transactions may read them, which is why readers do not block writers,
> but two updates on the same tuple serialize.
You might want to look at this:
https://momjian.us/main/presentations/internals.html#mvcc
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee