Re: Why to index a "Recently DEAD" tuple when creating index - Mailing list pgsql-hackers

From Kuntal Ghosh
Subject Re: Why to index a "Recently DEAD" tuple when creating index
Date
Msg-id CAGz5QCJwi7hBg4BLr1y6K3GySbgfWT7Q2_9jX9gspETv=4+K0Q@mail.gmail.com
Whole thread Raw
In response to Re: Why to index a "Recently DEAD" tuple when creating index  (Alex <zhihui.fan1213@gmail.com>)
Responses Re: Why to index a "Recently DEAD" tuple when creating index
Re: Why to index a "Recently DEAD" tuple when creating index
List pgsql-hackers
On Mon, Jun 10, 2019 at 2:12 PM Alex <zhihui.fan1213@gmail.com> wrote:
> On Mon, Jun 10, 2019 at 4:10 PM Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:
>> I think what I'm trying to say is different.
>>
>> For my case, the sequence is as following:
>> 1. Transaction A has deleted a tuple, say t1 and got committed.
>> 2. Index A has been created successfully.
>> 3. Now, transaction B starts and use the index A to fetch the tuple
>> t1. While doing visibility check, transaction B gets to know that t1
>> has been deleted by a committed transaction A, so it can't see the
>> tuple. But, it creates a dependency edge that transaction A precedes
>> transaction B. This edge is required to detect a serializable conflict
>> failure.
>>
>> If you don't create the index entry, it'll not be able to create that edge.
>
>
> Thanks,  I got the difference now, but still not get the necessity of it.
> 1.   Assume we don't index it,  in which situation we can get a wrong result?

Consider the following sequence of three different transactions X,A and B:

1. Transaction X reads a tuple t2.
2. Transaction A updates the tuple t2, deletes a tuple t1 and gets
committed. So, there transaction X precedes transaction A, i.e., X <-
A.
3. Index A is created successfully.
4. Transaction B starts and use the index A to fetch tuple t1. But,
it's already deleted by the committed transaction A. So, transaction A
precedes transaction B, i.e., A<-B.
5. At this point you've a dangerous structure X<-A<-B (definition of
dangerous structure src/backend/storage/lmgr/README-SSI) in the graph
which can produce an anomaly. For example now, if X tries to update
another tuple previously read by B, you'll have a dependency B<-X.
But, you already have X<-B which leads to serializable conflict.
Postgres tries to resolve this anomaly by rolling back one of the
transaction.

In your case, it'll be difficult to detect.

> 2.   If we only support "Read Committed" isolation level,  is there a safe way to not index such data?
>
I can't think of a case where the RECENTLY_DELETED tuple needs to be
indexed in "Read Committed" case. So, your suggestion likely to work
logically in "Read committed" isolation level. But, I'm not sure
whether you'll encounter any assertion failures in vacuum path or
concurrent index paths.


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Should we warn against using too many partitions?
Next
From: Guillaume Lelarge
Date:
Subject: Re: Quick doc typo fix