Re: [HACKERS] proposals for LLL, part 1 - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] proposals for LLL, part 1 |
Date | |
Msg-id | 199807170453.AAA14260@candle.pha.pa.us Whole thread Raw |
In response to | Re: [HACKERS] proposals for LLL, part 1 (Vadim Mikheev <vadim@krs.ru>) |
List | pgsql-hackers |
> Bruce Momjian wrote: > > > > I am retaining your entire message here for reference. > > > > I have a good solution for this. It will require only 4k of shared > > memory, and will have no restrictions on the age or number of > > transactions. > > > > First, I think we only want to implement "read committed isolation > > level", not serialized. Not sure why someone would want serialized. > > Serialized is DEFAULT isolation level in standards. > It must be implemented. Would you like inconsistent results > from pg_dump, etc? OK, I didn't know that. > > > > > OK, when a backend is looking at a row that has been committed, it must > > decide if the row was committed before or after my transaction started. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > If the transaction commit id(xmin) is greater than our current xid, we > > know we should not look at it because it is for a transaction that > > started after our own transaction. > > It's right for serialized, not for read committed. > In read committed backend must decide if the row was committed > before or after STATEMENT started... OK. > > > > > The problem is for transactions started before our own (have xmin's less > > than our own), and may have committed before or after our transaction. > > > > Here is my idea. We add a field to the shared memory Proc structure > > that can contain up to 32 transaction ids. When a transaction starts, > > we spin though all other open Proc structures, and record all > > currently-running transaction ids in our own Proc field used to store up > > to 32 transaction ids. While we do this, we remember the lowest of > > these open transaction ids. > > > > This is our snapshot of current transactions at the time our transaction > > starts. While analyzing a row, if it is greater than our transaction > > id, then the transaction was not even started before our transaction. > > If the xmin is lower than the min transaction id that we remembered from > > the Proc structures, it was committed before our transaction started. > > If it is greater than or equal to the min remembered transaction id, we > > must spin through our stored transaction ids. If it is in the stored > > list, we don't look at the row, because that transaction was not > > committed when we started our transaction. If it is not in the list, it > > must have been committed before our transaction started. We know this > > because if any backend starting a transaction after ours would get a > > transaction id higher than ours. > > Yes, this is way. > But, first, why should we store running transaction xids in shmem ? > Who is interested in these xids? > We have to store in shmem only min of these xids: vacuum must > not delete rows deleted by transactions with xid greater > (or equal) than this min xid or we risk to get inconsistent > results... > Also, as you see, we have to lock Proc structures in shmem > to get list of xids for each statement in read committed > mode... You are correct. We need to lock Proc stuctures during our scan, but we don't need to keep the list in shared memory. No reason to do it. Do we have to keep the Proc's locked while we get our table data locks. I sure hope not. Not sure how we are going prevent someone from committing their transaction between our Proc scan and when we start our transaction. Not even sure if I should be worried about that. > I don't know what way is better but using list of xids > is much easy to implement... Sure, do a list. Getting the min allows you to reduce the number of times it has to be scanned. I had not thought about vacuum, but keeping the min in shared memory will certain fix that issue. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
pgsql-hackers by date: