Thread: Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling as

Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling as

From
Christopher Kings-Lynne
Date:
Just random speculation, but could we use a pg_subtrans-like setup to do 
row share locks?

ie. Store them in a sort of table to avoid the problems with limited 
shared memory size?

Chris


Tom Lane wrote:

> Log Message:
> -----------
> Rearrange pg_subtrans handling as per recent discussion.  pg_subtrans
> updates are no longer WAL-logged nor even fsync'd; we do not need to,
> since after a crash no old pg_subtrans data is needed again.  We truncate
> pg_subtrans to RecentGlobalXmin at each checkpoint.  slru.c's API is
> refactored a little bit to separate out the necessary decisions.



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling as

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> Just random speculation, but could we use a pg_subtrans-like setup to do 
> row share locks?

pg_subtrans is trivial to index --- it's a linear array subscripted by
TransactionId.  I'm not sure what we'd do to handle row locks, which
would need a key like (DBoid, RELoid, BlockNum, LineNum) and would be
extremely non-dense in that space.

'Tis something to think about though...
        regards, tom lane


Re: pgsql-server: Rearrange pg_subtrans handling as

From
Alvaro Herrera
Date:
On Tue, Aug 24, 2004 at 09:17:59AM +0800, Christopher Kings-Lynne wrote:
> Just random speculation, but could we use a pg_subtrans-like setup to do 
> row share locks?
> 
> ie. Store them in a sort of table to avoid the problems with limited 
> shared memory size?

Hmm ... how would you map the row number to an array index?  I think it
could work if you answer the above question.  The problem is doing it
for any possible table/row combination, and make it so that in a given
moment only a small space in the array is used.  (If you don't do that,
there will be a lot of thrashing.)

What's needed is a four-dimension integer array:

pg_database    oid
pg_class    oid    BlockNumber    OffsetNumber

The value would keep how many share-lockers there are.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; jugar
al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en
La Feria de las Tinieblas, R. Bradbury)



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Christopher Kings-Lynne
Date:
> pg_subtrans is trivial to index --- it's a linear array subscripted by
> TransactionId.  I'm not sure what we'd do to handle row locks, which
> would need a key like (DBoid, RELoid, BlockNum, LineNum) and would be
> extremely non-dense in that space.
> 
> 'Tis something to think about though...

I've been thinking about it and I am wondering what the reason is that 
we need to record every transaction that has a row share lock?

ie. why can't we just record the number of locks each backend has, sort 
of, and use a reference counting sort of method.  Per-backend in case 
the backend dies and we need to discount those locks..?

Chris



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling as

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> ie. why can't we just record the number of locks each backend has, sort 
> of, and use a reference counting sort of method.  Per-backend in case 
> the backend dies and we need to discount those locks..?

Uh ... the interesting question is usually not "does this backend hold
any row locks", it is "is this row locked by any backends".  If the
latter question is not *exceedingly* cheap to answer, at least in the
normal case where the answer is no, you don't have a workable solution,
because you'll be adding nontrivial overhead to every row update.
        regards, tom lane


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Christopher Kings-Lynne
Date:
> Uh ... the interesting question is usually not "does this backend hold
> any row locks", it is "is this row locked by any backends".  If the
> latter question is not *exceedingly* cheap to answer, at least in the
> normal case where the answer is no, you don't have a workable solution,
> because you'll be adding nontrivial overhead to every row update.

OK, what I mean is to know if a row is locked by any backend, why can't 
we just put a reference count of the number of locks on that row, 
instead of recording each backend separately?  Wouldn't that require a 
fixed amount of shared mem?

Chris



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans

From
Stephan Szabo
Date:
On Tue, 24 Aug 2004, Christopher Kings-Lynne wrote:

> > Uh ... the interesting question is usually not "does this backend hold
> > any row locks", it is "is this row locked by any backends".  If the
> > latter question is not *exceedingly* cheap to answer, at least in the
> > normal case where the answer is no, you don't have a workable solution,
> > because you'll be adding nontrivial overhead to every row update.
>
> OK, what I mean is to know if a row is locked by any backend, why can't
> we just put a reference count of the number of locks on that row,
> instead of recording each backend separately?  Wouldn't that require a
> fixed amount of shared mem?

AFAICT you have to do something on top of that to allow deadlock
detection.  If transaction X has a shared row lock on A and is waiting on
a lock for me and I want to get an exclusive row lock on A, how do I
detect that it's a deadlock?


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Tue, 24 Aug 2004, Christopher Kings-Lynne wrote:
>> OK, what I mean is to know if a row is locked by any backend, why can't
>> we just put a reference count of the number of locks on that row,
>> instead of recording each backend separately?  Wouldn't that require a
>> fixed amount of shared mem?

> AFAICT you have to do something on top of that to allow deadlock
> detection.  If transaction X has a shared row lock on A and is waiting on
> a lock for me and I want to get an exclusive row lock on A, how do I
> detect that it's a deadlock?

I think the speed complaint I was just raising could possibly be
answered by setting an infomask bit indicating that the row might
be present in a separate table of active row locks.  (I'm not sure
how the bit would get cleared without race conditions, but let's
suppose that can be done.)  A little hashing, a little spill-to-disk
logic, and it might be done.  But that's just handwaving... anyone
want to try to fill in the details?

[ But to answer Chris' question: no, I don't see any way that this
could be a fixed-size table.  You will need that spill-to-disk bit. ]
        regards, tom lane


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Christopher Kings-Lynne
Date:
> I think the speed complaint I was just raising could possibly be
> answered by setting an infomask bit indicating that the row might
> be present in a separate table of active row locks.  (I'm not sure
> how the bit would get cleared without race conditions, but let's
> suppose that can be done.)  A little hashing, a little spill-to-disk
> logic, and it might be done.  But that's just handwaving... anyone
> want to try to fill in the details?

I vote Alvaro :)  This stuff is way out of my league - I'm just the 
ideas man :D

Either way - Bruce, did you want to add a summary of these ideas to the 
TODO?

Chris



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Richard Huxton
Date:
Christopher Kings-Lynne wrote:
>> Uh ... the interesting question is usually not "does this backend hold
>> any row locks", it is "is this row locked by any backends".  If the
>> latter question is not *exceedingly* cheap to answer, at least in the
>> normal case where the answer is no, you don't have a workable solution,
>> because you'll be adding nontrivial overhead to every row update.
> 
> 
> OK, what I mean is to know if a row is locked by any backend, why can't 
> we just put a reference count of the number of locks on that row, 
> instead of recording each backend separately?  Wouldn't that require a 
> fixed amount of shared mem?

Don't forget having to deal with a backend dying without being able to 
decrement the count (not my idea, Bruce (iirc) mentioned it last time 
this was discussed). I think at the least you'd need a 
max-trans-id-with-lock number stored next to the count so that in the 
event of backend crashes the lock will eventually be released.

--  Richard Huxton  Archonet Ltd


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Bruce Momjian
Date:
Richard Huxton wrote:
> Christopher Kings-Lynne wrote:
> >> Uh ... the interesting question is usually not "does this backend hold
> >> any row locks", it is "is this row locked by any backends".  If the
> >> latter question is not *exceedingly* cheap to answer, at least in the
> >> normal case where the answer is no, you don't have a workable solution,
> >> because you'll be adding nontrivial overhead to every row update.
> > 
> > 
> > OK, what I mean is to know if a row is locked by any backend, why can't 
> > we just put a reference count of the number of locks on that row, 
> > instead of recording each backend separately?  Wouldn't that require a 
> > fixed amount of shared mem?
> 
> Don't forget having to deal with a backend dying without being able to 
> decrement the count (not my idea, Bruce (iirc) mentioned it last time 
> this was discussed). I think at the least you'd need a 
> max-trans-id-with-lock number stored next to the count so that in the 
> event of backend crashes the lock will eventually be released.

Even more significantly, how does the backend know where to go to
decrement its row counts on commit?  I think the best we could do would
be to store the xids on the row for each backend that has a shared lock.
The list could be cleared out automatically by looking in the PROC
structure and removing completed xids.  However, that is a lot of
storage for each row to have and a centralized table is probably best
where backends can clean up on their own on commit.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > I think the speed complaint I was just raising could possibly be
> > answered by setting an infomask bit indicating that the row might
> > be present in a separate table of active row locks.  (I'm not sure
> > how the bit would get cleared without race conditions, but let's
> > suppose that can be done.)  A little hashing, a little spill-to-disk
> > logic, and it might be done.  But that's just handwaving... anyone
> > want to try to fill in the details?
> 
> I vote Alvaro :)  This stuff is way out of my league - I'm just the 
> ideas man :D
> 
> Either way - Bruce, did you want to add a summary of these ideas to the 
> TODO?

OK, TODO updated:

* Implement dirty reads or shared row locks and use them in RI triggers
 Adding shared locks requires recording the table/rows numbers in a shared area, and this could potentially be a large
amountof data. One idea is to store the table/row numbers in a separate table and set a bit on the row indicating
lookingin this new table is required to find any shared row locks.
 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Tom Lane
Date:
Richard Huxton <dev@archonet.com> writes:
> ... Don't forget having to deal with a backend dying without being able to 
> decrement the count (not my idea, Bruce (iirc) mentioned it last time 
> this was discussed). I think at the least you'd need a 
> max-trans-id-with-lock number stored next to the count so that in the 
> event of backend crashes the lock will eventually be released.

No, because the whole table would simply be flushed upon backend crash.
        regards, tom lane


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Christopher Kings-Lynne
Date:
> OK, TODO updated:
> 
> * Implement dirty reads or shared row locks and use them in RI triggers

Can someone explain to me what a dirty read is and how it relates to RI 
triggers?

Chris


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Greg Stark
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

> > OK, TODO updated:
> > * Implement dirty reads or shared row locks and use them in RI triggers
> 
> Can someone explain to me what a dirty read is and how it relates to RI
> triggers?

A dirty read is a read that includes data that hasn't been committed yet. Or
as the SQL 92 standard puts it:
        1) P1 ("Dirty read"): SQL-transaction T1 modifies a row. SQL-           transaction T2 then reads that row
beforeT1 performs a COMMIT.           If T1 then performs a ROLLBACK, T2 will have read a row that was           never
committedand that may thus be considered to have never           existed.
 

It's only allowed when the transaction is in READ UNCOMMITTED isolation level.
Something Postgres doesn't currently support. In fact I'm not aware of any SQL
database that supports it, though I'm sure there's one somewhere.

You wouldn't normally want to use such a thing, but it could be useful for,
for example, seeing what progress a transaction has made for a UI progress
meter.

It could also be useful for referential integrity checks since, for example,
it would let you see if someone has deleted the referenced record but not
committed the delete yet.

But that alone wouldn't let you avoid locking the record, TODO items are
mostly just pointers to old threads on the mailing lists. They don't contain
the complete story. You could maybe find more information searching the
pgsql-hackers archive on the web site.

-- 
greg



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Mark Kirkwood
Date:

Greg Stark wrote:

> It's only allowed when the transaction is in READ UNCOMMITTED 
> isolation level.
>
>Something Postgres doesn't currently support. In fact I'm not aware of any SQL
>database that supports it, though I'm sure there's one somewhere.
>
>  
>
FYI - DB2 supports this isolation level, I don't know of any others (tho 
Informix is a possibility).

regards

Mark


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Philip Yarra
Date:
On Wed, 25 Aug 2004 03:54 pm, Mark Kirkwood wrote:
> Greg Stark wrote:
> > It's only allowed when the transaction is in READ UNCOMMITTED
> > isolation level.
> >
> >Something Postgres doesn't currently support. In fact I'm not aware of any
> > SQL database that supports it, though I'm sure there's one somewhere.
>
> FYI - DB2 supports this isolation level, I don't know of any others (tho
> Informix is a possibility).

Sybase ASE (and by derivation MS SQL Server) also have dirty reads. 

Regards, Philip.



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans

From
Gavin Sherry
Date:
On Wed, 25 Aug 2004, Mark Kirkwood wrote:

>
>
> Greg Stark wrote:
>
> > It's only allowed when the transaction is in READ UNCOMMITTED
> > isolation level.
> >
> >Something Postgres doesn't currently support. In fact I'm not aware of any SQL
> >database that supports it, though I'm sure there's one somewhere.
> >
> >
> >
> FYI - DB2 supports this isolation level, I don't know of any others (tho
> Informix is a possibility).

I wasn't aware pf DB2 supporting it, but SQL Server and Informix do.
Oracle only supports COMMITTED and SERIALIZABLE according to my 10g
manual.

Gavin


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Jan Wieck
Date:
On 8/25/2004 1:32 AM, Greg Stark wrote:

> A dirty read is a read that includes data that hasn't been committed yet. Or
> as the SQL 92 standard puts it:

[...]

> It could also be useful for referential integrity checks since, for example,
> it would let you see if someone has deleted the referenced record but not
> committed the delete yet.
> 
> But that alone wouldn't let you avoid locking the record, TODO items are
> mostly just pointers to old threads on the mailing lists. They don't contain
> the complete story. You could maybe find more information searching the
> pgsql-hackers archive on the web site.

Plus ... wouldn't doing the "on delete" lookup as dirty reads let 
referencing rows that have been deleted but still could come back 
through a rollback disappear? What you want to see are new tuples of 
uncommitted insert/update as well as old tuples of uncommitted 
delete/update. I don't think there is any term in the standard for that 
read mode, so we should call it dusty-reads because they see everything 
vacuum is interested in.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans

From
Dennis Bjorklund
Date:
On 25 Aug 2004, Greg Stark wrote:

> It's only allowed when the transaction is in READ UNCOMMITTED isolation level.
> Something Postgres doesn't currently support. In fact I'm not aware of any SQL
> database that supports it, though I'm sure there's one somewhere.

Looks like mysql also supports it:
 http://dev.mysql.com/doc/mysql/en/InnoDB_transaction_isolation.html

Together with the other replies we now have a whole bunch of databases 
that implements it.

-- 
/Dennis Björklund



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > OK, TODO updated:
> > 
> > * Implement dirty reads or shared row locks and use them in RI triggers
> 
> Can someone explain to me what a dirty read is and how it relates to RI 
> triggers?

Dirty read allows you to see uncommited rows.  I think RI triggers need
it so they can know if someone has a current transaction that is going
to conflict with the RI trigger action, or something like that.  Right
now I think we hang waiting for the transaction to complete.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans handling

From
Alvaro Herrera
Date:
On Wed, Aug 25, 2004 at 05:10:30PM -0400, Bruce Momjian wrote:
> Christopher Kings-Lynne wrote:
> > > OK, TODO updated:
> > > 
> > > * Implement dirty reads or shared row locks and use them in RI triggers
> > 
> > Can someone explain to me what a dirty read is and how it relates to RI 
> > triggers?
> 
> Dirty read allows you to see uncommited rows.  I think RI triggers need
> it so they can know if someone has a current transaction that is going
> to conflict with the RI trigger action, or something like that.  Right
> now I think we hang waiting for the transaction to complete.

Yes, we hang, but we behave differently depending on the commit status
of the transaction we are waiting for.  We don't know that in advance,
which means we would have to "do something" at our own transaction end
to check that status.

Personally I think this is a bad mechanism to hide our lack of shared
row locks.  We should be solving that problem instead, which I would
like to attack next.

FYI, I leave tomorrow morning to give a talk on Postgres at the
'Encuentro de Linux Norte' here in Chile, and will be back on monday.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La fuerza no está en los medios físicos
sino que reside en una voluntad indomable" (Gandhi)



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans

From
Greg Stark
Date:
Dennis Bjorklund <db@zigo.dhs.org> writes:

> On 25 Aug 2004, Greg Stark wrote:
> 
> > It's only allowed when the transaction is in READ UNCOMMITTED isolation level.
> > Something Postgres doesn't currently support. In fact I'm not aware of any SQL
> > database that supports it, though I'm sure there's one somewhere.
> 
> Looks like mysql also supports it:
> 
>   http://dev.mysql.com/doc/mysql/en/InnoDB_transaction_isolation.html
> 
> Together with the other replies we now have a whole bunch of databases 
> that implements it.

Well it would be a pretty handy feature. 

Several times I've seen people on the list trying to calculate how far some
big batch update or load had proceeded by looking at the sizes of files in the
data directory, estimating row sizes, and dividing.

That's a pretty kludgy method for doing what could be done cleanly and with
more flexibility by just running switching to read uncommitted mode and
selecting to see how many records had been inserted.

I don't know the details, but with postgres's model wouldn't it be a simply
matter of treating every tuple found as having been inserting or deleted
without checking to see if the transaction id in the tuple is committed? It
should be even easier than the normal read committed mode.

One danger would be for such "dirty" data leaking out into the rest of the
database. But I would be pretty happy with such a feature even if it meant no
inserts/deletes/updates could be performed while in that mode.

-- 
greg



Re: [COMMITTERS] pgsql-server: Rearrange pg_subtrans

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> I don't know the details, but with postgres's model wouldn't it be a simply
> matter of treating every tuple found as having been inserting or deleted
> without checking to see if the transaction id in the tuple is committed?

No.  At least not if you want a view of the data that's even marginally
sane (not multiple versions of the same row, for instance, or versions
that are definitely dead by anyone's viewpoint).
        regards, tom lane