Alvaro,
I suppose there must be reasons not to do this, but have you considered
using the "slack" space (empty space) in an ordinary table "heap" page
to store share-locks on the tuples in that page? (If not enough space
is available then you would still need to use the spilled-to-disk btree
structure.)
Maybe there could also be something that you put in the disk page that
means "transaction X has all the tuples in this page share-locked,"
since I imagine it will usually be the case that if very many of them
are locked, then they are all locked.
With this method, checking for a lock would be slightly more complicated
since you would need to check the disk page in which the tuple resides
first, and the spill-to-disk structure afterwards.
It seems that if this would work, you would be able to share-lock every
tuple in a table (or a large fraction of them) without a lot of extra IO
provided that the pages were not 100 % full. (Which is almost never the
case in postgresql anyway.)
Paul Tillotson
Alvaro Herrera wrote:
>On Thu, Mar 31, 2005 at 06:54:31PM -0600, Guy Rouillier wrote:
>
>
>>Alvaro Herrera wrote:
>>
>>
>>>Now this can't be applied right away because it's easy to run "out of
>>>memory" (shared memory for the lock table). Say, a delete or update
>>>that touches 10000 tuples does not work. I'm currently working on a
>>>proposal to allow the lock table to spill to disk ...
>>>
>>>
>>While not always true, in many cases the cardinality of the referenced
>>(parent) table is small compared to that of the referencing (child)
>>table. Does locking require a separate lock record for each tuple in
>>the child table, or just one for each tuple in the parent table with a
>>reference count?
>>
>>
>
>Just one. (LOCALLOCK, which is private to each backend, stores how many
>times we hold a lock.)
>
>I just realized we not only need to be able to spill LOCK struct to
>disk, but also PROCLOCK ... am I right?
>
>
>