Thread: Re: [GENERAL] Debugging deadlocks

Re: [GENERAL] Debugging deadlocks

From
Alvaro Herrera
Date:
On Thu, Mar 31, 2005 at 06:54:31PM -0600, Guy Rouillier wrote:
> Alvaro Herrera wrote:
> >
> > Now this can't be applied right away because it's easy to run "out of
> > memory" (shared memory for the lock table).  Say, a delete or update
> > that touches 10000 tuples does not work.  I'm currently working on a
> > proposal to allow the lock table to spill to disk ...
>
> While not always true, in many cases the cardinality of the referenced
> (parent) table is small compared to that of the referencing (child)
> table.  Does locking require a separate lock record for each tuple in
> the child table, or just one for each tuple in the parent table with a
> reference count?

Just one.  (LOCALLOCK, which is private to each backend, stores how many
times we hold a lock.)

I just realized we not only need to be able to spill LOCK struct to
disk, but also PROCLOCK ... am I right?

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
La web junta la gente porque no importa que clase de mutante sexual seas,
tienes millones de posibles parejas. Pon "buscar gente que tengan sexo con
ciervos incendiándose", y el computador dirá "especifique el tipo de ciervo"
(Jason Alexander)

Re: [GENERAL] Debugging deadlocks

From
Alvaro Herrera
Date:
On Fri, Apr 01, 2005 at 11:02:36PM -0500, Tom Lane wrote:

[Cc: to -hackers]

> We currently store tuple locks on the same page as the tuples (ie, in
> the tuple headers) and need no extra locks to do so.  Certainly it
> still has to have a spill mechanism, but the thought that is attractive
> to me is that until you are forced to spill, you do not have to take any
> system-wide lock, only a page-level lock.  So it could have very good
> average performance.

If we go this way maybe we should abandon the idea of using the standard
lock manager to lock tuples, which is what can be found on the patch I
posted.  Or maybe not, and just have the lock manager store the locks on
the page himself -- but it will have to know about the buffer, so it
will be in some sense a break in opacity (of API between the two).

One possible way to do it would be having a OffsetNumber stored in the
page header, and if it's not InvalidOffsetNumber then it points to a
"tuple" that holds

struct
{
    OffsetNumber nextLock;
    LOCK lock
}

So a locker would check the chain of locks and stop when it sees
InvalidOffsetNumber.

If there is no free space on the page, what should we do?  Store the
lock into the main hash table?

Another problem would be the XLog.  On heap operations, do we register
exactly where (position in the page) a tuple was stored, or just the
fact that it was stored?  If the latter, then there's no problem.  If
the former, then on the next REDO the records wouldn't match (==> PANIC)
-- unless we logged the lockings as well.

Reading the code I see we do log the offset numbers, so that's a problem
:-( ... maybe we could work around that by moving the pd_lower without
using line pointers; not sure if that's doable.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Porque Kim no hacia nada, pero, eso sí,
con extraordinario éxito" ("Kim", Kipling)