Home > mailing lists

Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers

From	Greg Stark
Subject	Re: group locking: incomplete patch, just for discussion
Date	November 3, 2014 18:19:15
Msg-id	CAM-w4HOGY9SpAJS5v0PpKw3En7U-DGa=zUPCuGLbEFVy1PPtKw@mail.gmail.com Whole thread Raw
In response to	Re: group locking: incomplete patch, just for discussion (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: group locking: incomplete patch, just for discussion (Tom Lane <tgl@sss.pgh.pa.us>) Re: group locking: incomplete patch, just for discussion (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

On Sat, Nov 1, 2014 at 9:09 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> 1. Any non-trivial piece of PostgreSQL code is likely to contain
> syscache lookups.
> 2. Syscache lookups had better work in parallel workers, or they'll be
> all but useless.

I've been using parallel sorts and index builds in my mental model of
how this will be used. I note that sorts go out of their way to look
up all the syscache entries in advance precisely so that tuplesort
doesn't start doing catalog lookups in the middle of the sort. In
general I think what people are imagining is that the parallel workers
will be running low-level code like tuplesort that has all the
databasey stuff like catalog lookups done in advance and just operates
on C data structures like function pointers. And I think that's a
valuable coding discipline to enforce, it avoids having low level
infrastructure calling up to higher level abstractions which quickly
becomes hard to reason about.

However in practice I think you're actually right -- but not for the
reasons you've been saying. I think the parallel workers *should* be
written as low level infrastructure and not be directly doing syscache
lookups or tuple locking etc. However there are a million ways in
which Postgres is extensible which causes loops in the call graph that
aren't apparent in the direct code structure. For instance, what
happens if the index you're building is an expression index or partial
index? Worse, what happens if those expressions have a plpython
function that does queries using SPI....

But those are the kinds of user code exploiting extensibility are the
situations where we need a deadlock detector and where you might need
this infrastructure. We wouldn't and shouldn't need a deadlock
detector for our own core server code. In an ideal world some sort of
compromise that enforces careful locking rules where all locks are
acquired in advance and parallel workers are prohibited from obtaining
locks in the core code while still allowing users to a free-for-all
and detecting deadlocks at runtime for them would be ideal. But I'm
not sure there's any real middle ground here.

-- 
greg

pgsql-hackers by date:

From: Craig Ringer
Date: 03 November 2014, 17:27:53
Subject: Re: Pipelining executions to postgresql server

From: Tom Lane
Date: 03 November 2014, 18:44:09
Subject: Re: group locking: incomplete patch, just for discussion

Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers

Previous

Next