Re: [BUGS] BUG #3245: PANIC: failed to re-find shared loc k o b j ect - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [BUGS] BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Date
Msg-id 462E62B3.6080808@enterprisedb.com
Whole thread Raw
In response to Re: [BUGS] BUG #3245: PANIC: failed to re-find shared loc k o b j ect  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [BUGS] BUG #3245: PANIC: failed to re-find shared loc k o b j ect
List pgsql-hackers
Tom Lane wrote:
> The pending-fsync stuff in md.c is also expecting to be able to add
> entries during a scan.

No, mdsync starts the scan from scratch after calling AbsorbFsyncRequests.

> I don't think we can go in the direction of forbidding insertions during
> a scan --- as the case at hand shows, it's just not always obvious that
> that could happen, and finding/fixing such a problem is nigh impossible.
> (We were darn fortunate to be able to reproduce this one.)  Plus we have
> a couple of places where it's really necessary to be able to do it,
> anyway.
> 
> The only answer I can see that seems reasonably robust is to change
> dynahash.c so that it tracks whether any seq_search scans are open on a
> hashtable, and doesn't carry out any splits while one is.  This wouldn't
> cost anything noticeable in performance, assuming that not very many
> splits are postponed.  The PITA aspect of it is that we'd need to add
> bookkeeping mechanisms to ensure that the count of active scans gets
> cleaned up on error exit.  It's not like we've not got lots of those,
> though.

We could have two kinds of seq scans, with and without support for 
concurrent inserts. If you open a scan without that support, it acts 
just like today, and no extra bookkeeping or clean up by the caller is 
required. If you need concurrent inserts, we inhibit bucket splits, but 
it's up to the caller to explicitly close the scan, possibly with 
PG_TRY/CATCH. I'm not sure if that's simpler in the end, but we could 
get away without adding generic bookkeeping mechanism.

> Possibly we could simplify matters a bit by not worrying about cleaning
> up leaked counts at subtransaction abort, ie, the list of open scans
> would only get forced to empty at top transaction end.  This carries a
> slightly higher risk of meaningful performance degradation, but in
> practice I doubt it's a big problem.  If we agreed that then we'd not
> need ResourceOwner support --- it could be handled like LWLock counts.

Hmm. Unlike lwlocks, hash tables can live in different memory contexts, 
so we can't just have list of open scans similar to held_lwlocks array.

Do we need to support multiple simultaneous seq scans of a hash table? I 
suppose we do..

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Email signature in release announcement posting
Next
From: Tom Lane
Date:
Subject: Re: [BUGS] BUG #3245: PANIC: failed to re-find shared loc k o b j ect