Re: O(1) DSM handle operations - Mailing list pgsql-hackers

From Robert Haas
Subject Re: O(1) DSM handle operations
Date
Msg-id CA+TgmoYZOyMMm32FY0zW8_kYiYhWS8zE+OogJyCZPamTRh+Cng@mail.gmail.com
Whole thread Raw
Responses Re: O(1) DSM handle operations  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On Mon, Mar 27, 2017 at 5:13 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> This is just a thought for discussion, no patch attached...
>
> DSM operations dsm_create(), dsm_attach(), dsm_unpin_segment() perform
> linear searches of the dsm_control->item array for either a free slot
> or a slot matching a given handle.  Maybe no one thinks this is a
> problem, because in practice the number of DSM slots you need to scan
> should be something like number of backends * some small factor at
> peak.

One thing I thought about when designing the format of the DSM control
segment was that we need to (attempt to) reread the old segment after
recovering from a crash, even if it's borked.  With the current
design, I think that nothing too bad can happen even if some or all of
the old control segment has been overwritten with gibberish.  I mean,
if we get particularly unlucky, we might manage to remove a DSM
segment that some other cluster is using, but we'd have to be very
unlucky for things to even get that bad, and we shouldn't crash
outright.

If we replace the array with some more complicated data structure,
we'd have to be sure that reading it is robust against it having been
scrambled by a previous crash.  Otherwise, it won't be possible to
restart the cluster without manual intervention.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: logical replication launcher crash on buildfarm
Next
From: Michael Paquier
Date:
Subject: Re: Crash on promotion when recovery.conf is renamed