Contention on LWLock buffer_content, due to SHARED lock(?) - Mailing list pgsql-hackers

From Jens-Wolfhard Schicke-Uffmann
Subject Contention on LWLock buffer_content, due to SHARED lock(?)
Date
Msg-id 20191209221036.GA17421@eta-carinae
Whole thread Raw
Responses Re: Contention on LWLock buffer_content, due to SHARED lock(?)
Re: Contention on LWLock buffer_content, due to SHARED lock(?)
List pgsql-hackers
Hi,

today I observed (on a r5.24xlarge AWS RDS instance, i.e. 96 logical
cores) lock contention on a buffer content lock due to taking of a
SHARED lock (I think):

Three tables were involved, simplified case:

CREATE TABLE global_config (id BIGINT PRIMARY KEY);

CREATE TABLE b (
  id BIGINT PRIMARY KEY,
  config_id BIGINT REFERENCES global_config (id)
);

CREATE TABLE c (
  id BIGINT PRIMARY KEY,
  config_id BIGINT REFERENCES global_config (id)
);

(I suppose having both b + c doesn't make a difference, but
maybe it was relevant, so I'm including it.)

Heavy INSERT + UPDATE traffic on b + c (all trivial via id),
SELECTs on global_config (again by id).
As the name suggests, there were only very few rows in
global_config, specifically only one was referenced by all
INSERT + UPDATEs on b + c.

On lighter load, all three types of queries were taking <1ms (as
expected), as load grew, all three went to ~50ms avg. execution time
together. AWS RDS console showed wait on LWLock:buffer_content as the
main contribution to that time.

Checking the code, I concluded that I observed lock contention
on the lock taken at the beginning of heap_lock_tuple, where
an exclusive buffer content lock is held while recording the
SHARE lock into the tuple and the WAL and the multiXact. I don't know
the actual number, but potentially up to 7000 active
transactions were holding a SHARE lock on that row, which could have
performance implications while scanning for multiXact memberships.


Semantically, all that lock traffic was superfluous, as the
global_config row's key was in no danger of being changed.

As this situation (some global, essentially static, entity is referenced
by a much written table) seems not uncommon, I wonder:

1. Does the above analysis sound about right?

2. If so, would it be worthwhile to develop a solution?
   I was thinking along the lines of introducing an multiXact
   representation of "everyone": Instead of meticulously recording every
   locking + completing transaciton in a multiXact, after a certain
   number of transactions has accumulated in a single multiXact, it is
   approximated as "everyone". If later a transaction finds that a SHARE
   lock is held by "everyone", the tuple would need no further modification
   (not sure if this could even be checked without taking an exclusive
   buffer lock). The hard part would probably be to ensure that an
   attempt to obtain an EXCLUSIVE lock would finally succeed against a
   SHARE lock held by "everyone".


Regards,
  Drahflow

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [Proposal] Level4 Warnings show many shadow vars
Next
From: Tom Lane
Date:
Subject: Re: [Proposal] Level4 Warnings show many shadow vars