Re: margay fails assertion in stats/dsa/dsm code - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: margay fails assertion in stats/dsa/dsm code
Date
Msg-id CA+hUKGJ6+TRWex2FmgsL3LtznkwtcXrGrF23GhMK4ddqZGF9ww@mail.gmail.com
Whole thread Raw
In response to Re: margay fails assertion in stats/dsa/dsm code  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Wed, Jun 29, 2022 at 4:00 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> I suppose this could indicate that the machine and/or RAM disk is
> overloaded/swapping and one of those open() or unlink() calls is
> taking a really long time, and that could be fixed with some system
> tuning.

Hmm, I take that bit back.  Every backend that starts up is trying to
attach to the same segment, the one with the new pgstats stuff in it
(once the small space in the main shmem segment is used up and we
create a DSM segment).  There's no fairness/queue, random back-off or
guarantee of progress in that librt lock code, so you can get into
lock-step with other backends retrying, and although some waiter
always gets to make progress, any given backend can lose every round
and run out of retries.  Even when you're lucky and don't fail with an
undocumented incomprehensible error, it's very slow, and I'd
considering filing a bug report about that.  A work-around on
PostgreSQL would be to set dynamic_shared_memory_type to mmap (= we
just open our own files and map them directly), and making pg_dynshmem
a symlink to something under /tmp (or some other RAM disk) to avoid
touch regular disk file systems.



pgsql-hackers by date:

Previous
From: huyajun
Date:
Subject: Re: Implementing Incremental View Maintenance
Next
From: Amit Kapila
Date:
Subject: Re: Support logical replication of DDLs