Fix possible overflow of pg_stat DSA's refcnt - Mailing list pgsql-hackers

From Anthonin Bonnefoy
Subject Fix possible overflow of pg_stat DSA's refcnt
Date
Msg-id CAO6_XqqJbJBL=M7Ym13TcB4Xnq58vRa2jcC+gwEPBgbAda6B1Q@mail.gmail.com
Whole thread Raw
Responses Re: Fix possible overflow of pg_stat DSA's refcnt
List pgsql-hackers
Hi,

During backend initialisation, pgStat DSA is attached using dsa_attach_in_place with a NULL segment. The NULL segment means that there's no callback to release the DSA when the process exits. pgstat_detach_shmem only calls dsa_detach which, as mentioned in the function's comment, doesn't include releasing and doesn't decrement the reference count of pgStat DSA.

Thus, every time a backend is created, pgStat DSA's refcnt is incremented but never decremented when the backend shutdown. It will eventually overflow and reach 0, triggering the "could not attach to dynamic shared area" error on all newly created backends. When this state is reached, the only way to recover is to restart the db to reset the counter.

The issue can be visible by calling dsa_dump in pgstat_detach_shmem and checking that refcnt's value is continuously increasing as new backends are created. It is also possible to reach the state where all connections are refused by editing the refcnt manually with lldb/gdb (The alternative, creating enough backends to reach 0 exists but can take some time). Setting it to -10 and then opening 10 connections will eventually generate the "could not attach" error.

This patch fixes this issue by releasing pgStat DSA with dsa_release_in_place during pgStat shutdown to correctly decrement the refcnt.

Regards,
Anthonin
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: RFC: Additional Directory for Extensions
Next
From: Melanie Plageman
Date:
Subject: Re: Backporting BackgroundPsql