Thread: Crash: invalid DSA memory alloc request

Crash: invalid DSA memory alloc request

From
Andreas 'ads' Scherbaum
Date:
Hello,

I'm running a couple of large tests, and in this particular test I have 
a few million tables more.

At some point it fails, and I gathered the following trace:


2024-12-12 22:22:55.307 CET [1496210] ERROR: invalid DSA memory alloc 
request size 1073741824
2024-12-12 22:22:55.307 CET [1496210] BACKTRACE:
         postgres: ads tabletest [local] CREATE TABLE(+0x15e570) 
[0x6309c379c570]
         postgres: ads tabletest [local] CREATE 
TABLE(dshash_find_or_insert+0x1a4) [0x6309c39882d4]
         postgres: ads tabletest [local] CREATE 
TABLE(pgstat_get_entry_ref+0x440) [0x6309c3b0a530]
         postgres: ads tabletest [local] CREATE 
TABLE(pgstat_prep_pending_entry+0x3a) [0x6309c3b0676a]
         postgres: ads tabletest [local] CREATE 
TABLE(pgstat_assoc_relation+0x32) [0x6309c3b086c2]
         postgres: ads tabletest [local] CREATE 
TABLE(StartReadBuffer+0x3c0) [0x6309c3ab9870]
         postgres: ads tabletest [local] CREATE 
TABLE(ReadBufferExtended+0xa1) [0x6309c3abb271]
         postgres: ads tabletest [local] CREATE TABLE(+0x2c6caa) 
[0x6309c3904caa]
         postgres: ads tabletest [local] CREATE 
TABLE(AlterSequence+0xc0) [0x6309c3905860]
         postgres: ads tabletest [local] CREATE TABLE(+0x4b6336) 
[0x6309c3af4336]
         postgres: ads tabletest [local] CREATE 
TABLE(standard_ProcessUtility+0x259) [0x6309c3af33f9]
         postgres: ads tabletest [local] CREATE TABLE(+0x4b6e64) 
[0x6309c3af4e64]
         postgres: ads tabletest [local] CREATE 
TABLE(standard_ProcessUtility+0x259) [0x6309c3af33f9]
         postgres: ads tabletest [local] CREATE TABLE(+0x4b3d2f) 
[0x6309c3af1d2f]
         postgres: ads tabletest [local] CREATE TABLE(+0x4b3e4b) 
[0x6309c3af1e4b]
         postgres: ads tabletest [local] CREATE TABLE(PortalRun+0x16f) 
[0x6309c3af226f]
         postgres: ads tabletest [local] CREATE TABLE(+0x4b06cc) 
[0x6309c3aee6cc]
         postgres: ads tabletest [local] CREATE 
TABLE(PostgresMain+0xf67) [0x6309c3aefa87]
         postgres: ads tabletest [local] CREATE TABLE(+0x4accc5) 
[0x6309c3aeacc5]
         postgres: ads tabletest [local] CREATE 
TABLE(postmaster_child_launch+0x8f) [0x6309c3a5b95f]
         postgres: ads tabletest [local] CREATE TABLE(+0x421479) 
[0x6309c3a5f479]
         postgres: ads tabletest [local] CREATE 
TABLE(PostmasterMain+0xd71) [0x6309c3a61251]
         postgres: ads tabletest [local] CREATE TABLE(main+0x207) 
[0x6309c379efc7]
         /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x710c33a2a1ca]
         /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b) 
[0x710c33a2a28b]
         postgres: ads tabletest [local] CREATE TABLE(_start+0x25) 
[0x6309c379f595]
2024-12-12 22:22:55.307 CET [1496210] STATEMENT:  CREATE TABLE IF NOT 
EXISTS test_16718629 (id SERIAL PRIMARY KEY, d VARCHAR(200), e 
VARCHAR(200), f VARCHAR(200), i INTEGER, j INTEGER);


PostgreSQL Version is 17.2, compiled with debug symbols.

tabletest=# select version();
version
--------------------------------------------------------------------------------------------------
  PostgreSQL 17.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 
13.2.0-23ubuntu4) 13.2.0, 64-bit
(1 row)


I'm not able to reproduce this for every DDL statement, but grouping 
together about 50 of them it fails at some point.


Regards,

-- 
                Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project


Re: Crash: invalid DSA memory alloc request

From
Cédric Villemain
Date:
On 12/12/2024 22:49, Matthias van de Meent wrote:
> On Thu, 12 Dec 2024 at 22:28, Andreas 'ads' Scherbaum <ads@pgug.de> wrote:
>>
>> Hello,
>>
>> I'm running a couple of large tests, and in this particular test I have
>> a few million tables more.
>>
>> At some point it fails, and I gathered the following trace:
>>
>>
>> 2024-12-12 22:22:55.307 CET [1496210] ERROR: invalid DSA memory alloc
>> request size 1073741824
>> 2024-12-12 22:22:55.307 CET [1496210] BACKTRACE:
>>           postgres: ads tabletest [local] CREATE TABLE(+0x15e570)
>> [0x6309c379c570]
>>           postgres: ads tabletest [local] CREATE
>> TABLE(dshash_find_or_insert+0x1a4) [0x6309c39882d4]
>>           postgres: ads tabletest [local] CREATE
>> TABLE(pgstat_get_entry_ref+0x440) [0x6309c3b0a530]
> It looks like the dshash table used in the pgstats system uses
> resize(), which only specifies DSA_ALLOC_ZERO, not DSA_ALLOC_HUGE,
> causing issues when the table grows larger than 1 GB.
>
> I expect that error to disappear when you replace the
> dsa_allocate0(...) call in dshash.c's resize function with
> dsa_allocate_extended(..., DSA_ALLOC_HUGE | DSA_ALLOC_ZERO) as
> attached, but haven't tested it due to a lack of database with
> millions of relations.

IIUC the table is doubled in size when filled over 75%, so we went from 
500MB to 1GB here, doubling the number of available buckets.
It's probably good up to a point but the size limit is exceed here only 
by 1 byte and 1GB-1 are hopefully more than enough pointers.
Is it interesting to revisit the logic to increase size less quickly 
(over 500MB) ? (if at all possible given how buckets and partitions are 
managed).

There is this comment in 8c0d7bafad3 which introduce this "dshash":

There is a wide range of potential users for such a hash table, though 
it's very likely the interface will need to evolve as we come to 
understand the needs of different kinds of users. E.g support for 
iterators and incremental resizing is planned for later commits and the 
details of the callback signatures are likely to change.

I'm unsure iterators and incremental resizing has made it ?

---
Cédric Villemain +33 6 20 30 22 52
https://www.Data-Bene.io
PostgreSQL Support, Expertise, Training, R&D




Re: Crash: invalid DSA memory alloc request

From
"Andreas 'ads' Scherbaum"
Date:

Hi Matthias,

On Thu, Dec 12, 2024 at 10:49 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
On Thu, 12 Dec 2024 at 22:28, Andreas 'ads' Scherbaum <ads@pgug.de> wrote:
>
>
> Hello,
>
> I'm running a couple of large tests, and in this particular test I have
> a few million tables more.
>
> At some point it fails, and I gathered the following trace:
>
>
> 2024-12-12 22:22:55.307 CET [1496210] ERROR: invalid DSA memory alloc
> request size 1073741824
> 2024-12-12 22:22:55.307 CET [1496210] BACKTRACE:
>          postgres: ads tabletest [local] CREATE TABLE(+0x15e570)
> [0x6309c379c570]
>          postgres: ads tabletest [local] CREATE
> TABLE(dshash_find_or_insert+0x1a4) [0x6309c39882d4]
>          postgres: ads tabletest [local] CREATE
> TABLE(pgstat_get_entry_ref+0x440) [0x6309c3b0a530]

It looks like the dshash table used in the pgstats system uses
resize(), which only specifies DSA_ALLOC_ZERO, not DSA_ALLOC_HUGE,
causing issues when the table grows larger than 1 GB.

I expect that error to disappear when you replace the
dsa_allocate0(...) call in dshash.c's resize function with
dsa_allocate_extended(..., DSA_ALLOC_HUGE | DSA_ALLOC_ZERO) as
attached, but haven't tested it due to a lack of database with
millions of relations.

Can confirm that the crash no longer happens when applying your patch.

Was able to both continue the old and crashed test, as well as run a new test:

tabletest=# select count(*) from information_schema.tables;
  count  
----------
 20000211
(1 row)



Thanks,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

Re: Crash: invalid DSA memory alloc request

From
Nathan Bossart
Date:
On Mon, Dec 16, 2024 at 08:00:00AM +0100, Andreas 'ads' Scherbaum wrote:
> Can confirm that the crash no longer happens when applying your patch.

The patch looks reasonable to me.  I'll commit it soon unless someone
objects.  I was surprised to learn that the DSA_ALLOC_HUGE flag is only
intended to catch faulty allocation requests [0].

> Was able to both continue the old and crashed test, as well as run a new
> test:
> 
> tabletest=# select count(*) from information_schema.tables;
>   count
> ----------
>  20000211
> (1 row)

That's a lot of tables...

[0] https://postgr.es/m/28062.1487456862%40sss.pgh.pa.us

-- 
nathan



Re: Crash: invalid DSA memory alloc request

From
"Andreas 'ads' Scherbaum"
Date:

Hello,

On Mon, Dec 16, 2024 at 11:18 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Dec 16, 2024 at 08:00:00AM +0100, Andreas 'ads' Scherbaum wrote:
> Can confirm that the crash no longer happens when applying your patch.

The patch looks reasonable to me.  I'll commit it soon unless someone
objects.  I was surprised to learn that the DSA_ALLOC_HUGE flag is only
intended to catch faulty allocation requests [0].

Is there a way to test it, except by creating so many tables?
There might be more such problems.
I did run a few basic queries in the database, but that's far from a full test.
 
 
> Was able to both continue the old and crashed test, as well as run a new
> test:
>
> tabletest=# select count(*) from information_schema.tables;
>   count
> ----------
>  20000211
> (1 row)

That's a lot of tables...

Started as a discussion, got me curious and it's only about a magnitude or so off
from what I've seen in production.

Not unrealistic to find out when and where it breaks.


Thanks,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

Re: Crash: invalid DSA memory alloc request

From
Andres Freund
Date:
Hi,

On 2024-12-17 16:50:45 +0900, Michael Paquier wrote:
> On Mon, Dec 16, 2024 at 04:18:26PM -0600, Nathan Bossart wrote:
> > On Mon, Dec 16, 2024 at 08:00:00AM +0100, Andreas 'ads' Scherbaum wrote:
> >> Can confirm that the crash no longer happens when applying your patch.
> > 
> > The patch looks reasonable to me.  I'll commit it soon unless someone
> > objects.  I was surprised to learn that the DSA_ALLOC_HUGE flag is only
> > intended to catch faulty allocation requests [0].
> 
> No objections.
> 
> Most likely this issue gets by a large degree easier to reach now that
> we can plug into the backend custom pgstats kinds.  If pgstats or an
> equivalent implementation uses pgstats, I don't think that we'll be
> able to live without lifting this limit (500k query entries are
> common, at 2kB each it would be enough to blow things), so using
> DSA_ALLOC_HUGE sounds good to me.  I don't see a huge point in
> backpatching, FWIW.

I don't see why we wouldn't want to backpatch? The number of objects here
isn't entirely unrealistic to reach with relations alone, and if you enable
e.g. function execution stats it can reasonably reach higher numbers more
quickly. And use DSA_ALLOC_HUGE in that place feels like a rather low risk
change?

Greetings,

Andres Freund



Re: Crash: invalid DSA memory alloc request

From
Nathan Bossart
Date:
On Tue, Dec 17, 2024 at 10:53:07AM -0500, Andres Freund wrote:
> On 2024-12-17 16:50:45 +0900, Michael Paquier wrote:
>> I don't see a huge point in backpatching, FWIW.
> 
> I don't see why we wouldn't want to backpatch? The number of objects here
> isn't entirely unrealistic to reach with relations alone, and if you enable
> e.g. function execution stats it can reasonably reach higher numbers more
> quickly. And use DSA_ALLOC_HUGE in that place feels like a rather low risk
> change?

Agreed, this feels low-risk enough to back-patch to at least v15, where
statistics were moved to shared memory.  But I don't see a strong reason to
avoid back-patching it to all supported versions, either.

-- 
nathan



Re: Crash: invalid DSA memory alloc request

From
Nathan Bossart
Date:
Committed.

-- 
nathan



Re: Crash: invalid DSA memory alloc request

From
Andres Freund
Date:
On 2024-12-17 15:32:06 -0600, Nathan Bossart wrote:
> Committed.

Thanks!



Re: Crash: invalid DSA memory alloc request

From
Andreas 'ads' Scherbaum
Date:
On 17/12/2024 22:32, Nathan Bossart wrote:
> Committed.
>

Thanks, I see you backpatched it all the way to 13.
Will see how far back I can test this, will take a while.


Regards,

-- 
                Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project