Re: "unexpected duplicate for tablespace" problem in logical replication - Mailing list pgsql-bugs

From Andres Freund
Subject Re: "unexpected duplicate for tablespace" problem in logical replication
Date
Msg-id ot6apoeofiko6ienwx5b3aavsbru7yf76bjk5hdqj2cj53n5lm@hdwcrgek5j5y
Whole thread Raw
In response to Re: "unexpected duplicate for tablespace" problem in logical replication  (Andres Freund <andres@anarazel.de>)
Responses Re: "unexpected duplicate for tablespace" problem in logical replication
List pgsql-bugs
Hi,

On 2025-09-18 08:17:49 -0400, Andres Freund wrote:
> On 2025-09-18 17:37:10 +0530, Ashutosh Bapat wrote:
> > From 6a3562b4ac8917c8b577797e5468416a90cc04f5 Mon Sep 17 00:00:00 2001
> > From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
> > Date: Thu, 18 Sep 2025 17:24:09 +0530
> > Subject: [PATCH] Negative RelfilenumberMap cache entries from
> >  pg_filenode_relation()
> >
> > RelidByRelfilenumber() adds negative entries to the cache. It has three
> > users, logical replication, autoprewarm and pg_filenode_relation(). The
> > first two need negative entries in the cache in case they happen to
> > lookup non-existent mapping again and again.  However such mappings will
> > be smaller in number and usually come from some database object e.g. WAL
> > or autoprewarm metadata.
> >
> > But pg_filenode_relation(), which is SQL callable, may be invoked many
> > times with invalid tablespace and relfilenode pairs, causing the cache
> > to be bloated with negative cache entries. This can be used as a denial
> > of service attack since any user can execute it. This commit avoids such
> > a bloat.
>
> I don't really understand why this is worth fixing for the relfilenode stuff
> specifically - isn't this true for just about *all* of our caches? Many, if
> not most, can be reached via SQL?

Example:

postgres[315631][1]=# SELECT count(*), sum(total_bytes) total_bytes, sum(total_nblocks) total_nblocks, sum(free_bytes)
free_bytes,sum(free_chunks) free_chunks, sum(used_bytes) used_bytes FROM pg_backend_memory_contexts WHERE path @>
(SELECTpath FROM pg_backend_memory_contexts WHERE name = 'CacheMemoryContext');
 
┌───────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐
│ count │ total_bytes │ total_nblocks │ free_bytes │ free_chunks │ used_bytes │
├───────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
│    89 │      747200 │           187 │     130336 │         216 │     616864 │
└───────┴─────────────┴───────────────┴────────────┴─────────────┴────────────┘
(1 row)

Time: 1.540 ms
postgres[315631][1]=# SELECT to_regclass(g.i::text||'.'||g.i::text) is NULL, count(*) FROM generate_series(1, 10000000)
g(i)GROUP BY 1;
 
┌──────────┬──────────┐
│ ?column? │  count   │
├──────────┼──────────┤
│ t        │ 10000000 │
└──────────┴──────────┘
(1 row)

Time: 23238.662 ms (00:23.239)
postgres[315631][1]=# SELECT count(*), sum(total_bytes) total_bytes, sum(total_nblocks) total_nblocks, sum(free_bytes)
free_bytes,sum(free_chunks) free_chunks, sum(used_bytes) used_bytes FROM pg_backend_memory_contexts WHERE path @>
(SELECTpath FROM pg_backend_memory_contexts WHERE name = 'CacheMemoryContext');
 
┌───────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐
│ count │ total_bytes │ total_nblocks │ free_bytes │ free_chunks │ used_bytes │
├───────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
│    89 │  2223205104 │           441 │    8331448 │         472 │ 2214873656 │
└───────┴─────────────┴───────────────┴────────────┴─────────────┴────────────┘
(1 row)


There are so so so many ways to cause arbitrarily large memory usage,
e.g. just by definining a lot of prepared statements, that I just don't see it
worth adding any complexity to the relfilenode case specifically.

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: "unexpected duplicate for tablespace" problem in logical replication
Next
From: Ashutosh Bapat
Date:
Subject: Re: "unexpected duplicate for tablespace" problem in logical replication