Re: [PATCH] Infinite loop while acquiring new TOAST Oid - Mailing list pgsql-hackers

From Nikita Malakhov
Subject Re: [PATCH] Infinite loop while acquiring new TOAST Oid
Date
Msg-id CAN-LCVMaJVmS4a7QkobRWk0ja5mmpjV6_oMX=CC86QXLGJvJjA@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Infinite loop while acquiring new TOAST Oid  (Andres Freund <andres@anarazel.de>)
Responses Re: [PATCH] Infinite loop while acquiring new TOAST Oid
List pgsql-hackers
Hi!

We've already encountered this issue on large production databases, and
4 billion rows is not so much for modern bases, so this issue already arises
from time to time and would arise more and more often. I agree that global
oid counter is the main issue, and better solution would be local counters
with larger datatype for value id. This is the right way to solve this issue,
although it would take some time. As I understand, global counter was taken
because it looked the fastest way of getting unique ID.
Ok, I'll prepare a patch with it.

>Due to that we end up assigning oids that conflict with existing
>toast oids much sooner than 4 billion toasted datums.

Just a note: global oid is checked for related TOAST table only, so equal oids
in different TOAST tables would not collide.

>Eventually we should do the obvious thing and make toast ids 64bit wide - to
>combat the space usage we likely should switch to representing the ids as
>variable width integers or such, otherwise the space increase would likely be
>prohibitive.

I'm already working on it, but I thought that 64-bit value ID won't be easily
accepted by community. I'd be very thankful for any advice on this.

On Mon, Nov 28, 2022 at 11:36 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2022-11-28 18:34:20 +0300, Nikita Malakhov wrote:
> While working on Pluggable TOAST we've detected a defective behavior
> on tables with large amounts of TOASTed data - queries freeze and DB stalls.
> Further investigation led us to the loop with GetNewOidWithIndex function
> call - when all available Oid already exist in the related TOAST table this
> loop continues infinitely. Data type used for value ID is the UINT32, which
> is
> unsigned int and has a maximum value of *4294967295* which allows
> maximum 4294967295 records in the TOAST table. It is not a very big amount
> for modern databases and is the major problem for productive systems.

I don't think the absolute number is the main issue - by default external
toasting will happen only for bigger datums. 4 billion external datums
typically use a lot of space.

If you hit this easily with your patch, then you likely broke the conditions
under which external toasting happens.

IMO the big issue is the global oid counter making it much easier to hit oid
wraparound. Due to that we end up assigning oids that conflict with existing
toast oids much sooner than 4 billion toasted datums.

I think the first step to improve the situation is to not use a global oid
counter for toasted values. One way to do that would be to use the sequence
code to do oid assignment, but we likely can find a more efficient
representation.

Eventually we should do the obvious thing and make toast ids 64bit wide - to
combat the space usage we likely should switch to representing the ids as
variable width integers or such, otherwise the space increase would likely be
prohibitive.


> Quick fix for this problem is limiting GetNewOidWithIndex loops to some
> reasonable amount defined by related macro and returning error if there is
> still no available Oid. Please check attached patch, any feedback is
> appreciated.

This feels like the wrong spot to tackle the issue. For one, most of the
looping will be in GetNewOidWithIndex(), so limiting looping in
toast_save_datum() won't help much. For another, if the limiting were in the
right place, it'd break currently working cases. Due to oid wraparound it's
pretty easy to hit "ranges" of allocated oids, without even getting close to
2^32 toasted datums.

Greetings,

Andres Freund


--
Regards,
Nikita Malakhov
Postgres Professional 

pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: fixing CREATEROLE
Next
From: Bruce Momjian
Date:
Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication