Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. - Mailing list pgsql-bugs
From | Kasahara Tatsuhito |
---|---|
Subject | Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. |
Date | |
Msg-id | CAP0=ZVLue4DBqV5cSD63JbXryV7+UKXdukfoLyEOpw+2EHRFCg@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. (tomohiro hiramitsu <hiramit.tm@gmail.com>) |
Responses |
Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.
Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. |
List | pgsql-bugs |
Hi, On Mon, Dec 28, 2020 at 4:31 PM tomohiro hiramitsu <hiramit.tm@gmail.com> wrote: > > Hi > > > Oh meh. Yeah, I didn't think about the actual implementation of doing it :/ > > > > That said, I'd really prefer to see *some* kind of counter that could > > get people noticing this a bit earlier, rather than just bugging them > > in the logfile once it's gone over a threshold. A statistics counter > > maybe, but I guess that'd have to be tracked at a per-table level to > > be really useful, which would make it potentially fairly expensive to > > keep around... > > I think the statistics on OID assignments per-table is a good idea, but how about simply logging the event to notify usersas the first step? > I think it's easy to implement and helpful for users. +1 It is a simple and quick way for users to know if the OID assignment is taking a long time or not. At least it will be better than now. > First of all, I would like to provide a way for the user to know that it is taking a long time to generate the OID, andprovide an opportunity for the user to deal with this problem with statement_timeout etc. > This is an interim measure, but I think it's better than it is. > > Since it is complicated to handle WaitEvent and statistics for each table, I first considered a patch that simply outputslogs, giving priority to avoiding a hang state without outputting logs. > > * v01-0001-OID_log_output.patch > In this patch, GetNewOidWithIndex outputs a log when it loops too many times to assign a new OID. > Also, if the log is output even once, it will be output to the log when the OID is assigned. > > * This patch does not cause an error even if it loops too many times. > How much time can be tolerated in the OID search loop depends on the user, so I think that the user should decidethe time to make an error with statement_timeout. > > * The log output interval threshold increases by "* = 2"(exponential backoff) and the maximum number of log outputsis set to 5. > The reason for setting this threshold is to reduce the number of log outputs and output the log before stopping atstatement_timeout etc. I think the appropriate value for this threshold is controversial. > Since it is not necessary to keep logging, the upper limit is set to 5 times. > > * The log output interval threshold is set to 1 million for the first log output. > In my environment, it takes about 4 seconds to loop 1 million times, and the 5th log is output after about 2 minutes.Ithink this threshold is just right considering statements_timeout, but it's controversial. > > Log output does not need to continue output during OID assignment. I think the first few and the last log are enough. > > I will add this patch to the commitfest. I look forward to your feedback about the patch. > > > By the way > How about distributing OIDs so that the GetNewOidWithIndex function doesn't take long to find an available OID? > For example, skip one and assign an OID. > > As a side effect of this method, the OID progresses faster. > (Maybe the btree index will be fragmented faster) I think the idea of assigning OIDs to sparse is interesting. However, I think it needs to find out how much it will affect the performance. It could be a non-negligible overhead, especially for large amounts of data insertion/updating. Best regards, > It may be better to include skips only for TOAST chunk_id, as this method may have other side effects. > > regards. > -- > Tomohiro Hiramitsu > NTT Open Source Software Center > > 2020年12月21日(月) 16:44 Magnus Hagander <magnus@hagander.net>: >> >> On Wed, Nov 18, 2020 at 8:27 PM Andres Freund <andres@anarazel.de> wrote: >> > >> > Hi, >> > >> > On 2020-11-18 11:57:05 +0100, Magnus Hagander wrote: >> > > This definitely sounds like something that's worth putting out as a >> > > wait event. Even before you start traversing millions of OIDs it might >> > > gradually start to show up, and being able to monitor that would >> > > definitely be useful. >> > >> > I don't think this is likely to work well as a wait event. All the index >> > traversals etc will do IO, acquire locks, etc, which will overwrite the >> > wait event and reset it to nothing once done. >> >> Oh meh. Yeah, I didn't think about the actual implementation of doing it :/ >> >> That said, I'd really prefer to see *some* kind of counter that could >> get people noticing this a bit earlier, rather than just bugging them >> in the logfile once it's gone over a threshold. A statistics counter >> maybe, but I guess that'd have to be tracked at a per-table level to >> be really useful, which would make it potentially fairly expensive to >> keep around... >> >> -- >> Magnus Hagander >> Me: https://www.hagander.net/ >> Work: https://www.redpill-linpro.com/ >> >> >> >> -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
pgsql-bugs by date: