Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. - Mailing list pgsql-bugs
From | Kasahara Tatsuhito |
---|---|
Subject | Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. |
Date | |
Msg-id | CAP0=ZVKdHgLGvFJHE_6x2PTrw50_FLqPUSdRNzVTKSb_y8naCQ@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. (Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com>) |
List | pgsql-bugs |
Hi, On Wed, Jan 6, 2021 at 11:13 AM Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote: > > Hi, > > On Mon, Dec 28, 2020 at 4:31 PM tomohiro hiramitsu <hiramit.tm@gmail.com> wrote: > > > > Hi > > > > > Oh meh. Yeah, I didn't think about the actual implementation of doing it :/ > > > > > > That said, I'd really prefer to see *some* kind of counter that could > > > get people noticing this a bit earlier, rather than just bugging them > > > in the logfile once it's gone over a threshold. A statistics counter > > > maybe, but I guess that'd have to be tracked at a per-table level to > > > be really useful, which would make it potentially fairly expensive to > > > keep around... > > > > I think the statistics on OID assignments per-table is a good idea, but how about simply logging the event to notifyusers as the first step? > > I think it's easy to implement and helpful for users. > +1 > It is a simple and quick way for users to know if the OID assignment > is taking a long time or not. > At least it will be better than now. > > > First of all, I would like to provide a way for the user to know that it is taking a long time to generate the OID, andprovide an opportunity for the user to deal with this problem with statement_timeout etc. > > This is an interim measure, but I think it's better than it is. > > > > Since it is complicated to handle WaitEvent and statistics for each table, I first considered a patch that simply outputslogs, giving priority to avoiding a hang state without outputting logs. > > > > * v01-0001-OID_log_output.patch > > In this patch, GetNewOidWithIndex outputs a log when it loops too many times to assign a new OID. > > Also, if the log is output even once, it will be output to the log when the OID is assigned. > > > > * This patch does not cause an error even if it loops too many times. > > How much time can be tolerated in the OID search loop depends on the user, so I think that the user should decidethe time to make an error with statement_timeout. > > > > * The log output interval threshold increases by "* = 2"(exponential backoff) and the maximum number of log outputsis set to 5. > > The reason for setting this threshold is to reduce the number of log outputs and output the log before stoppingat statement_timeout etc. I think the appropriate value for this threshold is controversial. > > Since it is not necessary to keep logging, the upper limit is set to 5 times. > > > > * The log output interval threshold is set to 1 million for the first log output. > > In my environment, it takes about 4 seconds to loop 1 million times, and the 5th log is output after about 2 minutes.Ithink this threshold is just right considering statements_timeout, but it's controversial. > > > > Log output does not need to continue output during OID assignment. I think the first few and the last log are enough. > > > > I will add this patch to the commitfest. I look forward to your feedback about the patch. I tested the patch. The patch can be applied with the current HEAD (eb42110d952f8d1ad4049b8f2491e9dfba75ffed) and the regression test passed. I actually tried to reproduce OID assignment conflicts using the following procedure. ================================================================================================ DROP TABLE IF EXISTS t1; -- Create a table. SET TOAST_TUPLE_TARGET for generate much TOAST records. SELECT 'CREATE UNLOGGED TABLE t1 ( c1 int,' || string_agg(str, ',') || ') WITH (TOAST_TUPLE_TARGET = 128);' FROM (SELECT ' c' || i || ' text' as str FROM generate_series(2,21)i )t;\gexec -- Generate data and insert. This query generate 20 TOAST records per-rows. -- So this results in 16 million TOAST rows (same as GETNEWOID_NOTIFICATION_LIMIT). WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT md5(clock_timestamp()::text) AS str FROM generate_series(1,4)t)s) INSERT INTO t1 SELECT generate_series(1,800000) ,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r; -- Check min(chunk_id) to rewind OID following test SELECT 'SELECT min(chunk_id) FROM pg_toast.' || relname FROM pg_class WHERE oid = (SELECT reltoastrelid FROM pg_class WHERE relname = 't1');\gexec -- Check normal(non OID conflicting) insert's duration \timing WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT md5(clock_timestamp()::text) AS str FROM generate_series(1,4)t)s) INSERT INTO t1 SELECT 1,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r; INSERT 0 1 Time: 0.899 ms -- Reset OID to the value of min(chunk_id) by using pg_resetwal (-o) -- Check problematic(OID conflicting) insert's duration \timing WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT md5(clock_timestamp()::text) AS str FROM generate_series(1,4)t)s) INSERT INTO t1 SELECT 1,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r; INSERT 0 1 Time: 36399.478 ms (00:36.399) ================================================================================================ And I got following log messages. (The STATEMENT section is long and has been cut) =================================================================================================== 2021-02-18 19:04:59.055 JST [32153] LOG: failed to assign new OID in relation "pg_toast_32018451" after 1000000 retries 2021-02-18 19:04:59.055 JST [32153] STATEMENT: WITH r ... 2021-02-18 19:05:01.299 JST [32153] LOG: failed to assign new OID in relation "pg_toast_32018451" after 2000000 retries 2021-02-18 19:05:01.299 JST [32153] STATEMENT: WITH r ... 2021-02-18 19:05:05.802 JST [32153] LOG: failed to assign new OID in relation "pg_toast_32018451" after 4000000 retries 2021-02-18 19:05:05.802 JST [32153] STATEMENT: WITH r ... 2021-02-18 19:05:14.853 JST [32153] LOG: failed to assign new OID in relation "pg_toast_32018451" after 8000000 retries 2021-02-18 19:05:14.853 JST [32153] STATEMENT: WITH r ... 2021-02-18 19:05:33.185 JST [32153] LOG: failed to assign new OID in relation "pg_toast_32018451" after 16000000 retries 2021-02-18 19:05:33.185 JST [32153] STATEMENT: WITH r ... 2021-02-18 19:05:33.185 JST [32153] LOG: the new OID has been assigned in relation "pg_toast_32018451" after 16000021 retries 2021-02-18 19:05:33.185 JST [32153] STATEMENT: WITH r ... =================================================================================================== It works as expected, and I think the message is useful. Here are some comments about the code. +#define GETNEWOID_NOTIFICATION_INTERVAL_FACTOR 1000000 It seems to be more of a minimum value for conflict detection rather than an interval factor. So, it would be better to use another name, such as GETNEWOID_NOTIFICATION_MINVAL or others. + /* retry count and notification limit check */ + if (retry_count == next_notify && next_notify <= GETNEWOID_NOTIFICATION_LIMIT) + { + ereport(LOG, + (errmsg("failed to assign new OID in relation \"%s\" after "UINT64_FORMAT" retries", + RelationGetRelationName(relation), retry_count))); + next_notify *= 2; /* double it for the next notification */ + } + retry_count++; I think that we should add more comments about why output this LOG message. Best regards, > > > > > > By the way > > How about distributing OIDs so that the GetNewOidWithIndex function doesn't take long to find an available OID? > > For example, skip one and assign an OID. > > > > As a side effect of this method, the OID progresses faster. > > (Maybe the btree index will be fragmented faster) > I think the idea of assigning OIDs to sparse is interesting. > However, I think it needs to find out how much it will affect the performance. > It could be a non-negligible overhead, especially for large amounts of > data insertion/updating. > > Best regards, > > > It may be better to include skips only for TOAST chunk_id, as this method may have other side effects. > > > > regards. > > -- > > Tomohiro Hiramitsu > > NTT Open Source Software Center > > > > 2020年12月21日(月) 16:44 Magnus Hagander <magnus@hagander.net>: > >> > >> On Wed, Nov 18, 2020 at 8:27 PM Andres Freund <andres@anarazel.de> wrote: > >> > > >> > Hi, > >> > > >> > On 2020-11-18 11:57:05 +0100, Magnus Hagander wrote: > >> > > This definitely sounds like something that's worth putting out as a > >> > > wait event. Even before you start traversing millions of OIDs it might > >> > > gradually start to show up, and being able to monitor that would > >> > > definitely be useful. > >> > > >> > I don't think this is likely to work well as a wait event. All the index > >> > traversals etc will do IO, acquire locks, etc, which will overwrite the > >> > wait event and reset it to nothing once done. > >> > >> Oh meh. Yeah, I didn't think about the actual implementation of doing it :/ > >> > >> That said, I'd really prefer to see *some* kind of counter that could > >> get people noticing this a bit earlier, rather than just bugging them > >> in the logfile once it's gone over a threshold. A statistics counter > >> maybe, but I guess that'd have to be tracked at a per-table level to > >> be really useful, which would make it potentially fairly expensive to > >> keep around... > >> > >> -- > >> Magnus Hagander > >> Me: https://www.hagander.net/ > >> Work: https://www.redpill-linpro.com/ > >> > >> > >> > >> > > > -- > Tatsuhito Kasahara > kasahara.tatsuhito _at_ gmail.com -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
pgsql-bugs by date: