Thread: [HACKERS] [PATCH] Make sure all statistics is sent after a few DML areperformed
[HACKERS] [PATCH] Make sure all statistics is sent after a few DML areperformed
From
Yugo Nagata
Date:
Hi, After a DML is perfomed, the statistics is sent to pgstat by pgsent_report_sent(). However, the mininum time between stas file update is PGSTAT_STAT_INTERVAL, so if a few DMLs are performed with short interval, some statistics could not be sent until the backend is shutdown. This is not a problem in usual cases, but in case that a session is remained in idle for a long time, for example when using connection pooling, statistics of a huge change of a table is not sent for a long time, and as a result, starting autovacuum might be delayed. An idea to resolve this is call pgsent_report_sent() again with a delay after the last DML to make sure to send the statistics. The attached patch implements this. Any comments would be appreciated. Regards, -- Yugo Nagata <nagata@sraoss.co.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] [PATCH] Make sure all statistics is sent after a fewDML are performed
From
Andres Freund
Date:
Hi, On 2017-07-18 21:49:27 +0900, Yugo Nagata wrote: > After a DML is perfomed, the statistics is sent to pgstat by pgsent_report_sent(). > However, the mininum time between stas file update is PGSTAT_STAT_INTERVAL, so if > a few DMLs are performed with short interval, some statistics could not be sent > until the backend is shutdown. > > This is not a problem in usual cases, but in case that a session is remained in > idle for a long time, for example when using connection pooling, statistics of > a huge change of a table is not sent for a long time, and as a result, starting > autovacuum might be delayed. Hm. > An idea to resolve this is call pgsent_report_sent() again with a delay > after the last DML to make sure to send the statistics. The attached patch > implements this. That seems like it'd add a good number of new wakeups, or at least scheduling of wakeups. Now a timer would be armed whenever a query is run outside of a transaction - something happening quite regularly. And it'd do so even if a) there are no outstanding stat reports b) there's already a timer running. Additionally it'd, unless I mis-skimmed, trigger pgstat_report_stat() from within CHECK_FOR_INTERRUPTS/ProcessInterrupts even when inside a transaction, if that transaction is started after the enable_timeout_after(). I'm not entirely sure what a good solution would be. One option would be to have a regular wakeup setting a flag (considerably longer than 500ms, that's way too many additional wakeups), rearm the timer in the handler, but only process the flag when outside of a transaction. Or we could do nothing - I actually think that's a viable option. - Andres
Re: [HACKERS] [PATCH] Make sure all statistics is sent after a few DML are performed
From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes: > That seems like it'd add a good number of new wakeups, or at least > scheduling of wakeups. Yes, as it stands this will result in a huge increase in alarm-scheduling kernel call traffic. I understand the issue but I do not think this is an acceptable path to a fix. > Or we could do nothing - I actually think that's a viable option. I tend to agree. I'm not really sure that the presented problem is a big deal: for it to be an issue, you have to assume that a DML operation that takes less than PGSTAT_STAT_INTERVAL is capable of causing enough table churn that it's a problem if autovacuum doesn't hear about that churn promptly. I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL. I don't think that value has been reconsidered since the code was written, circa turn of the century. Maybe even make it configurable, though that could be overkill. regards, tom lane
Re: [HACKERS] [PATCH] Make sure all statistics is sent after a fewDML are performed
From
Andres Freund
Date:
Hi, On 2017-07-18 09:42:31 -0400, Tom Lane wrote: > I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL. > I don't think that value has been reconsidered since the code was written, > circa turn of the century. Maybe even make it configurable, though that > could be overkill. Not sure if that really does that much to solve the concern. Another, pretty half-baked, approach would be to add a procsignal triggering idle backends to send stats, and send that to all idle backends when querying stats. We could even publish the number of outstanding stats updates in PGXACT or such, without any locking, and send it only to those that have outstanding ones. Greetings, Andres Freund
Re: [HACKERS] [PATCH] Make sure all statistics is sent after a few DML are performed
From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes: > On 2017-07-18 09:42:31 -0400, Tom Lane wrote: >> I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL. > Not sure if that really does that much to solve the concern. Well, it reduces the amount of data churn that a statement shorter than PGSTAT_STAT_INTERVAL could cause. > Another, > pretty half-baked, approach would be to add a procsignal triggering idle > backends to send stats, and send that to all idle backends when querying > stats. We could even publish the number of outstanding stats updates in > PGXACT or such, without any locking, and send it only to those that have > outstanding ones. If somebody wanted to do the work, that'd be a viable answer IMO. You'd really want to not wake backends that have nothing more to send, but I agree that it'd be possible to advertise that in shared memory. regards, tom lane
Re: [HACKERS] [PATCH] Make sure all statistics is sent after a fewDML are performed
From
Yugo Nagata
Date:
On Tue, 18 Jul 2017 10:10:49 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: Thank you for your comments. I understand the problem of my proposal patch. > Andres Freund <andres@anarazel.de> writes: > > On 2017-07-18 09:42:31 -0400, Tom Lane wrote: > >> I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL. > > > Not sure if that really does that much to solve the concern. > > Well, it reduces the amount of data churn that a statement shorter than > PGSTAT_STAT_INTERVAL could cause. > > > Another, > > pretty half-baked, approach would be to add a procsignal triggering idle > > backends to send stats, and send that to all idle backends when querying > > stats. We could even publish the number of outstanding stats updates in > > PGXACT or such, without any locking, and send it only to those that have > > outstanding ones. > > If somebody wanted to do the work, that'd be a viable answer IMO. You'd > really want to not wake backends that have nothing more to send, but > I agree that it'd be possible to advertise that in shared memory. > > regards, tom lane -- Yugo Nagata <nagata@sraoss.co.jp>
Re: [HACKERS] [PATCH] Make sure all statistics is sent after a fewDML are performed
From
Andres Freund
Date:
Hi, (please don't top-reply on this list) On 2017-07-19 14:04:39 +0900, Yugo Nagata wrote: > On Tue, 18 Jul 2017 10:10:49 -0400 > Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Thank you for your comments. I understand the problem of my proposal > patch. Does that mean you're trying to rewrite it in the way that was suggested: > > > Another, > > > pretty half-baked, approach would be to add a procsignal triggering idle > > > backends to send stats, and send that to all idle backends when querying > > > stats. We could even publish the number of outstanding stats updates in > > > PGXACT or such, without any locking, and send it only to those that have > > > outstanding ones. > > > > If somebody wanted to do the work, that'd be a viable answer IMO. You'd > > really want to not wake backends that have nothing more to send, but > > I agree that it'd be possible to advertise that in shared memory. or are you planning to just let the issue leave be? - Andres
Re: [HACKERS] [PATCH] Make sure all statistics is sent after a fewDML are performed
From
Yugo Nagata
Date:
On Fri, 21 Jul 2017 04:58:47 -0700 Andres Freund <andres@anarazel.de> wrote: > Hi, > > (please don't top-reply on this list) > > On 2017-07-19 14:04:39 +0900, Yugo Nagata wrote: > > On Tue, 18 Jul 2017 10:10:49 -0400 > > Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Thank you for your comments. I understand the problem of my proposal > > patch. > > Does that mean you're trying to rewrite it in the way that was > suggested: Not yet, but I'll try to do it. > > > > > Another, > > > > pretty half-baked, approach would be to add a procsignal triggering idle > > > > backends to send stats, and send that to all idle backends when querying > > > > stats. We could even publish the number of outstanding stats updates in > > > > PGXACT or such, without any locking, and send it only to those that have > > > > outstanding ones. > > > > > > If somebody wanted to do the work, that'd be a viable answer IMO. You'd > > > really want to not wake backends that have nothing more to send, but > > > I agree that it'd be possible to advertise that in shared memory. > > or are you planning to just let the issue leave be? > > - Andres -- Yugo Nagata <nagata@sraoss.co.jp>