Thread: Statistics updates is delayed when using `commit and chain`
With a customer, we found out that when using `commit and chain`, statistics on the table were not updated. Here are the steps to reproduce (My customer saw this on Postgres 13, I confirmed it under current main version):
laetitia=# laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
n_tup_ins
-----------
17
(1 row)
laetitia=# begin;
BEGIN
laetitia=*# insert into test (value) values ('bla');
INSERT 0 1
laetitia=*# commit and chain;
COMMIT
laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test';
n_tup_ins
-----------
17
(1 row)
laetitia=*# commit;
COMMIT
laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
n_tup_ins
-----------
18
(1 row)
laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
n_tup_ins
-----------
17
(1 row)
Have a nice day,
Lætitia
On Fri, 09 Jul 2021 at 17:05, Lætitia Avrot <laetitia.avrot@gmail.com> wrote: > Hello, > > With a customer, we found out that when using `commit and chain`, statistics on the table were not updated. Here are thesteps to reproduce (My customer saw this on Postgres 13, I confirmed it under current main version): > > laetitia=# laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test'; > > n_tup_ins > > ----------- > > 17 > > (1 row) > > laetitia=# begin; > > BEGIN > > laetitia=*# insert into test (value) values ('bla'); > > INSERT 0 1 > > laetitia=*# commit and chain; > > COMMIT > > laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test'; > > n_tup_ins > > ----------- > > 17 > > (1 row) > > laetitia=*# commit; > > COMMIT > > laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test'; > > n_tup_ins > > ----------- > > 18 > > (1 row) > > Before issuing the last `commit`, I used another connection to check the value of the statistics from another transactionand it was not updated: > > laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test'; > > n_tup_ins > > ----------- > > 17 > > (1 row) > > Maybe it's not a bug and it's on purpose but I can't understand what would prevent the statistics collector from beingtriggered after a `commti and chain`. > After some analyze, I find the table statistics updated only when not within a transaction. If you use COMMIT AND CHAIN, we still in a transaction, so the statistics do not updated. See src/backend/tcop/postgres.c: else if (IsTransactionOrTransactionBlock()) <-------- After call COMMIT AND CHAIN, we come here. { set_ps_display("idle in transaction"); pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL); /* Start the idle-in-transaction timer */ if (IdleInTransactionSessionTimeout > 0) { idle_in_transaction_timeout_enabled = true; enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT, IdleInTransactionSessionTimeout); } } else <-------- After call COMMIT, we come here. { /* Send out notify signals and transmit self-notifies */ ProcessCompletedNotifies(); /* * Also process incoming notifies, if any. This is mostly to * ensure stable behavior in tests: if any notifies were * received during the just-finished transaction, they'll be * seen by the client before ReadyForQuery is. */ if (notifyInterruptPending) ProcessNotifyInterrupt(); pgstat_report_stat(false); <-------- Update statistics. set_ps_display("idle"); pgstat_report_activity(STATE_IDLE, NULL); /* Start the idle-session timer */ if (IdleSessionTimeout > 0) { idle_session_timeout_enabled = true; enable_timeout_after(IDLE_SESSION_TIMEOUT, IdleSessionTimeout); } } -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
On Fri, 09 Jul 2021 at 19:02, Japin Li <japinli@hotmail.com> wrote: > On Fri, 09 Jul 2021 at 17:05, Lætitia Avrot <laetitia.avrot@gmail.com> wrote: >> Hello, >> >> With a customer, we found out that when using `commit and chain`, statistics on the table were not updated. Here are thesteps to reproduce (My customer saw this on Postgres 13, I confirmed it under current main version): >> >> laetitia=# laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test'; >> >> n_tup_ins >> >> ----------- >> >> 17 >> >> (1 row) >> >> laetitia=# begin; >> >> BEGIN >> >> laetitia=*# insert into test (value) values ('bla'); >> >> INSERT 0 1 >> >> laetitia=*# commit and chain; >> >> COMMIT >> >> laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test'; >> >> n_tup_ins >> >> ----------- >> >> 17 >> >> (1 row) >> >> laetitia=*# commit; >> >> COMMIT >> >> laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test'; >> >> n_tup_ins >> >> ----------- >> >> 18 >> >> (1 row) >> >> Before issuing the last `commit`, I used another connection to check the value of the statistics from another transactionand it was not updated: >> >> laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test'; >> >> n_tup_ins >> >> ----------- >> >> 17 >> >> (1 row) >> >> Maybe it's not a bug and it's on purpose but I can't understand what would prevent the statistics collector from beingtriggered after a `commti and chain`. >> > > After some analyze, I find the table statistics updated only when not within > a transaction. If you use COMMIT AND CHAIN, we still in a transaction, so the > statistics do not updated. > > See src/backend/tcop/postgres.c: > > else if (IsTransactionOrTransactionBlock()) <-------- After call COMMIT AND CHAIN, we come here. > { > set_ps_display("idle in transaction"); > pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL); > > /* Start the idle-in-transaction timer */ > if (IdleInTransactionSessionTimeout > 0) > { > idle_in_transaction_timeout_enabled = true; > enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT, > IdleInTransactionSessionTimeout); > } > } > else <-------- After call COMMIT, we come here. > { > /* Send out notify signals and transmit self-notifies */ > ProcessCompletedNotifies(); > > /* > * Also process incoming notifies, if any. This is mostly to > * ensure stable behavior in tests: if any notifies were > * received during the just-finished transaction, they'll be > * seen by the client before ReadyForQuery is. > */ > if (notifyInterruptPending) > ProcessNotifyInterrupt(); > > pgstat_report_stat(false); <-------- Update statistics. > > set_ps_display("idle"); > pgstat_report_activity(STATE_IDLE, NULL); > > /* Start the idle-session timer */ > if (IdleSessionTimeout > 0) > { > idle_session_timeout_enabled = true; > enable_timeout_after(IDLE_SESSION_TIMEOUT, > IdleSessionTimeout); > } > } Attached fixes it by call pgstat_report_stat() when we a in COMMIT AND CHAIN mode. Any thoughts? -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Attachment
On Fri, 09 Jul 2021 at 20:25, John Naylor <john.naylor@enterprisedb.com> wrote: > On Fri, Jul 9, 2021 at 7:15 AM Japin Li <japinli@hotmail.com> wrote: >> > /* Send out notify signals and transmit self-notifies */ >> > ProcessCompletedNotifies(); >> > >> > /* >> > * Also process incoming notifies, if any. This is > mostly to >> > * ensure stable behavior in tests: if any notifies were >> > * received during the just-finished transaction, > they'll be >> > * seen by the client before ReadyForQuery is. >> > */ >> > if (notifyInterruptPending) >> > ProcessNotifyInterrupt(); > > It seems the above would also be skipped in chained transactions -- do we > need to handle notifies as well? > Thanks for your review! Modified. >> Attached fixes it by call pgstat_report_stat() when we a in COMMIT AND > CHAIN mode. >> Any thoughts? > > Do we need equivalent logic within the TBLOCK_SUBCOMMIT case also? Either > way, a comment is probably in order. Add a new function ProcessNotifiesAndStat() to process notifies and report statistics, then call this function in TBLOCK_SUBCOMMIT, TBLOCK_END, TBLOCK_ABORT_END, TBLOCK_ABORT_PENDING and TBLOCK_SUBCOMMIT cases. Please consider v2 patch to review. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Attachment
On Fri, 09 Jul 2021 at 20:25, John Naylor <john.naylor@enterprisedb.com> wrote:
> On Fri, Jul 9, 2021 at 7:15 AM Japin Li <japinli@hotmail.com> wrote:
>> > /* Send out notify signals and transmit self-notifies */
>> > ProcessCompletedNotifies();
>> >
>> > /*
>> > * Also process incoming notifies, if any. This is
> mostly to
>> > * ensure stable behavior in tests: if any notifies were
>> > * received during the just-finished transaction,
> they'll be
>> > * seen by the client before ReadyForQuery is.
>> > */
>> > if (notifyInterruptPending)
>> > ProcessNotifyInterrupt();
>
> It seems the above would also be skipped in chained transactions -- do we
> need to handle notifies as well?
>
Thanks for your review! Modified.
>> Attached fixes it by call pgstat_report_stat() when we a in COMMIT AND
> CHAIN mode.
>> Any thoughts?
>
> Do we need equivalent logic within the TBLOCK_SUBCOMMIT case also? Either
> way, a comment is probably in order.
Add a new function ProcessNotifiesAndStat() to process notifies and report
statistics, then call this function in TBLOCK_SUBCOMMIT, TBLOCK_END,
TBLOCK_ABORT_END, TBLOCK_ABORT_PENDING and TBLOCK_SUBCOMMIT cases.
Please consider v2 patch to review.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
Thank you for the patch. I read it and I find myself with one question: why do we update statistics even though there was a rollback? I know that that was the behaviour before, but is it still worth it?
I tested it against Postgres 14 Beta 2 and Postgres 15 (commit hash 4c64b51dc51f8ff7e3e51eebfe8d10469a577df1) and it worked like a charm for my use case:
laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
n_tup_ins
-----------
1
(1 row)
laetitia=# begin;
BEGIN
laetitia=*# insert into test (value) values ('bla');
INSERT 0 1
laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test';
n_tup_ins
-----------
1
(1 row)
laetitia=*# commit and chain;
COMMIT
laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test';
n_tup_ins
-----------
2
(1 row)
laetitia=*# commit;
COMMIT
laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
n_tup_ins
-----------
2
(1 row)
Have a nice day,
Lætitia
On Mon, 12 Jul 2021 at 20:56, Lætitia Avrot <laetitia.avrot@gmail.com> wrote: > Hello Japin, > > Thank you for the patch. I read it and I find myself with one question: why do we update statistics even though there wasa rollback? I know that that was the behaviour before, but is it still worth it? > I test the following case, and found it also update the statistics even if we rollback. So I update the statistics here. On the other hands, the insert tuple rollbacked still in data block, so IMO we should update the statistics. I'm not sure it is right. postgres=# create table test (id serial, value text); CREATE TABLE postgres=# SELECT n_tup_ins FROM pg_stat_all_tables WHERE relname = 'test'; n_tup_ins ----------- 0 (1 row) postgres=# BEGIN; BEGIN postgres=*# INSERT INTO test (value) VALUES ('bla'); INSERT 0 1 postgres=*# SELECT n_tup_ins FROM pg_stat_all_tables WHERE relname = 'test'; n_tup_ins ----------- 0 (1 row) postgres=*# ABORT ; ROLLBACK postgres=# SELECT n_tup_ins FROM pg_stat_all_tables WHERE relname = 'test'; n_tup_ins ----------- 1 (1 row) -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
John Naylor <john.naylor@enterprisedb.com> writes: > I understand this functionality to be tracking what happens at the physical > level, in which case it is correct. In any case, the bug reported is clear > and changing that behavior is the focus here. About the patch: I do not think this is a bug at all. The behavior is, and always has been, that we report stats when we are about to wait for client input and are not inside a transaction. Likewise for NOTIFY. The proposed patch randomly changes that in a way that is very likely to break clients. Maybe you can persuade me that there's a reason to move the responsibility for stats reporting to some other place, but please keep your hands OFF of NOTIFY. You clearly haven't the faintest idea what the client contract for that is. regards, tom lane
On Mon, 12 Jul 2021 at 23:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: > John Naylor <john.naylor@enterprisedb.com> writes: >> I understand this functionality to be tracking what happens at the physical >> level, in which case it is correct. In any case, the bug reported is clear >> and changing that behavior is the focus here. About the patch: > > I do not think this is a bug at all. The behavior is, and always has > been, that we report stats when we are about to wait for client input > and are not inside a transaction. Likewise for NOTIFY. The proposed > patch randomly changes that in a way that is very likely to break > clients. > Sorry, I'm not very clearly why we can not process NOTIFY when a transaction is end. For example: ------------------------------------------------------------------------------ Step 1. Session 1: postgres# LISTEN tn; LISTEN Step 2. Session 2: postgers# CREATE TABLE tbl (id int) CREATE TABLE postgres=# BEGIN; BEGIN postgres=*# INSERT INTO tbl values (1); INSERT 0 1 postgres=*# NOTIFY tn, 'hello'; NOTIFY Step 3. Session 1: postgres=# SELECT * FROM tbl; id ---- (0 rows) Step 4. Session 2: postgres=*# COMMIT AND CHAIN; COMMIT Step 5. Session 1: postgres=# SELECT * FROM tbl; id ---- 1 (1 row) Step 6. Session 2: postgres=*# COMMIT; COMMIT Step 7. Session 1: postgres=# SELECT * FROM tbl; id ---- 1 (1 row) Asynchronous notification "tn" with payload "hello" received from server process with PID 16348. ------------------------------------------------------------------------------ Since we commit the data in step 4, however the notify doesn't handle, I think this is a bit confused. Why we must wait session is in idle to handle the notify? > Maybe you can persuade me that there's a reason to move the responsibility > for stats reporting to some other place, but please keep your hands > OFF of NOTIFY. You clearly haven't the faintest idea what the client > contract for that is. > > regards, tom lane -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Japin Li <japinli@hotmail.com> writes: > On Mon, 12 Jul 2021 at 23:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I do not think this is a bug at all. The behavior is, and always has >> been, that we report stats when we are about to wait for client input >> and are not inside a transaction. Likewise for NOTIFY. The proposed >> patch randomly changes that in a way that is very likely to break >> clients. > Sorry, I'm not very clearly why we can not process NOTIFY when a transaction is > end. The protocol contract for that is that NOTIFY is delivered *between* transactions. With the proposed patch, the notification can be delivered in the middle of the new transaction created by COMMIT AND CHAIN. This creates semantic issues --- in particular, if that second transaction is later rolled back, does that mean we should re-send the notification? The NOTIFY man page describes this explicitly: ... Secondly, if a listening session receives a notification signal while it is within a transaction, the notification event will not be delivered to its connected client until just after the transaction is completed (either committed or aborted). Again, the reasoning is that if a notification were delivered within a transaction that was later aborted, one would want the notification to be undone somehow — but the server cannot “take back” a notification once it has sent it to the client. So notification events are only delivered between transactions. (protocol.sgml is a bit wishy-washy about this point, but I think we should lock the wording there down harder. We have never delivered notifies intra-transaction, and now is no time to start.) regards, tom lane
Japin Li <japinli@hotmail.com> writes:
> On Mon, 12 Jul 2021 at 23:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I do not think this is a bug at all. The behavior is, and always has
>> been, that we report stats when we are about to wait for client input
>> and are not inside a transaction. Likewise for NOTIFY. The proposed
>> patch randomly changes that in a way that is very likely to break
>> clients.
> Sorry, I'm not very clearly why we can not process NOTIFY when a transaction is
> end.
The protocol contract for that is that NOTIFY is delivered *between*
transactions.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Tue, Jul 13, 2021 at 8:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The protocol contract for that is that NOTIFY is delivered *between* >> transactions. > This is saying "We must not RECEIVE a notification unless the client is > prepared to handle it - idle and not within a transaction." Right. > So, the choice to delay releasing the notification upon committing the > first transaction of a chain is a design choice. Sure, you could argue that. But look at it from the client's side. How would it detangle its response to the notify event from whatever it's doing as part of the chained transaction? You don't want to risk the notify-response action failing because of a problem in the chained transaction, nor vice versa, because they are independent things. (Whatever the notify is about *did* commit, independently of what you are doing.) It's very hard to see how you could do that sanely, other than by delaying handling the notify event till after you're out of the chain. Another angle is that (I think) the proposed patch would allow notifies to be delivered during a commit that happens inside a procedure. At that point, the client doesn't even have control of the connection with which to do something in response to the notify. There might be value in the aspect of the patch that sends statistics upon commit/rollback. But NOTIFY is a *totally* different animal. The fact that the code happens to be physically adjacent in the current implementation doesn't mean that the semantic constraints are similar. regards, tom lane
But look at it from the client's side.
How would it detangle its response to the notify event from whatever
it's doing as part of the chained transaction?
We are already doing that now, though I argue we are in fact documenting the "other clients receive the notification as soon as committed - regardless of chaining" situation (the additional restrictions on receiving are then what cause the chain authoring client to wait).
On Wed, 14 Jul 2021 at 12:06, David G. Johnston <david.g.johnston@gmail.com> wrote: > On Tue, Jul 13, 2021 at 9:02 PM David G. Johnston < > david.g.johnston@gmail.com> wrote: > >> We are already doing that now, though I argue we are in fact documenting >> the "other clients receive the notification as soon as committed - >> regardless of chaining" situation (the additional restrictions on receiving >> are then what cause the chain authoring client to wait). >> >> > Specifically: > > Firstly, if a NOTIFY is executed inside a transaction, the notify events > are not delivered until and unless the transaction is committed. > > vs. > > So notification events are only delivered between transactions. > > I'll accept that "and-chain" means there is no "between transactions" > period. But there is no qualification to "until the transaction is > committed". > There are two cases: 1. Should we update the statistics when "and-chain" ends? 2. Should we deliver the notification when "and-chain" ends? For the first one, I think we should update the statistics when "and-chain" ends because it makes statistics more accurate. For the second one, maybe I misunderstand the "and-chain", tested it aggin and find the notifications will not deliver if the last transaction is aborted. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
On Wed, 14 Jul 2021 at 12:06, David G. Johnston <david.g.johnston@gmail.com> wrote:
> On Tue, Jul 13, 2021 at 9:02 PM David G. Johnston <
> david.g.johnston@gmail.com> wrote:
>
>> We are already doing that now, though I argue we are in fact documenting
>> the "other clients receive the notification as soon as committed -
>> regardless of chaining" situation (the additional restrictions on receiving
>> are then what cause the chain authoring client to wait).
>>
>>
> Specifically:
>
> Firstly, if a NOTIFY is executed inside a transaction, the notify events
> are not delivered until and unless the transaction is committed.
>
> vs.
>
> So notification events are only delivered between transactions.
>
> I'll accept that "and-chain" means there is no "between transactions"
> period. But there is no qualification to "until the transaction is
> committed".
>
There are two cases:
1. Should we update the statistics when "and-chain" ends?
2. Should we deliver the notification when "and-chain" ends?
For the first one, I think we should update the statistics when "and-chain" ends
because it makes statistics more accurate. For the second one, maybe I misunderstand
the "and-chain", tested it aggin and find the notifications will not deliver if
the last transaction is aborted.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
If the answer is "yes", then we need to update the statistics and notify before creating a new transaction because we need to completely end the previous transaction before starting a new one with the same characteristics.
If the answer is "no", then we contradict the SQL standard which states there's nothing special with a transaction created by `and chain`:
>If <commit statement> contains AND CHAIN, then an SQL-transaction is initiated. Any branch transactions of
> condition area limit as the corresponding branch of the SQL-transaction just termi- nated.
Lætitia
On Mon, 26 Jul 2021 at 23:53, Lætitia Avrot <laetitia.avrot@gmail.com> wrote: > Dear all, > > Le mer. 14 juil. 2021 à 08:54, Japin Li <japinli@hotmail.com> a écrit : > >> >> On Wed, 14 Jul 2021 at 12:06, David G. Johnston < >> david.g.johnston@gmail.com> wrote: >> > On Tue, Jul 13, 2021 at 9:02 PM David G. Johnston < >> > david.g.johnston@gmail.com> wrote: >> > >> >> We are already doing that now, though I argue we are in fact documenting >> >> the "other clients receive the notification as soon as committed - >> >> regardless of chaining" situation (the additional restrictions on >> receiving >> >> are then what cause the chain authoring client to wait). >> >> >> >> >> > Specifically: >> > >> > Firstly, if a NOTIFY is executed inside a transaction, the notify events >> > are not delivered until and unless the transaction is committed. >> > >> > vs. >> > >> > So notification events are only delivered between transactions. >> > >> > I'll accept that "and-chain" means there is no "between transactions" >> > period. But there is no qualification to "until the transaction is >> > committed". >> > >> >> There are two cases: >> >> 1. Should we update the statistics when "and-chain" ends? >> 2. Should we deliver the notification when "and-chain" ends? >> >> For the first one, I think we should update the statistics when >> "and-chain" ends >> because it makes statistics more accurate. For the second one, maybe I >> misunderstand >> the "and-chain", tested it aggin and find the notifications will not >> deliver if >> the last transaction is aborted. >> >> -- >> Regrads, >> Japin Li. >> ChengDu WenWu Information Technology Co.,Ltd. >> >> > It seems to me that the important question is : are chained transactions > normal transactions? > If the answer is "yes", then we need to update the statistics and notify > before creating a new transaction because we need to completely end the > previous transaction before starting a new one with the same > characteristics. > If the answer is "no", then we contradict the SQL standard which states > there's nothing special with a transaction created by `and chain`: > I didn't find a SQL standard about `and chain`. IMO, whether the answer is "yes" or "no", we should update the statistics. For notify, however, I'm not sure. >>If <commit statement> contains AND CHAIN, then an SQL-transaction is > initiated. Any branch transactions of >> the SQL-transaction are initiated with the same transaction access mode, > transaction isolation level, and >> condition area limit as the corresponding branch of the SQL-transaction > just termi- nated. > > and > >> If <rollback statement> contains AND CHAIN, then an SQL-transaction is > initiated. Any branch transactions of >> the SQL-transaction are initiated with the same transaction access mode, > transaction isolation level, and >> condition area limit as the corresponding branch of the SQL-transaction > just terminated. > > Have a nice day, > > Lætitia -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
I didn't find a SQL standard about `and chain`. IMO, whether the answer is
"yes" or "no", we should update the statistics. For notify, however, I'm not
sure.
Attachment
Laetitia Avrot <laetitia.avrot@gmail.com> writes: > [ v3-0001-update-table-stats-after-commit-and-chain.patch ] I took a quick look at this. If we're going to move the responsibility for pushing out stats to transaction commit, then I think what we want to do is put the call somewhere near the end of CommitTransaction. Maybe right after AtCommit_Notify, because that has the same spirit of sending out info about our having committed to other backends. (The pgstat_report_xact_timestamp call would need to be moved up to before that, I think, but that sure looks like it was inserted with the aid of a dartboard anyway. It certainly doesn't square with the immediately preceding comment about "purely internal" cleanup.) AbortTransaction and probably PrepareTransaction the same. Having done that, rather than adding more pgstat_report_stat calls we could get rid of most of them. I see a bunch of random calls right after various CommitTransactionCommand calls, which we'd not need anymore. We probably still want the one in PostgresMain, but it needs some rethinking, because we only want that one to do something when the timer fires saying that we've slept too long without sending stats. As for the NOTIFY business, I'm not impressed with putting duplicate boilerplate into half a dozen different places. I think the discussion in ref/notify.sgml covers this already, or if not, that's the one place to fix it. The place I actually thought could use more attention is the wire protocol spec in protocol.sgml. In any case, I'd rather see that as a separate patch, because it's unrelated. regards, tom lane
Laetitia Avrot <laetitia.avrot@gmail.com> writes:
> [ v3-0001-update-table-stats-after-commit-and-chain.patch ]
I took a quick look at this. If we're going to move the responsibility
for pushing out stats to transaction commit, then I think what we want
to do is put the call somewhere near the end of CommitTransaction.
Maybe right after AtCommit_Notify, because that has the same spirit
of sending out info about our having committed to other backends.
(The pgstat_report_xact_timestamp call would need to be moved up to
before that, I think, but that sure looks like it was inserted with
the aid of a dartboard anyway. It certainly doesn't square with the
immediately preceding comment about "purely internal" cleanup.)
AbortTransaction and probably PrepareTransaction the same.
Having done that, rather than adding more pgstat_report_stat calls
we could get rid of most of them. I see a bunch of random calls right
after various CommitTransactionCommand calls, which we'd not need anymore.
We probably still want the one in PostgresMain, but it needs some
rethinking, because we only want that one to do something when the timer
fires saying that we've slept too long without sending stats.
As for the NOTIFY business, I'm not impressed with putting duplicate
boilerplate into half a dozen different places. I think the discussion
in ref/notify.sgml covers this already, or if not, that's the one place
to fix it. The place I actually thought could use more attention is the
wire protocol spec in protocol.sgml. In any case, I'd rather see that
as a separate patch, because it's unrelated.
regards, tom lane
Hi, On 2022-05-14 17:11:33 +0200, Laetitia Avrot wrote: > This patch includes the update of the statistics in case of commit and > chain. Hm. > @@ -3006,6 +3006,12 @@ CommitTransactionCommand(void) > s->blockState = TBLOCK_DEFAULT; > if (s->chain) > { > + /* > + * Before starting the new transaction, we'll update the > + * statistics so that autovacuum can be triggered without > + * waiting for a `commit` or `rollback` without `and chain`. > + */ > + pgstat_report_stat(false); > StartTransaction(); > s->blockState = TBLOCK_INPROGRESS; > s->chain = false; > @@ -3032,6 +3038,12 @@ CommitTransactionCommand(void) > s->blockState = TBLOCK_DEFAULT; > if (s->chain) > { > + /* > + * Before starting the new transaction, we'll update the > + * statistics so that autovacuum can be triggered without > + * waiting for a `commit` or `rollback` without `and chain`. > + */ > + pgstat_report_stat(false); > StartTransaction(); > s->blockState = TBLOCK_INPROGRESS; > s->chain = false; > @@ -3050,6 +3062,12 @@ CommitTransactionCommand(void) > s->blockState = TBLOCK_DEFAULT; > if (s->chain) > { > + /* > + * Before starting the new transaction, we'll update the > + * statistics so that autovacuum can be triggered without > + * waiting for a `commit` or `rollback` without `and chain`. > + */ > + pgstat_report_stat(false); > StartTransaction(); > s->blockState = TBLOCK_INPROGRESS; > s->chain = false; > @@ -3117,6 +3135,12 @@ CommitTransactionCommand(void) > s->blockState = TBLOCK_DEFAULT; > if (s->chain) > { > + /* > + * Before starting the new transaction, we'll update the > + * statistics so that autovacuum can be triggered without > + * waiting for a `commit` or `rollback` without `and chain`. > + */ > + pgstat_report_stat(false); > StartTransaction(); > s->blockState = TBLOCK_INPROGRESS; > s->chain = false; I don't like copying the same comment etc into multiple places. Given the number of identical blocks for if (s->chain) blocks, perhaps we should just move it into a new helper? I'm a bit worried about triggering pgstat_report_stat() in new places where it previously couldn't be called. There's a few places in the relation stats code (predating the shared memory stats patch) that don't cope well with being called in the wrong state (see e.g. 0cf16cb8ca4), and I'm not sure how careful the procedures patch was with ensuring that there aren't any cases of relations being kept open across transactions or something like that. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > I don't like copying the same comment etc into multiple places. Given the > number of identical blocks for if (s->chain) blocks, perhaps we should just > move it into a new helper? What I suggested a few days ago is that it should be in CommitTransaction. regards, tom lane
Hi, On 2022-08-01 12:25:58 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I don't like copying the same comment etc into multiple places. Given the > > number of identical blocks for if (s->chain) blocks, perhaps we should just > > move it into a new helper? > > What I suggested a few days ago is that it should be in CommitTransaction. I wonder how we'd best deal with the idle timer if we go for that. That only makes sense in some contexts (normal backends), but not others (everything else). Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-08-01 12:25:58 -0400, Tom Lane wrote: >> What I suggested a few days ago is that it should be in CommitTransaction. > I wonder how we'd best deal with the idle timer if we go for that. That only > makes sense in some contexts (normal backends), but not others (everything > else). Yeah. I think we'd need to get rid of the "bool force" argument of pgstat_report_stat, and instead have it manage things internally based on understanding whether the current process uses a reporting timeout timer or not (if not, always send the report right away). That seems like it'd be more robust than the current mechanism anyway, as we're currently relying on every call site to get that right. regards, tom lane
Hi, On 2022-08-01 12:43:23 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2022-08-01 12:25:58 -0400, Tom Lane wrote: > >> What I suggested a few days ago is that it should be in CommitTransaction. > > > I wonder how we'd best deal with the idle timer if we go for that. That only > > makes sense in some contexts (normal backends), but not others (everything > > else). > > Yeah. I think we'd need to get rid of the "bool force" argument > of pgstat_report_stat, and instead have it manage things internally > based on understanding whether the current process uses a reporting > timeout timer or not (if not, always send the report right away). > That seems like it'd be more robust than the current mechanism anyway, > as we're currently relying on every call site to get that right. It's not as simple as looking at the backend type, I think. We'd not want to enable a timer in the commit-and-chain context, even in a normal backend - there's no chance we'll go idle. I wonder if it was the wrong call to use timers for IdleSessionTimeout, IdleInTransactionSessionTimeout and pgstats. We always use nonblocking socket IO these days, so perhaps we should instead just compute a relevant timeout for the WaitEventSetWait() call? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-08-01 12:43:23 -0400, Tom Lane wrote: >> Yeah. I think we'd need to get rid of the "bool force" argument >> of pgstat_report_stat, and instead have it manage things internally >> based on understanding whether the current process uses a reporting >> timeout timer or not (if not, always send the report right away). > It's not as simple as looking at the backend type, I think. We'd not want to > enable a timer in the commit-and-chain context, even in a normal backend - > there's no chance we'll go idle. No, but that's not the point. During CommitTransaction we should check to see if it's more than X amount of time since our last stats report, and if so send a new one. No timer interaction involved there. When we go idle, if not inside a transaction and a report needs to be sent, then compute the wait time till it should be sent and set a timer for that. > I wonder if it was the wrong call to use timers for IdleSessionTimeout, > IdleInTransactionSessionTimeout and pgstats. We always use nonblocking socket > IO these days, so perhaps we should instead just compute a relevant timeout > for the WaitEventSetWait() call? Meh. I think that that would end up with duplicative logic and duplicative gettimeofday calls, unless your idea is to get rid of the timeout.c facilities altogether. regards, tom lane
Hi, On 2022-08-01 13:22:35 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2022-08-01 12:43:23 -0400, Tom Lane wrote: > >> Yeah. I think we'd need to get rid of the "bool force" argument > >> of pgstat_report_stat, and instead have it manage things internally > >> based on understanding whether the current process uses a reporting > >> timeout timer or not (if not, always send the report right away). > > > It's not as simple as looking at the backend type, I think. We'd not want to > > enable a timer in the commit-and-chain context, even in a normal backend - > > there's no chance we'll go idle. > > No, but that's not the point. During CommitTransaction we should check > to see if it's more than X amount of time since our last stats report, > and if so send a new one. No timer interaction involved there. When > we go idle, if not inside a transaction and a report needs to be sent, > then compute the wait time till it should be sent and set a timer for > that. I don't quite see how that'd help us to get rid of the force argument - somehow pgstat_report_stat() needs to know that it it should "block" reporting stats. Which we e.g. want when the idle timer has elapsed. > > I wonder if it was the wrong call to use timers for IdleSessionTimeout, > > IdleInTransactionSessionTimeout and pgstats. We always use nonblocking socket > > IO these days, so perhaps we should instead just compute a relevant timeout > > for the WaitEventSetWait() call? > > Meh. I think that that would end up with duplicative logic and > duplicative gettimeofday calls, unless your idea is to get rid of > the timeout.c facilities altogether. I suspect it'd reduce the number of gettimeofday calls, but yes, the duplication wouldn't be nice. I guess there's a chance we could get rid of most of the timeout.c stuff, but it'd need changes in a fair number of places... Don't think it's worth doing atm. Greetings, Andres Freund