Thread: Transaction commits VS Transaction commits (with parallel) VS querymean time
Transaction commits VS Transaction commits (with parallel) VS querymean time
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
Hi Hackers,Does increase in Transaction commits per second means good query performance?Why I asked this question is, many monitoring tools display that number of transactionsper second in the dashboard (including pgadmin).During the testing of bunch of queries with different set of configurations, I observed thatTPS of some particular configuration has increased compared to default server configuration, but the overall query execution performance is decreased after comparing all queries run time.This is because of larger xact_commit value than default configuration. With the changed server configuration, that leads to generate more parallel workers and every parallel worker operation is treated as an extra commit, because of this reason, the total number of commits increased, but the overall query performance is decreased.Is there any relation of transaction commits to performance?Is there any specific reason to consider the parallel worker activity also as a transaction commit? Especially in my observation, if we didn't consider the parallel worker activity as separate commits, the test doesn't show an increase in transaction commits.
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Fri, Feb 8, 2019 at 6:55 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > On Thu, Feb 7, 2019 at 9:31 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: >> >> >> This is because of larger xact_commit value than default configuration. With the changed server configuration, that leadsto generate more parallel workers and every parallel worker operation is treated as an extra commit, because of thisreason, the total number of commits increased, but the overall query performance is decreased. >> >> Is there any relation of transaction commits to performance? >> >> Is there any specific reason to consider the parallel worker activity also as a transaction commit? Especially in my observation,if we didn't consider the parallel worker activity as separate commits, the test doesn't show an increase intransaction commits. > > > The following statements shows the increase in the xact_commit value with > parallel workers. I can understand that workers updating the seq_scan stats > as they performed the seq scan. > Yeah, that seems okay, however, one can say that for the scan they want to consider it as a single scan even if part of the scan is accomplished by workers or may be a separate counter for parallel workers scan. > Is the same applied to parallel worker transaction > commits also? > I don't think so. It seems to me that we should consider it as a single transaction. Do you want to do the leg work for this and try to come up with a patch? On a quick look, I think we might want to change AtEOXact_PgStat so that the commits for parallel workers are not considered. I think the caller has that information. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Fri, Feb 8, 2019 at 6:55 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Thu, Feb 7, 2019 at 9:31 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>>
>>
>> This is because of larger xact_commit value than default configuration. With the changed server configuration, that leads to generate more parallel workers and every parallel worker operation is treated as an extra commit, because of this reason, the total number of commits increased, but the overall query performance is decreased.
>>
>> Is there any relation of transaction commits to performance?
>>
>> Is there any specific reason to consider the parallel worker activity also as a transaction commit? Especially in my observation, if we didn't consider the parallel worker activity as separate commits, the test doesn't show an increase in transaction commits.
>
>
> The following statements shows the increase in the xact_commit value with
> parallel workers. I can understand that workers updating the seq_scan stats
> as they performed the seq scan.
>
Yeah, that seems okay, however, one can say that for the scan they
want to consider it as a single scan even if part of the scan is
accomplished by workers or may be a separate counter for parallel
workers scan.
> Is the same applied to parallel worker transaction
> commits also?
>
I don't think so. It seems to me that we should consider it as a
single transaction. Do you want to do the leg work for this and try
to come up with a patch? On a quick look, I think we might want to
change AtEOXact_PgStat so that the commits for parallel workers are
not considered. I think the caller has that information.
Attachment
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
The patch looks good to me. However, I wonder what are the other scenarios where xact_commit is incremented because even if I commit a single transaction with your patch applied the increment in xact_commit is > 1. As you mentioned upthread, need to check what internal operation resulted in the increase. But the increase in xact_commit is surely lesser with the patch.
postgres=# BEGIN;
BEGIN
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
xact_commit
-------------
158
(1 row)
postgres=# explain analyze select * from foo where i = 1000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..136417.85 rows=1 width=37) (actual time=4.596..3792.710 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on foo (cost=0.00..135417.75 rows=1 width=37) (actual time=2448.038..3706.009 rows=0 loops=3)
Filter: (i = 1000)
Rows Removed by Filter: 3333333
Planning Time: 0.353 ms
Execution Time: 3793.572 ms
(8 rows)
postgres=# commit;
COMMIT
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
xact_commit
-------------
161
(1 row)
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
RE: Transaction commits VS Transaction commits (with parallel) VSquery mean time
Hi Hari-san,
On Sunday, February 10, 2019 2:25 PM (GMT+9), Haribabu Kommi wrote:
> I try to fix it by adding a check for parallel worker or not and based on it
> count them into stats. Patch attached.
>
> With this patch, currently it doesn't count parallel worker transactions, and
> rest of the stats like seqscan and etc are still get counted. IMO they still
> may need to be counted as those stats represent the number of tuples
> returned and etc.
>
> Comments?
I took a look at your patch, and it’s pretty straightforward.
However, currently the patch does not apply, so I reattached an updated one
to keep the CFbot happy.
The previous patch also had a missing header to detect parallel workers
so I added that. --> #include "access/parallel.h"
Regards,
Kirk Jamison
Attachment
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
Hi Haribabu,The latest patch fails while applying header files part. Kindly rebase.
The patch looks good to me. However, I wonder what are the other scenarios where xact_commit is incremented because even if I commit a single transaction with your patch applied the increment in xact_commit is > 1. As you mentioned upthread, need to check what internal operation resulted in the increase. But the increase in xact_commit is surely lesser with the patch.
postgres=# BEGIN;
BEGIN
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
xact_commit
-------------
158
(1 row)
postgres=# explain analyze select * from foo where i = 1000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..136417.85 rows=1 width=37) (actual time=4.596..3792.710 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on foo (cost=0.00..135417.75 rows=1 width=37) (actual time=2448.038..3706.009 rows=0 loops=3)
Filter: (i = 1000)
Rows Removed by Filter: 3333333
Planning Time: 0.353 ms
Execution Time: 3793.572 ms
(8 rows)
postgres=# commit;
COMMIT
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
xact_commit
-------------
161
(1 row)
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
Hi Hari-san,
On Sunday, February 10, 2019 2:25 PM (GMT+9), Haribabu Kommi wrote:
> I try to fix it by adding a check for parallel worker or not and based on it
> count them into stats. Patch attached.
>
> With this patch, currently it doesn't count parallel worker transactions, and
> rest of the stats like seqscan and etc are still get counted. IMO they still
> may need to be counted as those stats represent the number of tuples
> returned and etc.
>
> Comments?
I took a look at your patch, and it’s pretty straightforward.
However, currently the patch does not apply, so I reattached an updated one
to keep the CFbot happy.
The previous patch also had a missing header to detect parallel workers
so I added that. --> #include "access/parallel.h"
RE: Transaction commits VS Transaction commits (with parallel) VSquery mean time
I tried to confirm the patch with the following configuration:
max_parallel_workers_per_gather = 2
autovacuum = off
----
postgres=# BEGIN;
BEGIN
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
xact_commit
-------------
118
(1 row)
postgres=# explain analyze select * from tab where b = 6;
QUERY PLAN
---------------------------------------------------------------------------------------
------------------------------------
Gather (cost=1000.00..102331.58 rows=50000 width=8) (actual time=0.984..1666.932 rows
=99815 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on tab (cost=0.00..96331.58 rows=20833 width=8) (actual time=
0.130..1587.463 rows=33272 loops=3)
Filter: (b = 6)
Rows Removed by Filter: 3300062
Planning Time: 0.146 ms
Execution Time: 1682.155 ms
(8 rows)
postgres=# COMMIT;
COMMIT
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
xact_commit
-------------
119
(1 row)
---------
[Conclusion]
I have confirmed that with the patch (with autovacuum=off,
max_parallel_workers_per_gather = 2), the increment is only 1.
Without the patch, I have also confirmed that
the increment in xact_commit > 1.
It seems Amit also confirmed that this should be the case,
so the patch works as intended.
>> Is the same applied to parallel worker transaction
>> commits also?
>
> I don't think so. It seems to me that we should consider it as a
> single transaction.
However, I cannot answer if the consideration of parallel worker activity
as a single xact_commit relates to good performance.
But ISTM, with this improvement we have a more accurate stats for xact_commit.
Regards,
Kirk Jamison
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Sun, Feb 10, 2019 at 10:54 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > On Sat, Feb 9, 2019 at 4:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> I don't think so. It seems to me that we should consider it as a >> single transaction. Do you want to do the leg work for this and try >> to come up with a patch? On a quick look, I think we might want to >> change AtEOXact_PgStat so that the commits for parallel workers are >> not considered. I think the caller has that information. > > > I try to fix it by adding a check for parallel worker or not and based on it > count them into stats. Patch attached. > @@ -2057,14 +2058,18 @@ AtEOXact_PgStat(bool isCommit) { .. + /* Don't count parallel worker transactions into stats */ + if (!IsParallelWorker()) + { + /* + * Count transaction commit or abort. (We use counters, not just bools, + * in case the reporting message isn't sent right away.) + */ + if (isCommit) + pgStatXactCommit++; + else + pgStatXactRollback++; + } .. } I wonder why you haven't used the 'is_parallel_worker' flag from the caller, see CommitTransaction/AbortTransaction? The difference is that if we use that then it will just avoid counting transaction for the parallel work (StartParallelWorkerTransaction), otherwise, it might miss the count of any other transaction we started in the parallel worker for some intermediate work (for example, the transaction we started to restore library and guc state). I think it boils down to whether we want to avoid any transaction that is started by a parallel worker or just the transaction which is shared among leader and worker. It seems to me that currently, we do count all the internal transactions (started with StartTransactionCommand and CommitTransactionCommand) in this counter, so we should just try to avoid the transaction for which state is shared between leader and workers. Anyone else has any opinion on this matter? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Sun, Feb 10, 2019 at 10:54 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> On Sat, Feb 9, 2019 at 4:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> I don't think so. It seems to me that we should consider it as a
>> single transaction. Do you want to do the leg work for this and try
>> to come up with a patch? On a quick look, I think we might want to
>> change AtEOXact_PgStat so that the commits for parallel workers are
>> not considered. I think the caller has that information.
>
>
> I try to fix it by adding a check for parallel worker or not and based on it
> count them into stats. Patch attached.
>
@@ -2057,14 +2058,18 @@ AtEOXact_PgStat(bool isCommit)
{
..
+ /* Don't count parallel worker transactions into stats */
+ if (!IsParallelWorker())
+ {
+ /*
+ * Count transaction commit or abort. (We use counters, not just bools,
+ * in case the reporting message isn't sent right away.)
+ */
+ if (isCommit)
+ pgStatXactCommit++;
+ else
+ pgStatXactRollback++;
+ }
..
}
I wonder why you haven't used the 'is_parallel_worker' flag from the
caller, see CommitTransaction/AbortTransaction? The difference is
that if we use that then it will just avoid counting transaction for
the parallel work (StartParallelWorkerTransaction), otherwise, it
might miss the count of any other transaction we started in the
parallel worker for some intermediate work (for example, the
transaction we started to restore library and guc state).
I think it boils down to whether we want to avoid any transaction that
is started by a parallel worker or just the transaction which is
shared among leader and worker. It seems to me that currently, we do
count all the internal transactions (started with
StartTransactionCommand and CommitTransactionCommand) in this counter,
so we should just try to avoid the transaction for which state is
shared between leader and workers.
Anyone else has any opinion on this matter?
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Thu, Mar 21, 2019 at 2:18 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > With Inclusion of parallel worker transactions, the TPS will be a higher number, > thus user may find it as better throughput. This is the case with one of our tool. > The tool changes the configuration randomly to find out the best configuration > for the server based on a set of workload, during our test, with some configurations, > the TPS is so good, but the overall performance is slow as the system is having > less resources to keep up with that configuration. > > Opinions? Well, I think that might be a sign that the data isn't being used correctly. I don't have a strong position on what the "right" thing to do here is, but I think if you want to know how many client transactions are being executed, you should count them on the client side, as pgbench does. I agree that it's a little funny to count the parallel worker commit as a separate transaction, since in a certain sense it is part of the same transaction. But if you do that, then as already noted you have to next decide what to do about other transactions that parallel workers use internally. And then you have to decide what to do about other background transactions. And there's not really one "right" answer to any of these questions, I don't think. You might want to count different things depending on how the information is going to be used. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On 2019-Mar-21, Robert Haas wrote: > I don't have a strong position on what the "right" thing > to do here is, but I think if you want to know how many client > transactions are being executed, you should count them on the client > side, as pgbench does. I think counting on the client side is untenable in practical terms. They may not even know what clients there are, or may not have the source code to add such a feature even if they know how. Plus they would have to aggregate data coming from dozens of different systems? I don't think this argument has any wings. OTOH the reason the server offers stats is so that the user can know what the server activity is, not to display useless internal state. If a user disables an internal option (such as parallel query) and their monitoring system suddenly starts showing half as many transactions as before, they're legitimaly going to think that the server is broken. Such stats are pointless. The use case for those stats seems fairly clear to me: display numbers in a monitoring system. You seem to be saying that we're just not going to help developers of monitoring systems, and that users have to feed them on their own. > I agree that it's a little funny to count the parallel worker commit > as a separate transaction, since in a certain sense it is part of the > same transaction. Right. So you don't count it. This seems crystal clear. Doing the other thing is outright wrong, there's no doubt about that. > But if you do that, then as already noted you have to next decide what > to do about other transactions that parallel workers use internally. You don't count those ones either. > And then you have to decide what to do about other background > transactions. Not count them if they're implementation details; otherwise count them. For example, IMO autovacuum transactions should definitely be counted (as one transaction, even if they run parallel vacuum). -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Fri, Mar 22, 2019 at 12:34 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > And then you have to decide what to do about other background > > transactions. > > Not count them if they're implementation details; otherwise count them. > For example, IMO autovacuum transactions should definitely be counted > (as one transaction, even if they run parallel vacuum). Hmm, interesting. autovacuum isn't an implementation detail? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Fri, Mar 22, 2019 at 10:04 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2019-Mar-21, Robert Haas wrote: > > > I agree that it's a little funny to count the parallel worker commit > > as a separate transaction, since in a certain sense it is part of the > > same transaction. > > Right. So you don't count it. This seems crystal clear. Doing the > other thing is outright wrong, there's no doubt about that. > Agreed, this is a clear problem and we can just fix this and move ahead, but it is better if we spend some more time to see if we can come up with something better. We might want to just fix this and then deal with the second part of the problem separately along with some other similar cases. > > But if you do that, then as already noted you have to next decide what > > to do about other transactions that parallel workers use internally. > > You don't count those ones either. > Yeah, we can do that but it is not as clear as the previous one. The reason is that we do similarly start/commit transaction for various operations like autovacuum, cluster, drop index concurrently, etc. So, it doesn't sound good to me to just change this for parallel query and leave others as it is. > > And then you have to decide what to do about other background > > transactions. > > Not count them if they're implementation details; otherwise count them. > For example, IMO autovacuum transactions should definitely be counted > (as one transaction, even if they run parallel vacuum). > It appears to me that the definition of what we want to display in xact_commit/xact_rollback (for pg_stat_database view) is slightly vague. For ex. does it mean that we will show only transactions started by the user or does it also includes the other transactions started internally (which you call implementation detail) to perform the various operations? I think users would be more interested in the transactions initiated by them. I think some users might also be interested in the write transactions happened in the system, basically, those have consumed xid. I think we should decide what we want to do for all other internal transactions that are started before fixing the second part of this problem ("other transactions that parallel workers use internally"). Thanks, Robert and Alvaro to chime in as this needs some discussion to decide what behavior we want to provide to users. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On 2019-Mar-23, Amit Kapila wrote: > On Fri, Mar 22, 2019 at 10:04 AM Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > Not count them if they're implementation details; otherwise count them. > > For example, IMO autovacuum transactions should definitely be counted > > (as one transaction, even if they run parallel vacuum). > > It appears to me that the definition of what we want to display in > xact_commit/xact_rollback (for pg_stat_database view) is slightly > vague. For ex. does it mean that we will show only transactions > started by the user or does it also includes the other transactions > started internally (which you call implementation detail) to perform > the various operations? I think users would be more interested in the > transactions initiated by them. Yes, you're probably right. > I think some users might also be interested in the write transactions > happened in the system, basically, those have consumed xid. Well, do they really want to *count* these transactions, or is it enough to keep an eye on the "age" of some XID column? Other than for XID freezing purposes, I don't see such internal transactions as very interesting. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Sat, Mar 23, 2019 at 9:50 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > On 2019-Mar-23, Amit Kapila wrote: > > > I think some users might also be interested in the write transactions > > happened in the system, basically, those have consumed xid. > > Well, do they really want to *count* these transactions, or is it enough > to keep an eye on the "age" of some XID column? Other than for XID > freezing purposes, I don't see such internal transactions as very > interesting. > That's what I also had in mind. I think doing anything more than just fixing the count for the parallel cooperating transaction by parallel workers doesn't seem intuitive to me. I mean if we want we can commit the fix such that all supporting transactions by parallel worker shouldn't be counted, but I am not able to convince myself that that is the good fix. Instead, I think rather than fixing that one case we should think more broadly about all the supportive transactions happening in the various operations. Also, as that is a kind of behavior change, we should discuss that as a separate topic. I know what I am proposing here won't completely fix the problem Hari is facing, but I am not sure what else we can do here which doesn't create some form of inconsistency with other parts of the system. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Sat, Mar 23, 2019 at 9:50 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2019-Mar-23, Amit Kapila wrote:
>
> > I think some users might also be interested in the write transactions
> > happened in the system, basically, those have consumed xid.
>
> Well, do they really want to *count* these transactions, or is it enough
> to keep an eye on the "age" of some XID column? Other than for XID
> freezing purposes, I don't see such internal transactions as very
> interesting.
>
That's what I also had in mind. I think doing anything more than just
fixing the count for the parallel cooperating transaction by parallel
workers doesn't seem intuitive to me. I mean if we want we can commit
the fix such that all supporting transactions by parallel worker
shouldn't be counted, but I am not able to convince myself that that
is the good fix. Instead, I think rather than fixing that one case we
should think more broadly about all the supportive transactions
happening in the various operations. Also, as that is a kind of
behavior change, we should discuss that as a separate topic.
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Mon, Mar 25, 2019 at 6:55 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > On Sat, Mar 23, 2019 at 11:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Sat, Mar 23, 2019 at 9:50 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> > >> > On 2019-Mar-23, Amit Kapila wrote: >> > >> > > I think some users might also be interested in the write transactions >> > > happened in the system, basically, those have consumed xid. >> > >> > Well, do they really want to *count* these transactions, or is it enough >> > to keep an eye on the "age" of some XID column? Other than for XID >> > freezing purposes, I don't see such internal transactions as very >> > interesting. >> > >> >> That's what I also had in mind. I think doing anything more than just >> fixing the count for the parallel cooperating transaction by parallel >> workers doesn't seem intuitive to me. I mean if we want we can commit >> the fix such that all supporting transactions by parallel worker >> shouldn't be counted, but I am not able to convince myself that that >> is the good fix. Instead, I think rather than fixing that one case we >> should think more broadly about all the supportive transactions >> happening in the various operations. Also, as that is a kind of >> behavior change, we should discuss that as a separate topic. > > > Thanks to everyone for their opinions and suggestions to improve. > > Without parallel workers, there aren't many internal implementation > logic code that causes the stats to bloat. Parallel worker stats are not > just the transactions, it update other stats also, for eg; seqscan stats > of a relation. I also eagree that just fixing it for transactions may not > be a proper solution. > > Using of XID data may not give proper TPS of the instance as it doesn't > involve the read only transactions information. > > How about adding additional two columns that provides all the internal > and background worker transactions into that column? > I can see the case where the users will be interested in application initiated transactions, so if we want to add new columns, it could be to display that information and the current columns can keep displaying the overall transactions in the database. Here, I think we need to find a way to distinguish between internal and user-initiated transactions. OTOH, I am not sure adding new columns is better than changing the meaning of existing columns. We can go either way based on what others feel good, but I think we can do that as a separate head-only feature. As part of this thread, maybe we can just fix the case of the parallel cooperating transaction. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Wed, Mar 27, 2019 at 6:53 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> As part of this thread, maybe we can >> just fix the case of the parallel cooperating transaction. > > > With the current patch, all the parallel implementation transaction are getting > skipped, in my tests parallel workers are the major factor in the transaction > stats counter. Even before parallelism, the stats of the autovacuum and etc > are still present but their contribution is not greatly influencing the stats. > > I agree with you in fixing the stats with parallel workers and improve stats. > I was proposing to fix only the transaction started with StartParallelWorkerTransaction by using is_parallel_worker flag as discussed above. I understand that it won't completely fix the problem reported by you, but it will be a step in that direction. My main worry is that if we fix it the way you are proposing and we also invent a new way to deal with all other internal transactions, then the fix done by us now needs to be changed/reverted. Note, that this fix needs to be backpatched as well, so we should avoid doing any fix which needs to be changed or reverted. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Wed, Mar 27, 2019 at 6:53 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> As part of this thread, maybe we can
>> just fix the case of the parallel cooperating transaction.
>
>
> With the current patch, all the parallel implementation transaction are getting
> skipped, in my tests parallel workers are the major factor in the transaction
> stats counter. Even before parallelism, the stats of the autovacuum and etc
> are still present but their contribution is not greatly influencing the stats.
>
> I agree with you in fixing the stats with parallel workers and improve stats.
>
I was proposing to fix only the transaction started with
StartParallelWorkerTransaction by using is_parallel_worker flag as
discussed above. I understand that it won't completely fix the
problem reported by you, but it will be a step in that direction. My
main worry is that if we fix it the way you are proposing and we also
invent a new way to deal with all other internal transactions, then
the fix done by us now needs to be changed/reverted. Note, that this
fix needs to be backpatched as well, so we should avoid doing any fix
which needs to be changed or reverted.
Attachment
RE: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Thursday, March 28, 2019 3:13 PM (GMT+9), Haribabu Kommi wrote:
> I tried the approach as your suggested as by not counting the actual parallel work
> transactions by just releasing the resources without touching the counters,
> the counts are not reduced much.
>
> HEAD - With 4 parallel workers running query generates 13 stats ( 4 * 3 + 1)
> Patch - With 4 parallel workers running query generates 9 stats ( 4 * 2 + 1)
> Old approach patch - With 4 parallel workers running query generates 1 stat (1)
>
> Currently the parallel worker start transaction 3 times in the following places.
> 1. InitPostgres
> 2. ParallelWorkerMain (2)
>
> with the attached patch, we reduce one count from ParallelWorkerMain.
I'm sorry for the late review of the patch.
Patch applies, compiles cleanly, make check passes too.
I tested the recent patch again using the same method above
and confirmed the increase of generated stats by 9 w/ 4 parallel workers.
postgres=# BEGIN;
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
xact_commit
-------------
60
(1 row)
postgres=# explain analyze select * from tab where b = 6;
[snipped]
postgres=# COMMIT;
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
xact_commit
-------------
69
The intention of the latest patch is to fix the stat of
(IOW, do not count) parallel cooperating transactions,
or the transactions started by StartParallelWorkerTransaction,
based from the advice of Amit.
After testing, the current patch works as intended.
However, also currently being discussed in the latter mails
is the behavior of how parallel xact stats should be shown.
So the goal eventually is to improve how we present the
stats for parallel workers by differentiating between the
internal-initiated (bgworker xact, autovacuum, etc) and
user/application-initiated transactions, apart from keeping
the overall xact stats.
..So fixing the latter one will change this current thread's fix?
I personally have no problems with committing this fix first
before fixing the latter problem discussed above.
But I think there should be a consensus regarding that one.
Regards,
Kirk Jamison
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Thu, Mar 28, 2019 at 11:43 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > On Wed, Mar 27, 2019 at 11:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Wed, Mar 27, 2019 at 6:53 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: >> > On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> >> >> As part of this thread, maybe we can >> >> just fix the case of the parallel cooperating transaction. >> > >> > >> > With the current patch, all the parallel implementation transaction are getting >> > skipped, in my tests parallel workers are the major factor in the transaction >> > stats counter. Even before parallelism, the stats of the autovacuum and etc >> > are still present but their contribution is not greatly influencing the stats. >> > >> > I agree with you in fixing the stats with parallel workers and improve stats. >> > >> >> I was proposing to fix only the transaction started with >> StartParallelWorkerTransaction by using is_parallel_worker flag as >> discussed above. I understand that it won't completely fix the >> problem reported by you, but it will be a step in that direction. My >> main worry is that if we fix it the way you are proposing and we also >> invent a new way to deal with all other internal transactions, then >> the fix done by us now needs to be changed/reverted. Note, that this >> fix needs to be backpatched as well, so we should avoid doing any fix >> which needs to be changed or reverted. > > > I tried the approach as your suggested as by not counting the actual parallel work > transactions by just releasing the resources without touching the counters, > the counts are not reduced much. > > HEAD - With 4 parallel workers running query generates 13 stats ( 4 * 3 + 1) > Patch - With 4 parallel workers running query generates 9 stats ( 4 * 2 + 1) > Old approach patch - With 4 parallel workers running query generates 1 stat (1) > > Currently the parallel worker start transaction 3 times in the following places. > 1. InitPostgres > 2. ParallelWorkerMain (2) > > with the attached patch, we reduce one count from ParallelWorkerMain. > Right, that is expected from this fix. BTW, why you have changed the approach in this patch to not count anything by the parallel worker as compared to the earlier version where you were just ignoring the stats for transactions. As of now, either way is fine, but in future (after parallel inserts/updates), we want other stats to be updated. I think your previous idea was better, just you need to start using is_parallel_worker flag. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Thu, Mar 28, 2019 at 11:43 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Wed, Mar 27, 2019 at 11:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Wed, Mar 27, 2019 at 6:53 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>> > On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >>
>> >> As part of this thread, maybe we can
>> >> just fix the case of the parallel cooperating transaction.
>> >
>> >
>> > With the current patch, all the parallel implementation transaction are getting
>> > skipped, in my tests parallel workers are the major factor in the transaction
>> > stats counter. Even before parallelism, the stats of the autovacuum and etc
>> > are still present but their contribution is not greatly influencing the stats.
>> >
>> > I agree with you in fixing the stats with parallel workers and improve stats.
>> >
>>
>> I was proposing to fix only the transaction started with
>> StartParallelWorkerTransaction by using is_parallel_worker flag as
>> discussed above. I understand that it won't completely fix the
>> problem reported by you, but it will be a step in that direction. My
>> main worry is that if we fix it the way you are proposing and we also
>> invent a new way to deal with all other internal transactions, then
>> the fix done by us now needs to be changed/reverted. Note, that this
>> fix needs to be backpatched as well, so we should avoid doing any fix
>> which needs to be changed or reverted.
>
>
> I tried the approach as your suggested as by not counting the actual parallel work
> transactions by just releasing the resources without touching the counters,
> the counts are not reduced much.
>
> HEAD - With 4 parallel workers running query generates 13 stats ( 4 * 3 + 1)
> Patch - With 4 parallel workers running query generates 9 stats ( 4 * 2 + 1)
> Old approach patch - With 4 parallel workers running query generates 1 stat (1)
>
> Currently the parallel worker start transaction 3 times in the following places.
> 1. InitPostgres
> 2. ParallelWorkerMain (2)
>
> with the attached patch, we reduce one count from ParallelWorkerMain.
>
Right, that is expected from this fix. BTW, why you have changed the
approach in this patch to not count anything by the parallel worker as
compared to the earlier version where you were just ignoring the stats
for transactions. As of now, either way is fine, but in future (after
parallel inserts/updates), we want other stats to be updated. I think
your previous idea was better, just you need to start using
is_parallel_worker flag.
Attachment
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Wed, Apr 3, 2019 at 10:45 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > Thanks for the review. > > While changing the approach to use the is_parallel_worker_flag, I thought > that the rest of the stats are also not required to be updated and also those > are any way write operations and those values are zero anyway for parallel > workers. > > Instead of expanding the patch scope, I just changed to avoid the commit > or rollback stats as discussed, and later we can target the handling of all the > internal transactions and their corresponding stats. > The patch looks good to me. I have changed the commit message and ran the pgindent in the attached patch. Can you once see if that looks fine to you? Also, we should backpatch this till 9.6. So, can you once verify if the change is fine in all bank branches? Also, test with a force_parallel_mode option. I have already tested it with force_parallel_mode = 'regress' in HEAD, please test it in back branches as well. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Wed, Apr 3, 2019 at 10:45 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> Thanks for the review.
>
> While changing the approach to use the is_parallel_worker_flag, I thought
> that the rest of the stats are also not required to be updated and also those
> are any way write operations and those values are zero anyway for parallel
> workers.
>
> Instead of expanding the patch scope, I just changed to avoid the commit
> or rollback stats as discussed, and later we can target the handling of all the
> internal transactions and their corresponding stats.
>
The patch looks good to me. I have changed the commit message and ran
the pgindent in the attached patch. Can you once see if that looks
fine to you? Also, we should backpatch this till 9.6. So, can you
once verify if the change is fine in all bank branches? Also, test
with a force_parallel_mode option. I have already tested it with
force_parallel_mode = 'regress' in HEAD, please test it in back
branches as well.
Attachment
RE: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Monday, April 8, 2019 9:04 AM (GMT+9), Haribabu Kommi wrote:
>On Thu, Apr 4, 2019 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>On Wed, Apr 3, 2019 at 10:45 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>>
>> Thanks for the review.
>>
>> While changing the approach to use the is_parallel_worker_flag, I thought
>> that the rest of the stats are also not required to be updated and also those
>> are any way write operations and those values are zero anyway for parallel
>> workers.
>>
>> Instead of expanding the patch scope, I just changed to avoid the commit
>> or rollback stats as discussed, and later we can target the handling of all the
>> internal transactions and their corresponding stats.
>>
> The patch looks good to me. I have changed the commit message and ran
> the pgindent in the attached patch. Can you once see if that looks
> fine to you? Also, we should backpatch this till 9.6. So, can you
> once verify if the change is fine in all bank branches? Also, test
> with a force_parallel_mode option. I have already tested it with
> force_parallel_mode = 'regress' in HEAD, please test it in back
> branches as well.
>
>
> Thanks for the updated patch.
> I tested in back branches even with force_parallelmode and it is working
> as expected. But the patches apply is failing in back branches, so attached
> the patches for their branches. For v11 it applies with hunks.
There are 3 patches for this thread:
_v5: for PG v11 to current head
_10: for PG10 branch
_96: for PG9.6
I have also tried applying these latest patches, .
The patch set works correctly from patch application, build to compilation.
I also tested with force_parallel_mode, and it works as intended.
So I am marking this thread as “Ready for Committer”.
I hope this makes it on v12 before the feature freeze.
Regards,
Kirk Jamison
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Mon, Apr 8, 2019 at 7:54 AM Jamison, Kirk <k.jamison@jp.fujitsu.com> wrote: > > On Monday, April 8, 2019 9:04 AM (GMT+9), Haribabu Kommi wrote: > > >On Thu, Apr 4, 2019 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > The patch looks good to me. I have changed the commit message and ran > > the pgindent in the attached patch. Can you once see if that looks > > fine to you? Also, we should backpatch this till 9.6. So, can you > > once verify if the change is fine in all bank branches? Also, test > > with a force_parallel_mode option. I have already tested it with > > force_parallel_mode = 'regress' in HEAD, please test it in back > > branches as well. > > > > Thanks for the updated patch. > > I tested in back branches even with force_parallelmode and it is working > > as expected. But the patches apply is failing in back branches, so attached > > the patches for their branches. For v11 it applies with hunks. > > There are 3 patches for this thread: > _v5: for PG v11 to current head > _10: for PG10 branch > _96: for PG9.6 > > I have also tried applying these latest patches, . > The patch set works correctly from patch application, build to compilation. > I also tested with force_parallel_mode, and it works as intended. > > So I am marking this thread as “Ready for Committer”. > Thanks, Hari and Jamison for verification. The patches for back-branches looks good to me. I will once again verify them before commit. I will commit this patch tomorrow unless someone has objections. Robert/Alvaro, do let me know if you see any problem with this fix? > I hope this makes it on v12 before the feature freeze. > Yes, I think fixing bugs should be fine unless we delay too much. I see one typo in the commit message (transactions as we that is what we do in ../transactions as that is what we do in ..), will fix it before commit. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Mon, Apr 8, 2019 at 8:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Apr 8, 2019 at 7:54 AM Jamison, Kirk <k.jamison@jp.fujitsu.com> wrote: > > So I am marking this thread as “Ready for Committer”. > > > > Thanks, Hari and Jamison for verification. The patches for > back-branches looks good to me. I will once again verify them before > commit. I will commit this patch tomorrow unless someone has > objections. Robert/Alvaro, do let me know if you see any problem with > this fix? > Pushed. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
On Mon, Apr 8, 2019 at 8:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 8, 2019 at 7:54 AM Jamison, Kirk <k.jamison@jp.fujitsu.com> wrote:
> > So I am marking this thread as “Ready for Committer”.
> >
>
> Thanks, Hari and Jamison for verification. The patches for
> back-branches looks good to me. I will once again verify them before
> commit. I will commit this patch tomorrow unless someone has
> objections. Robert/Alvaro, do let me know if you see any problem with
> this fix?
>
Pushed.