Thread: xmin and very high number of concurrent transactions

xmin and very high number of concurrent transactions

From
Vijaykumar Jain
Date:
I was asked this question in one of my demos, and it was interesting one.

we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?

i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,

Regards,
Vijay


Re: xmin and very high number of concurrent transactions

From
Adrian Klaver
Date:
On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
> I was asked this question in one of my demos, and it was interesting one.
> 
> we update xmin for new inserts with the current txid.

Why?

> now in a very high concurrent scenario where there are more than 2000
> concurrent users trying to insert new data,
> will updating xmin value be a bottleneck?
> 
> i know we should use pooling solutions to reduce concurrent
> connections but given we have enough resources to take care of
> spawning a new process for a new connection,
> 
> Regards,
> Vijay
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: [External] Re: xmin and very high number of concurrent transactions

From
Vijaykumar Jain
Date:
no i mean not we end users, postgres does it (?) via the xmin and xmax
fields  from inherited tables :) if that is what you wanted in a why
or are you asking, does postgres even update those rows and i am wrong
assuming it that way?

since the values need to be atomic,
consider the below analogy
assuming i(postgres) am person giving out token to
people(connections/tx) in a queue.
if there is a single line, (sequential) then it is easy for me to
simply give them 1 token incrementing the value and so on.
but if there are thousands of users in parallel lines, i am only one
person delivering the token, will operate sequentially, and the other
person is "blocked" for sometime before it gets the token with the
required value.
so if there are 1000s or users with the "delay" may impact my
performance  coz i need to maintain the value of the token to be able
to know what token value i need to give to next person?

i do not know if am explaining it correctly, pardon my analogy,


Regards,
Vijay

On Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
> > I was asked this question in one of my demos, and it was interesting one.
> >
> > we update xmin for new inserts with the current txid.
>
> Why?
>
> > now in a very high concurrent scenario where there are more than 2000
> > concurrent users trying to insert new data,
> > will updating xmin value be a bottleneck?
> >
> > i know we should use pooling solutions to reduce concurrent
> > connections but given we have enough resources to take care of
> > spawning a new process for a new connection,
> >
> > Regards,
> > Vijay
> >
> >
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com


Re: [External] Re: xmin and very high number of concurrenttransactions

From
Adrian Klaver
Date:
On 3/12/19 1:02 PM, Vijaykumar Jain wrote:
> no i mean not we end users, postgres does it (?) via the xmin and xmax
> fields  from inherited tables :) if that is what you wanted in a why
> or are you asking, does postgres even update those rows and i am wrong
> assuming it that way?

Not sure where the inherited tables come in?

See below for more info:
https://www.postgresql.org/docs/11/storage-page-layout.html

AFAIK xmin and xmax are just done as part of the insert or delete 
operations so there is no updating involved.

I would say the impact to performance would come from the overhead of 
each connection rather then maintaining xmin/xmax.

> 
> since the values need to be atomic,
> consider the below analogy
> assuming i(postgres) am person giving out token to
> people(connections/tx) in a queue.
> if there is a single line, (sequential) then it is easy for me to
> simply give them 1 token incrementing the value and so on.
> but if there are thousands of users in parallel lines, i am only one
> person delivering the token, will operate sequentially, and the other
> person is "blocked" for sometime before it gets the token with the
> required value.
> so if there are 1000s or users with the "delay" may impact my
> performance  coz i need to maintain the value of the token to be able
> to know what token value i need to give to next person?
> 
> i do not know if am explaining it correctly, pardon my analogy,
> 
> 
> Regards,
> Vijay
> 
> On Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
>>> I was asked this question in one of my demos, and it was interesting one.
>>>
>>> we update xmin for new inserts with the current txid.
>>
>> Why?
>>
>>> now in a very high concurrent scenario where there are more than 2000
>>> concurrent users trying to insert new data,
>>> will updating xmin value be a bottleneck?
>>>
>>> i know we should use pooling solutions to reduce concurrent
>>> connections but given we have enough resources to take care of
>>> spawning a new process for a new connection,
>>>
>>> Regards,
>>> Vijay
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: xmin and very high number of concurrent transactions

From
reg_pg_stefanz@perfexpert.ch
Date:
I may have misunderstood the documentation or your question, but I had 
the understanding that xmin is not updated, but is only set on insert
(but yes, also for update, but updates are also inserts for Postgres as 
updates are executed as delete/insert)

from https://www.postgresql.org/docs/10/ddl-system-columns.html
 > xmin
 > The identity (transaction ID) of the inserting transaction for this 
row version. (A row version is an individual state of > row; each update 
of a row creates a new row version for the same logical row.)

therfore I assume, there are no actual updates of xmin values

Stefan

On 12.03.2019 20:19, Vijaykumar Jain wrote:
> I was asked this question in one of my demos, and it was interesting one.
>
> we update xmin for new inserts with the current txid.
> now in a very high concurrent scenario where there are more than 2000
> concurrent users trying to insert new data,
> will updating xmin value be a bottleneck?
>
> i know we should use pooling solutions to reduce concurrent
> connections but given we have enough resources to take care of
> spawning a new process for a new connection,
>
> Regards,
> Vijay
>



Re: xmin and very high number of concurrent transactions

From
Laurenz Albe
Date:
Vijaykumar Jain wrote:
> I was asked this question in one of my demos, and it was interesting one.
> 
> we update xmin for new inserts with the current txid.
> now in a very high concurrent scenario where there are more than 2000
> concurrent users trying to insert new data,
> will updating xmin value be a bottleneck?
> 
> i know we should use pooling solutions to reduce concurrent
> connections but given we have enough resources to take care of
> spawning a new process for a new connection,

You can read the function GetNewTransactionId in
src/backend/access/transam/varsup.c for details.

Transaction ID creation is serialized with a "light-weight lock",
so it could potentially be a bottleneck.

Often that is dwarfed by the I/O requirements from many concurrent
commits, but if most of your transactions are rolled back or you
use "synchronous_commit = off", I can imagine that it could matter.

It is not a matter of how many clients there are, but of how
often a new writing transaction is started.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



Re: xmin and very high number of concurrent transactions

From
Julien Rouhaud
Date:
On Wed, Mar 13, 2019 at 9:50 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> Vijaykumar Jain wrote:
> > I was asked this question in one of my demos, and it was interesting one.
> >
> > we update xmin for new inserts with the current txid.
> > now in a very high concurrent scenario where there are more than 2000
> > concurrent users trying to insert new data,
> > will updating xmin value be a bottleneck?
> >
> > i know we should use pooling solutions to reduce concurrent
> > connections but given we have enough resources to take care of
> > spawning a new process for a new connection,
>
> You can read the function GetNewTransactionId in
> src/backend/access/transam/varsup.c for details.
>
> Transaction ID creation is serialized with a "light-weight lock",
> so it could potentially be a bottleneck.

Also I think that GetSnapshotData() would be the major bottleneck way
before GetNewTransactionId() becomes problematic.  Especially with
such a high number of active backends.


Re: [External] Re: xmin and very high number of concurrent transactions

From
Vijaykumar Jain
Date:
Thank you everyone for responding.
Appreciate your help.

Looks like I need to understand the concepts a little more in detail , to be able to ask the right questions, but atleast now I can look at  the relevant docs.


On Wed, 13 Mar 2019 at 2:44 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
On Wed, Mar 13, 2019 at 9:50 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> Vijaykumar Jain wrote:
> > I was asked this question in one of my demos, and it was interesting one.
> >
> > we update xmin for new inserts with the current txid.
> > now in a very high concurrent scenario where there are more than 2000
> > concurrent users trying to insert new data,
> > will updating xmin value be a bottleneck?
> >
> > i know we should use pooling solutions to reduce concurrent
> > connections but given we have enough resources to take care of
> > spawning a new process for a new connection,
>
> You can read the function GetNewTransactionId in
> src/backend/access/transam/varsup.c for details.
>
> Transaction ID creation is serialized with a "light-weight lock",
> so it could potentially be a bottleneck.

Also I think that GetSnapshotData() would be the major bottleneck way
before GetNewTransactionId() becomes problematic.  Especially with
such a high number of active backends.
--

Regards,
Vijay