xmin and very high number of concurrent transactions
I was asked this question in one of my demos, and it was interesting one.
we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?
i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,
Regards,
Vijay
On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
I was asked this question in one of my demos, and it was interesting one.
we update xmin for new inserts with the current txid.
Why?
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,Regards,
Vijay
--
Adrian Klaver
adrian.klaver@aklaver.com
no i mean not we end users, postgres does it (?) via the xmin and xmax
fields from inherited tables :) if that is what you wanted in a why
or are you asking, does postgres even update those rows and i am wrong
assuming it that way?
since the values need to be atomic,
consider the below analogy
assuming i(postgres) am person giving out token to
people(connections/tx) in a queue.
if there is a single line, (sequential) then it is easy for me to
simply give them 1 token incrementing the value and so on.
but if there are thousands of users in parallel lines, i am only one
person delivering the token, will operate sequentially, and the other
person is "blocked" for sometime before it gets the token with the
required value.
so if there are 1000s or users with the "delay" may impact my
performance coz i need to maintain the value of the token to be able
to know what token value i need to give to next person?
i do not know if am explaining it correctly, pardon my analogy,
Regards,
Vijay
Show quoted text
On Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
I was asked this question in one of my demos, and it was interesting one.
we update xmin for new inserts with the current txid.
Why?
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,Regards,
Vijay--
Adrian Klaver
adrian.klaver@aklaver.com
On 3/12/19 1:02 PM, Vijaykumar Jain wrote:
no i mean not we end users, postgres does it (?) via the xmin and xmax
fields from inherited tables :) if that is what you wanted in a why
or are you asking, does postgres even update those rows and i am wrong
assuming it that way?
Not sure where the inherited tables come in?
See below for more info:
https://www.postgresql.org/docs/11/storage-page-layout.html
AFAIK xmin and xmax are just done as part of the insert or delete
operations so there is no updating involved.
I would say the impact to performance would come from the overhead of
each connection rather then maintaining xmin/xmax.
since the values need to be atomic,
consider the below analogy
assuming i(postgres) am person giving out token to
people(connections/tx) in a queue.
if there is a single line, (sequential) then it is easy for me to
simply give them 1 token incrementing the value and so on.
but if there are thousands of users in parallel lines, i am only one
person delivering the token, will operate sequentially, and the other
person is "blocked" for sometime before it gets the token with the
required value.
so if there are 1000s or users with the "delay" may impact my
performance coz i need to maintain the value of the token to be able
to know what token value i need to give to next person?i do not know if am explaining it correctly, pardon my analogy,
Regards,
VijayOn Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
I was asked this question in one of my demos, and it was interesting one.
we update xmin for new inserts with the current txid.
Why?
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,Regards,
Vijay--
Adrian Klaver
adrian.klaver@aklaver.com
--
Adrian Klaver
adrian.klaver@aklaver.com
I may have misunderstood the documentation or your question, but I had
the understanding that xmin is not updated, but is only set on insert
(but yes, also for update, but updates are also inserts for Postgres as
updates are executed as delete/insert)
from https://www.postgresql.org/docs/10/ddl-system-columns.html
xmin
The identity (transaction ID) of the inserting transaction for this
row version. (A row version is an individual state of > row; each update
of a row creates a new row version for the same logical row.)
therfore I assume, there are no actual updates of xmin values
Stefan
Show quoted text
On 12.03.2019 20:19, Vijaykumar Jain wrote:
I was asked this question in one of my demos, and it was interesting one.
we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,Regards,
Vijay
Vijaykumar Jain wrote:
I was asked this question in one of my demos, and it was interesting one.
we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,
You can read the function GetNewTransactionId in
src/backend/access/transam/varsup.c for details.
Transaction ID creation is serialized with a "light-weight lock",
so it could potentially be a bottleneck.
Often that is dwarfed by the I/O requirements from many concurrent
commits, but if most of your transactions are rolled back or you
use "synchronous_commit = off", I can imagine that it could matter.
It is not a matter of how many clients there are, but of how
often a new writing transaction is started.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On Wed, Mar 13, 2019 at 9:50 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Vijaykumar Jain wrote:
I was asked this question in one of my demos, and it was interesting one.
we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,You can read the function GetNewTransactionId in
src/backend/access/transam/varsup.c for details.Transaction ID creation is serialized with a "light-weight lock",
so it could potentially be a bottleneck.
Also I think that GetSnapshotData() would be the major bottleneck way
before GetNewTransactionId() becomes problematic. Especially with
such a high number of active backends.
Thank you everyone for responding.
Appreciate your help.
Looks like I need to understand the concepts a little more in detail , to
be able to ask the right questions, but atleast now I can look at the
relevant docs.
On Wed, 13 Mar 2019 at 2:44 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
On Wed, Mar 13, 2019 at 9:50 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:Vijaykumar Jain wrote:
I was asked this question in one of my demos, and it was interesting
one.
we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,You can read the function GetNewTransactionId in
src/backend/access/transam/varsup.c for details.Transaction ID creation is serialized with a "light-weight lock",
so it could potentially be a bottleneck.Also I think that GetSnapshotData() would be the major bottleneck way
before GetNewTransactionId() becomes problematic. Especially with
such a high number of active backends.
--
Regards,
Vijay