IO related waits

Started by veem vover 1 year ago26 messagesgeneral

veema0000@gmail.com

over 1 year ago

Hi,
One of our application using RDS postgres. In one of our streaming
applications(using flink) which processes 100's of millions of transactions
each day, we are using row by row transaction processing for inserting data
into the postgres database and commit is performed for each row. We are
seeing heavy IO:XactSynch wait events during the data load and also high
overall response time.

Architecture team is suggesting to enable asynch io if possible, so that
the streaming client will not wait for the commit confirmation from the
database. So I want to understand , how asynch io can be enabled and if any
downsides of doing this?

Regards
Veem

Christophe Pettus

xof@thebuild.com

over 1 year ago

In reply to: veem v (#1)

Re: IO related waits

On Sep 16, 2024, at 13:24, veem v <veema0000@gmail.com> wrote:
Architecture team is suggesting to enable asynch io if possible, so that the streaming client will not wait for the commit confirmation from the database. So I want to understand , how asynch io can be enabled and if any downsides of doing this?

"Async I/O" has a specific meaning that's not currently applicable to PostgreSQL. What is available is "synchronous_commit". This setting is by default on. When it's on, each commit waits until the associated WAL information has been flushed to disk, and then returns. If it is turned off, the commit returns more or less immediately, and the WAL flush happens asynchronously from the commit.

The upside is that the session can proceed without waiting for the WAL flush. The downside is that on a server crash, some transactions may not have been fully committed to the database, and will be missing when the database restarts. The database won't be corrupted (as in, you try to use it and get errors), but it will be "earlier in time" than the application might expect. It's pretty common to turn it off for high-ingestion-rate situations, especially where the application can detect and replay missing transactions on a crash.

Adrian Klaver

adrian.klaver@aklaver.com

over 1 year ago

In reply to: veem v (#1)

Re: IO related waits

On 9/16/24 13:24, veem v wrote:

Hi,
One of our application using RDS postgres. In one of our streaming
applications(using flink) which processes 100's of millions of
transactions each day, we are using row by row transaction processing
for inserting data into the postgres database and commit is performed
for each row. We are seeing heavy IO:XactSynch wait events during the
data load and also high overall response time.

Architecture team is suggesting to enable asynch io if possible, so that
the streaming client will not wait for the commit confirmation from the
database. So I want to understand , how asynch io can be enabled and if
any downsides of doing this?

Are you referring to this?:

https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/operators/asyncio/

If not then you will need to be more specific.

Regards
Veem

--
Adrian Klaver
adrian.klaver@aklaver.com

veem v

veema0000@gmail.com

over 1 year ago

In reply to: Adrian Klaver (#3)

Re: IO related waits

On Tue, 17 Sept 2024 at 03:41, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Are you referring to this?:

https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/operators/asyncio/

If not then you will need to be more specific.

Yes, I was referring to this one. So what can be the caveats in this
approach, considering transactions meant to be ACID compliant as financial
transactions.Additionally I was not aware of the parameter
"synchronous_commit" in DB side which will mimic the synchronous commit.

Would both of these mimic the same asynchronous behaviour and achieves the
same, which means the client data load throughput will increase because the
DB will not wait for those data to be written to the WAL and give a
confirmation back to the client and also the client will not wait for the
DB to give a confirmation back on the data to be persisted in the DB or
not?. Also, as in the backend the flushing of the WAL to the disk has to
happen anyway(just that it will be delayed now), so can this method cause
contention in the database storage side if the speed in which the data gets
ingested from the client is not getting written to the disk , and if it can
someway impact the data consistency for the read queries?

Greg Sabino Mullane

greg@turnstep.com

over 1 year ago

In reply to: veem v (#4)

Re: IO related waits

On Mon, Sep 16, 2024 at 11:56 PM veem v <veema0000@gmail.com> wrote:

So what can be the caveats in this approach, considering transactions
meant to be ACID compliant as financial transactions.

Financial transactions need to be handled with care. Only you know your
business requirements, but as Christophe pointed out, disabling
synchronous commit means your application may think a particular
transaction has completed when it has not. Usually that's a big red flag
for financial applications.

we are using row by row transaction processing for inserting data into the

postgres database and commit is performed for each row.

This is a better place to optimize. Batch many rows per transaction. Remove
unused indexes.

flushing of the WAL to the disk has to happen anyway(just that it will be

delayed now), so can this method cause contention in the database storage
side if the speed in which the data gets ingested from the client is not
getting written to the disk , and if it can someway impact the data
consistency for the read queries?

Not quite clear what you are asking here re data consistency. The data will
always be consistent, even if synchronous_commit is disabled. The only
danger window is on a server crash.

(Keep in mind that RDS is not Postgres, so take tuning recommendations and
advice with a grain of salt.)

Cheers,
Greg

veem v

veema0000@gmail.com

over 1 year ago

In reply to: Greg Sabino Mullane (#5)

Re: IO related waits

On Tue, 17 Sept 2024 at 18:43, Greg Sabino Mullane <htamfids@gmail.com>
wrote:

This is a better place to optimize. Batch many rows per transaction.
Remove unused indexes.

flushing of the WAL to the disk has to happen anyway(just that it will be

delayed now), so can this method cause contention in the database storage
side if the speed in which the data gets ingested from the client is not
getting written to the disk , and if it can someway impact the data
consistency for the read queries?

Not quite clear what you are asking here re data consistency. The data
will always be consistent, even if synchronous_commit is disabled. The only
danger window is on a server crash.

(Keep in mind that RDS is not Postgres, so take tuning recommendations and
advice with a grain of salt.)

Thank you Greg.

Yes, our Java application was doing row by row commit and we saw that from
pg_stat_database from the column "xact_commit" which was closely the same
as the sum of tup_inserted, tup_updated, tup_deleted column. And also we
verified in pg_stats_statements the number against the "calls" column were
matching to the "rows" column for the INSERT queries, so it means also we
are inserting exactly same number of rows as the number of DB calls, so it
also suggest that we are doing row by row operations/dmls.

And we then asked the application tema to make the inserts in batches, but
still seeing those figures in these above two views are not changing much
the number "xact_commit" is staying almost same and also even the "calls"
and the "rows" column in pg_stats_statements also staying almost same. So
does it mean that the application code is somehow still doing the same row
by row processing or we are doing something wrong in the above analysis?

And another thing we noticed , even after the data load finished , even
then the "xact_commit" was keep increasing along with "tup_fetched", so
does it mean that its doing some kind of implicit commit even for the fetch
type queries which must be "select" queries i believe? Also not sure if its
expected, but here in postgres i have seen unless we put a code within
begin and end block , it's default gets committed even we just run it on
the console , it doesn't ask for a explicit commit/rollback ,so not sure if
that is someway playing a role here.

Regards
Veem

Show quoted text

Adrian Klaver

adrian.klaver@aklaver.com

over 1 year ago

In reply to: veem v (#4)

Re: IO related waits

On 9/16/24 20:55, veem v wrote:

On Tue, 17 Sept 2024 at 03:41, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

Are you referring to this?:

https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/operators/asyncio/ <https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/operators/asyncio/>

If not then you will need to be more specific.

Yes, I was referring to this one. So what can be the caveats in this
approach, considering transactions meant to be ACID compliant as
financial transactions.Additionally I was not aware of the parameter
"synchronous_commit" in DB side which will mimic the synchronous commit.

Would both of these mimic the same asynchronous behaviour and achieves
the same, which means the client data load throughput will increase
because the DB will not wait for those data to be written to the WAL and
give a confirmation back to the client and also the client will not wait
for the DB to give a confirmation back on the data to be persisted in
the DB or not?. Also, as in the backend the flushing of the WAL to the
disk has to happen anyway(just that it will be delayed now), so can this
method cause contention in the database storage side if the speed in
which the data gets ingested from the client is not getting written to
the disk , and if it can someway impact the data consistency for the
read queries?

This is not something that I am that familiar with. I suspect though
this is more complicated then you think. From the link above:

" Prerequisites #

As illustrated in the section above, implementing proper asynchronous
I/O to a database (or key/value store) requires a client to that
database that supports asynchronous requests. Many popular databases
offer such a client.

In the absence of such a client, one can try and turn a synchronous
client into a limited concurrent client by creating multiple clients and
handling the synchronous calls with a thread pool. However, this
approach is usually less efficient than a proper asynchronous client.
"

Which means you need to on Flink end:

1) Use Flink async I/O .

2) Find a client that supports async or fake it by using multiple
synchronous clients.

On Postgres end there is this:

https://www.postgresql.org/docs/current/wal-async-commit.html

That will return a success signal to the client quicker if
synchronous_commit is set to off. Though the point of the Flink async
I/O is not to wait for the response before moving on, so I am not sure
how much synchronous_commit = off would help.

--
Adrian Klaver
adrian.klaver@aklaver.com

veem v

veema0000@gmail.com

over 1 year ago

In reply to: Adrian Klaver (#7)

Re: IO related waits

On Tue, 17 Sept 2024 at 21:24, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Which means you need to on Flink end:

1) Use Flink async I/O .

2) Find a client that supports async or fake it by using multiple
synchronous clients.

On Postgres end there is this:

https://www.postgresql.org/docs/current/wal-async-commit.html

That will return a success signal to the client quicker if
synchronous_commit is set to off. Though the point of the Flink async
I/O is not to wait for the response before moving on, so I am not sure
how much synchronous_commit = off would help.

Got it. So it means their suggestion was to set the asynch_io at flink
level but not DB level, so that the application will not wait for the
commit response from the database. But in that case , won't it overload the
DB with more and more requests if database will keep doing the commit (
with synchronous_commit=ON) and waiting for getting the response back from
its storage for the WAL's to be flushed to the disk, while the application
will not wait for its response back(for those inserts) and keep flooding
the database with more and more incoming Insert requests?

Additionally as I mentioned before, we see that from "pg_stat_database"
from the column "xact_commit" , it's almost matching with the sum of
"tup_inserted", "tup_updated", "tup_deleted" column. And also we verified
in pg_stats_statements the "calls" column is same as the "rows" column for
the INSERT queries, so it means also we are inserting exactly same number
of rows as the number of DB calls, so doesn't it suggest that we are
doing row by row operations/dmls.

Also after seeing above and asking application team to do the batch commit
,we are still seeing the similar figures from pg_stat_database and
pg_stat_statements, so does it mean that we are looking into wrong stats?
or the application code change has not been done accurately? and we see
even when no inserts are running from the application side, we do see
"xact_commit" keep increasing along with "tup_fetched" , why so?

Finally we see in postgres here, even if we just write a DML statement it
does commit that by default, until we explicitly put it in a "begin... end"
block. Can that be the difference between how a "commit" gets handled in
postgres vs other databases?

Adrian Klaver

adrian.klaver@aklaver.com

over 1 year ago

In reply to: veem v (#8)

Re: IO related waits

On 9/17/24 12:34, veem v wrote:

On Tue, 17 Sept 2024 at 21:24, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

Which means you need to on Flink end:

1) Use Flink async I/O .

2) Find a client that supports async or fake it by using multiple
synchronous clients.

On Postgres end there is this:

https://www.postgresql.org/docs/current/wal-async-commit.html
<https://www.postgresql.org/docs/current/wal-async-commit.html>

That will return a success signal to the client quicker if
synchronous_commit is set to off. Though the point of the Flink async
I/O is not to wait for the response before moving on, so I am not sure
how much synchronous_commit = off would help.

Got it. So it means their suggestion was to set the asynch_io at flink
level but not DB level, so that the application will not wait for the
commit response from the database. But in that case , won't it overload
the DB with more and more requests if database will keep doing the
commit ( with synchronous_commit=ON) and waiting for getting the
response back from its storage for the WAL's to be flushed to the disk,
while the application will not wait for its response back(for those
inserts) and keep flooding the database with more and more incoming
Insert requests?

My point is this is a multi-layer cake with layers:

1) Flink asycnc io

2) Database client async/sync

3) Postgres sync status.

That is a lot of moving parts and determining whether it is suitable is
going to require rigorous testing over a representative data load.

See more below.

Additionally as I mentioned before, we see that from "pg_stat_database"
from the column "xact_commit" , it's almost matching with the sum of
"tup_inserted", "tup_updated", "tup_deleted" column. And also we
verified in pg_stats_statements the "calls" column is same as the
"rows" column for the INSERT queries, so it means also we are inserting
exactly same number of rows as the number of DB calls, so doesn't it
suggest that we are doing row by row operations/dmls.

Also after seeing above and asking application team to do the batch
commit ,we are still seeing the similar figures from pg_stat_database
and pg_stat_statements, so does it mean that we are looking into wrong
stats? or the application code change has not been done accurately? and
we see even when no inserts are running from the application side, we do
see "xact_commit" keep increasing along with "tup_fetched" , why so?

Finally we see in postgres here, even if we just write a DML statement
it does commit that by default, until we explicitly put it in a
"begin... end" block. Can that be the difference between how a "commit"
gets handled in postgres vs other databases?

It does if autocommit is set in the client, that is common to other
databases also:

https://dev.mysql.com/doc/refman/8.4/en/commit.html

https://docs.oracle.com/en/database/oracle/developer-tools-for-vscode/getting-started/disabling-and-enabling-auto-commit.html

https://learn.microsoft.com/en-us/sql/t-sql/statements/set-implicit-transactions-transact-sql?view=sql-server-ver16

You probably need to take a closer look at the client/driver you are
using and the code that interacting with it.

In fact I would say you need to review the entire data transfer process
to see if there are performance gains that can be obtained without
adding an entirely new async component.

--
Adrian Klaver
adrian.klaver@aklaver.com

#10

veem v

veema0000@gmail.com

over 1 year ago

In reply to: Adrian Klaver (#9)

Re: IO related waits

On Wed, 18 Sept 2024 at 05:07, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 9/17/24 12:34, veem v wrote:

It does if autocommit is set in the client, that is common to other
databases also:

https://dev.mysql.com/doc/refman/8.4/en/commit.html

https://docs.oracle.com/en/database/oracle/developer-tools-for-vscode/getting-started/disabling-and-enabling-auto-commit.html

https://learn.microsoft.com/en-us/sql/t-sql/statements/set-implicit-transactions-transact-sql?view=sql-server-ver16

You probably need to take a closer look at the client/driver you are
using and the code that interacting with it.

In fact I would say you need to review the entire data transfer process
to see if there are performance gains that can be obtained without
adding an entirely new async component.

You were spot on. When we turned off the "auto commit" we started seeing
less number of commits as per the number of batches.

However we also started seeing deadlock issues. We have foreign key
relationships between the tables and during the batch we do insert into the
parent first and then to the child , but this does happen from multiple
sessions for different batches. So why do we see below error, as we
ensure in each batch we first insert into parent and then into the child
tables?

caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
Detail: Process 10443 waits for ShareLock on transaction 220972157;
blocked by process 10454.
Process 10454 waits for ShareLock on transaction 220972155; blocked by
process 10443.
Hint: See server log for query details.
Where: while inserting index tuple (88736,28) in relation
"TAB1_p2024_08_29"

#11

veem v

veema0000@gmail.com

over 1 year ago

In reply to: veem v (#10)

Re: IO related waits

On Thu, 19 Sept 2024 at 02:01, veem v <veema0000@gmail.com> wrote:

On Wed, 18 Sept 2024 at 05:07, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 9/17/24 12:34, veem v wrote:

It does if autocommit is set in the client, that is common to other
databases also:

https://dev.mysql.com/doc/refman/8.4/en/commit.html

https://docs.oracle.com/en/database/oracle/developer-tools-for-vscode/getting-started/disabling-and-enabling-auto-commit.html

https://learn.microsoft.com/en-us/sql/t-sql/statements/set-implicit-transactions-transact-sql?view=sql-server-ver16

You probably need to take a closer look at the client/driver you are
using and the code that interacting with it.

In fact I would say you need to review the entire data transfer process
to see if there are performance gains that can be obtained without
adding an entirely new async component.

You were spot on. When we turned off the "auto commit" we started seeing
less number of commits as per the number of batches.

However we also started seeing deadlock issues. We have foreign key
relationships between the tables and during the batch we do insert into the
parent first and then to the child , but this does happen from multiple
sessions for different batches. So why do we see below error, as we
ensure in each batch we first insert into parent and then into the child
tables?

caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
Detail: Process 10443 waits for ShareLock on transaction 220972157;
blocked by process 10454.
Process 10454 waits for ShareLock on transaction 220972155; blocked by
process 10443.
Hint: See server log for query details.
Where: while inserting index tuple (88736,28) in relation
"TAB1_p2024_08_29"

As we are able to get hold of one session, we see "insert into <parent
partition table>" was blocked by "insert into <child partition table>". And
the "insert into <child partition table> " was experiencing a "client
read" wait event. Still unable to understand why it's happening and how to
fix it?

#12

Adrian Klaver

adrian.klaver@aklaver.com

over 1 year ago

In reply to: veem v (#11)

Re: IO related waits

On 9/18/24 1:40 PM, veem v wrote:

You were spot on. When we turned off the "auto commit" we started
seeing less number of commits as per the number of batches.

However we also started seeing deadlock issues. We have foreign key
relationships between the tables and during the batch we do insert
into the parent first and then to the child , but this does happen
from multiple sessions for different batches. So why do we see below
error, as we ensure in each batch we first insert into parent and
then into the child tables?

caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
Detail: Process 10443 waits for ShareLock on transaction
220972157; blocked by process 10454.
Process 10454 waits for ShareLock on transaction 220972155; blocked
by process 10443.
Hint: See server log for query details.
Where: while inserting index tuple (88736,28) in relation
"TAB1_p2024_08_29"

As we are able to get hold of one session, we see "insert into <parent
partition table>" was blocked by "insert into <child partition table>".
And the "insert into <child partition table> " was experiencing a
"client read" wait event. Still unable to understand why it's happening
and how to fix it?

This needs clarification.

1) To be clear when you refer to parent and child that is:
FK
parent_tbl.fld <--> child_tbl.fld_fk

not parent and child tables in partitioning scheme?

2) What are the table schemas?

3) What is the code that is generating the error?

Overall it looks like this process needs a top to bottom audit to map
out what is actually being done versus what needs to be done.

--
Adrian Klaver
adrian.klaver@aklaver.com

#13

veem v

veema0000@gmail.com

over 1 year ago

In reply to: Adrian Klaver (#12)

Re: IO related waits

On Thu, 19 Sept 2024 at 03:02, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

This needs clarification.

1) To be clear when you refer to parent and child that is:
FK
parent_tbl.fld <--> child_tbl.fld_fk

not parent and child tables in partitioning scheme?

2) What are the table schemas?

3) What is the code that is generating the error?

Overall it looks like this process needs a top to bottom audit to map
out what is actually being done versus what needs to be done.

Yes the tables were actually having parent and child table relationships,
not the child/parent table in partitioning scheme. And the PK and FK are
on columns - (ID, part_date) .The table is the daily range partitioned on
column part_date.

*The error we are seeing is as below in logs:-*

deadlock detected
2024-09-18 17:05:56 UTC:100.72.10.66(54582):USER1@TRANDB:[14537]:DETAIL:
Process 14537 waits for ShareLock on transaction 220975629; blocked by
process 14548.
Process 14548 waits for ShareLock on transaction 220975630; blocked by
process 14537.
Process 14537: INSERT INTO TRANDB.PART_TAB (ID, part_date....) VALUES ($1,
$2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14) ON CONFLICT (ID,
part_date) DO NOTHING
Process 14548: INSERT INTO TRANDB.PART_TAB (ID, part_date, ...) VALUES
($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14) ON CONFLICT
(ID, part_date) DO NOTHING

2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:HINT: See
server log for query details.
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:CONTEXT:
while inserting index tuple (88814,39) in relation "PART_TAB_p2024_08_29"

2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:STATEMENT:
INSERT INTO TRANDB.PART_TAB (ID, part_date, ...) VALUES ($1, $2, $3, $4,
$5, $6, $7, $8, $9, $10, $11, $12, $13, $14) ON CONFLICT (ID, part_date) DO
NOTHING

2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:STATEMENT:
INSERT INTO TRANDB.EXCEP_TAB (...)
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:STATEMENT:

2024-09-18 17:05:56 UTC:100.72.22.33(36096):USER1@TRANDB:[14551]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block

*********

2024-09-18 17:41:05 UTC:100.72.39.47(58778):USER1@TRANDB:[17456]:ERROR:
deadlock detected
2024-09-18 17:41:05 UTC:100.72.39.47(58778):USER1@TRANDB:[17456]:DETAIL:
Process 17456 waits for ShareLock on transaction 220978890; blocked by
process 17458.
Process 17458 waits for ShareLock on transaction 220978889; blocked by
process 17456.
Process 17456: INSERT INTO TRANDB.PART_TAB (ID, part_date, ...) VALUES
($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14) ON CONFLICT
(ID, part_date) DO NOTHING
Process 17458: INSERT INTO TRANDB.PART_TAB (ID, part_date, ..) VALUES ($1,
$2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14) ON CONFLICT (ID,
part_date) DO NOTHING
2024-09-18 17:41:05 UTC:100.72.39.47(58778):USER1@TRANDB:[17456]:HINT: See
server log for query details.
2024-09-18 17:41:05 UTC:100.72.39.47(58778):USER1@TRANDB:[17456]:CONTEXT:
while inserting index tuple (88875,13) in relation "PART_TAB_p2024_08_29"
2024-09-18 17:41:05 UTC:100.72.39.47(58778):USER1@TRANDB:[17456]:STATEMENT:
INSERT INTO TRANDB.PART_TAB (ID, part_date,..) VALUES ($1, $2, $3, $4,
$5, $6, $7, $8, $9, $10, $11, $12, $13, $14) ON CONFLICT (ID, part_date) DO
NOTHING
2024-09-18 17:41:05 UTC:100.72.39.47(58778):USER1@TRANDB:[17456]:LOG:
could not receive data from client: Connection reset by peer
2024-09-18 17:41:05 UTC:100.72.39.47(58778):USER1@TRANDB:[17456]:LOG:
disconnection: session time: 0:08:37.154 user=USER1 database=TRANDB
host=XXXXX port=58778

#14

Greg Sabino Mullane

greg@turnstep.com

over 1 year ago

In reply to: veem v (#13)

Re: IO related waits

On Thu, Sep 19, 2024 at 5:17 AM veem v <veema0000@gmail.com> wrote:

2024-09-18 17:05:56 UTC:100.72.10.66(54582):USER1@TRANDB:[14537]:DETAIL:
Process 14537 waits for ShareLock on transaction 220975629; blocked by
process 14548.

You need to find out exactly what commands, and in what order, all these
processes are doing. Deadlocks can be avoided by rearranging your
application logic.

2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:STATEMENT:
INSERT INTO TRANDB.EXCEP_TAB (...)
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:STATEMENT:

2024-09-18 17:05:56 UTC:100.72.22.33(36096):USER1@TRANDB:[14551]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block

Fix your application. It should be checking that each command completed and
not just blindly pushing on to the next statement while ignoring the error.

This is really difficult to diagnose from afar with only snippets of logs
and half-complete descriptions of your business logic. Pull everyone
involved into a room with a whiteboard, and produce a document describing
exactly what your application does, and how it is doing it. Switch from
reactive to proactive.

Cheers,
Greg

#15

veem v

veema0000@gmail.com

over 1 year ago

In reply to: Greg Sabino Mullane (#14)

Re: IO related waits

On Thu, 19 Sept 2024 at 17:54, Greg Sabino Mullane <htamfids@gmail.com>
wrote:

On Thu, Sep 19, 2024 at 5:17 AM veem v <veema0000@gmail.com> wrote:

2024-09-18 17:05:56 UTC:100.72.10.66(54582):USER1@TRANDB:[14537]:DETAIL:
Process 14537 waits for ShareLock on transaction 220975629; blocked by
process 14548.

You need to find out exactly what commands, and in what order, all these
processes are doing. Deadlocks can be avoided by rearranging your
application logic.

2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:STATEMENT:
INSERT INTO TRANDB.EXCEP_TAB (...)
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block
2024-09-18 17:05:56 UTC:100.72.22.33(54582):USER1@TRANDB:[14537]:STATEMENT:

2024-09-18 17:05:56 UTC:100.72.22.33(36096):USER1@TRANDB:[14551]:ERROR:
current transaction is aborted, commands ignored until end of transaction
block

Fix your application. It should be checking that each command completed
and not just blindly pushing on to the next statement while ignoring the
error.

This is really difficult to diagnose from afar with only snippets of logs
and half-complete descriptions of your business logic. Pull everyone
involved into a room with a whiteboard, and produce a document describing
exactly what your application does, and how it is doing it. Switch from
reactive to proactive.

Thank you Greg.

I was thinking there might be some oddity or difference in the behaviour
here in postgres as compared to others, because I have seen deadlock due to
UPDATES but never seen deadlocks with INSERT queries before in other
databases (though here we have "insert on conflict do nothing"). But I am
now thinking , here we have foreign keys and primary keys exist and if the
same PK gets inserted from multiple sessions then one will wait if the
other has not been committed and that might be creating a situation of
locking first and subsequently deadlock.

But also we are doing batch inserts from multiple sessions but each session
will first insert into the parent and then into the child table for those
related to PK and FK and it should not overlap across sessions. So I will
check if there is a loophole there.

Also another thing which we encountered here , if the session gets errors
out with any error(may be deadlock etc) , it's not executing any further
transactions and erroring out with "*current transaction aborted, command
ignored until end of transaction block*". And it seems it will need
explicit "rollback" and will not be the default rollback, which I was
expecting it to do.

Regards
Veem

#16

Adrian Klaver

adrian.klaver@aklaver.com

over 1 year ago

In reply to: Greg Sabino Mullane (#14)

Re: IO related waits

On 9/19/24 05:24, Greg Sabino Mullane wrote:

On Thu, Sep 19, 2024 at 5:17 AM veem v <veema0000@gmail.com

This is really difficult to diagnose from afar with only snippets of
logs and half-complete descriptions of your business logic. Pull
everyone involved into a room with a whiteboard, and produce a document
describing exactly what your application does, and how it is doing it.
Switch from reactive to proactive.

Cheers,
Greg

--
Adrian Klaver
adrian.klaver@aklaver.com

#17

veem v

veema0000@gmail.com

over 1 year ago

In reply to: Adrian Klaver (#16)

Re: IO related waits

On Thu, 19 Sept, 2024, 8:40 pm Adrian Klaver, <adrian.klaver@aklaver.com>
wrote:

On 9/19/24 05:24, Greg Sabino Mullane wrote:

On Thu, Sep 19, 2024 at 5:17 AM veem v <veema0000@gmail.com

This is really difficult to diagnose from afar with only snippets of
logs and half-complete descriptions of your business logic. Pull
everyone involved into a room with a whiteboard, and produce a document
describing exactly what your application does, and how it is doing it.
Switch from reactive to proactive.

Able to reproduce this deadlock graph as below. Now my question is , this
is a legitimate scenario in which the same ID can get inserted from
multiple sessions and in such cases it's expected to skip that (thus "On
conflict Do nothing" is used) row. But as we see it's breaking the code
with deadlock error during race conditions where a lot of parallel threads
are operating. So how should we handle this scenario? Will setting the
"lock_timeout" parameter at session level will help us anyway here?

Create table t1(id numeric primary key);

Session 1:-
Begin
Insert into table1 values(1) on conflict(id) do nothing;

Session 2:
Begin
Insert into table1 values(2) on conflict(id) do nothing;

session 1:-
Insert into table1 values (2) on conflict(id) do nothing;

It got hung as it waits for the session-2 to commit/rollback the transaction

Session-2:-
Insert into table1 values(1) on conflict(id) do nothing;

deadlock detected... and this session terminated.

Regards
Veem

#18

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: veem v (#17)

Re: IO related waits

veem v <veema0000@gmail.com> writes:

Able to reproduce this deadlock graph as below. Now my question is , this
is a legitimate scenario in which the same ID can get inserted from
multiple sessions and in such cases it's expected to skip that (thus "On
conflict Do nothing" is used) row. But as we see it's breaking the code
with deadlock error during race conditions where a lot of parallel threads
are operating. So how should we handle this scenario?

Do you have to batch multiple insertions into a single transaction?
If so, can you arrange to order them consistently across transactions
(eg, sort by primary key before inserting)?

regards, tom lane

#19

Ron

ronljohnsonjr@gmail.com

over 1 year ago

In reply to: Tom Lane (#18)

Re: IO related waits

On Fri, Sep 20, 2024 at 4:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

veem v <veema0000@gmail.com> writes:

Able to reproduce this deadlock graph as below. Now my question is ,

this

is a legitimate scenario in which the same ID can get inserted from
multiple sessions and in such cases it's expected to skip that (thus "On
conflict Do nothing" is used) row. But as we see it's breaking the code
with deadlock error during race conditions where a lot of parallel

threads

are operating. So how should we handle this scenario?

Do you have to batch multiple insertions into a single transaction?
If so, can you arrange to order them consistently across transactions
(eg, sort by primary key before inserting)?

That's exactly what I did back in the day. Because of database buffering,
sorting the data file at the OS level made the job 3x as fast as when the
input data was random.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!

#20

Adrian Klaver

adrian.klaver@aklaver.com

over 1 year ago

In reply to: veem v (#17)

Re: IO related waits

On 9/20/24 1:01 PM, veem v wrote:

On Thu, 19 Sept, 2024, 8:40 pm Adrian Klaver, <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 9/19/24 05:24, Greg Sabino Mullane wrote:

On Thu, Sep 19, 2024 at 5:17 AM veem v <veema0000@gmail.com

<mailto:veema0000@gmail.com>

This is really difficult to diagnose from afar with only snippets of
logs and half-complete descriptions of your business logic. Pull
everyone involved into a room with a whiteboard, and produce a

document

describing exactly what your application does, and how it is

doing it.

Switch from reactive to proactive.

Able to reproduce this deadlock graph as below. Now my question is ,
this is a legitimate scenario in which the same ID can get inserted from
multiple sessions and in such cases it's expected to skip that (thus "On
conflict Do nothing" is used) row. But as we see it's breaking the code

Yeah, as I see it that would not work with concurrent uncommitted
sessions as it would be unresolved whether a conflict actually exists
until at least one of the sessions completes.

with deadlock error during race conditions where a lot of parallel
threads are operating. So how should we handle this scenario? Will
setting the "lock_timeout" parameter at session level will help us
anyway here?

Serializable transaction?:

https://www.postgresql.org/docs/current/transaction-iso.html#XACT-SERIALIZABLE

Or change the application code to not have this:

"... legitimate scenario in which the same ID can get inserted from
multiple sessions ..."
--
Adrian Klaver
adrian.klaver@aklaver.com

#21