Inserts and bad performance

Started by Godfrin, Philippe Eover 4 years ago15 messagesgeneral

Philippe.Godfrin@nov.com

over 4 years ago

Greetings
I am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.

At first, when there were a low number of rows inserted, the inserts would run at a good clip - 30 - 50K inserts per second. Now, after inserting oh say 1.5 Billion rows, the insert rate has dropped to around 5000 inserts per second. I dropped the unique index , rebuilt the other indexes and no change. The instance is 16 vcpu and 64GB ram.

I'm perplexed, I can't see to find any reason for the slow down...
Thanks,
pg

Phil Godfrin | Database Administration
NOV
NOV US | Engineering Data
9720 Beechnut St | Houston, Texas 77036
M 281.825.2311
E Philippe.Godfrin@nov.com<mailto:Philippe.Godfrin@nov.com>

Kenneth Marshall

ktm@rice.edu

over 4 years ago

In reply to: Godfrin, Philippe E (#1)

Re: Inserts and bad performance

On Wed, Nov 24, 2021 at 07:15:31PM +0000, Godfrin, Philippe E wrote:

Greetings
I am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.

At first, when there were a low number of rows inserted, the inserts would run at a good clip - 30 - 50K inserts per second. Now, after inserting oh say 1.5 Billion rows, the insert rate has dropped to around 5000 inserts per second. I dropped the unique index , rebuilt the other indexes and no change. The instance is 16 vcpu and 64GB ram.

I'm perplexed, I can't see to find any reason for the slow down...
Thanks,
pg

Hi,

With not much information, it may be I/O related. CPU and RAM cannot fix
that once items need to be written to disk. Are there any errors in the
logs or CPUs maxxed out?

Regards,
Ken

Godfrin, Philippe E

Philippe.Godfrin@nov.com

over 4 years ago

In reply to: Kenneth Marshall (#2)

RE: [EXTERNAL] Re: Inserts and bad performance

My apologies for the dearth of details. No on both the cpu and errors. But I do believe it is IO related. I just can't find it.
I thought maybe it was index splitting so I altered the unique index with filterfactor=40 and reindexed. No change.
I then dropped the unique index. No change.
I thought maybe it was checkpoint timeouts, but there was no correlation.
Oddly enough other jobs running concurrently, are also inserting, most likely into different partitions, are running about 2x faster than others.

I'm rather perplexed.
pg

-----Original Message-----
From: Kenneth Marshall <ktm@rice.edu>
Sent: Wednesday, November 24, 2021 1:20 PM
To: Godfrin, Philippe E <Philippe.Godfrin@nov.com>
Cc: pgsql-general@lists.postgresql.org
Subject: [EXTERNAL] Re: Inserts and bad performance

On Wed, Nov 24, 2021 at 07:15:31PM +0000, Godfrin, Philippe E wrote:

Greetings
I am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.

At first, when there were a low number of rows inserted, the inserts would run at a good clip - 30 - 50K inserts per second. Now, after inserting oh say 1.5 Billion rows, the insert rate has dropped to around 5000 inserts per second. I dropped the unique index , rebuilt the other indexes and no change. The instance is 16 vcpu and 64GB ram.

I'm perplexed, I can't see to find any reason for the slow down...
Thanks,
pg

Hi,

With not much information, it may be I/O related. CPU and RAM cannot fix that once items need to be written to disk. Are there any errors in the logs or CPUs maxxed out?

Regards,
Ken

Tom Lane

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Godfrin, Philippe E (#1)

Re: Inserts and bad performance

"Godfrin, Philippe E" <Philippe.Godfrin@nov.com> writes:

I am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.
At first, when there were a low number of rows inserted, the inserts would run at a good clip - 30 - 50K inserts per second. Now, after inserting oh say 1.5 Billion rows, the insert rate has dropped to around 5000 inserts per second. I dropped the unique index , rebuilt the other indexes and no change. The instance is 16 vcpu and 64GB ram.

Can you drop the indexes and not rebuild them till after the bulk load is
done? Once the indexes exceed available RAM, insert performance is going
to fall off a cliff, except maybe for indexes that are receiving purely
sequential inserts (so that only the right end of the index gets touched).

Also see

https://www.postgresql.org/docs/current/populate.html

regards, tom lane

Gavin Roy

gavinr@aweber.com

over 4 years ago

In reply to: Godfrin, Philippe E (#1)

Re: Inserts and bad performance

On Wed, Nov 24, 2021 at 2:15 PM Godfrin, Philippe E <
Philippe.Godfrin@nov.com> wrote:

Greetings

I am inserting a large number of rows, 5,10, 15 million. The python code
commits every 5000 inserts. The table has partitioned children.

On the Python client side, if you're using psycopg, you should consider
using using COPY instead of INSERT if you're not:

https://www.psycopg.org/psycopg3/docs/basic/copy.html#copy

And if using psycopg2, execute_batch might be of value:

https://www.psycopg.org/docs/extras.html?highlight=insert#psycopg2.extras.execute_batch

Regards,

Gavin

Godfrin, Philippe E

Philippe.Godfrin@nov.com

over 4 years ago

In reply to: Tom Lane (#4)

RE: [EXTERNAL] Re: Inserts and bad performance

Hi Tom. Good point about the index paging out of the buffer. I did that and no change. I do have the shared buffers at 40GB, so there's a good bit there, but I also did all those things on the page you referred, except for using copy. At this point the data has not been scrubbed, so I'm trapping data errors and duplicates. I am curios though, as sidebar, why copy is considered faster than inserts. I was unable to get COPY faster than around 25K inserts a second (pretty fast anyway). Frankly, initially I was running 3 concurrent insert jobs and getting 90K ins/sec ! but after a certain number of records, the speed just dropped off.

From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Wednesday, November 24, 2021 1:32 PM
To: Godfrin, Philippe E <Philippe.Godfrin@nov.com>
Cc: pgsql-general@lists.postgresql.org
Subject: [EXTERNAL] Re: Inserts and bad performance

"Godfrin, Philippe E" <Philippe.Godfrin@nov.com<mailto:Philippe.Godfrin@nov.com>> writes:

I am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.
At first, when there were a low number of rows inserted, the inserts would run at a good clip - 30 - 50K inserts per second. Now, after inserting oh say 1.5 Billion rows, the insert rate has dropped to around 5000 inserts per second. I dropped the unique index , rebuilt the other indexes and no change. The instance is 16 vcpu and 64GB ram.

Also see

https://www.postgresql.org/docs/current/populate.html<https://www.postgresql.org/docs/current/populate.html>

regards, tom lane

Godfrin, Philippe E

Philippe.Godfrin@nov.com

over 4 years ago

In reply to: Gavin Roy (#5)

RE: [EXTERNAL] Re: Inserts and bad performance

The notion of COPY blocks and asynchronously is very interesting

From: Gavin Roy <gavinr@aweber.com>
Sent: Wednesday, November 24, 2021 1:50 PM
To: Godfrin, Philippe E <Philippe.Godfrin@nov.com>
Cc: pgsql-general@lists.postgresql.org
Subject: [EXTERNAL] Re: Inserts and bad performance

On Wed, Nov 24, 2021 at 2:15 PM Godfrin, Philippe E <Philippe.Godfrin@nov.com<mailto:Philippe.Godfrin@nov.com>> wrote:GreetingsI am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.On the Python client side, if you're usi ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

<https://us.report.cybergraph.mimecast.com/alert-details/?dep=Wr0OlPCq7rNfiTEZXZaOpA%3D%3DjZOFbItn0C5RxyoO%2BmXR2j9FVv%2BWzhFJYReW7ql2zdXPDV40mdS1DQpYOmBt2Oxoehf1bVTmKoJUhNrZa2mIi%2FQMp8dj%2B9IMl1T8FzHRYvXB5us%2BUoZgXp%2BbwqXCXYEsxTG8iZj8bV7I6oscimbLg1XRT039VTqG5EDwXI%2FlGEJpWpx1EVzIcXHenq8DwZLgCSkhj2TFk9HkbexFBWJa3mZxYASZ%2BLx4zI5WJuTtLUGhLcQi5YtrFmxK%2FhegJTn02LIFkfp7RuqaPEJ5b%2BmvbJ8AsY1UH99HbU1dTHOFyQrKRwBXKk1knkZ9ymsDQl7VgWH%2FDg%2FTpgX0URnz8tqnbDANTpMEMJZcEvbETRrqvBMlBcdZlbm2V7LiLwDiQgK3XxvyQpn2CU%2F6QxeZAZslAsvTt%2F3bWNEXmOgoEabPh96vDxjRSdEvVvVGy%2BUPtP36YKLarzhLq1nwAah0bPBgC2XSNlAi02os5URexqotMZjX5vlxMsfPVpncwWUj61%2FFTbVU04xkn2%2FuBm8Izm5oQFsq9iGBQENILj8LakGpFNY5FH1DJuKMEUba91X6mzcy4w2Ez1bPhdWCPFTy9ToiOt7F5vC4AoMD%2FzsxoJCOWQtq9OZMzqVSPaz19AicZdgiGm%2B98bZtbGBZKdIXNiM9YLLKWS9%2FxPaDhL%2FZYkVNUjo%3D>

On the Python client side, if you're using psycopg, you should consider using using COPY instead of INSERT if you're not:

https://www.psycopg.org/psycopg3/docs/basic/copy.html#copy<https://www.psycopg.org/psycopg3/docs/basic/copy.html#copy>

And if using psycopg2, execute_batch might be of value:

https://www.psycopg.org/docs/extras.html?highlight=insert#psycopg2.extras.execute_batch<https://www.psycopg.org/docs/extras.html?highlight=insert#psycopg2.extras.execute_batch>

Regards,

Gavin

Michael Lewis

mlewis@entrata.com

over 4 years ago

In reply to: Godfrin, Philippe E (#7)

Re: [EXTERNAL] Re: Inserts and bad performance

How many partitions? How many rows do they have when performance is slowing
considerably? Does this table get many updates or is it insert only? What
version of PostgreSQL? Are the inserts randomly distributed among the
partitions or targeting one or a few partitions? Are you able to capture an
example and run it in a transaction with explain (analyze, buffers,
verbose) and then rollback?

*Michael Lewis | Database Engineer*
*Entrata*

David Rowley

dgrowleyml@gmail.com

over 4 years ago

In reply to: Godfrin, Philippe E (#6)

Re: [EXTERNAL] Re: Inserts and bad performance

On Thu, 25 Nov 2021 at 08:59, Godfrin, Philippe E
<Philippe.Godfrin@nov.com> wrote:

Hi Tom. Good point about the index paging out of the buffer. I did that and no change. I do have the shared buffers at 40GB, so there’s a good bit there, but I also did all those things on the page you referred, except for using copy. At this point the data has not been scrubbed, so I’m trapping data errors and duplicates. I am curios though, as sidebar, why copy is considered faster than inserts. I was unable to get COPY faster than around 25K inserts a second (pretty fast anyway). Frankly, initially I was running 3 concurrent insert jobs and getting 90K ins/sec ! but after a certain number of records, the speed just dropped off.

EXPLAIN (ANALYZE, BUFFERS) works with INSERTs. You just need to be
aware that using ANALYZE will perform the actual insert too. So you
might want to use BEGIN; and ROLLBACK; if it's not data that you want
to keep.

SET track_io_timing = on; might help you too.

David

#10

Ron

ronljohnsonjr@gmail.com

over 4 years ago

In reply to: Godfrin, Philippe E (#1)

Re: Inserts and bad performance

On 11/24/21 1:15 PM, Godfrin, Philippe E wrote:
[snip]

I dropped the unique index , rebuilt the other indexes and no change.

IMNSHO, this is the worst possible approach. Drop everything *except* the
unique index, and then (if possible) sort the input file by the unique
key. That'll increase buffered IO; otherwise, you're bopping all around
the filesystem.

Using a bulk loader if possible would increase speeds

--
Angular momentum makes the world go 'round.

#11

Godfrin, Philippe E

Philippe.Godfrin@nov.com

over 4 years ago

In reply to: David Rowley (#9)

RE: [EXTERNAL] Re: Inserts and bad performance

Excellent idea David, silly me, I didn't think of that. For the other questions:

How many partitions?

How many rows do they have when performance is slowing considerably?

Not sure, maybe on the low millions

Does this table get many updates or is it insert only?

insert

What version of PostgreSQL?

Are the inserts randomly distributed among the partitions or targeting one or a few partitions?

Sequentially one partition at a time, so each set of runs is inserting across each part.

Are you able to capture an example and run it in a transaction with explain (analyze, buffers, verbose) and then rollback?

Yes, I'm looking into that
pg

-----Original Message-----
From: David Rowley <dgrowleyml@gmail.com>
Sent: Wednesday, November 24, 2021 7:13 PM
To: Godfrin, Philippe E <Philippe.Godfrin@nov.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-general@lists.postgresql.org
Subject: Re: [EXTERNAL] Re: Inserts and bad performance

On Thu, 25 Nov 2021 at 08:59, Godfrin, Philippe E <Philippe.Godfrin@nov.com> wrote:

Hi Tom. Good point about the index paging out of the buffer. I did that and no change. I do have the shared buffers at 40GB, so there’s a good bit there, but I also did all those things on the page you referred, except for using copy. At this point the data has not been scrubbed, so I’m trapping data errors and duplicates. I am curios though, as sidebar, why copy is considered faster than inserts. I was unable to get COPY faster than around 25K inserts a second (pretty fast anyway). Frankly, initially I was running 3 concurrent insert jobs and getting 90K ins/sec ! but after a certain number of records, the speed just dropped off.

EXPLAIN (ANALYZE, BUFFERS) works with INSERTs. You just need to be aware that using ANALYZE will perform the actual insert too. So you might want to use BEGIN; and ROLLBACK; if it's not data that you want to keep.

SET track_io_timing = on; might help you too.

David

#12

Godfrin, Philippe E

Philippe.Godfrin@nov.com

over 4 years ago

In reply to: Gavin Roy (#5)

RE: [EXTERNAL] Re: Inserts and bad performance

Hi Gavin – thanks I hadn’t realized that about psychopg. I’m on the earlier version, so I can’t use what you recommended at this point. But I did use copy_expert.

Interestingly enough the performance of the copy statement is only slightly better than the insert, as I was running inserts with 5000 values clauses. In the end, the current config couldn’t keep up with the WAL creation, so I turned all that off. But still no perf gains. I also turned off fsync and set the kernel settings to 10% and 98% for dirty pages…

I wonder if there’s a better load product than COPY???? But I’d still like to know what separates COPY from bulk inserts…
pf

On the Python client side, if you're using psycopg, you should consider using using COPY instead of INSERT if you're not:

https://www.psycopg.org/psycopg3/docs/basic/copy.html#copy<https://www.psycopg.org/psycopg3/docs/basic/copy.html#copy>

And if using psycopg2, execute_batch might be of value:

https://www.psycopg.org/docs/extras.html?highlight=insert#psycopg2.extras.execute_batch<https://www.psycopg.org/docs/extras.html?highlight=insert#psycopg2.extras.execute_batch>

Regards,

Gavin

#13

Godfrin, Philippe E

Philippe.Godfrin@nov.com

over 4 years ago

In reply to: Ron (#10)

RE: [EXTERNAL] Re: Inserts and bad performance

Right you are sir! I figured that out a few hours ago!
pg

From: Ron <ronljohnsonjr@gmail.com>
Sent: Wednesday, November 24, 2021 10:58 PM
To: pgsql-general@lists.postgresql.org
Subject: [EXTERNAL] Re: Inserts and bad performance

On 11/24/21 1:15 PM, Godfrin, Philippe E wrote: [snip] I dropped the unique index , rebuilt the other indexes and no change. IMNSHO, this is the worst possible approach. Drop everything except the unique index, and then (if possible) sort the input file by the unique key. That ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

<https://us.report.cybergraph.mimecast.com/alert-details/?dep=MscIZbFcA20cZ6HlLPCzsg%3D%3Dup0ll%2Bu3Q7dCpCHjMRutwBFZuF6FlLMVKqLpzfn8crKDwB3aTItjQ7KgyC%2BQRsD5f8xbGX3jwMuXdtjE8CPSUKv%2Fnjpl7aIQazHJfSlvqupgF%2FVAjvz8Jlq7ZsTWUJt%2BWrXlTYCuU%2BCQOKhfWIIlqoUScpyv3co3gzlBlKNVgoCnnPt3fs8nGUhVxEcDxAJG9wNueH6iVn9jt7IZqlJqkRuTmS%2BLprH8Wpzbiccgs%2B%2Fe761zMa0BtyYusxWyTzOBV%2BlpQ%2FkcIStxNkVj3yVEQrTaZKxvBJ1%2BBjgcdUM3ElooA73KU%2Fw61vomhZ35fg%2BjjYpYuKP9oTWWQUuW%2FWsjpQTX%2BL45OIGuqTNH5fXB8BrPn4Y6b5ov0daaqUQGoAWwU5W5q3ZY6cXp7aUiHpF5GjNsdel3fZrHhxyVjwGII7YfDIjf21rgY9iMePiiqA6Q8VT6f18T8yyhIgcuVYOJ%2BLp3lBxaj%2Fv%2FEjQHbHECjy4LeFjie6WzC%2F4%2BkdBShGp5SA8plTtWPCeY6aaacWMN0RsqNcCXpybFU4smy8v0bb%2FeqOrfR8abX4%2FRB%2BVI5WpHUi0TihiOt%2BviFrXdWNSOopKTokuQp4DJQ2uqqh5tUZtlsXGeemkFCTgwy6WCX2Uemtupx1j4MZWdbuPQisdDvUE0XpSQARjcSjmUdHlWve%2FHd6wsLXc2abfpAW5tYy%2BL>
On 11/24/21 1:15 PM, Godfrin, Philippe E wrote:

[snip]

I dropped the unique index , rebuilt the other indexes and no change.

IMNSHO, this is the worst possible approach. Drop everything except the unique index, and then (if possible) sort the input file by the unique key. That'll increase buffered IO; otherwise, you're bopping all around the filesystem.

Using a bulk loader if possible would increase speeds
--
Angular momentum makes the world go 'round.

#14

Gavin Flower

GavinFlower@archidevsys.co.nz

over 4 years ago

In reply to: Godfrin, Philippe E (#13)

Re: [EXTERNAL] Re: Inserts and bad performance

On 28/11/21 17:17, Godfrin, Philippe E wrote:

Right you are sir! I figured that out a few hours ago!

pg

*From:* Ron <ronljohnsonjr@gmail.com>
*Sent:* Wednesday, November 24, 2021 10:58 PM
*To:* pgsql-general@lists.postgresql.org
*Subject:* [EXTERNAL] Re: Inserts and bad performance

On 11/24/21 1:15 PM, Godfrin, Philippe E wrote:

[snip]

I dropped the unique index , rebuilt the other indexes and no change.

IMNSHO, this is the worst possible approach. Drop everything *except*
the unique index, and then (if possible) sort the input file by the
unique key. That'll increase buffered IO; otherwise, you're bopping
all around the filesystem.

Using a bulk loader if possible would increase speeds

--
Angular momentum makes the world go 'round.

Please don't top post!

Cheers,
Gavin

#15

Ali .

mp.x@bk.ru

over 4 years ago

In reply to: Godfrin, Philippe E (#7)

Re: RE: [EXTERNAL] Re: Inserts and bad performance

Ok, thanks
--
Sent from Mail.ru app for Android Wednesday, 24 November 2021, 11:28pm +03:00 from Godfrin, Philippe E philippe.godfrin@nov.com :

Show quoted text

The notion of COPY blocks and asynchronously is very interesting

From: Gavin Roy < gavinr@aweber.com>
Sent: Wednesday, November 24, 2021 1:50 PM
To: Godfrin, Philippe E < Philippe.Godfrin@nov.com>
Cc: pgsql-general@lists.postgresql.org
Subject: [EXTERNAL] Re: Inserts and bad performance

On Wed, Nov 24, 2021 at 2:15 PM Godfrin, Philippe E < Philippe.Godfrin@nov.com> wrote:

Greetings
I am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.

On the Python client side, if you're using psycopg, you should consider using using COPY instead of INSERT if you're not:

https://www.psycopg.org/psycopg3/docs/basic/copy.html#copy

And if using psycopg2, execute_batch might be of value:

https://www.psycopg.org/docs/extras.html?highlight=insert#psycopg2.extras.execute_batch

Regards,

Gavin