Difference between Bulk Load (Multiple inserts or single inserts) and COPY

Started by PG Bug reporting formover 6 years ago2 messagesdocs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/10/sql-copy.html
Description:

Hello,

I experimented with Bulk load and COPY.
Loading in COPY was very fast.
However, after COPYing data from a CSV file to PostgreSQL Table. The

query

execution took lot of time for 1 of the first 4 queries.
Only this slow query was taking so much time, that even if I had used

normal

bulk load, it would have been faster in total.
Then all other Query executions took equal time as it took while querying

a

table after the Bulk data load method.

So, I want to know the exact reason what's the issue with COPY.
How exactly they differ? The only thing from the document I could

identify

was row security.
But it did not mention anything about indexing. Like, in Bulk load, do
indices(or constraint checks) are created with data loading?
& in COPY it's done after? so when indices are being created that query
slows down??

*Added details*

"Table & Query details"
I have 1 Table is there having 3 attributes:
TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300};
I haven't created any primary keys or FKs. No other constraints.

Data set size: 150MB / 1M records

Queries:
Select count(*) from Table;
Select count(distinct( Column1, Column2 , Column3 )) from Table;
Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3
where T1. Column1=T2.Column3 and T1. Column1="xyz";

Please let me know, how Bulk load vs. COPY different in both situations
1) Do the internal representation differs after data is loaded using Bulk
vs. COPY?
2) what if I have added Keys and Constraints, are they checked later? Means
loading is shown completed but in background it's creating indices/checking
constraints.
3) Can it be the reason that some other process(which?) is running in
background during query execution ? as I query the data as soon as the load
after COPY is complete.

#2Bruce Momjian
bruce@momjian.us
In reply to: PG Bug reporting form (#1)
Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY

This is not a documentation question. For assistance, please join the
appropriate mailing list and post your question:

http://www.postgresql.org/community

You can also try the #postgresql IRC channel on irc.freenode.net. See
the PostgreSQL FAQ for more information.

---------------------------------------------------------------------------

On Thu, Dec 5, 2019 at 03:39:24PM +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/10/sql-copy.html
Description:

Hello,

I experimented with Bulk load and COPY.
Loading in COPY was very fast.
However, after COPYing data from a CSV file to PostgreSQL Table. The

query

execution took lot of time for 1 of the first 4 queries.
Only this slow query was taking so much time, that even if I had used

normal

bulk load, it would have been faster in total.
Then all other Query executions took equal time as it took while querying

a

table after the Bulk data load method.

So, I want to know the exact reason what's the issue with COPY.
How exactly they differ? The only thing from the document I could

identify

was row security.
But it did not mention anything about indexing. Like, in Bulk load, do
indices(or constraint checks) are created with data loading?
& in COPY it's done after? so when indices are being created that query
slows down??

*Added details*

"Table & Query details"
I have 1 Table is there having 3 attributes:
TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300};
I haven't created any primary keys or FKs. No other constraints.

Data set size: 150MB / 1M records

Queries:
Select count(*) from Table;
Select count(distinct( Column1, Column2 , Column3 )) from Table;
Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3
where T1. Column1=T2.Column3 and T1. Column1="xyz";

Please let me know, how Bulk load vs. COPY different in both situations
1) Do the internal representation differs after data is loaded using Bulk
vs. COPY?
2) what if I have added Keys and Constraints, are they checked later? Means
loading is shown completed but in background it's creating indices/checking
constraints.
3) Can it be the reason that some other process(which?) is running in
background during query execution ? as I query the data as soon as the load
after COPY is complete.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +