Difference between Bulk Load (Multiple inserts or single inserts) and COPY
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/10/sql-copy.html
Description:
Hello,
I experimented with Bulk load and COPY.
Loading in COPY was very fast.
However, after COPYing data from a CSV file to PostgreSQL Table. The
query
execution took lot of time for 1 of the first 4 queries.
Only this slow query was taking so much time, that even if I had used
normal
bulk load, it would have been faster in total.
Then all other Query executions took equal time as it took while querying
a
table after the Bulk data load method.
So, I want to know the exact reason what's the issue with COPY.
How exactly they differ? The only thing from the document I could
identify
was row security.
But it did not mention anything about indexing. Like, in Bulk load, do
indices(or constraint checks) are created with data loading?
& in COPY it's done after? so when indices are being created that query
slows down??
*Added details*
"Table & Query details"
I have 1 Table is there having 3 attributes:
TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300};
I haven't created any primary keys or FKs. No other constraints.
Data set size: 150MB / 1M records
Queries:
Select count(*) from Table;
Select count(distinct( Column1, Column2 , Column3 )) from Table;
Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3
where T1. Column1=T2.Column3 and T1. Column1="xyz";
Please let me know, how Bulk load vs. COPY different in both situations
1) Do the internal representation differs after data is loaded using Bulk
vs. COPY?
2) what if I have added Keys and Constraints, are they checked later? Means
loading is shown completed but in background it's creating indices/checking
constraints.
3) Can it be the reason that some other process(which?) is running in
background during query execution ? as I query the data as soon as the load
after COPY is complete.