COPY ... FROM and index usage

Started by Reg Me Pleaseover 18 years ago9 messagesgeneral
Jump to latest
#1Reg Me Please
regmeplease@gmail.com

Hi all.

I'd like to know whether the indexes on a table are updated or not during
a "COPY ... FROM" request.

That is, should I drop all indexes during a "COPY ... FROM" in order to gain
the maximum speed to load data?

Thanks.

--
Reg me Please

#2Joshua Tolley
eggyknap@gmail.com
In reply to: Reg Me Please (#1)
Re: COPY ... FROM and index usage

On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote:

Hi all.

I'd like to know whether the indexes on a table are updated or not during
a "COPY ... FROM" request.

That is, should I drop all indexes during a "COPY ... FROM" in order to gain
the maximum speed to load data?

Thanks.

Although questions of "which is faster" often depend very heavily on
the data involved, the database schema, the hardware, etc., typically
people find it best to drop all indexes during a large import and
recreate them afterward.

- Josh/eggyknap

#3Reg Me Please
regmeplease@gmail.com
In reply to: Joshua Tolley (#2)
Re: COPY ... FROM and index usage

Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto:

On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote:

Hi all.

I'd like to know whether the indexes on a table are updated or not during
a "COPY ... FROM" request.

That is, should I drop all indexes during a "COPY ... FROM" in order to
gain the maximum speed to load data?

Thanks.

Although questions of "which is faster" often depend very heavily on
the data involved, the database schema, the hardware, etc., typically
people find it best to drop all indexes during a large import and
recreate them afterward.

- Josh/eggyknap

This sounds very reasonable to me.
But the first question remains unanswered:

Are the indexes updated during the COPY ... FROM ?

Thanks again.

--
Reg me Please

#4Erik Jones
erik@myemma.com
In reply to: Reg Me Please (#3)
Re: COPY ... FROM and index usage

On Nov 4, 2007, at 9:15 AM, Reg Me Please wrote:

Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto:

On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote:

Hi all.

I'd like to know whether the indexes on a table are updated or
not during
a "COPY ... FROM" request.

That is, should I drop all indexes during a "COPY ... FROM" in
order to
gain the maximum speed to load data?

Thanks.

Although questions of "which is faster" often depend very heavily on
the data involved, the database schema, the hardware, etc., typically
people find it best to drop all indexes during a large import and
recreate them afterward.

- Josh/eggyknap

This sounds very reasonable to me.
But the first question remains unanswered:

Are the indexes updated during the COPY ... FROM ?

Of course. Why would think that data could be inserted into a table
by any means without it updating the table's indexes? That would
make the index worthless.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com

#5Reg Me Please
regmeplease@gmail.com
In reply to: Erik Jones (#4)
Re: COPY ... FROM and index usage

Il Sunday 04 November 2007 16:21:41 Erik Jones ha scritto:

On Nov 4, 2007, at 9:15 AM, Reg Me Please wrote:

Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto:

On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote:

Hi all.

I'd like to know whether the indexes on a table are updated or
not during
a "COPY ... FROM" request.

That is, should I drop all indexes during a "COPY ... FROM" in
order to
gain the maximum speed to load data?

Thanks.

Although questions of "which is faster" often depend very heavily on
the data involved, the database schema, the hardware, etc., typically
people find it best to drop all indexes during a large import and
recreate them afterward.

- Josh/eggyknap

This sounds very reasonable to me.
But the first question remains unanswered:

Are the indexes updated during the COPY ... FROM ?

Of course. Why would think that data could be inserted into a table
by any means without it updating the table's indexes? That would
make the index worthless.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com

I do understand your remarks and would agree.
But I was thinking about the COPY...FROM request not
as a normal INSERT INTO.
If this were the case I could have been running a REINDEX TABLE.

--
Reg me Please

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua Tolley (#2)
Re: COPY ... FROM and index usage

"Josh Tolley" <eggyknap@gmail.com> writes:

Although questions of "which is faster" often depend very heavily on
the data involved, the database schema, the hardware, etc., typically
people find it best to drop all indexes during a large import and
recreate them afterward.

See also the extensive discussion of this topic at

http://www.postgresql.org/docs/8.2/static/populate.html

regards, tom lane

#7Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Reg Me Please (#1)
Re: COPY ... FROM and index usage

Hi,

Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit :

That is, should I drop all indexes during a "COPY ... FROM" in order to
gain the maximum speed to load data?

When looking for a way to speed up data loading, you may want to consider
pgbulkload, a project which optimizes index creation while loading data, and
bypass constraints: it's useful when you know you trust input.
As I've never used it myself, I can only provides following links:

http://pgfoundry.org/projects/pgbulkload
http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf

Regards,
--
dim

#8Toru SHIMOGAKI
shimogaki.toru@oss.ntt.co.jp
In reply to: Dimitri Fontaine (#7)
Re: COPY ... FROM and index usage

Dimitri, thank you for your quoting. I'm a pg_bulkload author.

pg_bulkload is optimized especially for appending data to table with indexes.
If you use it, you don't need to drop index before loading data. But you have to
consider conditions carefully as Dimitri said below. See also pg_bulkload README:

http://pgfoundry.org/docman/view.php/1000261/473/README.pg_bulkload-2.2.0.txt

Best regards,

Dimitri Fontaine wrote:

Hi,

Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit :

That is, should I drop all indexes during a "COPY ... FROM" in order to
gain the maximum speed to load data?

When looking for a way to speed up data loading, you may want to consider
pgbulkload, a project which optimizes index creation while loading data, and
bypass constraints: it's useful when you know you trust input.
As I've never used it myself, I can only provides following links:

http://pgfoundry.org/projects/pgbulkload
http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf

Regards,

--
Toru SHIMOGAKI<shimogaki.toru@oss.ntt.co.jp>
NTT Open Source Software Center

#9Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Toru SHIMOGAKI (#8)
Re: COPY ... FROM and index usage

Hi,

Anyone have comparisons/benchmarks to give some
idea of the potential performance gains?

Say compared to doing the stuff here:
http://www.postgresql.org/docs/8.2/static/populate.html

Regards,
Link.

At 09:35 AM 11/5/2007, Toru SHIMOGAKI wrote:

Show quoted text

Dimitri, thank you for your quoting. I'm a pg_bulkload author.

pg_bulkload is optimized especially for appending data to table with indexes.
If you use it, you don't need to drop index
before loading data. But you have to consider
conditions carefully as Dimitri said below. See also pg_bulkload README:

http://pgfoundry.org/docman/view.php/1000261/473/README.pg_bulkload-2.2.0.txt

Best regards,

Dimitri Fontaine wrote:

Hi,
Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit :

That is, should I drop all indexes during a "COPY ... FROM" in order to
gain the maximum speed to load data?

When looking for a way to speed up data
loading, you may want to consider pgbulkload, a
project which optimizes index creation while
loading data, and bypass constraints: it's
useful when you know you trust input.
As I've never used it myself, I can only provides following links:
http://pgfoundry.org/projects/pgbulkload
http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf
Regards,

--
Toru SHIMOGAKI<shimogaki.toru@oss.ntt.co.jp>
NTT Open Source Software Center

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend