Best way to import data in postgresl (not "COPY")

Started by Denis BUCHERover 16 years ago4 messagesgeneral
Jump to latest
#1Denis BUCHER
dbucherml@hsolutions.ch

Hello,

I have a system that must each day import lots of data from another one.
Our system is in postgresql and we connect to the other via ODBC.

Currently we do something like :

SELECT ... FROM ODBC source
foreach row {
INSERT INTO postgresql
}

The problem is that this method is very slow...

Does someone has a better suggestion ?

Thanks a lot in advance !

Denis

#2Andy Colson
andy@squeakycode.net
In reply to: Denis BUCHER (#1)
Re: Best way to import data in postgresl (not "COPY")

Denis BUCHER wrote:

Hello,

I have a system that must each day import lots of data from another one.
Our system is in postgresql and we connect to the other via ODBC.

Currently we do something like :

SELECT ... FROM ODBC source
foreach row {
INSERT INTO postgresql
}

The problem is that this method is very slow...

Does someone has a better suggestion ?

Thanks a lot in advance !

Denis

If you can prepare your statement it would run a lot faster, no idea if
odbc supports such things though.

so:

select ... from odbc...;
$q = prepare('insert into pg...')
foreach row {
$q.params[0] = ..
$q.params[1] = ..
$q.execute;
}
commit;

(* if possible, make sure you are not commitiing each insert statement,
do them all the commit once at the end *)

If you cant prepare, you should try to build multi-value insert statements:

insert into pgtable (col1, col2, col3) values ('a', 'b', 'c'), ('d',
'e', 'f'), ('g','h','i'),...;

Or, you could look into dblink, dunno if it would be faster.

-Andy

#3Sam Mason
sam@samason.me.uk
In reply to: Denis BUCHER (#1)
Re: Best way to import data in postgresl (not "COPY")

On Wed, Jul 22, 2009 at 08:24:22PM +0200, Denis BUCHER wrote:

SELECT ... FROM ODBC source
foreach row {
INSERT INTO postgresql
}

The problem is that this method is very slow...

Does someone has a better suggestion ?

Using COPY[1]http://www.postgresql.org/docs/current/static/sql-copy.html is normally the preferred solution to getting data into PG
fast. Some languages make this easier than others, if you can generate
SQL that looks like:

COPY table (col1,col2) FROM STDIN WITH CSV;
13,hello
42,"text with,comma"
\.

then you should be in luck---just bung this off to the ODBC driver
as is and all should good. If you need to copy more than will fit
in a string, arrange to put a few thousand rows in each batch, and
generate them and insert them one-at-a-time in a transaction. Using
tab-delimited mode (drop the WITH CSV) is possible, but most languages
will provide library code for generating CSV files and hence will
probably be easier to get correct.

--
Sam http://samason.me.uk/

[1]: http://www.postgresql.org/docs/current/static/sql-copy.html

#4Denis BUCHER
dbucherml@hsolutions.ch
In reply to: Denis BUCHER (#1)
Re: Best way to import data in postgresl (not "COPY")

Hello everyone,

Denis BUCHER a �crit :

I have a system that must each day import lots of data from another one.
Our system is in postgresql and we connect to the other via ODBC.

Currently we do something like :

SELECT ... FROM ODBC source
foreach row {
INSERT INTO postgresql
}

The problem is that this method is very slow...
Does someone has a better suggestion ?

Thanks a lot for the help of everyone !

There are the first results of my tries, it's very interesting !!!

a) ON THE DESTINATION (PHP/Postgresql)

1. Preparing INSERT statements (to Postgres) was already a better idea
2. Then using BEGIN WORK COMMIT improved even more
3. At first I didn't realised I could remove quotes escaping thank to
prepare, this improved a little more
4. Then I found something very interesting : pg_send_execute !
(asynchronous)

Inserted lines : 134297
Required time : 292 seconds ([0] without prepare)
Required time : 253 seconds ([1] with prepare) (13% better)
Required time : 224 seconds ([2] with prepare and BEGIN COMMIT) (12% better)
Required time : 221 seconds [3]removed escaping
Required time : 214 seconds ([4] 4% better)

b) ON THE SOURCE (PHP/ODBC)
5. Believe it or not but changing from PHP ODBC to PHP PDO ODBC

From : http://us2.php.net/manual/en/ref.uodbc.php

To : http://fr.php.net/manual/en/class.pdostatement.php
...helped a LOT :

Inserted lines : 134297
Required time : 25 seconds ([1] [2] [3] [4] [5] + PDO)

Hope it will help other people !

Thanks a lot again to everyone that help me :-)

Denis