batch insertion

Started by Koriskover 12 years ago5 messagesgeneral
Jump to latest
#1Korisk
korisk@yandex.ru

Hi!
I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.
I tried batch insert like this:

insert into triplets values (1,1,1);
insert into triplets values (1,1,1), (3,2,5), (4,5,5);
...
insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;

The more triplets I use the quicker operation is.
With preparation it looks like this:

res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
...
res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint, $5::bigint, $6::float), ($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL);
...

The question:
Is there any way to prepare query with any number of triplets without casting such a long string?

Thank you.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Allan Kamau
kamauallan@gmail.com
In reply to: Korisk (#1)
Re: batch insertion

On Sun, Aug 25, 2013 at 3:15 AM, Korisk <korisk@yandex.ru> wrote:

Hi!
I want quick insert into db a lot of data (in form of triplets). Data is
formed dynamical so "COPY" is not suitable.
I tried batch insert like this:

insert into triplets values (1,1,1);
insert into triplets values (1,1,1), (3,2,5), (4,5,5);
...
insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;

The more triplets I use the quicker operation is.
With preparation it looks like this:

res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint,
$2::bigint, $3::float);",3, NULL);
...
res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint,
$2::bigint, $3::float), ($4::bigint, $5::bigint, $6::float), ($7::bigint,
$8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL);
...

The question:
Is there any way to prepare query with any number of triplets without
casting such a long string?

Thank you.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

You may still use COPY as follows:
Let every thread that is generating data create a file on disk (at some
predetermined directory) to which the tread would write data in a format
(such as CSV, TSV) that COPY can use.
Use some sort of counter that would be updated for each write of a record
to this file.
Then provide a counter threshold which when the value of your counter
reaches (surpasses) your thread will first generate a COPY command and send
out the contents of this file via the COPY command to your DB. Now delete
the contents of the file and write the record into it.
You may also write some code to do on demand writing of the contents of
this file to the DB when some event such as an indication to terminate the
application happens.
All the above steps are to be performed within each thread of you
application in isolation.

You may also write some clean up code that would look for the existence of
these files when you application starts and writes the the contents to the
DB (followed by the deletion of the files), this is done to cater for
situations where your application may not have gracefully shutdown.

Allan.

#3Steve Atkins
steve@blighty.com
In reply to: Korisk (#1)
Re: batch insertion

On Aug 24, 2013, at 5:15 PM, Korisk <Korisk@yandex.ru> wrote:

Hi!
I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.

COPY works just fine for dynamically generated data, and it's probably the right thing to use if you're bulk loading data (it's about as fast as you can get for a single threaded load).

Take a look at the PQputCopyData() and PQputCopyEnd() functions.

Cheers,
Steve

I tried batch insert like this:

insert into triplets values (1,1,1);
insert into triplets values (1,1,1), (3,2,5), (4,5,5);
...
insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;

The more triplets I use the quicker operation is.
With preparation it looks like this:

res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
...
res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint, $5::bigint, $6::float), ($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL);
...

The question:
Is there any way to prepare query with any number of triplets without casting such a long string?

Thank you.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Korisk (#1)
Re: batch insertion

On 08/24/2013 05:15 PM, Korisk wrote:

Hi!
I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.
I tried batch insert like this:

insert into triplets values (1,1,1);
insert into triplets values (1,1,1), (3,2,5), (4,5,5);
...
insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;

The more triplets I use the quicker operation is.
With preparation it looks like this:

res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
...
res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint, $5::bigint, $6::float), ($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL);
...

The question:
Is there any way to prepare query with any number of triplets without casting such a long string?,

Others have already posted about using COPY, so I will go another route.
The increased speed you see is probably a result of more data being
included in each transaction. From your example it is not clear if you
are batching your INSERTs. If not that is another way go, prepare your
statement then loop through your data in batches where a batch is
between a BEGIN and a COMMIT. This is one of the reasons COPY is so
fast, the data is dumped inside a single transaction. Unfortunately I do
not use libpq so I cannot provide an example.

Thank you.

--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Merlin Moncure
mmoncure@gmail.com
In reply to: Korisk (#1)
Re: batch insertion

On Sat, Aug 24, 2013 at 7:15 PM, Korisk <korisk@yandex.ru> wrote:

Hi!
I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.
I tried batch insert like this:

insert into triplets values (1,1,1);
insert into triplets values (1,1,1), (3,2,5), (4,5,5);
...
insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;

The more triplets I use the quicker operation is.
With preparation it looks like this:

res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
...
res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint, $5::bigint, $6::float), ($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL);
...

The question:
Is there any way to prepare query with any number of triplets without casting such a long string?

yes. you can use COPY with some tricks, or use arrays.

CREATE TYPE triplet_t AS (a bigint, b bigint, c bigint);

WITH data AS (SELECT UNNEST($1::triplet_t[]) INSERT INTO triplets
SELECT * FROM data;

Also see libpqtypes: http://libpqtypes.esilo.com/man3/pqt-composites.html

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general