How to force PostgreSQL to use multiple cores within one connection?

Started by Sergey A.over 17 years ago16 messagesgeneral
Jump to latest
#1Sergey A.
n39052@gmail.com

Hello.

My application generates a large amount of inserts (~ 2000 per second)
using one connection to PostgreSQL. All queries are buffered in memory
and then the whole buffers are send to DB. But when I use two
connections to PostgreSQL instead of one on dual core CPU (i.e. I use
two processes of PostgreSQL) to insert my buffers I see that things
goes 1.6 times faster.

Using several connections in my application is somewhat tricky, so I
want to move this problem to PostgreSQL's side. Is there any method
for PostgreSQL to process huge inserts coming from one connection on
different cores?

Thanks.

--
Sergey.

In reply to: Sergey A. (#1)
Re: How to force PostgreSQL to use multiple cores within one connection?

On 01/10/2008 11:44, Sergey A. wrote:

Using several connections in my application is somewhat tricky, so I
want to move this problem to PostgreSQL's side. Is there any method
for PostgreSQL to process huge inserts coming from one connection on
different cores?

I don't think so. Postgres spawns a single process for each connection,
so each connection is going to be confined to a single core.

Ray.

------------------------------------------------------------------
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
rod@iol.ie
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
------------------------------------------------------------------

#3Sergey A.
n39052@gmail.com
In reply to: Raymond O'Donnell (#2)
Re: How to force PostgreSQL to use multiple cores within one connection?

Hello.

I don't think so. Postgres spawns a single process for each connection,
so each connection is going to be confined to a single core.

Thanks for your answer.

I know that I can use a connection pooler to involve early created
connections. Can poolers balance queries coming from my connection
among a few physical connections to DB?

--
Sergey.

#4Richard Huxton
dev@archonet.com
In reply to: Sergey A. (#3)
Re: How to force PostgreSQL to use multiple cores within one connection?

Sergey A. wrote:

I know that I can use a connection pooler to involve early created
connections. Can poolers balance queries coming from my connection
among a few physical connections to DB?

The pg_loader project might be of use to you.

http://pgfoundry.org/projects/pgloader/

--
Richard Huxton
Archonet Ltd

#5David Fetter
david@fetter.org
In reply to: Sergey A. (#1)
Re: How to force PostgreSQL to use multiple cores within one connection?

On Wed, Oct 01, 2008 at 03:44:40AM -0700, Sergey A. wrote:

Hello.

My application generates a large amount of inserts (~ 2000 per
second) using one connection to PostgreSQL. All queries are
buffered in memory and then the whole buffers are send to DB.

Are you using COPY? If not, start there :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#6David Fetter
david@fetter.org
In reply to: Sergey A. (#1)
Re: How to force PostgreSQL to use multiple cores within one connection?

On Wed, Oct 01, 2008 at 05:13:59AM -0700, Sergey A. wrote:

Hello.

Are you using COPY? If not, start there :)

I'm new to PostgreSQL. No, I'm not using COPY =) Are you about
http://www.postgresql.org/docs/8.3/interactive/sql-copy.html (COPY --
copy data between a file and a table)?

You can use it for whatever you're generating.

Sorry, I don't understand how COPY can help me to force PostgreSQL
to use multiple cores when processing my queries.

Multiple cores are not the solution to your problem here, but COPY
almost certainly is :)

Cheers,
David.

--
Sergey

P.S. My application and DB can be placed on diferent hosts.

--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#7Sergey A.
n39052@gmail.com
In reply to: David Fetter (#6)
Re: How to force PostgreSQL to use multiple cores within one connection?

Hello.

You can use it for whatever you're generating.

I've tested this technique, and I'm wondering! 120000 inserts per
~600ms! Thanks for your help.

Multiple cores are not the solution to your problem here, but COPY
almost certainly is :)

But as I can see this approach doesn't work over network: I need to
create file with data locally, and then ask PostgreSQL to read it.

--
Sergey.

#8Scott Marlowe
scott.marlowe@gmail.com
In reply to: Sergey A. (#7)
Re: How to force PostgreSQL to use multiple cores within one connection?

On Wed, Oct 1, 2008 at 6:58 AM, Sergey A. <n39052@gmail.com> wrote:

Hello.

You can use it for whatever you're generating.

I've tested this technique, and I'm wondering! 120000 inserts per
~600ms! Thanks for your help.

Multiple cores are not the solution to your problem here, but COPY
almost certainly is :)

But as I can see this approach doesn't work over network: I need to
create file with data locally, and then ask PostgreSQL to read it.

Sure it does.

copy .... from STDIN
213 345 567
847 837 473
\.

#9Bill Moran
wmoran@collaborativefusion.com
In reply to: Sergey A. (#7)
Re: How to force PostgreSQL to use multiple cores within one connection?

In response to "Sergey A." <n39052@gmail.com>:

Multiple cores are not the solution to your problem here, but COPY
almost certainly is :)

But as I can see this approach doesn't work over network: I need to
create file with data locally, and then ask PostgreSQL to read it.

There is a network API for COPY. Look up pg_put_line (or PQputLine
or whatever the convention is for whatever API you're using).

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

#10Merlin Moncure
mmoncure@gmail.com
In reply to: Sergey A. (#1)
Re: How to force PostgreSQL to use multiple cores within one connection?

On Wed, Oct 1, 2008 at 6:44 AM, Sergey A. <n39052@gmail.com> wrote:

Hello.

My application generates a large amount of inserts (~ 2000 per second)
using one connection to PostgreSQL. All queries are buffered in memory
and then the whole buffers are send to DB. But when I use two
connections to PostgreSQL instead of one on dual core CPU (i.e. I use
two processes of PostgreSQL) to insert my buffers I see that things
goes 1.6 times faster.

Using several connections in my application is somewhat tricky, so I
want to move this problem to PostgreSQL's side. Is there any method
for PostgreSQL to process huge inserts coming from one connection on
different cores?

If you are buffering inserts, you can get an easy performance boost by
using copy as others have suggested. Another approach is to use
mutli-row insert statement:

insert into something values (1,2,3), (2,4,6), ...

Using multiple cpu basically requires multiple connections. This can
be easy or difficult depending on how you are connecting to the
database.

merlin

#11Nikolas Everett
nik9000@gmail.com
In reply to: Merlin Moncure (#10)
Re: How to force PostgreSQL to use multiple cores within one connection?

If you happen to be using JDBC you can also get copy to work:

http://kato.iki.fi/sw/db/postgresql/jdbc/copy/

On Wed, Oct 1, 2008 at 9:24 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

Show quoted text

On Wed, Oct 1, 2008 at 6:44 AM, Sergey A. <n39052@gmail.com> wrote:

Hello.

My application generates a large amount of inserts (~ 2000 per second)
using one connection to PostgreSQL. All queries are buffered in memory
and then the whole buffers are send to DB. But when I use two
connections to PostgreSQL instead of one on dual core CPU (i.e. I use
two processes of PostgreSQL) to insert my buffers I see that things
goes 1.6 times faster.

Using several connections in my application is somewhat tricky, so I
want to move this problem to PostgreSQL's side. Is there any method
for PostgreSQL to process huge inserts coming from one connection on
different cores?

If you are buffering inserts, you can get an easy performance boost by
using copy as others have suggested. Another approach is to use
mutli-row insert statement:

insert into something values (1,2,3), (2,4,6), ...

Using multiple cpu basically requires multiple connections. This can
be easy or difficult depending on how you are connecting to the
database.

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12Sergey A.
n39052@gmail.com
In reply to: Scott Marlowe (#8)
Re: How to force PostgreSQL to use multiple cores within one connection?

Sure it does.

copy .... from STDIN
213 345 567
847 837 473
\.

Thanks. Was this query entered in psql shell?

#13Sergey A.
n39052@gmail.com
In reply to: Bill Moran (#9)
Re: How to force PostgreSQL to use multiple cores within one connection?

There is a network API for COPY. Look up pg_put_line (or PQputLine
or whatever the convention is for whatever API you're using).

Thanks for your answer. I use Erlang (erlang.org) + pgsql2 (it's
native Erlang driver maintained by ejabberd developers). All all I
have is the following functions:

connect(User,Password,Database,Options)
stop(Pid)
q(Pid,Query)
q(Pid,Query,Params)
q(Pid,Query,Params,Options)
execute_many(Pid,Query,Params) // I use this function to do buffered inserts
execute(Pid,Query,Params)
apply_in_tx(Pid,Fun,Args) // it's for transactions
get_parameters(Pid) // some information about connection

Unfortunattely, there are no functions like pg_put_line.

#14Scott Marlowe
scott.marlowe@gmail.com
In reply to: Sergey A. (#12)
Re: How to force PostgreSQL to use multiple cores within one connection?

On Wed, Oct 1, 2008 at 7:58 AM, Sergey A. <n39052@gmail.com> wrote:

Sure it does.

copy .... from STDIN
213 345 567
847 837 473
\.

Thanks. Was this query entered in psql shell?

Yes, but if you're using something like php, you can enter it as a
single string and it will work.

#15Scott Marlowe
scott.marlowe@gmail.com
In reply to: Sergey A. (#13)
Re: How to force PostgreSQL to use multiple cores within one connection?

On Wed, Oct 1, 2008 at 8:04 AM, Sergey A. <n39052@gmail.com> wrote:

There is a network API for COPY. Look up pg_put_line (or PQputLine
or whatever the convention is for whatever API you're using).

Thanks for your answer. I use Erlang (erlang.org) + pgsql2 (it's
native Erlang driver maintained by ejabberd developers). All all I
have is the following functions:

connect(User,Password,Database,Options)
stop(Pid)
q(Pid,Query)
q(Pid,Query,Params)
q(Pid,Query,Params,Options)
execute_many(Pid,Query,Params) // I use this function to do buffered inserts
execute(Pid,Query,Params)
apply_in_tx(Pid,Fun,Args) // it's for transactions
get_parameters(Pid) // some information about connection

the execute many should probably work. Just load it up with the same
kind of lines you'd see come out of

pg_dump -t tablename

That's the easiest way to see an example.

#16Bill Moran
wmoran@collaborativefusion.com
In reply to: Scott Marlowe (#14)
Re: How to force PostgreSQL to use multiple cores within one connection?

In response to "Scott Marlowe" <scott.marlowe@gmail.com>:

On Wed, Oct 1, 2008 at 7:58 AM, Sergey A. <n39052@gmail.com> wrote:

Sure it does.

copy .... from STDIN
213 345 567
847 837 473
\.

Thanks. Was this query entered in psql shell?

Yes, but if you're using something like php, you can enter it as a
single string and it will work.

php also has pg_put_line(), which reduces the memory overhead on both
the client and server. Hopefully, whatever language you're using has
an API for expediting as well.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023