realtime data inserts

Started by Adam Siegelover 22 years ago13 messages
#1Adam Siegel
adam@sycamorehq.com

I have realtime data flowing at a rate of 500, 512 byte packets per second.
I want to log the info in a database table with two other columns, one for a
timestamp and one for a name of the packet. The max rate I can achieve is
350 inserts per second on a sun blade 2000. The inserts are grouped in a
transaction and I commit every 1200 records. I am storing the binary data
in a bytea. I am using the libpq conversion function. Not sure if that is
slowing me down. But I think it is the insert not the conversion.

Any thoughts on how to achive this goal?

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Adam Siegel (#1)
Re: realtime data inserts

"Adam Siegel" <adam@sycamorehq.com> writes:

I have realtime data flowing at a rate of 500, 512 byte packets per second.
I want to log the info in a database table with two other columns, one for a
timestamp and one for a name of the packet. The max rate I can achieve is
350 inserts per second on a sun blade 2000. The inserts are grouped in a
transaction and I commit every 1200 records.

Have you thought about using COPY?

regards, tom lane

#3Ericson Smith
eric@did-it.com
In reply to: Adam Siegel (#1)
Re: realtime data inserts

Had the same problem recently...

Format your data like a pg text dump into a file and then...

COPY <tablename> (a,b,c) FROM stdin;
1 2 3
4 5 6
\.

psql <yourdatabase < dumpfile.sql

I've achieved thousands of rows per seconds with this method.

- Ericson Smith
eric@did-it.com
http://www.did-it.com

Adam Siegel wrote:

Show quoted text

I have realtime data flowing at a rate of 500, 512 byte packets per second.
I want to log the info in a database table with two other columns, one for a
timestamp and one for a name of the packet. The max rate I can achieve is
350 inserts per second on a sun blade 2000. The inserts are grouped in a
transaction and I commit every 1200 records. I am storing the binary data
in a bytea. I am using the libpq conversion function. Not sure if that is
slowing me down. But I think it is the insert not the conversion.

Any thoughts on how to achive this goal?

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#4Jim C. Nasby
jim@nasby.net
In reply to: Adam Siegel (#1)
Re: realtime data inserts

Are you binding your insert? IE:

prepare statement INSERT INTO blah VALUES (?, ?, ?);

execute statement (a, b, c)

Instead of just "INSERT INTO blah VALUES(a, b, c)"

On Sat, May 10, 2003 at 11:25:16AM -0400, Adam Siegel wrote:

I have realtime data flowing at a rate of 500, 512 byte packets per second.
I want to log the info in a database table with two other columns, one for a
timestamp and one for a name of the packet. The max rate I can achieve is
350 inserts per second on a sun blade 2000. The inserts are grouped in a
transaction and I commit every 1200 records. I am storing the binary data
in a bytea. I am using the libpq conversion function. Not sure if that is
slowing me down. But I think it is the insert not the conversion.

Any thoughts on how to achive this goal?

--
Jim C. Nasby (aka Decibel!) jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

#5Ron Johnson
ron.l.johnson@cox.net
In reply to: Tom Lane (#2)
Re: realtime data inserts

On Sat, 2003-05-10 at 11:00, Tom Lane wrote:

"Adam Siegel" <adam@sycamorehq.com> writes:

I have realtime data flowing at a rate of 500, 512 byte packets per second.
I want to log the info in a database table with two other columns, one for a
timestamp and one for a name of the packet. The max rate I can achieve is
350 inserts per second on a sun blade 2000. The inserts are grouped in a
transaction and I commit every 1200 records.

Have you thought about using COPY?

Generate a temporary file, and then system("COPY /tmp/foobar ...") ?

-- 
+---------------------------------------------------------------+
| Ron Johnson, Jr.        mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA      http://members.cox.net/ron.l.johnson  |
|                                                               |
| The purpose of the military isn't to pay your college tuition |
| or give you a little extra income; it's to "kill people and   |
| break things".  Surprisingly, not everyone understands that.  |
+---------------------------------------------------------------+
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ron Johnson (#5)
Re: realtime data inserts

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 11:00, Tom Lane wrote:

Have you thought about using COPY?

Generate a temporary file, and then system("COPY /tmp/foobar ...") ?

No, copy from stdin. No need for a temp file.

regards, tom lane

#7Ron Johnson
ron.l.johnson@cox.net
In reply to: Tom Lane (#6)
Re: realtime data inserts

On Sat, 2003-05-10 at 21:46, Tom Lane wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 11:00, Tom Lane wrote:

Have you thought about using COPY?

Generate a temporary file, and then system("COPY /tmp/foobar ...") ?

No, copy from stdin. No need for a temp file.

But wouldn't that only work if the input stream is acceptable to
COPY ?

-- 
+---------------------------------------------------------------+
| Ron Johnson, Jr.        mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA      http://members.cox.net/ron.l.johnson  |
|                                                               |
| The purpose of the military isn't to pay your college tuition |
| or give you a little extra income; it's to "kill people and   |
| break things".  Surprisingly, not everyone understands that.  |
+---------------------------------------------------------------+
#8Doug McNaught
doug@mcnaught.org
In reply to: Adam Siegel (#1)
Re: realtime data inserts

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 21:46, Tom Lane wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 11:00, Tom Lane wrote:

Have you thought about using COPY?

Generate a temporary file, and then system("COPY /tmp/foobar ...") ?

No, copy from stdin. No need for a temp file.

But wouldn't that only work if the input stream is acceptable to
COPY ?

Yes, but you could always pipe it through a script or C program to
make it so...

-Doug

#9Adam Siegel
adam@sycamore.us
In reply to: Jim C. Nasby (#4)
Re: realtime data inserts

The copy from method (PQputline) allows me to achieve around 1000 inserts
per second.

On Sat, 10 May 2003, Jim C. Nasby wrote:

Show quoted text

Are you binding your insert? IE:

prepare statement INSERT INTO blah VALUES (?, ?, ?);

execute statement (a, b, c)

Instead of just "INSERT INTO blah VALUES(a, b, c)"

On Sat, May 10, 2003 at 11:25:16AM -0400, Adam Siegel wrote:

I have realtime data flowing at a rate of 500, 512 byte packets per second.
I want to log the info in a database table with two other columns, one for a
timestamp and one for a name of the packet. The max rate I can achieve is
350 inserts per second on a sun blade 2000. The inserts are grouped in a
transaction and I commit every 1200 records. I am storing the binary data
in a bytea. I am using the libpq conversion function. Not sure if that is
slowing me down. But I think it is the insert not the conversion.

Any thoughts on how to achive this goal?

#10Steve Crawford
scrawford@pinpointresearch.com
In reply to: Adam Siegel (#1)
Re: realtime data inserts

Depends - we don't know enough about your needs. Some questions:

Is this constant data or just capturing a burst?

Are you feeding it through one connection or several in parallel?

Did you tune your memory configs in postgresql.conf or are they still at the
minimalized defaults?

How soon does the data need to be available for query? (Obviously there will
be up to a 1200 record delay just due to the transaction.)

What generates the timestamp? Ie. is it an insert into foo values (now(),
packetname, data) or is the app providing the timestamp?

More info about the app will help.

Cheers,
Steve

Show quoted text

On Saturday 10 May 2003 8:25 am, Adam Siegel wrote:

I have realtime data flowing at a rate of 500, 512 byte packets per second.
I want to log the info in a database table with two other columns, one for
a timestamp and one for a name of the packet. The max rate I can achieve
is 350 inserts per second on a sun blade 2000. The inserts are grouped in
a transaction and I commit every 1200 records. I am storing the binary
data in a bytea. I am using the libpq conversion function. Not sure if
that is slowing me down. But I think it is the insert not the conversion.

Any thoughts on how to achive this goal?

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#11alex b.
mailinglists1@gmx.de
In reply to: Doug McNaught (#8)
Re: realtime data inserts

Doug McNaught wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 21:46, Tom Lane wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 11:00, Tom Lane wrote:

Have you thought about using COPY?

Generate a temporary file, and then system("COPY /tmp/foobar ...") ?

No, copy from stdin. No need for a temp file.

But wouldn't that only work if the input stream is acceptable to
COPY ?

Yes, but you could always pipe it through a script or C program to
make it so...

lets say I have an about 1kb/s continuus datastream comming in for many
hours and I'd like to store this data in my db using COPY table FROM stdin.

At what time should I COMMIT or close the stream to feed the database
and COPY FROM again?

#12Ericson Smith
eric@did-it.com
In reply to: alex b. (#11)
Re: realtime data inserts

You probably want to have a process that constantly stores the data in a
text file. Every "n" minutes, you will cause the logger to rotate the
text file, then process that batch.

Over here, we are able to dump around 5,000 records per second in one of
our tables using that methodology.

- Ericson Smith
eric@did-it.com

On Fri, 2003-05-16 at 16:27, alex b. wrote:

Doug McNaught wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 21:46, Tom Lane wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 11:00, Tom Lane wrote:

Have you thought about using COPY?

Generate a temporary file, and then system("COPY /tmp/foobar ...") ?

No, copy from stdin. No need for a temp file.

But wouldn't that only work if the input stream is acceptable to
COPY ?

Yes, but you could always pipe it through a script or C program to
make it so...

lets say I have an about 1kb/s continuus datastream comming in for many
hours and I'd like to store this data in my db using COPY table FROM stdin.

At what time should I COMMIT or close the stream to feed the database
and COPY FROM again?

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

--
Ericson Smith <eric@did-it.com>

#13Ron Johnson
ron.l.johnson@cox.net
In reply to: Ericson Smith (#12)
Re: realtime data inserts

On Fri, 2003-05-16 at 15:33, Ericson Smith wrote:

You probably want to have a process that constantly stores the data in a
text file. Every "n" minutes, you will cause the logger to rotate the
text file, then process that batch.

Does the logger spawn the DB writer?

Over here, we are able to dump around 5,000 records per second in one of
our tables using that methodology.

- Ericson Smith
eric@did-it.com

On Fri, 2003-05-16 at 16:27, alex b. wrote:

Doug McNaught wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 21:46, Tom Lane wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

On Sat, 2003-05-10 at 11:00, Tom Lane wrote:

Have you thought about using COPY?

Generate a temporary file, and then system("COPY /tmp/foobar ...") ?

No, copy from stdin. No need for a temp file.

But wouldn't that only work if the input stream is acceptable to
COPY ?

Yes, but you could always pipe it through a script or C program to
make it so...

lets say I have an about 1kb/s continuus datastream comming in for many
hours and I'd like to store this data in my db using COPY table FROM stdin.

At what time should I COMMIT or close the stream to feed the database
and COPY FROM again?

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

-- 
+---------------------------------------------------------------+
| Ron Johnson, Jr.        mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA      http://members.cox.net/ron.l.johnson  |
|                                                               |
| The purpose of the military isn't to pay your college tuition |
| or give you a little extra income; it's to "kill people and   |
| break things".  Surprisingly, not everyone understands that.  |
+---------------------------------------------------------------+