PostgreSQL Write Performance
I am not sure whether I am doing the correct benchmarking way.
I have the following table ;
CREATE TABLE measurement_type
(
measurement_type_id bigserial NOT NULL,
measurement_type_name text NOT NULL,
CONSTRAINT pk_measurement_type_id PRIMARY KEY (measurement_type_id),
CONSTRAINT measurement_type_measurement_type_name_key UNIQUE (measurement_type_name)
)
I make the following single write operation through pgAdmin :
INSERT INTO measurement_type ( measurement_type_name )
VALUES('Width');
It takes 16ms to write a single row according to "Query Editor" (bottom right corner)
Am I doing the correct way to benchmarking? I am not sure whether this is expected performance? For me, I am expecting the time measurement is in nano seconds :p
Thanks and Regards
Yan Cheng CHEOK
On Mon, Jan 4, 2010 at 8:36 PM, Yan Cheng Cheok <yccheok@yahoo.com> wrote:
I am not sure whether I am doing the correct benchmarking way.
I have the following table ;
CREATE TABLE measurement_type
(
measurement_type_id bigserial NOT NULL,
measurement_type_name text NOT NULL,
CONSTRAINT pk_measurement_type_id PRIMARY KEY (measurement_type_id),
CONSTRAINT measurement_type_measurement_type_name_key UNIQUE (measurement_type_name)
)I make the following single write operation through pgAdmin :
INSERT INTO measurement_type ( measurement_type_name )
VALUES('Width');It takes 16ms to write a single row according to "Query Editor" (bottom right corner)
Am I doing the correct way to benchmarking? I am not sure whether this is expected performance? For me, I am expecting the time measurement is in nano seconds :p
It would be great if a hard drive could seek write, acknowledge the
write and the OS could tell the db about it in nano seconds. However,
some part of that chain would have to be lieing to do that. It takes
at LEAST a full rotation of a hard drive to commit a single change to
a database, usually more. Given that the fastest HDs are 15k RPM
right now, you're looking at 250 revolutions per second, or 1/250th of
a second minimum to commit a transaction.
Now, the good news is that if you make a bunch of inserts in the same
transaction a lot of them can get committed together to the disk at
the same time, and the aggregate speed will be, per insert, much
faster.
Instead of sending 1000++ INSERT statements in one shot, which will requires my application to keep track on the INSERT statement.
Is it possible that I can tell PostgreSQL,
"OK. I am sending you INSERT statement. But do not perform any actual right operation. Only perform actual write operation when the pending statement had reached 1000"
Thanks and Regards
Yan Cheng CHEOK
--- On Tue, 1/5/10, Scott Marlowe <scott.marlowe@gmail.com> wrote:
Show quoted text
From: Scott Marlowe <scott.marlowe@gmail.com>
Subject: Re: [GENERAL] PostgreSQL Write Performance
To: "Yan Cheng Cheok" <yccheok@yahoo.com>
Cc: pgsql-general@postgresql.org
Date: Tuesday, January 5, 2010, 11:45 AM
On Mon, Jan 4, 2010 at 8:36 PM, Yan
Cheng Cheok <yccheok@yahoo.com>
wrote:I am not sure whether I am doing the correct
benchmarking way.
I have the following table ;
CREATE TABLE measurement_type
(
measurement_type_id bigserial NOT NULL,
measurement_type_name text NOT NULL,
CONSTRAINT pk_measurement_type_id PRIMARY KEY(measurement_type_id),
CONSTRAINT
measurement_type_measurement_type_name_key UNIQUE
(measurement_type_name))
I make the following single write operation through
pgAdmin :
INSERT INTO measurement_type ( measurement_type_name
)
VALUES('Width');
It takes 16ms to write a single row according to
"Query Editor" (bottom right corner)
Am I doing the correct way to benchmarking? I am not
sure whether this is expected performance? For me, I am
expecting the time measurement is in nano seconds :pIt would be great if a hard drive could seek write,
acknowledge the
write and the OS could tell the db about it in nano
seconds. However,
some part of that chain would have to be lieing to do
that. It takes
at LEAST a full rotation of a hard drive to commit a single
change to
a database, usually more. Given that the fastest HDs
are 15k RPM
right now, you're looking at 250 revolutions per second, or
1/250th of
a second minimum to commit a transaction.Now, the good news is that if you make a bunch of inserts
in the same
transaction a lot of them can get committed together to the
disk at
the same time, and the aggregate speed will be, per insert,
much
faster.--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Yan Cheng Cheok wrote:
Instead of sending 1000++ INSERT statements in one shot, which will requires my application to keep track on the INSERT statement.
Is it possible that I can tell PostgreSQL,
"OK. I am sending you INSERT statement. But do not perform any actual right operation. Only perform actual write operation when the pending statement had reached 1000"
You can turn off synchronous_commit to get something like that:
http://www.postgresql.org/docs/current/static/runtime-config-wal.html
This should make your single-record INSERT time drop dramatically. Note
that you'll be introducing a possibility of some data loss from the
latest insert(s) if the server crashes in this situation.
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Yan Cheng Cheok
Sent: Monday, January 04, 2010 9:05 PM
To: Scott Marlowe
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL Write PerformanceInstead of sending 1000++ INSERT statements in one shot, which will
requires my application to keep track on the INSERT statement.Is it possible that I can tell PostgreSQL,
"OK. I am sending you INSERT statement. But do not perform any actual
right operation. Only perform actual write operation when the pending
statement had reached 1000"
You might use the copy command instead of insert, which is far faster.
If you want the fastest possible inserts, then probably copy is the way
to go instead of insert.
Here is copy command via API:
http://www.postgresql.org/docs/current/static/libpq-copy.html
Here is copy command via SQL:
http://www.postgresql.org/docs/8.4/static/sql-copy.html
You might (instead) use this sequence:
1. Begin transaction
2. Prepare
3. Insert 1000 times
4. Commit
If the commit fails, you will have to redo the entire set of 1000 or
otherwise handle the error.
Or something along those lines. Of course, when information is not
written to disk, what will happen on power failure? If you do something
cheesy like turning fsync off to speed up inserts, then you will have
trouble if you lose power.
What is the actual problem you are trying to solve?
What is the actual problem you are trying to solve?
I am currently developing a database system for a high speed measurement machine.
The time taken to perform measurement per unit is in term of ~30 milliseconds. We need to record down the measurement result for every single unit. Hence, the time taken by record down the measurement result shall be far more less than milliseconds, so that it will have nearly 0 impact on the machine speed (If not, machine need to wait for database to finish writing, before performing measurement on next unit)
Previously, we are using flat file.. However, using flat file is quite a mess, when come to generating reports to customers.
Thanks and Regards
Yan Cheng CHEOK
--- On Tue, 1/5/10, Dann Corbit <DCorbit@connx.com> wrote:
Show quoted text
From: Dann Corbit <DCorbit@connx.com>
Subject: Re: [GENERAL] PostgreSQL Write Performance
To: "Yan Cheng Cheok" <yccheok@yahoo.com>
Cc: pgsql-general@postgresql.org
Date: Tuesday, January 5, 2010, 2:03 PM-----Original Message-----
From: pgsql-general-owner@postgresql.org[mailto:pgsql-general-
owner@postgresql.org]
On Behalf Of Yan Cheng Cheok
Sent: Monday, January 04, 2010 9:05 PM
To: Scott Marlowe
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL Write PerformanceInstead of sending 1000++ INSERT statements in one
shot, which will
requires my application to keep track on the INSERT
statement.
Is it possible that I can tell PostgreSQL,
"OK. I am sending you INSERT statement. But do not
perform any actual
right operation. Only perform actual write operation
when the pending
statement had reached 1000"
You might use the copy command instead of insert, which is
far faster.
If you want the fastest possible inserts, then probably
copy is the way
to go instead of insert.
Here is copy command via API:
http://www.postgresql.org/docs/current/static/libpq-copy.html
Here is copy command via SQL:
http://www.postgresql.org/docs/8.4/static/sql-copy.htmlYou might (instead) use this sequence:
1. Begin transaction
2. Prepare
3. Insert 1000 times
4. CommitIf the commit fails, you will have to redo the entire set
of 1000 or
otherwise handle the error.Or something along those lines. Of course, when
information is not
written to disk, what will happen on power failure?
If you do something
cheesy like turning fsync off to speed up inserts, then you
will have
trouble if you lose power.What is the actual problem you are trying to solve?
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
-----Original Message-----
From: Yan Cheng Cheok [mailto:yccheok@yahoo.com]
Sent: Monday, January 04, 2010 11:30 PM
To: Dann Corbit
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL Write PerformanceWhat is the actual problem you are trying to solve?
I am currently developing a database system for a high speed
measurement machine.The time taken to perform measurement per unit is in term of ~30
milliseconds. We need to record down the measurement result for every
single unit. Hence, the time taken by record down the measurement
result shall be far more less than milliseconds, so that it will have
nearly 0 impact on the machine speed (If not, machine need to wait for
database to finish writing, before performing measurement on next
unit)
Previously, we are using flat file.. However, using flat file is quite
a mess, when come to generating reports to customers.
Does the data volume build continuously so that the file becomes
arbitrarily large, or can you archive data that is more than X days old?
What is the format of a record?
What is the format of the incoming data?
Do you need indexes on these records?
Is the stream of incoming data continuous around the clock, or are there
periods when there is no incoming data?
On 5 Jan 2010, at 8:30, Yan Cheng Cheok wrote:
What is the actual problem you are trying to solve?
I am currently developing a database system for a high speed measurement machine.
The time taken to perform measurement per unit is in term of ~30 milliseconds. We need to record down the measurement result for every single unit. Hence, the time taken by record down the measurement result shall be far more less than milliseconds, so that it will have nearly 0 impact on the machine speed (If not, machine need to wait for database to finish writing, before performing measurement on next unit)
Previously, we are using flat file.. However, using flat file is quite a mess, when come to generating reports to customers.
That flat file can help you with clustering INSERTs together and also means you'll have all your measurements ready for a single INSERT. The latter is useful if you're planning on using arrays to store your measurements, as updating array values requires the entire array to be rewritten to the database. I don't know how your measurements would arrive without the flat file, but I wouldn't be surprised if the measurements for a single unit would come out at different points in time, which would be a bit painful with arrays (not quite as much with a measurements table though).
A safe approach (with minimal risk of data loss) would be to split your flat file every n units (earlier in this thread a number of n=1000 was mentioned) and store that data using COPY in the format COPY expects. You will probably also want to keep a queue-table (which is just a normal table, but it's used like a queue) with the names of the flat files that need processing.
I haven't done this kind of thing before, but I envision it something like this:
CREATE TABLE unit (
id bigserial NOT NULL,
date date NOT NULL DEFAULT CURRENT_DATE,
measured text[],
measurements numeric(4,3)[]
);
CREATE TABLE queue (
file text NOT NULL,
definitive boolean DEFAULT False
);
---file-2010-01-05-00000001---
/* Update in it's own transaction so that we know we tried to process this file
* even if the transaction rolls back.
*/
UPDATE queue SET definitive = True
WHERE file = 'file-' || to_char(CURRENT_DATE, 'YYYY-MM-DD') || '-00000001';
/* Start work */
BEGIN;
COPY unit FROM STDIN;
1 {Width,Height} {0.001,0.021}
2 {Width,Height} {0.002,0.019}
...
999 {Width,Height} {0.000,0.018}
\.
/* This file was processed and can be removed from the queue */
DELETE FROM queue WHERE file='file-2010-01-05-00000001';
COMMIT;
/* This will probably be the name of the next flat file, but we don't know that
* for sure yet. It needs to be outside the transaction as otherwise CURRENT_DATE
* will have the date of the start of the transaction and we need to know what the
* next batch will be regardless of whether this one succeeded.
*/
INSERT INTO queue (file, definitive)
VALUES ('file-' || to_char(CURRENT_DATE, 'YYYY-MM-DD') || '-00001000', False);
---end of file---
You'd need a little program (a script will probably work) to read that queue table and send the commands in those files to the database. Don't forget that at the start the queue table will be empty ;) I recall some of this lists' members wrote up a webpage about how to implement queue-tables reliably.
Alban Hertroys
--
Screwing up is the best way to attach something to the ceiling.
!DSPAM:737,4b431caa10731320433375!
On 5/01/2010 3:30 PM, Yan Cheng Cheok wrote:
What is the actual problem you are trying to solve?
I am currently developing a database system for a high speed measurement machine.
The time taken to perform measurement per unit is in term of ~30 milliseconds. We need to record down the measurement result for every single unit. Hence, the time taken by record down the measurement result shall be far more less than milliseconds, so that it will have nearly 0 impact on the machine speed (If not, machine need to wait for database to finish writing, before performing measurement on next unit)
The commit_delay and synchronous_commit pararmeters may help you if you
want to do each insert as a separate transaction. Note that with these
parameters there's some risk of very recently committed data being lost
if the server OS crashes or the server hardware is powered
off/power-cycled unexpectedly. PostgreSQL its self crashing shouldn't
cause loss of the committed data, though.
Alternately, you can accumulate small batches of measurements in your
app and do multi-valued INSERTs once you have a few (say 10) collected
up. You'd have to be prepared to lose those if the app crashed though.
Another option is to continue using your flat file, and have a "reader"
process tailing the flat file and inserting new records into the
database as they become available in the flat file. The reader could
batch inserts intelligently, keep a record on disk of its progress,
rotate the flat file periodically, etc.
--
Craig Ringer
You might use the copy command instead of insert, which is far faster.
If you want the fastest possible inserts, then probably copy is the way
to go instead of insert.
Here is copy command via API:
http://www.postgresql.org/docs/current/static/libpq-copy.html
Here is copy command via SQL:
http://www.postgresql.org/docs/8.4/static/sql-copy.html
Is there a command like COPY which will insert the data but skip all
triggers and optionally integrity checks.
Tim Uckun wrote:
Is there a command like COPY which will insert the data but skip all
triggers and optionally integrity checks.
Nope, skipping integrity checks is MySQL talk. When doing a bulk
loading job, it may make sense to drop constraints and triggers though;
there's more notes on this and related techniques at
http://www.postgresql.org/docs/current/interactive/populate.html
Another thing you can do is defer your constraints:
http://www.postgresql.org/docs/current/static/sql-set-constraints.html
so that they execute in a more efficient block.
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com
Hi,
On Tuesday 05 January 2010 04:36:10 Yan Cheng Cheok wrote:
I make the following single write operation through pgAdmin :
...
It takes 16ms to write a single row according to "Query Editor" (bottom
right corner)
In my experience the times presented by pgadmin vary wildly and seldomly do
represent an accurate timing.
Andres
On 6/01/2010 6:21 AM, Tim Uckun wrote:
You might use the copy command instead of insert, which is far faster.
If you want the fastest possible inserts, then probably copy is the way
to go instead of insert.
Here is copy command via API:
http://www.postgresql.org/docs/current/static/libpq-copy.html
Here is copy command via SQL:
http://www.postgresql.org/docs/8.4/static/sql-copy.htmlIs there a command like COPY which will insert the data but skip all
triggers and optionally integrity checks.
No. If you don't want triggers and integrity checks to fire, don't
define them in the first place.
Technically you *can* disable triggers, including RI checks, but it's
VERY unwise and almost completely defeats the purpose of having the
checks. In most such situations you're much better off dropping the
constraints then adding them again at the end of the load.
--
Craig Ringer
Technically you *can* disable triggers, including RI checks, but it's VERY
unwise and almost completely defeats the purpose of having the checks. In
most such situations you're much better off dropping the constraints then
adding them again at the end of the load.
I know that the SQL server bulk loader defaults to not firing the
triggers and I was hoping there would be an option on the COPY command
to accomplish the same thing.
pg_dump has a --disable-triggers option too.
It seems to me that the COPY FROM should have an option that bypasses
the triggers as a convience.
Both the SQL server team and the postgres team have obviously
recognized that there will be situations where the DBA will want to
bulk load data without firing off a set of triggers for every insert.
It doesn't seem like an outrageous expectation that the COPY command
or something similar should have that option.
On Jan 5, 2010, at 3:46 PM, Tim Uckun wrote:
pg_dump has a --disable-triggers option too.
[...]
It doesn't seem like an outrageous expectation that the COPY command
or something similar should have that option.
Well, whether an expectation is "outrageous" or not is a matter of
viewpoint. The principle is that pg_dump and COPY have fundamentally
different use cases. pg_dump is intended to restore a backup of data
that was already in the database, and presumably was already validated
by the appropriate constraints and triggers. COPY is used to create
new records in a database, from arbitrary data, which may not be valid
based on the database's vision of data consistency.
--
-- Christophe Pettus
xof@thebuild.com
Tim Uckun wrote:
Technically you *can* disable triggers, including RI checks, but it's VERY
unwise and almost completely defeats the purpose of having the checks. In
most such situations you're much better off dropping the constraints then
adding them again at the end of the load.I know that the SQL server bulk loader defaults to not firing the
triggers and I was hoping there would be an option on the COPY command
to accomplish the same thing.pg_dump has a --disable-triggers option too.
pg_restore? Yes, it does. However, it knows that the data you're loading
came from a PostgreSQL database where those triggers have already fired
when the data was originally inserted. It can trust that the data is OK.
I don't personally think that COPY should make it easy to disable
triggers. You can do it if you want to (see the manual for how) but in
almost all cases its much wiser to drop triggers and constraints instead.
Rather than disabling RI constraints, it'd be desirable for COPY to have
an option that *deferred* RI constraint checking, then re-checked the
constraints for all rows at once and rolled back the COPY if any failed.
That'd be a lot faster for many uses, but (importantly) would preserve
referential integrity.
While COPY doesn't offer an easy way to do that, you can emulate the
behavior by:
- Beginning a transaction
- Disabling RI constraints
- Running COPY
- Re-enabling RI constraints
- Re-checking the RI constraints and raising an exception to abort the
transaction if the checks fail.
Unfortunately you can't (yet) do this with deferrable RI constraints,
because Pg isn't clever enough to notice when large numbers of
individual checks have built up and merge them into a single re-check
query that verifies the whole constraint in one pass. So you have to do
that yourself.
It seems to me that the COPY FROM should have an option that bypasses
the triggers as a convience.
I, for one, would loudly and firmly resist the addition of such a
feature. Almost-as-fast options such as intelligent re-checking of
deferred constraints would solve the problem better and much more
safely. If you really want the dangerous way you can already get it,
it's just a bit more work to disable the triggers yourself, and by doing
so you're saying "I understand what I am doing and take responsibility
for the dangers inherent in doing so".
If you really want to do that, look at the manual for how to disable
triggers, but understand that you are throwing away the database's data
integrity protection by doing it.
--
Craig Ringer
I, for one, would loudly and firmly resist the addition of such a
feature. Almost-as-fast options such as intelligent re-checking of
Even if it was not the default behavior?
If you really want to do that, look at the manual for how to disable
triggers, but understand that you are throwing away the database's data
integrity protection by doing it.
I guess it's a matter of philosophy. I kind of think as the DBA I
should be the final authority in determining what is right and wrong.
It's my data after all. Yes I would expect pg to perform every check I
specify and execute every trigger I write but if I want I should be
able to bypass those things "just this once".
As you point out I can already do this by manually going through and
disabling every trigger or even dropping the triggers. Many people
have said I could drop the constraints and re-set them up. The fact
that the COPY command does not have a convenient way for me to do this
doesn't prevent me from "shooting myself in the foot" if I want to.
It would just be a flag. If you want you can enable it, if you don't
they no harm no foul.
Anyway this is getting offtopic. I got my question answered. COPY does
not do this. If I want to do it I have to manually iterate through all
the triggers and disable them or drop them before running copy.
Tim Uckun wrote:
I, for one, would loudly and firmly resist the addition of such a
feature. Almost-as-fast options such as intelligent re-checking ofEven if it was not the default behavior?
Even if it was called
COPY (PLEASE BREAK MY DATABASE) FROM ...
... because there are *better* ways to do it that are safe for exposure
to normal users, and existing ways to do it the dangerous way if you
really want to.
I guess it's a matter of philosophy. I kind of think as the DBA I
should be the final authority in determining what is right and wrong.
It's my data after all. Yes I would expect pg to perform every check I
specify and execute every trigger I write but if I want I should be
able to bypass those things "just this once".As you point out I can already do this by manually going through and
disabling every trigger or even dropping the triggers.
You could alternately dynamically query pg_catalog and use PL/PgSQL (or
SQL generated by your app) to issue the appropriate statements to
control the triggers.
Many people
have said I could drop the constraints and re-set them up. The fact
that the COPY command does not have a convenient way for me to do this
doesn't prevent me from "shooting myself in the foot" if I want to.
I think that it would be desirable for COPY to provide a convenient way
to drop and re-create constraints, or (preferably) just disable them
while it ran then re-check them before returning success. Think:
COPY (BATCH_RI_CHECKS) FROM ...
The thing you might have missed is that dropping and re-creating
constraints is different to disabling them (as you're requesting that
COPY do). When the constraints are re-created, the creation will *fail*
if the constraint is violated, aborting the whole operation if you're
sensible enough to do it in a single transaction. At no point can you
end up with a constraint in place promising RI that is in fact violated
by the data.
By contrast, if you disable triggers and constraints, re-enabling them
does *not* re-check any constraints. So it's much, MUCH more dangerous,
since it can let bad data into the DB silently. So if you disable
constraints you MUST re-check them after re-enabling them and abort the
transaction if they're violated, or must be utterly certain that the
data you inserted/altered/deleted was really safe.
It would just be a flag. If you want you can enable it, if you don't
they no harm no foul.
Unfortunately my experience has been that many users (a) often don't
read documentation and (b) just try different things until something
they do makes the error "go away". They then get five steps down the
track and post a question about a query that doesn't work correctly
(because they broke their data) and it takes forever to get to the
bottom of it.
You're clearly a fairly experienced and responsible DBA who'd look
something up before using it and would be careful to preserve RI
manually in these situations. Some people who use PG aren't (an amazing
number of them just installed it to run their poker card counting
software!), and I don't think it's wise to make dangerous things *too*
obvious. They need to be there and available, but not staring you in the
face.
I don't advocate the GNOME philosophy of "make it impossible if it's not
suitable for a user who's using a computer for the first time" ... but I
don't like the "nuke my data" button to be on the default desktop either ;-)
--
Craig Ringer
Thanks for the information. I wrote a plan c program to test the performance. Its time measurement is very MUCH different from pgAdmin.
Thanks and Regards
Yan Cheng CHEOK
--- On Wed, 1/6/10, Andres Freund <andres@anarazel.de> wrote:
Show quoted text
From: Andres Freund <andres@anarazel.de>
Subject: Re: [GENERAL] PostgreSQL Write Performance
To: pgsql-general@postgresql.org
Cc: "Yan Cheng Cheok" <yccheok@yahoo.com>
Date: Wednesday, January 6, 2010, 6:49 AM
Hi,On Tuesday 05 January 2010 04:36:10 Yan Cheng Cheok wrote:
I make the following single write operation through
pgAdmin :
...It takes 16ms to write a single row according to
"Query Editor" (bottom
right corner)
In my experience the times presented by pgadmin vary wildly
and seldomly do
represent an accurate timing.Andres
Thanks for the information. I perform benchmarking on a very simple table, on local database. (1 table, 2 fields with 1 is bigserial, another is text)
====================================================================
INSERT INTO measurement_type(measurement_type_name) VALUES ('Hello')
====================================================================
I turn synchronous_commit to off.
To write a single row(local database), the time taken is in between 0.1ms and 0.5ms
I try to compare this with flat text file.
To write a single row(file), the time taken is in between 0.005ms and 0.05ms
The different is big. Is this the expected result? Are you guys also getting the similar result?
I know there shall be some overhead to write to database compared to flat text file. (network communication, interpretation of SQL statement...) However, Is there any way to further improve so that PostgreSQL write performance is near to file?
If not, I need to plan another strategy, to migrate my flat text file system, into PostgreSQL system smoothly.
Thanks and Regards
Yan Cheng CHEOK
--- On Tue, 1/5/10, Craig Ringer <craig@postnewspapers.com.au> wrote:
Show quoted text
From: Craig Ringer <craig@postnewspapers.com.au>
Subject: Re: [GENERAL] PostgreSQL Write Performance
To: "Yan Cheng Cheok" <yccheok@yahoo.com>
Cc: "Dann Corbit" <DCorbit@connx.com>, pgsql-general@postgresql.org
Date: Tuesday, January 5, 2010, 7:20 PM
On 5/01/2010 3:30 PM, Yan Cheng Cheok
wrote:What is the actual problem you are trying to
solve?
I am currently developing a database system for a high
speed measurement machine.
The time taken to perform measurement per unit is in
term of ~30 milliseconds. We need to record down the
measurement result for every single unit. Hence, the time
taken by record down the measurement result shall be far
more less than milliseconds, so that it will have nearly 0
impact on the machine speed (If not, machine need to wait
for database to finish writing, before performing
measurement on next unit)The commit_delay and synchronous_commit pararmeters may
help you if you want to do each insert as a separate
transaction. Note that with these parameters there's some
risk of very recently committed data being lost if the
server OS crashes or the server hardware is powered
off/power-cycled unexpectedly. PostgreSQL its self crashing
shouldn't cause loss of the committed data, though.Alternately, you can accumulate small batches of
measurements in your app and do multi-valued INSERTs once
you have a few (say 10) collected up. You'd have to be
prepared to lose those if the app crashed though.Another option is to continue using your flat file, and
have a "reader" process tailing the flat file and inserting
new records into the database as they become available in
the flat file. The reader could batch inserts intelligently,
keep a record on disk of its progress, rotate the flat file
periodically, etc.--
Craig Ringer-- Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general