Alternative to UPDATE (As COPY to INSERT)
I realize update operation speed in PostgreSQL doesn't meet my speed expectation.
Is there any fast alternative to UPDATE? as using fast COPY to INSERT operation.
Thanks!
I am using update in the following case :
CREATE OR REPLACE FUNCTION update_or_insert_statistic(integer, text[], text[], double precision[])
RETURNS void AS
$BODY$DECLARE
_lotID ALIAS FOR $1;
_measurementTypes ALIAS FOR $2;
_statisticTypes ALIAS FOR $3;
_values ALIAS FOR $4;
_row_count int;
i int;
BEGIN
-- Parameters validation.
IF array_upper(_measurementTypes, 1) != array_upper(_statisticTypes, 1) OR array_upper(_measurementTypes, 1) != array_upper(_values, 1) THEN
RAISE EXCEPTION 'Inconsistency in array size';
END IF;
FOR i IN SELECT generate_subscripts(_measurementTypes, 1)
LOOP
EXECUTE 'UPDATE statistic SET value = $1 WHERE fk_lot_id = $2 AND measurement_type = $3 AND statistic_type = $4'
USING _values[i], _lotID, _measurementTypes[i], _statisticTypes[i];
GET DIAGNOSTICS _row_count = ROW_COUNT;
IF _row_count = 0 THEN
EXECUTE 'INSERT INTO statistic(fk_lot_id, "value", measurement_type, statistic_type) VALUES ($1, $2, $3, $4)'
USING _lotID, _value, _measurementType, _statisticType;
END IF;
END LOOP;
END;$BODY$
I use the following "minimal" version
SELECT * FROM update_or_insert_statistic(1,array['Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1
Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1
Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch'],
array['LSL','USL','Nominal','Average','StdDev','StdDev3','CPK','Min','Max','Total','Valid','Invalid','Variance','LSL','USL','Nominal','Av
erage','StdDev','StdDev3','CPK','Min','Max','Total','Valid','Invalid','Variance'],
array[0,0,0,4.94422589800714,3.16063453753607,0,0,1.01620532853175,9.98406933805353,20,20,0,9.98961067986587,0,0,0,6.56297341837825,2.512
73949943937,0,0,1.69188512833033,9.56794946134831,20,20,0,6.31385979204282])
It takes around 20ms :(
I am expecting < 1ms
Or shall I just go back to plain text in this case?
Thanks and Regards
Yan Cheng CHEOK
On 23 Feb 2010, at 10:26, Yan Cheng CHEOK wrote:
I realize update operation speed in PostgreSQL doesn't meet my speed expectation.
Is there any fast alternative to UPDATE? as using fast COPY to INSERT operation.
Well, since an UPDATE is just a DELETE + INSERT and you're already doing this in one transaction, you could use DELETE + COPY instead. That's not as easy to do with your current approach though - converting those arrays to something COPY understands will probably take about as much time as your function is taking now.
It would probably be convenient to use a staging table (maybe temporary) to put the new values in before you act on them, so you don't need that hassle with arrays. I don't know where your data is coming from (I recall you work with a measurement machine in a capacitor plant?), but you could use COPY to fill the staging table (no constraints, yay!) and then:
BEGIN;
DELETE FROM statistics WHERE (fk_lot_id, measurement_type, statistic_type) IN (SELECT fk_lot_id, measurement_type, statistic_type FROM staging_table);
INSERT INTO statistics (value, fk_lot_id, measurement_type, statistic_type) SELECT value, fk_lot_id, measurement_type, statistic_type FROM staging_table);
TRUNCATE staging_table;
COMMIT;
This isn't concurrency-safe, so you need to make sure no values are added to the staging table while you're doing the above.
I'm not sure you'll get to <1 ms doing this, that's a pretty steep requirement, but considering you can do it the "slow" way in 20ms it might just work.
Of course, if you have a staging table you could choose to operate on it less frequently; that would give you more time to operate on it and reduces the amount of overhead a little. I'm guessing your data comes in 24/7, so finding the right batch-size is part of your problem.
Thanks!
I am using update in the following case :
CREATE OR REPLACE FUNCTION update_or_insert_statistic(integer, text[], text[], double precision[])
RETURNS void AS
$BODY$DECLARE
_lotID ALIAS FOR $1;
_measurementTypes ALIAS FOR $2;
_statisticTypes ALIAS FOR $3;
_values ALIAS FOR $4;
_row_count int;
i int;
BEGIN
-- Parameters validation.
IF array_upper(_measurementTypes, 1) != array_upper(_statisticTypes, 1) OR array_upper(_measurementTypes, 1) != array_upper(_values, 1) THEN
RAISE EXCEPTION 'Inconsistency in array size';
END IF;FOR i IN SELECT generate_subscripts(_measurementTypes, 1)
LOOP
EXECUTE 'UPDATE statistic SET value = $1 WHERE fk_lot_id = $2 AND measurement_type = $3 AND statistic_type = $4'
USING _values[i], _lotID, _measurementTypes[i], _statisticTypes[i];GET DIAGNOSTICS _row_count = ROW_COUNT;
IF _row_count = 0 THEN
EXECUTE 'INSERT INTO statistic(fk_lot_id, "value", measurement_type, statistic_type) VALUES ($1, $2, $3, $4)'
USING _lotID, _value, _measurementType, _statisticType;
END IF;
END LOOP;
END;$BODY$I use the following "minimal" version
SELECT * FROM update_or_insert_statistic(1,array['Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1
Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1
Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch'],
array['LSL','USL','Nominal','Average','StdDev','StdDev3','CPK','Min','Max','Total','Valid','Invalid','Variance','LSL','USL','Nominal','Av
erage','StdDev','StdDev3','CPK','Min','Max','Total','Valid','Invalid','Variance'],
array[0,0,0,4.94422589800714,3.16063453753607,0,0,1.01620532853175,9.98406933805353,20,20,0,9.98961067986587,0,0,0,6.56297341837825,2.512
73949943937,0,0,1.69188512833033,9.56794946134831,20,20,0,6.31385979204282])It takes around 20ms :(
I am expecting < 1ms
Or shall I just go back to plain text in this case?
Thanks and Regards
Yan Cheng CHEOK--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.
!DSPAM:737,4b83c7c410447773417439!
On 23/02/10 09:26, Yan Cheng CHEOK wrote:
I realize update operation speed in PostgreSQL doesn't meet my speed expectation.
Is there any fast alternative to UPDATE? as using fast COPY to INSERT operation.
No. But you haven't said where the limit is on your operation.
EXECUTE 'UPDATE statistic SET value = $1 WHERE fk_lot_id = $2 AND measurement_type = $3 AND statistic_type = $4'
USING _values[i], _lotID, _measurementTypes[i], _statisticTypes[i];
There's no need to do EXECUTE ... USING here - just do the update
UPDATE statistic SET value = _values[i] WHERE fk_log_id = _lotID ...
It takes around 20ms :(
I am expecting< 1ms
This might be impractical, depending on exactly what you hope to achieve.
If you wish to have individual transactions take no more than 1ms and be
safely on disk, then you will need a disk controller with battery-backed
write cache. Disks just don't spin fast enough.
You posted a few questions, but I don't see anything saying exactly what
you are trying to achieve and what sort of server you have to do it
with. Perhaps some background would be useful.
--
Richard Huxton
Archonet Ltd