Why Stored Procedure is Slower In The Following Case?

Started by Yan Cheng Cheokabout 16 years ago3 messagesgeneral
Jump to latest
#1Yan Cheng Cheok
yccheok@yahoo.com

I have a stored procedure, which perform single RETURN QUERY SELECT..., by taking in 2 function parameters.

It takes around 7 seconds to complete the operation.

When I perform outside stored procedure, with exact same SELECT statement, it only takes 0.5 seconds, with same result being returned.

Testing using Stored Procedure
==============================
SELECT * FROM get_measurements(1, 'Pad%');
6949.593 ms

EXPLAIN SELECT * FROM get_measurements(1, 'Pad%');
QUERY PLAN
---------------------------------------------------------------------------
Function Scan on get_measurements (cost=0.00..260.00 rows=1000 width=72)
(1 row)

Testing using SQL statement
===========================
SemiconductorInspection=# SELECT measurement_type.value, measurement.value, measurement_unit.value FROM unit, lot, measurement,
measurement_unit, measurement_type WHERE lot_id = fk_lot_id AND fk_unit_id = unit_id AND fk_measurement_unit_id = measurement_
unit_id AND fk_measurement_type_id = measurement_type_id AND lot_id = 1 AND measurement_type.value LIKE 'Pad%';
552.234 ms

SemiconductorInspection=# EXPLAIN SELECT measurement_type.value, measurement.value, measurement_unit.value FROM unit, lot, meas
urement, measurement_unit, measurement_type WHERE lot_id = fk_lot_id AND fk_unit_id = unit_id AND fk_measurement_unit_id = meas
urement_unit_id AND fk_measurement_type_id = measurement_type_id AND lot_id = 1 AND measurement_type.value LIKE 'Pad%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Nested Loop (cost=51.33..44328.65 rows=3629 width=53)
-> Index Scan using pk_lot_id on lot (cost=0.00..8.27 rows=1 width=8)
Index Cond: (lot_id = 1)
-> Hash Join (cost=51.33..44284.10 rows=3629 width=61)
Hash Cond: (measurement.fk_measurement_unit_id = measurement_unit.measurement_unit_id)
-> Hash Join (cost=13.65..44196.52 rows=3629 width=33)
Hash Cond: (measurement.fk_measurement_type_id = measurement_type.measurement_type_id)
-> Hash Join (cost=11.38..44138.71 rows=5134 width=24)
Hash Cond: (measurement.fk_unit_id = unit.unit_id)
-> Seq Scan on measurement (cost=0.00..36261.81 rows=2083781 width=24)
-> Hash (cost=10.08..10.08 rows=104 width=16)
-> Index Scan using idx_fk_lot_id on unit (cost=0.00..10.08 rows=104 width=16)
Index Cond: (fk_lot_id = 1)
-> Hash (cost=1.76..1.76 rows=41 width=17)
-> Seq Scan on measurement_type (cost=0.00..1.76 rows=41 width=17)
Filter: (value ~~ 'Pad%'::text)
-> Hash (cost=22.30..22.30 rows=1230 width=36)
-> Seq Scan on measurement_unit (cost=0.00..22.30 rows=1230 width=36)
(18 rows)

The content for stored procedure is as follow :

CREATE OR REPLACE FUNCTION get_measurements(IN bigint, IN text)
RETURNS TABLE(_type text, _value double precision, _unit text) AS
$BODY$DECLARE
_lotID ALIAS FOR $1;
_type ALIAS FOR $2;
BEGIN
RETURN QUERY SELECT measurement_type.value, measurement.value, measurement_unit.value
FROM unit, lot, measurement, measurement_unit, measurement_type
WHERE lot_id = fk_lot_id AND fk_unit_id = unit_id AND fk_measurement_unit_id = measurement_unit_id AND
fk_measurement_type_id = measurement_type_id AND
lot_id = _lotID AND measurement_type.value LIKE _type;
END;$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION get_measurements(bigint, text) OWNER TO postgres;

As you can see, their select statement is the same. Except stored procedure is having additional 'QUERY'. Does that make the speed difference?

Thanks and Regards
Yan Cheng CHEOK

p/s I have index on measurement_type table using :

CREATE INDEX idx_measurement_type_value
ON measurement_type
USING btree
(value);

#2A. Kretschmer
andreas.kretschmer@schollglas.com
In reply to: Yan Cheng Cheok (#1)
Re: Why Stored Procedure is Slower In The Following Case?

In response to Yan Cheng Cheok :

As you can see, their select statement is the same. Except stored
procedure is having additional 'QUERY'. Does that make the speed
difference?

No. The problem is, the planner don't know the actual parameters.
Therefore the planner picked out a wrong plan (seq-scan instead of
index-scan). You can avoid this by rewrite your function: use a dynamic
query, use EXECUTE.

Read more:
http://www.postgresql.org/docs/8.4/interactive/plpgsql-control-structures.html
Chapter 38.6.1.2. RETURN NEXT and RETURN QUERY

Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99

#3Yan Cheng Cheok
yccheok@yahoo.com
In reply to: A. Kretschmer (#2)
Re: Why Stored Procedure is Slower In The Following Case?

Thanks. I am able to solve my problem using the following (EXECUTE) :

CREATE OR REPLACE FUNCTION get_measurements(IN bigint, IN text)
RETURNS TABLE(_type text, _value double precision, _unit text) AS
$BODY$DECLARE
_lotID ALIAS FOR $1;
_type ALIAS FOR $2;
BEGIN
RETURN QUERY EXECUTE 'SELECT measurement_type.value, measurement.value, measurement_unit.value
FROM unit, lot, measurement, measurement_unit, measurement_type
WHERE lot_id = fk_lot_id AND fk_unit_id = unit_id AND fk_measurement_unit_id = measurement_unit_id AND
fk_measurement_type_id = measurement_type_id AND
lot_id = $1 AND measurement_type.value LIKE $2'
USING _lotID, _type;
END;$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION get_measurements(bigint) OWNER TO postgres;

Thanks and Regards
Yan Cheng CHEOK

--- On Wed, 1/20/10, Yan Cheng Cheok <yccheok@yahoo.com> wrote:
Show quoted text

From: Yan Cheng Cheok <yccheok@yahoo.com>
Subject: Why Stored Procedure is Slower In The Following Case?
To: pgsql-general@postgresql.org
Date: Wednesday, January 20, 2010, 3:10 PM
I have a stored procedure, which
perform single RETURN QUERY SELECT..., by taking in 2
function parameters.

It takes around 7 seconds to complete the operation.

When I perform outside stored procedure, with exact same
SELECT statement, it only takes 0.5 seconds, with same
result being returned.

Testing using Stored Procedure
==============================
SELECT * FROM get_measurements(1, 'Pad%');
6949.593 ms

EXPLAIN SELECT * FROM get_measurements(1, 'Pad%');
               
               
QUERY PLAN
---------------------------------------------------------------------------
Function Scan on get_measurements  (cost=0.00..260.00
rows=1000 width=72)
(1 row)

Testing using SQL statement
===========================
SemiconductorInspection=# SELECT measurement_type.value,
measurement.value, measurement_unit.value FROM unit, lot,
measurement,
measurement_unit, measurement_type WHERE lot_id =
fk_lot_id AND fk_unit_id = unit_id AND
fk_measurement_unit_id = measurement_
unit_id AND fk_measurement_type_id = measurement_type_id
AND lot_id = 1 AND measurement_type.value LIKE 'Pad%';
552.234 ms

SemiconductorInspection=# EXPLAIN SELECT
measurement_type.value, measurement.value,
measurement_unit.value FROM unit, lot, meas
urement, measurement_unit, measurement_type WHERE lot_id =
fk_lot_id AND fk_unit_id = unit_id AND
fk_measurement_unit_id = meas
urement_unit_id AND fk_measurement_type_id =
measurement_type_id AND lot_id = 1 AND
measurement_type.value LIKE 'Pad%';
               
               
             
   QUERY PLAN
------------------------------------------------------------------------------------------------------------
Nested Loop  (cost=51.33..44328.65 rows=3629
width=53)
   ->  Index Scan using pk_lot_id on
lot  (cost=0.00..8.27 rows=1 width=8)
         Index Cond: (lot_id
= 1)
   ->  Hash Join 
(cost=51.33..44284.10 rows=3629 width=61)
         Hash Cond:
(measurement.fk_measurement_unit_id =
measurement_unit.measurement_unit_id)
         ->  Hash
Join  (cost=13.65..44196.52 rows=3629 width=33)
           
   Hash Cond:
(measurement.fk_measurement_type_id =
measurement_type.measurement_type_id)
           
   ->  Hash Join 
(cost=11.38..44138.71 rows=5134 width=24)
               
     Hash Cond: (measurement.fk_unit_id
= unit.unit_id)
               
     ->  Seq Scan on
measurement  (cost=0.00..36261.81 rows=2083781
width=24)
               
     ->  Hash 
(cost=10.08..10.08 rows=104 width=16)
               
           -> 
Index Scan using idx_fk_lot_id on unit 
(cost=0.00..10.08 rows=104 width=16)
               
             
   Index Cond: (fk_lot_id = 1)
           
   ->  Hash  (cost=1.76..1.76
rows=41 width=17)
               
     ->  Seq Scan on
measurement_type  (cost=0.00..1.76 rows=41 width=17)
               
           Filter: (value
~~ 'Pad%'::text)
         -> 
Hash  (cost=22.30..22.30 rows=1230 width=36)
           
   ->  Seq Scan on
measurement_unit  (cost=0.00..22.30 rows=1230
width=36)
(18 rows)

The content for stored procedure is as follow :

CREATE OR REPLACE FUNCTION get_measurements(IN bigint, IN
text)
  RETURNS TABLE(_type text, _value double precision,
_unit text) AS
$BODY$DECLARE
    _lotID ALIAS FOR $1;
    _type ALIAS FOR $2;
BEGIN
    RETURN QUERY SELECT measurement_type.value,
measurement.value, measurement_unit.value
    FROM unit, lot, measurement,
measurement_unit, measurement_type
    WHERE lot_id = fk_lot_id AND fk_unit_id =
unit_id AND fk_measurement_unit_id = measurement_unit_id AND

          fk_measurement_type_id =
measurement_type_id AND
          lot_id = _lotID AND
measurement_type.value LIKE _type;
END;$BODY$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100
  ROWS 1000;
ALTER FUNCTION get_measurements(bigint, text) OWNER TO
postgres;

As you can see, their select statement is the same. Except
stored procedure is having additional 'QUERY'. Does that
make the speed difference?

Thanks and Regards
Yan Cheng CHEOK

p/s I have index on measurement_type table using :

CREATE INDEX idx_measurement_type_value
  ON measurement_type
  USING btree
  (value);