Immutable function WAY slower than Stable function?

Started by Ken Tanzerover 7 years ago19 messagesgeneral
Jump to latest
#1Ken Tanzer
ken.tanzer@gmail.com

Hi. I was recently troubleshooting a function, and realized it had
incorrectly been declared as Immutable, when it should have been declared
Stable. When I changed it to Stable, the query I was running ran
dramatically faster. Digging into this a little more, this is what I found:

I've got a function (staff_inspector) that takes two arguments, an integer
and a date.

I've got a convenience function that takes just an integer, and fills in
the date. (With a Stable function target_date()).

There is no performance difference between the two functions if both
arguments are supplied.

If I use the convenience function however, the difference is dramatic. The
Stable version clocks in around 1.3 seconds, and the immutable version at
around 23 seconds.

So I'm wondering if this is expected behavior, and if someone could explain
it to me. Also, if it is expected, I'm wondering whether it's worth noting
in Section 36.6 ("Function Volatility Categories"), which led me to believe
I might be risking a stale value by marking something as Immutable, but
offers no hint about a ginormous performance penalty.

Here's some more detail and information.

I created two versions of the underlying function.
(staff_inspector_stable,staff_inspector_imm). There is no noticeable
performance difference between them. I created two versions of the
convenience function, si_stable and si_imm. They are identical, except for
being declared Stable or Immutable.

The first time I run any of these queries, they take about two seconds.
All subsequent calls take about 1.3 seconds. Except the Immutable version,
which clocks in at 20+ seconds. This is true whether it calls the Stable
or Immutable version of the underlying function:

spc=> EXPLAIN ANALYZE SELECT
client_id,staff_inspector_stable(client_id,target_date()) FROM
tbl_residence_own;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_residence_own (cost=0.00..3734.27 rows=6977 width=8)
(actual time=8.311..1990.601 rows=6983 loops=1)
Planning time: 1.976 ms
Execution time: 2001.247 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT
client_id,staff_inspector_stable(client_id,target_date()) FROM
tbl_residence_own;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_residence_own (cost=0.00..3734.27 rows=6977 width=8)
(actual time=3.472..1298.537 rows=6983 loops=1)
Planning time: 0.279 ms
Execution time: 1310.831 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT
client_id,staff_inspector_imm(client_id,target_date()) FROM
tbl_residence_own;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_residence_own (cost=0.00..3734.27 rows=6977 width=8)
(actual time=3.780..1299.082 rows=6983 loops=1)
Planning time: 0.308 ms
Execution time: 1311.379 ms
(3 rows)

spc=> EXPLAIN ANALYZE SELECT client_id,si_stable(client_id) FROM
tbl_residence_own;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_residence_own (cost=0.00..3734.27 rows=6977 width=8)
(actual time=3.145..1300.551 rows=6983 loops=1)
Planning time: 0.281 ms
Execution time: 1312.762 ms
(3 rows)
*spc=> EXPLAIN ANALYZE SELECT client_id,si_imm(client_id) FROM
tbl_residence_own;
* QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_residence_own (cost=0.00..1990.02 rows=6977 width=8)
(actual time=3.537..22892.481 rows=6983 loops=1)
Planning time: 0.079 ms* Execution time: 22903.504 ms
*(3 rows)

spc=> EXPLAIN ANALYZE SELECT client_id,si_stable_calls_imm(client_id)
FROM tbl_residence_own;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_residence_own (cost=0.00..3734.27 rows=6977 width=8)
(actual time=2.907..1291.235 rows=6983 loops=1)
Planning time: 0.223 ms
Execution time: 1303.488 ms
(3 rows)
*spc=> EXPLAIN ANALYZE SELECT client_id,si_imm_calls_imm(client_id)
FROM tbl_residence_own;
* QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_residence_own (cost=0.00..1990.02 rows=6977 width=8)
(actual time=3.664..22868.734 rows=6983 loops=1)
Planning time: 0.134 ms* Execution time: 22879.761 ms
*(3 rows)

And then here is the definitions of the functions:

CREATE OR REPLACE FUNCTION staff_inspector_stable( client INTEGER,
asof DATE ) RETURNS INTEGER AS $$
SELECT
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE asof BETWEEN reg_spc_date
AND COALESCE(reg_spc_date_end,asof) AND client_id=client LIMIT 1)
ILIKE 'SSP%' THEN
--- SSP answer
    COALESCE((SELECT staff_id FROM staff_employment_current WHERE
staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT
1),(SELECT staff_id FROM staff_employment_current WHERE
staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
---SPC answer
    (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN
l_housing_project USING (housing_project_code) WHERE client_id=client
AND asof BETWEEN residence_date AND COALESCE(residence_date_end,asof)
AND NOT ro.is_deleted LIMIT 1)

END
--LIMIT 1
$$ LANGUAGE SQL STABLE;

CREATE OR REPLACE FUNCTION staff_inspector_imm( client INTEGER, asof
DATE ) RETURNS INTEGER AS $$
SELECT
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE asof BETWEEN reg_spc_date
AND COALESCE(reg_spc_date_end,asof) AND client_id=client LIMIT 1)
ILIKE 'SSP%' THEN
--- SSP answer
    COALESCE((SELECT staff_id FROM staff_employment_current WHERE
staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT
1),(SELECT staff_id FROM staff_employment_current WHERE
staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
---SPC answer
    (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN
l_housing_project USING (housing_project_code) WHERE client_id=client
AND asof BETWEEN residence_date AND COALESCE(residence_date_end,asof)
AND NOT ro.is_deleted LIMIT 1)

END
--LIMIT 1
$$ LANGUAGE SQL IMMUTABLE;

CREATE OR REPLACE FUNCTION si_stable( client INTEGER ) RETURNS INTEGER AS $$
SELECT staff_inspector_stable(client,target_date());
$$ LANGUAGE SQL STABLE;

CREATE OR REPLACE FUNCTION si_imm( client INTEGER ) RETURNS INTEGER AS $$
SELECT staff_inspector_stable(client,target_date());
$$ LANGUAGE SQL IMMUTABLE;

CREATE OR REPLACE FUNCTION si_stable_calls_imm( client INTEGER )
RETURNS INTEGER AS $$
SELECT staff_inspector_imm(client,target_date());
$$ LANGUAGE SQL STABLE;

CREATE OR REPLACE FUNCTION si_imm_calls_imm( client INTEGER ) RETURNS
INTEGER AS $$
SELECT staff_inspector_imm(client,target_date());
$$ LANGUAGE SQL IMMUTABLE;

And the version info (Postgres on Centos 6.10)

SELECT version();

version
----------------------------------------------------------------------------------------------------------
PostgreSQL 9.6.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-18), 64-bit
(1 row)

Happy to provide any additional relevant info, or for someone to point out
what obvious thing I'm overlooking. Thanks in advance!

Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ken Tanzer (#1)
Re: Immutable function WAY slower than Stable function?

Ken Tanzer <ken.tanzer@gmail.com> writes:

Hi. I was recently troubleshooting a function, and realized it had
incorrectly been declared as Immutable, when it should have been declared
Stable. When I changed it to Stable, the query I was running ran
dramatically faster. Digging into this a little more, this is what I found:

EXPLAIN VERBOSE might shed a bit more light. I suspect you're falling
foul of the rather arcane rules about whether a SQL function can be
inlined. (Although it seems like only the wrapper function would be
inline-able anyway, so I'm not sure why the penalty is that high.)

regards, tom lane

#3Ken Tanzer
ken.tanzer@gmail.com
In reply to: Tom Lane (#2)
Re: Immutable function WAY slower than Stable function?

On Mon, Aug 6, 2018 at 4:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ken Tanzer <ken.tanzer@gmail.com> writes:

Hi. I was recently troubleshooting a function, and realized it had
incorrectly been declared as Immutable, when it should have been declared
Stable. When I changed it to Stable, the query I was running ran
dramatically faster. Digging into this a little more, this is what I

found:

EXPLAIN VERBOSE might shed a bit more light. I suspect you're falling
foul of the rather arcane rules about whether a SQL function can be
inlined. (Although it seems like only the wrapper function would be
inline-able anyway, so I'm not sure why the penalty is that high.)

regards, tom lane

Thanks Tom. This was as talkative as I could get it. I do see the
difference on the Output line, but I don't know how to interperet it.

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT
client_id,si_imm(client_id) FROM tbl_residence_own;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_residence_own (cost=0.00..1990.02 rows=6977
width=8) (actual time=3.771..22665.604 rows=6983 loops=1)
Output: client_id, si_imm(client_id)
Buffers: shared hit=199814
Planning time: 0.156 ms
Execution time: 22677.333 ms
(5 rows)

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT
client_id,si_stable(client_id) FROM tbl_residence_own;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_residence_own (cost=0.00..3734.27 rows=6977
width=8) (actual time=3.100..1302.888 rows=6983 loops=1)
Output: client_id, staff_inspector_stable(client_id, target_date())
Buffers: shared hit=60174
Planning time: 0.354 ms
Execution time: 1315.746 ms
(5 rows)

Cheers,
Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

#4Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ken Tanzer (#1)
Re: Immutable function WAY slower than Stable function?

On 08/06/2018 03:49 PM, Ken Tanzer wrote:

Hi.  I was recently troubleshooting a function, and realized it had
incorrectly been declared as Immutable, when it should have been
declared Stable.  When I changed it to Stable, the query I was running
ran dramatically faster. Digging into this a little more, this is what I
found:

Happy to provide any additional relevant info, or for someone to point
out what obvious thing I'm overlooking.  Thanks in advance!

What is the definition for target_date()?

Ken

--

--
Adrian Klaver
adrian.klaver@aklaver.com

#5Ken Tanzer
ken.tanzer@gmail.com
In reply to: Adrian Klaver (#4)
Re: Immutable function WAY slower than Stable function?

On Mon, Aug 6, 2018 at 4:36 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

What is the definition for target_date()?

Hi Adrian. Happy to provide this info. Though on a side note, I don't
understand why it should matter, if functions are black box optimization
fences. But here are the definitions:

CREATE OR REPLACE FUNCTION target_date() RETURNS date AS $$
SELECT target_date FROM target_date_current;
$$ LANGUAGE sql STABLE;

The target_date table and views:

CREATE TABLE tbl_target_date (
target_date_id SERIAL PRIMARY KEY,
target_date DATE NOT NULL,
effective_at TIMESTAMP NOT NULL DEFAULT current_timestamp,
comment TEXT,
--system fields
added_by INTEGER NOT NULL REFERENCES tbl_staff (staff_id),
added_at TIMESTAMP(0) NOT NULL DEFAULT CURRENT_TIMESTAMP,
changed_by INTEGER NOT NULL REFERENCES tbl_staff (staff_id),
changed_at TIMESTAMP(0) NOT NULL DEFAULT CURRENT_TIMESTAMP,
is_deleted BOOLEAN NOT NULL DEFAULT FALSE,
deleted_at TIMESTAMP(0),
deleted_by INTEGER REFERENCES tbl_staff(staff_id),
deleted_comment TEXT,
sys_log TEXT
);

CREATE VIEW target_date AS SELECT * FROM tbl_target_date WHERE NOT is_deleted;

CREATE VIEW target_date_current AS
SELECT * FROM target_date ORDER BY effective_at DESC LIMIT 1;

CREATE OR REPLACE FUNCTION target_date_no_edit_or_delete() RETURNS TRIGGER AS $$

BEGIN
IF (TG_OP <> 'INSERT')
THEN RAISE EXCEPTION 'Target records cannot be changed or deleted.
(Attempted operation: %)',TG_OP;
END IF;
IF (NEW.target_date <> date_trunc('month',NEW.target_date))
THEN RAISE EXCEPTION 'Target date must be the first of a month';
END IF;
IF (NEW.target_date <= target_date())
THEN RAISE EXCEPTION 'Target date can only be moved forward';
END IF;
RETURN NEW;
END;

$$ LANGUAGE plpgsql;

CREATE TRIGGER protect_target_date BEFORE INSERT OR UPDATE OR DELETE
ON tbl_target_date FOR EACH ROW EXECUTE PROCEDURE
target_date_no_edit_or_delete();
CREATE TRIGGER target_date_no_trunacte BEFORE TRUNCATE ON
tbl_target_date FOR STATEMENT EXECUTE PROCEDURE
target_date_no_edit_or_delete();

And the tbl_residence_own, which was referenced in my sample queries:

spc=> \d tbl_residence_own
Table
"public.tbl_residence_own"
Column | Type |
Modifiers
------------------------------+--------------------------------+------------------------------------------------------------------------------
residence_own_id | integer | not
null default nextval('tbl_residence_own_residence_own_id_seq'::regclass)
client_id | integer | not null
housing_project_code | character varying(10) | not null
housing_unit_code | character varying(10) | not null
residence_date | date | not null
residence_date_end | date |
unit_rent_manual | numeric(7,2) |
utility_allowance_manual | numeric(7,2) |
is_active_manual | boolean | not
null default true
was_received_hap | boolean |
was_received_compliance | boolean |
moved_from_code | character varying(10) |
chronic_homeless_status_code | character varying(10) |
lease_on_file | boolean |
moved_to_code | character varying(10) |
departure_type_code | character varying(10) |
departure_reason_code | character varying(10) |
move_out_was_code | character varying(10) |
returned_homeless | boolean |
was_deposit_returned | boolean |
comment_damage | text |
comment_deposit | text |
comment | text |
old_access_id | character varying |
old_utility_allowance | numeric(9,2) |
added_by | integer | not null
added_at | timestamp(0) without time zone | not
null default now()
changed_by | integer | not null
changed_at | timestamp(0) without time zone | not
null default now()
is_deleted | boolean | default false
deleted_at | timestamp(0) without time zone |
deleted_by | integer |
deleted_comment | text |
sys_log | text |
tenant_pays_deposit | boolean | not
null default false
is_coordinated_entry | boolean |
referral_source | text |
Indexes:
"tbl_residence_own_pkey" PRIMARY KEY, btree (residence_own_id)
"tbl_residence_own_client_id" btree (client_id)
"tbl_residence_own_housing_project_code" btree (housing_project_code)
"tbl_residence_own_housing_unit_code" btree (housing_unit_code)
"tbl_residence_own_is_deleted" btree (is_deleted)
"tbl_residence_own_residence_date" btree (residence_date)
"tbl_residence_own_residence_date_end" btree (residence_date_end)
Check constraints:
"coordinated_entry_or_other" CHECK (xor(is_coordinated_entry,
referral_source IS NOT NULL))
"date_sanity" CHECK (residence_date_end IS NULL OR residence_date
<= residence_date_end)
Foreign-key constraints:
"tbl_residence_own_added_by_fkey" FOREIGN KEY (added_by)
REFERENCES tbl_staff(staff_id)
"tbl_residence_own_changed_by_fkey" FOREIGN KEY (changed_by)
REFERENCES tbl_staff(staff_id)
"tbl_residence_own_chronic_homeless_status_code_fkey" FOREIGN KEY
(chronic_homeless_status_code) REFERENCES
tbl_l_chronic_homeless_status(chronic_homeless_status_code)
"tbl_residence_own_client_id_fkey" FOREIGN KEY (client_id)
REFERENCES tbl_client(client_id)
"tbl_residence_own_deleted_by_fkey" FOREIGN KEY (deleted_by)
REFERENCES tbl_staff(staff_id)
"tbl_residence_own_departure_reason_code_fkey" FOREIGN KEY
(departure_reason_code) REFERENCES
tbl_l_departure_reason(departure_reason_code)
"tbl_residence_own_departure_type_code_fkey" FOREIGN KEY
(departure_type_code) REFERENCES
tbl_l_departure_type(departure_type_code)
"tbl_residence_own_housing_project_code_fkey" FOREIGN KEY
(housing_project_code) REFERENCES
tbl_l_housing_project(housing_project_code)
"tbl_residence_own_housing_unit_code_fkey" FOREIGN KEY
(housing_unit_code) REFERENCES tbl_housing_unit(housing_unit_code)
"tbl_residence_own_move_out_was_code_fkey" FOREIGN KEY
(move_out_was_code) REFERENCES tbl_l_exit_status(exit_status_code)
"tbl_residence_own_moved_from_code_fkey" FOREIGN KEY
(moved_from_code) REFERENCES tbl_l_facility(facility_code)
"tbl_residence_own_moved_to_code_fkey" FOREIGN KEY (moved_to_code)
REFERENCES tbl_l_facility(facility_code)
Triggers:
check_max_occupant AFTER INSERT OR UPDATE ON tbl_residence_own FOR
EACH ROW EXECUTE PROCEDURE enforce_max_occupant()
tbl_residence_own_log_chg AFTER INSERT OR DELETE OR UPDATE ON
tbl_residence_own FOR EACH ROW EXECUTE PROCEDURE table_log()
tbl_residence_own_no_unit_or_project_change BEFORE UPDATE ON
tbl_residence_own FOR EACH ROW EXECUTE PROCEDURE
tbl_residence_own_validate_modify()

Let me know if I can provide more info. Thanks!

Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

#6Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ken Tanzer (#5)
Re: Immutable function WAY slower than Stable function?

On 08/06/2018 04:44 PM, Ken Tanzer wrote:

On Mon, Aug 6, 2018 at 4:36 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

What is the definition for target_date()?

Hi Adrian.  Happy to provide this info.  Though on a side note, I don't
understand why it should matter, if functions are black box optimization

Black boxes the bane of my existence, different field, control
circuits:) My interest was because you are chaining functions and a
chain is only as strong as it's weakest link. From the surface I am not
seeing anything, will need to mull it over a bit.

fences.  But here are the definitions:

--
Adrian Klaver
adrian.klaver@aklaver.com

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ken Tanzer (#5)
Re: Immutable function WAY slower than Stable function?

Ken Tanzer <ken.tanzer@gmail.com> writes:

Hi Adrian. Happy to provide this info. Though on a side note, I don't
understand why it should matter, if functions are black box optimization
fences.

They aren't, at least not when they are SQL-language functions that
meet the conditions for inlining. The reason that EXPLAIN VERBOSE
is helpful here is that you can see whether the function got inlined
or not: do you see a call to the function, or a representation of
its body?

regards, tom lane

#8Ken Tanzer
ken.tanzer@gmail.com
In reply to: Tom Lane (#7)
Re: Immutable function WAY slower than Stable function?

On Mon, Aug 6, 2018 at 7:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ken Tanzer <ken.tanzer@gmail.com> writes:

Hi Adrian. Happy to provide this info. Though on a side note, I don't
understand why it should matter, if functions are black box optimization
fences.

They aren't, at least not when they are SQL-language functions that
meet the conditions for inlining.

Yeah, I kinda realized after I sent this that I wasn't really making much
sense. :) I appreciate your explanation though.

The reason that EXPLAIN VERBOSE

is helpful here is that you can see whether the function got inlined
or not: do you see a call to the function, or a representation of
its body?

I wasn't able to get anything more VERBOSE than what I sent before. I

re-copied it below. (If there's another command or option I should be
using, please advise.)

There is the difference in the two output lines, which I guess suggests
that the stable one got inlined and the immutable one didn't?

* Output: client_id, si_imm(client_id)
*

* Output: client_id, staff_inspector_stable(client_id, target_date())
*

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_imm(client_id)
FROM tbl_residence_own;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_residence_own (cost=0.00..1990.02 rows=6977
width=8) (actual time=3.771..22665.604 rows=6983 loops=1)* Output:
client_id, si_imm(client_id)
* Buffers: shared hit=199814
Planning time: 0.156 ms
Execution time: 22677.333 ms
(5 rows)

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT
client_id,si_stable(client_id) FROM tbl_residence_own;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_residence_own (cost=0.00..3734.27 rows=6977
width=8) (actual time=3.100..1302.888 rows=6983 loops=1)* Output:
client_id, staff_inspector_stable(client_id, target_date())
* Buffers: shared hit=60174
Planning time: 0.354 ms
Execution time: 1315.746 ms
(5 rows)

Cheers,
Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

#9Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Ken Tanzer (#3)
Re: Immutable function WAY slower than Stable function?

Ken Tanzer wrote:

On Mon, Aug 6, 2018 at 4:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ken Tanzer <ken.tanzer@gmail.com> writes:

Hi. I was recently troubleshooting a function, and realized it had
incorrectly been declared as Immutable, when it should have been declared
Stable. When I changed it to Stable, the query I was running ran
dramatically faster. Digging into this a little more, this is what I found:

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_imm(client_id) FROM tbl_residence_own;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_residence_own (cost=0.00..1990.02 rows=6977 width=8) (actual time=3.771..22665.604 rows=6983 loops=1)
Output: client_id, si_imm(client_id)
Buffers: shared hit=199814
Planning time: 0.156 ms
Execution time: 22677.333 ms
(5 rows)

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT client_id,si_stable(client_id) FROM tbl_residence_own;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_residence_own (cost=0.00..3734.27 rows=6977 width=8) (actual time=3.100..1302.888 rows=6983 loops=1)
Output: client_id, staff_inspector_stable(client_id, target_date())
Buffers: shared hit=60174
Planning time: 0.354 ms
Execution time: 1315.746 ms
(5 rows)

That seems to say that the _stable function is running much faster.
Buth functions don't get inlined.

I'd dig into the functions and find out how long the queries in
them take. auto_explain is a good helper for that.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#10Ken Tanzer
ken.tanzer@gmail.com
In reply to: Laurenz Albe (#9)
Re: Immutable function WAY slower than Stable function?

On Mon, Aug 6, 2018 at 11:05 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Ken Tanzer wrote:

On Mon, Aug 6, 2018 at 4:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ken Tanzer <ken.tanzer@gmail.com> writes:

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT

client_id,si_imm(client_id) FROM tbl_residence_own;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------

Seq Scan on public.tbl_residence_own (cost=0.00..1990.02 rows=6977

width=8) (actual time=3.771..22665.604 rows=6983 loops=1)

Output: client_id, si_imm(client_id)
Buffers: shared hit=199814
Planning time: 0.156 ms
Execution time: 22677.333 ms
(5 rows)

spc=> EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT

client_id,si_stable(client_id) FROM tbl_residence_own;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------

Seq Scan on public.tbl_residence_own (cost=0.00..3734.27 rows=6977

width=8) (actual time=3.100..1302.888 rows=6983 loops=1)

Output: client_id, staff_inspector_stable(client_id, target_date())
Buffers: shared hit=60174
Planning time: 0.354 ms
Execution time: 1315.746 ms
(5 rows)

That seems to say that the _stable function is running much faster.

Agreed!

Buth functions don't get inlined.

OK, I clearly don't understand the output lines. What does it mean then

that the stable output line shows the underlying (two-argument) function,
while the immutable one shows the convenience function?

Output: client_id, si_imm(client_id)
Output: client_id, staff_inspector_stable(client_id, target_date())

I'd dig into the functions and find out how long the queries in

them take. auto_explain is a good helper for that.

I'm definitely not understanding why or how auto-explain would help here.
(Also, don't overlook the fact that both si_stable and si_imm have the
exact same definition (except for stable v. immutable), and are calling the
same function (staff_inspector_stable)). I can load up the auto_explain
stuff if it really will help, but in the meantime what I did was pull the
query out of staff_inspector_stable, and inserted into the "SELECT
client_id,staff_inspector( ... ) FROM tbl_residence_own" query, replacing
the "client" (parameter 1) with the client_id from the tbl_residence_own,
and replacing "asof" (parameter 2) with target_date(), which seems like it
should closely mimic the original query.

I ended up with the query below, and have attached the explain output to
this email. The actual timing is:

Planning time: 2.737 ms
Execution time: 821.034 ms

Which is somewhat better than the function versions, though it seems
unsurprisingly and not dramatically so.

EXPLAIN (VERBOSE,ANALYZE,BUFFERS) SELECT client_id,
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE target_date() BETWEEN
reg_spc_date AND COALESCE(reg_spc_date_end,target_date()) AND
client_id=tro.client_id LIMIT 1)
ILIKE 'SSP%' THEN
COALESCE((SELECT staff_id FROM staff_employment_current WHERE
staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT
1),(SELECT staff_id FROM staff_employment_current WHERE
staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
(SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN
l_housing_project USING (housing_project_code) WHERE client_id=5538
AND target_date() BETWEEN residence_date AND
COALESCE(residence_date_end,target_date()) AND NOT ro.is_deleted LIMIT
1)
END
FROM tbl_residence_own tro;

Let me know if that is not helpful. Or if you need something from
auto-explain, please help me with some more specifics. Thanks!

Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

Attachments:

explain.staff_inspector_query.sqlapplication/octet-stream; name=explain.staff_inspector_query.sqlDownload
#11Ken Tanzer
ken.tanzer@gmail.com
In reply to: Ken Tanzer (#10)
Re: Immutable function WAY slower than Stable function?

On Tue, Aug 7, 2018 at 12:05 AM Ken Tanzer <ken.tanzer@gmail.com> wrote:

I'm definitely not understanding why or how auto-explain would help here.
(Also, don't overlook the fact that both si_stable and si_imm have the
exact same definition (except for stable v. immutable), and are calling the
same function (staff_inspector_stable)). I can load up the auto_explain
stuff if it really will help, but in the meantime what I did was pull the
query out of staff_inspector_stable, and inserted into the "SELECT
client_id,staff_inspector( ... ) FROM tbl_residence_own" query, replacing
the "client" (parameter 1) with the client_id from the tbl_residence_own,
and replacing "asof" (parameter 2) with target_date(), which seems like it
should closely mimic the original query.

I ended up with the query below, and have attached the explain output to
this email. The actual timing is:

Planning time: 2.737 ms
Execution time: 821.034 ms

Which is somewhat better than the function versions, though it seems
unsurprisingly and not dramatically so.

EXPLAIN (VERBOSE,ANALYZE,BUFFERS) SELECT client_id,
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE target_date() BETWEEN reg_spc_date AND COALESCE(reg_spc_date_end,target_date()) AND client_id=tro.client_id LIMIT 1)
ILIKE 'SSP%' THEN
COALESCE((SELECT staff_id FROM staff_employment_current WHERE staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 1),(SELECT staff_id FROM staff_employment_current WHERE staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
(SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN l_housing_project USING (housing_project_code) WHERE client_id=5538 AND target_date() BETWEEN residence_date AND COALESCE(residence_date_end,target_date()) AND NOT ro.is_deleted LIMIT 1)
END
FROM tbl_residence_own tro;

Whoops, scratch that previous explain and query. I accidentally left in a
hard-coded client_id from earlier testing. The correct query is:

EXPLAIN (VERBOSE,ANALYZE,BUFFERS) SELECT client_id,
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE target_date() BETWEEN
reg_spc_date AND COALESCE(reg_spc_date_end,target_date()) AND
client_id=tro.client_id LIMIT 1)
ILIKE 'SSP%' THEN

COALESCE((SELECT staff_id FROM staff_employment_current WHERE
staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT
1),(SELECT staff_id FROM staff_employment_current WHERE
staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE

(SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN
l_housing_project USING (housing_project_code) WHERE
client_id=tro.client_id AND target_date() BETWEEN residence_date AND
COALESCE(residence_date_end,target_date()) AND NOT ro.is_deleted LIMIT
1)
END
FROM tbl_residence_own tro;

The corrected explain output is attached, and the actual timing was:

Planning time: 2.741 ms
Execution time: 2538.277 ms

Sorry for the confusion! It's been a long day, and filled with many 20+
second waits for queries to finish. :)

Cheers,
Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

Attachments:

explain.staff_inspector_query_corrected.sqlapplication/octet-stream; name=explain.staff_inspector_query_corrected.sqlDownload
#12Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Ken Tanzer (#10)
Re: Immutable function WAY slower than Stable function?

Ken Tanzer wrote:

Buth functions don't get inlined.

OK, I clearly don't understand the output lines. What does it mean then that the stable
output line shows the underlying (two-argument) function, while the immutable one shows
the convenience function?

Output: client_id, si_imm(client_id)
Output: client_id, staff_inspector_stable(client_id, target_date())

I was wrong, obviously "si_imm" gets inlined and replaced with the other
function call in the STABLE case.

I'd dig into the functions and find out how long the queries in
them take. auto_explain is a good helper for that.

I'm definitely not understanding why or how auto-explain would help here. (Also, don't
overlook the fact that both si_stable and si_imm have the exact same definition
(except for stable v. immutable), and are calling the same function (staff_inspector_stable)).

Let me know if that is not helpful. Or if you need something from auto-explain,
please help me with some more specifics. Thanks!

If you set

shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 0
auto_explain.log_analyze = on
auto_explain.log_buffers = on
auto_explain.log_verbose = on
auto_explain.log_nested_statements = on

and run both the slow and the fast query, the log will contain the execution plans and
time for all SQL statements that were called from the functions.

Then you can identify in which nested query the time is spent, which should give us
more material to determine the cause.

The most likely explanation for the difference is that the same query is running with
different execution plans in both cases.

Do you notice a difference if you start a new database session and run the queries
several times? Is there a difference in execution time from the sixth execution on?
If yes, generic plans may be part of the problem.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#13Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ken Tanzer (#11)
Re: Immutable function WAY slower than Stable function?

On 08/07/2018 12:38 AM, Ken Tanzer wrote:

On Tue, Aug 7, 2018 at 12:05 AM Ken Tanzer <ken.tanzer@gmail.com
<mailto:ken.tanzer@gmail.com>> wrote:

Whoops, scratch that previous explain and query.  I accidentally left in
a hard-coded client_id from earlier testing.  The correct query is:

EXPLAIN (VERBOSE,ANALYZE,BUFFERS) SELECT client_id,
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE target_date() BETWEEN reg_spc_date AND COALESCE(reg_spc_date_end,target_date()) AND client_id=tro.client_id LIMIT 1)
ILIKE 'SSP%' THEN

COALESCE((SELECT staff_id FROM staff_employment_current WHERE staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 1),(SELECT staff_id FROM staff_employment_current WHERE staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE

(SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN l_housing_project USING (housing_project_code) WHERE client_id=tro.client_id AND target_date() BETWEEN residence_date AND COALESCE(residence_date_end,target_date()) AND NOT ro.is_deleted LIMIT 1)
END
FROM tbl_residence_own tro;

The corrected explain output is attached, and the actual timing was:

 Planning time: 2.741 ms
 Execution time: 2538.277 ms

Sorry for the confusion!  It's been a long day, and filled with many 20+
second waits for queries to finish. :)

Hmm, whatever it is looks to be tied to si_* being IMMUTABLE. Just a
thought but have you tried(NOTE: DEFAULT value):

CREATE OR REPLACE FUNCTION staff_inspector_stable( client INTEGER, asof 
DATE DEFAULT target_date()) RETURNS INTEGER AS $$
SELECT
CASE WHEN
(SELECT program_type_code FROM reg_spc WHERE asof BETWEEN reg_spc_date 
AND COALESCE(reg_spc_date_end,asof) AND client_id=client LIMIT 1)
ILIKE 'SSP%' THEN
--- SSP answer
     COALESCE((SELECT staff_id FROM staff_employment_current WHERE 
staff_position_code='COORD_PROP' AND agency_project_code='SSP' LIMIT 
1),(SELECT staff_id FROM staff_employment_current WHERE 
staff_position_code='MGRPROJ' AND agency_project_code='SSP' LIMIT 1))
ELSE
---SPC answer
     (SELECT staff_inspector FROM tbl_residence_own ro LEFT JOIN 
l_housing_project USING (housing_project_code) WHERE client_id=client 
AND asof BETWEEN residence_date AND COALESCE(residence_date_end,asof) 
AND NOT ro.is_deleted LIMIT 1)

END
--LIMIT 1
$$ LANGUAGE SQL STABLE;

Cheers,
Ken

--
Adrian Klaver
adrian.klaver@aklaver.com

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ken Tanzer (#8)
Re: Immutable function WAY slower than Stable function?

Ken Tanzer <ken.tanzer@gmail.com> writes:

On Mon, Aug 6, 2018 at 7:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason that EXPLAIN VERBOSE
is helpful here is that you can see whether the function got inlined
or not: do you see a call to the function, or a representation of
its body?

There is the difference in the two output lines, which I guess suggests
that the stable one got inlined and the immutable one didn't?

* Output: client_id, si_imm(client_id)

Yeah, exactly: here we have the original call to si_imm ...

* Output: client_id, staff_inspector_stable(client_id, target_date())

... but here, that's been replaced by si_imm's body. So we've
successfully eliminated one layer of SQL-function call overhead.

The reason this happens in one case and not the other is that one of
the arcane rules for function inlining is that it's not allowed to make
the expression look more volatile than it was before. So if si_imm()
is marked immutable, inlining it is disallowed if it would insert any
stable or volatile functions. I can give you a disquisition on why
that is if you like, but it's not really relevant right now, since
clearly none of these functions should have been marked immutable.

Still, though, we've only eliminated the SQL-function overhead for one
out of three SQL functions that are getting called at each row, so you'd
think the inlining could not save more than 1/3rd of the runtime at the
absolute most. How is it making more than an order-of-magnitude
difference?

I was confused about that too, so I set up a simple test case similar
to Ken's and poked into it a bit, and what I found out is that nested
SQL functions are just about completely broken performance-wise,
unless one or the other gets inlined :-(. The reason is that the SQL
function mechanism is designed to cache the results of parsing and
planning the function body ... but only for the duration of the
calling query. So if you issue a direct SQL command "SELECT myf(x)
FROM big_table", and myf() isn't inline-able, then myf's body gets
parsed/planned once and then re-used at each row of the table. That's
not too awful. But each execution of myf's body counts as a separate
"calling query" for this purpose, so if it's calling some other SQL
function that's also not inline-able, then we re-parse-and-plan that
other function for each row of big_table. *That* is where the time
is going.

I've had a to-do item to rewrite and improve the SQL function cache
mechanism for a long time, but I hadn't thought it was high priority.
Maybe it should be.

In the meantime, I think you could dodge the issue by converting either
level of function into plpgsql. Or just be careful about inline-ability
of convenience wrappers.

regards, tom lane

#15Guyren Howe
guyren@gmail.com
In reply to: Tom Lane (#14)
Re: Immutable function WAY slower than Stable function?

On Aug 7, 2018, at 9:10 , Tom Lane <tgl@sss.pgh.pa.us> wrote:

I've had a to-do item to rewrite and improve the SQL function cache
mechanism for a long time, but I hadn't thought it was high priority.
Maybe it should be.

Personally, I would love this. Far too few developers dig into doing things in the database. Conceptually, doing all the data things in the database should be faster, more versatile, more secure etc.

Modern SQL has tools that can make its awfulness tolerable. CTEs and SQL functions spring immediately to mind. But both suffer from optimization fences and corner cases like this that arguably make it not worth the effort.

#16Ken Tanzer
ken.tanzer@gmail.com
In reply to: Tom Lane (#14)
Re: Immutable function WAY slower than Stable function?

On Tue, Aug 7, 2018 at 9:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I was confused about that too, so I set up a simple test case similar
to Ken's and poked into it a bit, and what I found out is that nested
SQL functions are just about completely broken performance-wise,
unless one or the other gets inlined :-(. The reason is that the SQL
function mechanism is designed to cache the results of parsing and
planning the function body ... but only for the duration of the
calling query. So if you issue a direct SQL command "SELECT myf(x)
FROM big_table", and myf() isn't inline-able, then myf's body gets
parsed/planned once and then re-used at each row of the table. That's
not too awful. But each execution of myf's body counts as a separate
"calling query" for this purpose, so if it's calling some other SQL
function that's also not inline-able, then we re-parse-and-plan that
other function for each row of big_table. *That* is where the time
is going.

Glad you were able to dig into this and figure it out!

As a couple of follow-ups:

After I submitted the (corrected) explain for my attempt to inline-by-hand
the SQL from the function directly into the query, that the inlined version
took about twice as long as the call with the function. (2.5 vs 1.3
seconds.) That doesn't really make sense to me. I guess I could use that
auto_explain stuff to see what's going on, but if there's an obvious
answer...

I've wondered in the past about Immutable functions, and particularly the
scope & lifetime of "forever" in "guaranteed to return the same results
given the same arguments forever." I assume that's "for all users and all
sessions," but either in theory or in practice is there a limit to how long
a stale value might persist? And, if you were to drop and recreate a
function with the same name & parameters, would it start fresh at that
point? And is there a way to flush any caching? (It's surely best to just
declare Stable, but I'm wondering about cases that might have _very_
infrequently-changed values.)

Finally, I'm wondering if there's any legitimate reason for a immutable
function to call anything stable or volatile, and if not would it make any
sense to emit a warning when the function is created.

I've had a to-do item to rewrite and improve the SQL function cache
mechanism for a long time, but I hadn't thought it was high priority.
Maybe it should be.

That sounds great. My earlier misguided statement notwithstanding, we've

actually tried to write our functions in SQL wherever possible, assuming
that the performance will be better than with a procedural language.

In the meantime, I think you could dodge the issue by converting either
level of function into plpgsql. Or just be careful about inline-ability
of convenience wrappers.

Definitely yes on the second. And to be clear, my actual problem was
dodged before starting this thread. I was aiming more towards
understanding the behavior, which seemed either amiss or in need of some
documentation. And trying to be Postgres-helpful where I can in a
non-developer way. Kinda like "see something / say something" for software
users. :)

Thanks again.

Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/&gt;*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client&gt;*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

#17Christophe Pettus
xof@thebuild.com
In reply to: Ken Tanzer (#16)
Re: Immutable function WAY slower than Stable function?

On Aug 7, 2018, at 11:42, Ken Tanzer <ken.tanzer@gmail.com> wrote:
I assume that's "for all users and all sessions," but either in theory or in practice is there a limit to how long a stale value might persist? And, if you were to drop and recreate a function with the same name & parameters, would it start fresh at that point? And is there a way to flush any caching? (It's surely best to just declare Stable, but I'm wondering about cases that might have _very_ infrequently-changed values.)

Well, the extreme case is an IMMUTABLE function used to create an expression index; then, the value lasts as long as the index does. The best way to think of an IMMUTABLE is that it is a pure function, unchanged by system state. (This is one of the reasons that datetime-related functions are often STABLE rather than IMMUTABLE, due to time zone changes.)

--
-- Christophe Pettus
xof@thebuild.com

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ken Tanzer (#16)
Re: Immutable function WAY slower than Stable function?

Ken Tanzer <ken.tanzer@gmail.com> writes:

I've wondered in the past about Immutable functions, and particularly the
scope & lifetime of "forever" in "guaranteed to return the same results
given the same arguments forever." I assume that's "for all users and all
sessions," but either in theory or in practice is there a limit to how long
a stale value might persist?

The two issues I can think of offhand are (a) plans that depend on the
assumption of immutability might persist for the life of active sessions;
(b) if you created an index on the result of an immutable function,
changing the function's behavior would likely render the index
inconsistent. (REINDEX would be enough to fix that, though.)

And, if you were to drop and recreate a
function with the same name & parameters, would it start fresh at that
point?

DROP (not CREATE OR REPLACE) would surely get rid of any dependent
values.

Finally, I'm wondering if there's any legitimate reason for a immutable
function to call anything stable or volatile, and if not would it make any
sense to emit a warning when the function is created.

We've been around on that before, and concluded that it's not worth it.
People have in the past used phony-immutable functions to coerce the
optimizer to do what they want --- in fact, that use case is one of the
reasons why we have the inlining restriction. I'm not sure if such
practices are still popular, but somebody who was doing that would not
appreciate nannyism.

regards, tom lane

#19Michal
michal-ok@o2.pl
In reply to: Tom Lane (#14)
Re: Immutable function WAY slower than Stable function?

On 2018.08.07 18:10, Tom Lane wrote:

I've had a to-do item to rewrite and improve the SQL function cache
mechanism for a long time, but I hadn't thought it was high priority.
Maybe it should be.

In the meantime, I think you could dodge the issue by converting either
level of function into plpgsql. Or just be careful about inline-ability
of convenience wrappers.

I now remember I stumbled across this same problem last year when I
needed an immutable unaccent() function for the purpose of creating a
functional index. I simply wrapped unaccent(), which is stable, in an
immutable custom function (and accepting the need to recreate the index
in case when unaccent rules change). I can't remember the specifics but
I was also surprised to learn that query execution plans were not
optimized at all when the function was immutable and in some cases they
were even slower than when the function was stable, quite contrary to
all logic and to what the docs say. So I think it would be great if this
issue were resolved in Postgres.