SQL Functions and plan time
I define an SQL function:
CREATE FUNCTION foo(timestamptz) AS '
...' LANGUAGE 'SQL';
Does the plan for the body get built at creation or execution time? I could
have sworn it was execution time, but timings on one function suggest
creation?
For those new to functions, plpgsql always prepares its plan at "compile" time
so doesn't have actual values available to refine its plan.
--
Richard Huxton
Richard Huxton <dev@archonet.com> writes:
Does the plan for the body get built at creation or execution time? I could
have sworn it was execution time, but timings on one function suggest
creation?
It's planned once per query. There's been some talk of building a
cross-query plan cache for SQL functions, like plpgsql has, but
nothing done yet.
regards, tom lane
It's planned once per query. There's been some talk of building a
cross-query plan cache for SQL functions, like plpgsql has, but
nothing done yet.
Several weeks ago I posted a note about a plpgsql function that fails
if an index that existed when it was created is subsequently dropped.
(I think this should be considered a bug, the existence or nonexistence
of indexes should affect performance, not whether queries fail.)
Would a cross-query plan cache for SQL functions create similar problems?
--
Mike Nolan
nolan@celery.tssi.com writes:
Several weeks ago I posted a note about a plpgsql function that fails
if an index that existed when it was created is subsequently dropped.
No surprise...
Would a cross-query plan cache for SQL functions create similar problems?
Yeah. We need to think about a mechanism for invalidating plans before
we go too much further down the road of caching plans.
regards, tom lane
On Tuesday 08 Jul 2003 6:39 am, Tom Lane wrote:
Richard Huxton <dev@archonet.com> writes:
Does the plan for the body get built at creation or execution time? I
could have sworn it was execution time, but timings on one function
suggest creation?It's planned once per query. There's been some talk of building a
cross-query plan cache for SQL functions, like plpgsql has, but
nothing done yet.
So why am I getting substantially different times for identical queries
(except for parameter substitution)
[41]: LOG: query: SELECT zzz2('2003-07-07 17:00:00+01','2003-07-07 17:20:00+01'); [42-1] LOG: query: [42-2] DELETE FROM stats_telcon WHERE st_hour >= $1 AND st_hour < $2; ...
17:20:00+01');
[42-1] LOG: query:
[42-2] DELETE FROM stats_telcon WHERE st_hour >= $1 AND st_hour < $2;
...
[43]: LOG: duration: 7.524765 sec
[44]: LOG: query: DELETE FROM stats_telcon WHERE st_hour>='2003-07-07 17:00:00+01' AND st_hour<'2003-07-07 17:20:00+01';
17:00:00+01' AND st_hour<'2003-07-07 17:20:00+01';
[45]: LOG: duration: 0.032860 sec
[46-1] LOG: query: INSERT INTO stats_telcon
...
[47]: LOG: duration: 1.810267 sec
I don't think I'm getting caching issues here - these aren't the first run.
--
Richard Huxton
So why am I getting substantially different times for identical queries
(except for parameter substitution)
This sounds like the question I asked two weeks ago.
What I concluded was that though they may be identical queries, they are
not identical in terms of what happens when you execute them.
Why? Because the tuples have been updated, meaning that the affected
rows are in different physical locations than they used to be in both
the table and in any indexes.
Imagine that you have to pick up 10 items at the grocery store. You
have a list of which aisles they're in. Tomorrow you go back
to the store to pick up the same 10 items. However, overnight the
store has moved them all to different locations. Even though you have
an updated list, it will likely take you a different amount of time
to pick up those 10 items.
Running a vacuum analyze and rebuilding indexes between runs should
produce more consistent timings, if consistent timings are important.
Readonly queries running on an otherwise idle server should produce
more consistent timings, subject to caching issues.
--
Mike Nolan
On Tuesday 08 Jul 2003 9:34 am, nolan@celery.tssi.com wrote:
So why am I getting substantially different times for identical queries
(except for parameter substitution)This sounds like the question I asked two weeks ago.
What I concluded was that though they may be identical queries, they are
not identical in terms of what happens when you execute them.Why? Because the tuples have been updated, meaning that the affected
rows are in different physical locations than they used to be in both
the table and in any indexes.
The "SELECT" part, which is the determining factor in all of this is reading
unchanged data from tables unchanged since a vacuum full/analyse. Besides,
the timings are consistent.
Running a vacuum analyze and rebuilding indexes between runs should
produce more consistent timings, if consistent timings are important.Readonly queries running on an otherwise idle server should produce
more consistent timings, subject to caching issues.
Yep - which is why I'm puzzled. It's the readonly part of this that's taking
the extra time. The DELETE/INSERT are to another table - I'm summarising
activity logs into an hourly stats table.
--
Richard Huxton
Richard Huxton <dev@archonet.com> writes:
So why am I getting substantially different times for identical queries
(except for parameter substitution)
[41] LOG: query: SELECT zzz2('2003-07-07 17:00:00+01','2003-07-07
17:20:00+01');
[42-1] LOG: query:
[42-2] DELETE FROM stats_telcon WHERE st_hour >= $1 AND st_hour < $2;
...
[43] LOG: duration: 7.524765 sec
[44] LOG: query: DELETE FROM stats_telcon WHERE st_hour>='2003-07-07
17:00:00+01' AND st_hour<'2003-07-07 17:20:00+01';
[45] LOG: duration: 0.032860 sec
They're not the same query from the planner's viewpoint: one has
constants from which it can infer the number of rows to be fetched,
the other has only parameter symbols.
My guess is that the parameterized query is getting stuck with a seqscan
plan, but it's hard to be sure without more data.
regards, tom lane
On Tuesday 08 Jul 2003 4:33 pm, Tom Lane wrote:
Richard Huxton <dev@archonet.com> writes:
So why am I getting substantially different times for identical queries
(except for parameter substitution)[41] LOG: query: SELECT zzz2('2003-07-07 17:00:00+01','2003-07-07
17:20:00+01');
[42-1] LOG: query:
[42-2] DELETE FROM stats_telcon WHERE st_hour >= $1 AND st_hour < $2;
...
[43] LOG: duration: 7.524765 sec[44] LOG: query: DELETE FROM stats_telcon WHERE st_hour>='2003-07-07
17:00:00+01' AND st_hour<'2003-07-07 17:20:00+01';
[45] LOG: duration: 0.032860 secThey're not the same query from the planner's viewpoint: one has
constants from which it can infer the number of rows to be fetched,
the other has only parameter symbols.My guess is that the parameterized query is getting stuck with a seqscan
plan, but it's hard to be sure without more data.
That was my guess, but I couldn't think of a way to get an EXPLAIN out of the
function. I turned the plan debugging on for both but I'll need some free
time to format it up and figure out what's happening.
Can I ask why, since the plan is constructed at query-time the parameters
aren't substitued *before* planning?
--
Richard Huxton
Richard Huxton <dev@archonet.com> writes:
Can I ask why, since the plan is constructed at query-time the parameters
aren't substitued *before* planning?
Because then the plan couldn't be re-used. A SQL function may be
executed many times in a query, so the plan has to be reusable.
(Or, if you prefer, we have query-level caching of SQL function plans.)
regards, tom lane
On Tuesday 08 Jul 2003 7:48 pm, Tom Lane wrote:
Richard Huxton <dev@archonet.com> writes:
Can I ask why, since the plan is constructed at query-time the parameters
aren't substitued *before* planning?Because then the plan couldn't be re-used. A SQL function may be
executed many times in a query, so the plan has to be reusable.
(Or, if you prefer, we have query-level caching of SQL function plans.)
Ah - I see. Fair enough
Thanks Tom.
--
Richard Huxton