plpgsql functions organisation

Started by Yves Dorfsmanalmost 11 years ago19 messagesgeneral
Jump to latest
#1Yves Dorfsman
yves@zioup.com

I find my plpgsql functions becomes unreadable very quickly. I want to break
them up in smaller functions.

What is the best way to organised them?
Is there any way to define functions inside functions?
When I list functions in psql, I can see them all at the same level, is there
any way to organise them in packages like with python and other languages, so
the smaller functions are hidden away in a package/directory?

Thanks.

--
http://yves.zioup.com
gpg: 4096R/32B0F416

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Melvin Davidson
melvin6925@gmail.com
In reply to: Yves Dorfsman (#1)
Re: plpgsql functions organisation

AFAIK, you cannot "package" functions in PostgreSQL, but it is possible to
call a function from within a function.

That being said, I would seriously look at how and why you are writing your
functions
as functions that call other functions are not very efficient.

Also note that PostgreSQL allows you define functions using Python, so that
might be a possible compromise.

On Sat, May 2, 2015 at 12:53 PM, Yves Dorfsman <yves@zioup.com> wrote:

I find my plpgsql functions becomes unreadable very quickly. I want to
break
them up in smaller functions.

What is the best way to organised them?
Is there any way to define functions inside functions?
When I list functions in psql, I can see them all at the same level, is
there
any way to organise them in packages like with python and other languages,
so
the smaller functions are hidden away in a package/directory?

Thanks.

--
http://yves.zioup.com
gpg: 4096R/32B0F416

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Yves Dorfsman (#1)
Re: plpgsql functions organisation

On 05/02/2015 09:53 AM, Yves Dorfsman wrote:

I find my plpgsql functions becomes unreadable very quickly. I want to break
them up in smaller functions.

What is the best way to organised them?
Is there any way to define functions inside functions?
When I list functions in psql, I can see them all at the same level, is there
any way to organise them in packages like with python and other languages, so
the smaller functions are hidden away in a package/directory?

The only thing I can think of is to use SCHEMAs;

http://www.postgresql.org/docs/9.4/interactive/sql-createschema.html

Thanks.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Yves Dorfsman
yves@zioup.com
In reply to: Melvin Davidson (#2)
Re: plpgsql functions organisation

On 2015-05-02 11:12, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is possible to
call a function from within a function.

That being said, I would seriously look at how and why you are writing your
functions
as functions that call other functions are not very efficient.

Simply to make long procedures easier to read and follow the logic.

Also note that PostgreSQL allows you define functions using Python, so that
might be a possible compromise.

Thanks.

--
http://yves.zioup.com
gpg: 4096R/32B0F416

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Melvin Davidson (#2)
Re: plpgsql functions organisation

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is possible to
call a function from within a function.

That being said, I would seriously look at how and why you are writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is using
existing functions in new functions/classes. In fact in the Postgres
source I see this in many places. Now it is entirely possible I missed a
memo, so I am open to a more detailed explanation of the inefficiencies
involved.

Also note that PostgreSQL allows you define functions using Python, so
that might be a possible compromise.

On Sat, May 2, 2015 at 12:53 PM, Yves Dorfsman <yves@zioup.com
<mailto:yves@zioup.com>> wrote:

I find my plpgsql functions becomes unreadable very quickly. I want
to break
them up in smaller functions.

What is the best way to organised them?
Is there any way to define functions inside functions?
When I list functions in psql, I can see them all at the same level,
is there
any way to organise them in packages like with python and other
languages, so
the smaller functions are hidden away in a package/directory?

Thanks.

--
http://yves.zioup.com
gpg: 4096R/32B0F416

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Jeff Janes
jeff.janes@gmail.com
In reply to: Adrian Klaver (#5)
Re: plpgsql functions organisation

On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is possible
to
call a function from within a function.

That being said, I would seriously look at how and why you are writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out 'units
of work' so they can be combined as needed. Part of that is using existing
functions in new functions/classes. In fact in the Postgres source I see
this in many places. Now it is entirely possible I missed a memo, so I am
open to a more detailed explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Cheers,

Jeff

#7Bill Moran
wmoran@potentialtech.com
In reply to: Jeff Janes (#6)
Re: plpgsql functions organisation

On Sat, 2 May 2015 14:07:31 -0700
Jeff Janes <jeff.janes@gmail.com> wrote:

On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is possible
to
call a function from within a function.

That being said, I would seriously look at how and why you are writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out 'units
of work' so they can be combined as needed. Part of that is using existing
functions in new functions/classes. In fact in the Postgres source I see
this in many places. Now it is entirely possible I missed a memo, so I am
open to a more detailed explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Maybe that's a roundabout way of saying that if your functions are
complex enough to require calling "sub-functions" they might be
justifying being writting in C?

--
Bill Moran

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Jeff Janes (#6)
Re: plpgsql functions organisation

On 05/02/2015 02:07 PM, Jeff Janes wrote:

On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is
possible to
call a function from within a function.

That being said, I would seriously look at how and why you are
writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is
using existing functions in new functions/classes. In fact in the
Postgres source I see this in many places. Now it is entirely
possible I missed a memo, so I am open to a more detailed
explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Does this actually matter? I am a biologist that backed into computing,
so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works close to
the server and is optimized to do so. I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?

Cheers,

Jeff

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Melvin Davidson
melvin6925@gmail.com
In reply to: Bill Moran (#7)
Re: plpgsql functions organisation

Further to the point of saying functions are ineffiencent, consider the
fact that as of the current version of PostgreSQL, plpgsql
functions cannot be pre-optimized. So when they are referenced in a SQL
statement, PostgreSQL (optimizer) has load the
function from the catalogs, which involves overhead. If the function calls
another function, then the process has to be repeated, which involves
additional overhead. Ergo, that is not the most efficient way of doing
things.

On Sat, May 2, 2015 at 5:17 PM, Bill Moran <wmoran@potentialtech.com> wrote:

On Sat, 2 May 2015 14:07:31 -0700
Jeff Janes <jeff.janes@gmail.com> wrote:

On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com

wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is

possible

to
call a function from within a function.

That being said, I would seriously look at how and why you are writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out

'units

of work' so they can be combined as needed. Part of that is using

existing

functions in new functions/classes. In fact in the Postgres source I

see

this in many places. Now it is entirely possible I missed a memo, so I

am

open to a more detailed explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Maybe that's a roundabout way of saying that if your functions are
complex enough to require calling "sub-functions" they might be
justifying being writting in C?

--
Bill Moran

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#10Bill Moran
wmoran@potentialtech.com
In reply to: Adrian Klaver (#8)
Re: plpgsql functions organisation

On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/02/2015 02:07 PM, Jeff Janes wrote:

On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is
possible to
call a function from within a function.

That being said, I would seriously look at how and why you are
writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is
using existing functions in new functions/classes. In fact in the
Postgres source I see this in many places. Now it is entirely
possible I missed a memo, so I am open to a more detailed
explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Does this actually matter? I am a biologist that backed into computing,
so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works close to
the server and is optimized to do so. I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your needs, then
don't worry about it. plpgsql is around because for most people, it works
well enough. There are certainly cases when you want to create very complex
logic in the database and plpgsql is liable to make that difficult. But
there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified, because
plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

--
Bill Moran

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Melvin Davidson (#9)
Re: plpgsql functions organisation

On 05/02/2015 03:10 PM, Melvin Davidson wrote:

Further to the point of saying functions are ineffiencent, consider the
fact that as of the current version of PostgreSQL, plpgsql
functions cannot be pre-optimized. So when they are referenced in a SQL
statement, PostgreSQL (optimizer) has load the
function from the catalogs, which involves overhead. If the function
calls another function, then the process has to be repeated, which
involves additional overhead. Ergo, that is not the most efficient way
of doing things.

Yeah, I see the explanation here:

http://www.postgresql.org/docs/9.4/interactive/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

Which has this:

"As each expression and SQL command is first executed in the function,
the PL/pgSQL interpreter parses and analyzes the command to create a
prepared statement, using the SPI manager's SPI_prepare function.
Subsequent visits to that expression or command reuse the prepared
statement. Thus, a function with conditional code paths that are seldom
visited will never incur the overhead of analyzing those commands that
are never executed within the current session"

So it still not clear to me whether a monolithic function is better or
worse than one that calls other functions as needed. Probably over
thinking this, but it would make a good experiment. Just have to figure
out a realistic scenario to test. Thanks for the input.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Bill Moran (#10)
Re: plpgsql functions organisation

On 05/02/2015 03:28 PM, Bill Moran wrote:

On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/02/2015 02:07 PM, Jeff Janes wrote:

On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is
possible to
call a function from within a function.

That being said, I would seriously look at how and why you are
writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is
using existing functions in new functions/classes. In fact in the
Postgres source I see this in many places. Now it is entirely
possible I missed a memo, so I am open to a more detailed
explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Does this actually matter? I am a biologist that backed into computing,
so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works close to
the server and is optimized to do so. I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your needs, then
don't worry about it. plpgsql is around because for most people, it works
well enough. There are certainly cases when you want to create very complex
logic in the database and plpgsql is liable to make that difficult. But
there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified, because
plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

Agreed, though in my case I drop into plpythonu when I want more complex
solutions.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Melvin Davidson
melvin6925@gmail.com
In reply to: Adrian Klaver (#12)
Re: plpgsql functions organisation

OK, Here is a simple example that shows the difference between using a self
contained function and
one that calls sub functions.

After loading all the functions below, repeat each of the EXPLAIN
statements a few times and note that
callsubs takes almost TWICE as long to execute as nosub.

CREATE OR REPLACE FUNCTION nosub(text)
RETURNS void AS
$BODY$

DECLARE

p_in_str ALIAS FOR $1;

BEGIN
IF LENGTH(p_in_str) <= 6
THEN RAISE NOTICE 'Hi %', p_in_str;
ELSE
RAISE NOTICE 'Hello %', p_in_str;
END IF;

RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION nosub(text)
OWNER TO postgres;

CREATE OR REPLACE FUNCTION called1(text)
RETURNS void AS
$BODY$

DECLARE

p_in_str1 ALIAS FOR $1;

BEGIN
RAISE NOTICE 'Hi %', p_in_str1;

RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION called1(text)
OWNER TO postgres;

CREATE OR REPLACE FUNCTION called2(text)
RETURNS void AS
$BODY$

DECLARE

p_in_str2 ALIAS FOR $1;

BEGIN
RAISE NOTICE 'Hello %', p_in_str2;

RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION called2(text)
OWNER TO postgres;

CREATE OR REPLACE FUNCTION callsubs(text)
RETURNS void AS
$BODY$

DECLARE

p_in_str ALIAS FOR $1;

BEGIN
IF LENGTH(p_in_str) <= 6
THEN PERFORM CALLED1(p_in_str);
ELSE
PERFORM CALLED2(p_in_str);
END IF;

RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION callsubs(text)
OWNER TO postgres;

EXPLAIN ANALYZE SELECT nosub('melvin');

EXPLAIN ANALYZE SELECT callsubs('melvin');

On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 05/02/2015 03:28 PM, Bill Moran wrote:

On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/02/2015 02:07 PM, Jeff Janes wrote:

On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <
adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is
possible to
call a function from within a function.

That being said, I would seriously look at how and why you are
writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is
using existing functions in new functions/classes. In fact in the
Postgres source I see this in many places. Now it is entirely
possible I missed a memo, so I am open to a more detailed
explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Does this actually matter? I am a biologist that backed into computing,
so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works close to
the server and is optimized to do so. I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your needs, then
don't worry about it. plpgsql is around because for most people, it works
well enough. There are certainly cases when you want to create very
complex
logic in the database and plpgsql is liable to make that difficult. But
there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified, because
plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

Agreed, though in my case I drop into plpythonu when I want more complex
solutions.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#14Alban Hertroys
haramrae@gmail.com
In reply to: Melvin Davidson (#13)
Re: plpgsql functions organisation

On 03 May 2015, at 2:56, Melvin Davidson <melvin6925@gmail.com> wrote:

OK, Here is a simple example that shows the difference between using a self contained function and
one that calls sub functions.

After loading all the functions below, repeat each of the EXPLAIN statements a few times and note that
callsubs takes almost TWICE as long to execute as nosub.

CREATE OR REPLACE FUNCTION nosub(text)
RETURNS void AS
$BODY$

...

IF LENGTH(p_in_str) <= 6
THEN RAISE NOTICE 'Hi %', p_in_str;
ELSE
RAISE NOTICE 'Hello %', p_in_str;
END IF;

RETURN;
END;
$BODY$

CREATE OR REPLACE FUNCTION called1(text)
RETURNS void AS
$BODY$

...

RAISE NOTICE 'Hi %', p_in_str1;

RETURN;
END;
$BODY$

CREATE OR REPLACE FUNCTION called2(text)
RETURNS void AS
$BODY$

...

RAISE NOTICE 'Hello %', p_in_str2;

RETURN;
END;

...

That's a rather uninteresting experiment, as all it does is call a function and raise a notice. Relative to what the functions do, the function call itself takes a significant amount of time. No surprise there, you'll see something similar in any language, even C. All you're showing is that calling a function takes some amount of time > 0.

In C, a function call needs to look up an address to jump to, in plpgsql the database needs to look up the function body in a table. If the function is small and atomic it often gets called from multiple other functions and is probably cached anyway. The main difference between C and plpgsql here is that the latter is an interpreted language, so it does need to read in the entire function body after a call - which I'd expect to be quite a bit faster with a smaller (atomic) function body, especially when it hasn't been cached yet.

So far I haven't been convinced.

An actual use-case where the functions actually do something would be far more interesting. I doubt anybody writes functions just to raise a notice. I expect that in reality most plpgsql functions perform database queries and do something with the result. In such cases, function call overhead could be significant if the call is done for each record in a result set, for example. And even then it's worth considering whether that matters to your situation enough that it outweighs the usual benefits of code separation.

On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 05/02/2015 03:28 PM, Bill Moran wrote:
On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/02/2015 02:07 PM, Jeff Janes wrote:
On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is
possible to
call a function from within a function.

That being said, I would seriously look at how and why you are
writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is
using existing functions in new functions/classes. In fact in the
Postgres source I see this in many places. Now it is entirely
possible I missed a memo, so I am open to a more detailed
explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Does this actually matter? I am a biologist that backed into computing,
so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works close to
the server and is optimized to do so. I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your needs, then
don't worry about it. plpgsql is around because for most people, it works
well enough. There are certainly cases when you want to create very complex
logic in the database and plpgsql is liable to make that difficult. But
there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified, because
plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

Agreed, though in my case I drop into plpythonu when I want more complex solutions.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#15Melvin Davidson
melvin6925@gmail.com
In reply to: Alban Hertroys (#14)
Re: plpgsql functions organisation

The point was to show that yes, function calls take time, and using sub
functions take even more time. I am not about to write an additional more
detailed example just to show the same results. If you are in doubt, I
respectfully suggest you do your own testing.

On Sun, May 3, 2015 at 5:26 AM, Alban Hertroys <haramrae@gmail.com> wrote:

On 03 May 2015, at 2:56, Melvin Davidson <melvin6925@gmail.com> wrote:

OK, Here is a simple example that shows the difference between using a

self contained function and

one that calls sub functions.

After loading all the functions below, repeat each of the EXPLAIN

statements a few times and note that

callsubs takes almost TWICE as long to execute as nosub.

CREATE OR REPLACE FUNCTION nosub(text)
RETURNS void AS
$BODY$

...

IF LENGTH(p_in_str) <= 6
THEN RAISE NOTICE 'Hi %', p_in_str;
ELSE
RAISE NOTICE 'Hello %', p_in_str;
END IF;

RETURN;
END;
$BODY$

CREATE OR REPLACE FUNCTION called1(text)
RETURNS void AS
$BODY$

...

RAISE NOTICE 'Hi %', p_in_str1;

RETURN;
END;
$BODY$

CREATE OR REPLACE FUNCTION called2(text)
RETURNS void AS
$BODY$

...

RAISE NOTICE 'Hello %', p_in_str2;

RETURN;
END;

...

That's a rather uninteresting experiment, as all it does is call a
function and raise a notice. Relative to what the functions do, the
function call itself takes a significant amount of time. No surprise there,
you'll see something similar in any language, even C. All you're showing is
that calling a function takes some amount of time > 0.

In C, a function call needs to look up an address to jump to, in plpgsql
the database needs to look up the function body in a table. If the function
is small and atomic it often gets called from multiple other functions and
is probably cached anyway. The main difference between C and plpgsql here
is that the latter is an interpreted language, so it does need to read in
the entire function body after a call - which I'd expect to be quite a bit
faster with a smaller (atomic) function body, especially when it hasn't
been cached yet.

So far I haven't been convinced.

An actual use-case where the functions actually do something would be far
more interesting. I doubt anybody writes functions just to raise a notice.
I expect that in reality most plpgsql functions perform database queries
and do something with the result. In such cases, function call overhead
could be significant if the call is done for each record in a result set,
for example. And even then it's worth considering whether that matters to
your situation enough that it outweighs the usual benefits of code
separation.

On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver <adrian.klaver@aklaver.com>

wrote:

On 05/02/2015 03:28 PM, Bill Moran wrote:
On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/02/2015 02:07 PM, Jeff Janes wrote:
On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is
possible to
call a function from within a function.

That being said, I would seriously look at how and why you are
writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is
using existing functions in new functions/classes. In fact in the
Postgres source I see this in many places. Now it is entirely
possible I missed a memo, so I am open to a more detailed
explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Does this actually matter? I am a biologist that backed into computing,
so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works close to
the server and is optimized to do so. I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it

depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your needs, then
don't worry about it. plpgsql is around because for most people, it works
well enough. There are certainly cases when you want to create very

complex

logic in the database and plpgsql is liable to make that difficult. But
there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified, because
plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

Agreed, though in my case I drop into plpythonu when I want more complex

solutions.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#16Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Melvin Davidson (#15)
Re: plpgsql functions organisation

On 05/03/2015 07:14 AM, Melvin Davidson wrote:

The point was to show that yes, function calls take time, and using sub
functions take even more time. I am not about to write an additional
more detailed example just to show the same results. If you are in
doubt, I respectfully suggest you do your own testing.

Can't resist a challenge. I took an existing function that calculates an
aggregated attendance count for a student or all enrolled students over
a date period and modified it to call sub functions. There are two sub
functions, one that finds the students enrolled over a period(which by
the way calls another function) and dates range they where enrolled. The
other calculates the aggregate values for each student. The original
function is student_attendance, the modified student_attendance_sub. The
results are below, where the first argument is the student_id(where 0
equals all students). The all students version returns 600 rows, the
single student 16 rows.

hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.204865s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.014101s
hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.041182s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.011385s
hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.040762s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.016506s

hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.00291s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.004125s
hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.001907s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.003476s
hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.00597s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE: Time 00:00:00.003986s

Definite difference in the all students run, probably because one of the
called functions is used in a LOOP and caching applies.

On Sun, May 3, 2015 at 5:26 AM, Alban Hertroys <haramrae@gmail.com
<mailto:haramrae@gmail.com>> wrote:

On 03 May 2015, at 2:56, Melvin Davidson <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> wrote:

OK, Here is a simple example that shows the difference between using a self contained function and
one that calls sub functions.

After loading all the functions below, repeat each of the EXPLAIN statements a few times and note that
callsubs takes almost TWICE as long to execute as nosub.

CREATE OR REPLACE FUNCTION nosub(text)
RETURNS void AS
$BODY$

...

IF LENGTH(p_in_str) <= 6
THEN RAISE NOTICE 'Hi %', p_in_str;
ELSE
RAISE NOTICE 'Hello %', p_in_str;
END IF;

RETURN;
END;
$BODY$

CREATE OR REPLACE FUNCTION called1(text)
RETURNS void AS
$BODY$

...

RAISE NOTICE 'Hi %', p_in_str1;

RETURN;
END;
$BODY$

CREATE OR REPLACE FUNCTION called2(text)
RETURNS void AS
$BODY$

...

RAISE NOTICE 'Hello %', p_in_str2;

RETURN;
END;

...

That's a rather uninteresting experiment, as all it does is call a
function and raise a notice. Relative to what the functions do, the
function call itself takes a significant amount of time. No surprise
there, you'll see something similar in any language, even C. All
you're showing is that calling a function takes some amount of time > 0.

In C, a function call needs to look up an address to jump to, in
plpgsql the database needs to look up the function body in a table.
If the function is small and atomic it often gets called from
multiple other functions and is probably cached anyway. The main
difference between C and plpgsql here is that the latter is an
interpreted language, so it does need to read in the entire function
body after a call - which I'd expect to be quite a bit faster with a
smaller (atomic) function body, especially when it hasn't been
cached yet.

So far I haven't been convinced.

An actual use-case where the functions actually do something would
be far more interesting. I doubt anybody writes functions just to
raise a notice. I expect that in reality most plpgsql functions
perform database queries and do something with the result. In such
cases, function call overhead could be significant if the call is
done for each record in a result set, for example. And even then
it's worth considering whether that matters to your situation enough
that it outweighs the usual benefits of code separation.

On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver

<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 03:28 PM, Bill Moran wrote:
On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver <adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 02:07 PM, Jeff Janes wrote:
On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver

<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>>> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL,

but it is

possible to
call a function from within a function.

That being said, I would seriously look at how and why

you are

writing
your functions
as functions that call other functions are not very

efficient.

I am not following. That is what packaging is about,

separating out

'units of work' so they can be combined as needed. Part of

that is

using existing functions in new functions/classes. In fact

in the

Postgres source I see this in many places. Now it is entirely
possible I missed a memo, so I am open to a more detailed
explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Does this actually matter? I am a biologist that backed into

computing,

so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works

close to

the server and is optimized to do so. I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it

depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your

needs, then

don't worry about it. plpgsql is around because for most people,

it works

well enough. There are certainly cases when you want to create

very complex

logic in the database and plpgsql is liable to make that

difficult. But

there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified,

because

plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

Agreed, though in my case I drop into plpythonu when I want more

complex solutions.

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#17Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Adrian Klaver (#3)
Re: plpgsql functions organisation

On 5/2/15 2:32 PM, Adrian Klaver wrote:

On 05/02/2015 09:53 AM, Yves Dorfsman wrote:

I find my plpgsql functions becomes unreadable very quickly. I want to
break
them up in smaller functions.

What is the best way to organised them?
Is there any way to define functions inside functions?
When I list functions in psql, I can see them all at the same level,
is there
any way to organise them in packages like with python and other
languages, so
the smaller functions are hidden away in a package/directory?

The only thing I can think of is to use SCHEMAs;

http://www.postgresql.org/docs/9.4/interactive/sql-createschema.html

You can do a crude form of public vs private methods using schemas; I
frequently define schemas that start with a _ and don't grant USAGE to
general users for those schemas as a way to do that (not defining USAGE
means the schemas won't show up in things like \df). I do wish I could
control visibility separately from USAGE... maybe someday.

As for performance concerns, in 99% of cases code maintainability is
going to be way more important than performance microoptimization. If
you're *that* concerned about performance than plpgsql probably isn't
the right answer anyway.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18Yves Dorfsman
yves@zioup.com
In reply to: Jim Nasby (#17)
Re: plpgsql functions organisation

As for performance concerns, in 99% of cases code maintainability is going to
be way more important than performance microoptimization. If you're *that*
concerned about performance than plpgsql probably isn't the right answer anyway.

Isn't one of the advantage of running on the server to avoid data round trip?

What would you recommend for better performance?

--
http://yves.zioup.com
gpg: 4096R/32B0F416

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#19Jan de Visser
jan@de-visser.net
In reply to: Yves Dorfsman (#18)
Re: plpgsql functions organisation

On May 4, 2015 02:32:14 PM Yves Dorfsman wrote:

As for performance concerns, in 99% of cases code maintainability is going
to be way more important than performance microoptimization. If you're
*that* concerned about performance than plpgsql probably isn't the right
answer anyway.

Isn't one of the advantage of running on the server to avoid data round
trip?

What would you recommend for better performance?

You can run python or perl on the server. That's what Jim meant with 'If
you're *that* concerned about performance than plpgsql probably isn't the
right answer anyway.'

Additionally: Many moons ago I did extensive and aggressive performance
analysis on a system that did many recursive queries. This was before CTEs
(WITH statements) and we settled on recursive plpgsql functions. The queries
in the functions were trivial, but nevertheless the bottleneck was in the
query and data processing, and never in the surrounding infrastructure.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general