CTE and function
Hello,
I'm trying to convert a select after a CTE into a function for generic
use. The CTE is normally a complex query but I want to capsulate then
the calculation of the Gini coefficient it into a function:
Based on:
http://www.heckler.com.br/blog/2010/06/15/gini-coeficient-having-fun-in-both-sql-and-python/
Details at: https://en.wikipedia.org/wiki/Gini_coefficient
================================================================================================================================================================
= OK
================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT
((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient
FROM
( SELECT
SUM(PiXi) AS PiXi_sum,
COUNT(*) AS N,
(SELECT AVG(col) FROM tab) AS u
FROM
( SELECT
row_number() OVER() * col AS PiXi
FROM
(SELECT col FROM tab ORDER BY col DESC) t1
) t2
) t3
;
================================================================================================================================================================
= OK: Create function
================================================================================================================================================================
CREATE OR REPLACE FUNCTION gini_coefficient(IN table_name text, IN
column_name text, OUT gini_coefficient DOUBLE PRECISION) AS $$
BEGIN
EXECUTE format('
SELECT
((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient
FROM
( SELECT
SUM(PiXi) AS PiXi_sum,
COUNT(*) AS N,
(SELECT AVG(%s) FROM %s) AS u
FROM
( SELECT
row_number() OVER() * col AS PiXi
FROM
(SELECT %s FROM %s ORDER BY %s DESC) t1
) t2
) t3
;
', column_name, table_name, column_name, table_name, column_name)
INTO gini_coefficient;
END
$$ LANGUAGE plpgsql;
================================================================================================================================================================
= NOT OK:
================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT gini_coefficient('tab', 'col');
ERROR: relation "tab" does not exist
LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1
================================================================================================================================================================
= NOT OK:
================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT * FROM gini_coefficient('tab', 'col');
ERROR: relation "tab" does not exist
LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1
So it looks like the table tab from the CTE is not available in the
function.
Any ideas how to solve it and an explaination would be fine?
Thank you.
Ciao,
Gerhard
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
This is my understanding, hopefully someone will chime in if I'm off. Using
EXECUTE, the SQL is executed in a separate context than the current
statement. So it's checking for a table with the name you pass, not aliases
within the current statement.
Giving the function another parameter to allow an expression lets this run.
That said, running user-given SQL is very insecure, and this should never be
run in a live database. I changed the formatting to make it easier (for me)
to read. Note the two table aliases ("as tab") where the table is passed
in.
CREATE OR REPLACE FUNCTION gini_coefficient(IN table_name text
,IN table_expression text
,IN column_name text
,OUT gini_coefficient DOUBLE
PRECISION
) AS
$$
BEGIN
EXECUTE format('SELECT ((N+1.0)/(N-1.0)) -
((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient
FROM (SELECT SUM(PiXi) AS PiXi_sum
, COUNT(*) AS N
, (SELECT AVG(%s) FROM %s as tab) AS u
FROM (SELECT row_number() OVER() * col AS
PiXi
FROM (SELECT %s FROM %s as tab ORDER
BY %s DESC) t1
) t2
) t3;'
, column_name
, COALESCE (table_name
,'(' || table_expression || ')'
)
, column_name
, COALESCE (table_name
,'(' || table_expression || ')'
)
, column_name
)
INTO gini_coefficient;
END;
$$
LANGUAGE plpgsql;
SELECT gini_coefficient(null
,'SELECT unnest(ARRAY[1,2,3,4]) AS col'
,'col'
);
A better solution is (using your original definition for gini_coefficient):
create view tab as SELECT unnest(ARRAY[1,2,3,4]) AS col;
SELECT gini_coefficient('tab','col');
Ben
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Gerhard Wiesinger
Sent: Thursday, February 25, 2016 5:32 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] CTE and function
Hello,
I'm trying to convert a select after a CTE into a function for generic use.
The CTE is normally a complex query but I want to capsulate then the
calculation of the Gini coefficient it into a function:
Based on:
http://www.heckler.com.br/blog/2010/06/15/gini-coeficient-having-fun-in-both-sql-and-python/
Details at: https://en.wikipedia.org/wiki/Gini_coefficient
================================================================================================================================================================
= OK
================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT
((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient
FROM
( SELECT
SUM(PiXi) AS PiXi_sum,
COUNT(*) AS N,
(SELECT AVG(col) FROM tab) AS u
FROM
( SELECT
row_number() OVER() * col AS PiXi
FROM
(SELECT col FROM tab ORDER BY col DESC) t1
) t2
) t3
;
================================================================================================================================================================
= OK: Create function
================================================================================================================================================================
CREATE OR REPLACE FUNCTION gini_coefficient(IN table_name text, IN
column_name text, OUT gini_coefficient DOUBLE PRECISION) AS $$ BEGIN EXECUTE
format('
SELECT
((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient
FROM
( SELECT
SUM(PiXi) AS PiXi_sum,
COUNT(*) AS N,
(SELECT AVG(%s) FROM %s) AS u
FROM
( SELECT
row_number() OVER() * col AS PiXi
FROM
(SELECT %s FROM %s ORDER BY %s DESC) t1
) t2
) t3
;
', column_name, table_name, column_name, table_name, column_name) INTO
gini_coefficient; END $$ LANGUAGE plpgsql;
================================================================================================================================================================
= NOT OK:
================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT gini_coefficient('tab', 'col');
ERROR: relation "tab" does not exist
LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1
================================================================================================================================================================
= NOT OK:
================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT * FROM gini_coefficient('tab', 'col');
ERROR: relation "tab" does not exist
LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1
So it looks like the table tab from the CTE is not available in the
function.
Any ideas how to solve it and an explaination would be fine?
Thank you.
Ciao,
Gerhard
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thu, Feb 25, 2016 at 3:31 AM, Gerhard Wiesinger <lists@wiesinger.com>
wrote:
================================================================================================================================================================
= NOT OK:================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT gini_coefficient('tab', 'col');ERROR: relation "tab" does not exist
LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1================================================================================================================================================================
= NOT OK:================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT * FROM gini_coefficient('tab', 'col');ERROR: relation "tab" does not exist
LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1So it looks like the table tab from the CTE is not available in the
function.Any ideas how to solve it and an explaination would be fine?
Not tested but:
CREATE TEMP TABLE tab AS SELECT ... AS col;
SELECT * FROM gini_ciefficient('tab','col');
A function is able to access (session) global objects and whatever data is
passed in to it via is parameters.
I don't know if there is any fundamental reason the contents of a CTE
cannot be seen by a function executing in the same context but that is not
how it works today.
So turn the CTE into its own standalone TABLE and you should be able to
then refer to it by name in subsequent queries. It works for actual
queries and so functions should be no different.
David J.