Calculating Minkowski distance between two rows
Greetings everyone,
I'm a novice plpgsql user.
For an application, I'm trying to write a user-defined function that takes
a row of some table (let's say with k fields) and takes another row from
another table (again with k fields); then calculate the Euclidean,
Manhattan or generally Minkowski distance (with some p) and then return an
integer.
I've written this:
CREATE FUNCTION euclidean_distance(row1 table_train, row2 table_test, OUT
distance DOUBLE PRECISION) AS $$
DECLARE
tmp DOUBLE PRECISION;
BEGIN
FOR col IN SELECT column_name FROM information_schema.columns WHERE
table_name=table_train LOOP
tmp := (row1.col - row2.col);
distance += tmp*tmp;
END LOOP;
distance := sqrt(distance);
END;
$$ LANGUAGE plpgsql;
Could anyone please help me fix this function so that I can pass any two
rows of two tables (with same number of columns) and have their distance
returned.
Best regards,
Babak Alipour
--
*Babak Alipour ,*
*University of Florida*
On 04/25/2016 07:07 AM, Babak Alipour wrote:
Greetings everyone,
I'm a novice plpgsql user.
For an application, I'm trying to write a user-defined function that
takes a row of some table (let's say with k fields) and takes another
row from another table (again with k fields); then calculate the
Euclidean, Manhattan or generally Minkowski distance (with some p) and
then return an integer.
I've written this:CREATE FUNCTION euclidean_distance(row1 table_train, row2 table_test,
OUT distance DOUBLE PRECISION) AS $$
DECLARE
tmp DOUBLE PRECISION;
BEGIN
FOR col IN SELECT column_name FROM information_schema.columns WHERE
table_name=table_train LOOP
tmp := (row1.col - row2.col);
distance += tmp*tmp;
END LOOP;
distance := sqrt(distance);
END;
$$ LANGUAGE plpgsql;Could anyone please help me fix this function so that I can pass any two
rows of two tables (with same number of columns) and have their distance
returned.
You are already doing that, so do you mean any two rows of any two tables?
Best regards,
Babak Alipour--
*/Babak Alipour ,/*
*/University of Florida/*
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
That is correct. The function I've written only works when the two tables
are named table_train and table_test; is it possible to generalize that to
take in any two tables?
Thanks in advance.
Babak
On Mon, Apr 25, 2016 at 10:24 AM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
On 04/25/2016 07:07 AM, Babak Alipour wrote:
Greetings everyone,
I'm a novice plpgsql user.
For an application, I'm trying to write a user-defined function that
takes a row of some table (let's say with k fields) and takes another
row from another table (again with k fields); then calculate the
Euclidean, Manhattan or generally Minkowski distance (with some p) and
then return an integer.
I've written this:CREATE FUNCTION euclidean_distance(row1 table_train, row2 table_test,
OUT distance DOUBLE PRECISION) AS $$
DECLARE
tmp DOUBLE PRECISION;
BEGIN
FOR col IN SELECT column_name FROM information_schema.columns WHERE
table_name=table_train LOOP
tmp := (row1.col - row2.col);
distance += tmp*tmp;
END LOOP;
distance := sqrt(distance);
END;
$$ LANGUAGE plpgsql;Could anyone please help me fix this function so that I can pass any two
rows of two tables (with same number of columns) and have their distance
returned.You are already doing that, so do you mean any two rows of any two tables?
Best regards,
Babak Alipour--
*/Babak Alipour ,/*
*/University of Florida/*--
Adrian Klaver
adrian.klaver@aklaver.com
--
*Babak Alipour ,*
*University of Florida*
On 04/25/2016 07:26 AM, Babak Alipour wrote:
That is correct. The function I've written only works when the two
tables are named table_train and table_test; is it possible to
generalize that to take in any two tables?
I'm heading out the door and off the top of my head:
1) Get tables names as text.
2) Get the row conditions as text.
3) Use EXECUTE to build a query string:
http://www.postgresql.org/docs/9.5/interactive/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
In particular:
"A cleaner approach is to use format()'s %I specification for table or
column names (strings separated by a newline are concatenated):"
http://www.postgresql.org/docs/9.5/static/functions-string.html#FUNCTIONS-STRING-FORMAT
4) SELECT the result INTO a record variable:
http://www.postgresql.org/docs/9.5/interactive/plpgsql-declarations.html#PLPGSQL-DECLARATION-RECORDS
5) Do your calculations
6) Not considered, validating that number of table columns are the same
and the types are compatible.
Thanks in advance.
Babak
On Mon, Apr 25, 2016 at 10:24 AM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:On 04/25/2016 07:07 AM, Babak Alipour wrote:
Greetings everyone,
I'm a novice plpgsql user.
For an application, I'm trying to write a user-defined function that
takes a row of some table (let's say with k fields) and takes
another
row from another table (again with k fields); then calculate the
Euclidean, Manhattan or generally Minkowski distance (with some
p) and
then return an integer.
I've written this:CREATE FUNCTION euclidean_distance(row1 table_train, row2
table_test,
OUT distance DOUBLE PRECISION) AS $$
DECLARE
tmp DOUBLE PRECISION;
BEGIN
FOR col IN SELECT column_name FROM information_schema.columns WHERE
table_name=table_train LOOP
tmp := (row1.col - row2.col);
distance += tmp*tmp;
END LOOP;
distance := sqrt(distance);
END;
$$ LANGUAGE plpgsql;Could anyone please help me fix this function so that I can pass
any two
rows of two tables (with same number of columns) and have their
distance
returned.You are already doing that, so do you mean any two rows of any two
tables?Best regards,
Babak Alipour--
*/Babak Alipour ,/*
*/University of Florida/*--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>--
*/Babak Alipour ,/*
*/University of Florida/*
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hi:
On Mon, Apr 25, 2016 at 4:26 PM, Babak Alipour <babak.alipour@gmail.com> wrote:
That is correct. The function I've written only works when the two tables
are named table_train and table_test; is it possible to generalize that to
take in any two tables?
And only when all table_train columns are numbers AND table_test
contains at least all of them AND they are numbers too.
Wouldn't it be easier to use numeric arrays to represent coordinate vectors?
Anyway, I'm not versed in all of this, wbut I would try to make a
function to turn a table record to a numeric array and then write the
numeric array version of the func and call them, divide and conquer.
Francisco Olarte.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
I am inclined to go with Francisco's solution
Daniel
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Francisco Olarte
Sent: April-25-16 10:46
To: Babak Alipour
Cc: Adrian Klaver; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Calculating Minkowski distance between two rows
Hi:
On Mon, Apr 25, 2016 at 4:26 PM, Babak Alipour <babak.alipour@gmail.com> wrote:
That is correct. The function I've written only works when the two
tables are named table_train and table_test; is it possible to
generalize that to take in any two tables?
And only when all table_train columns are numbers AND table_test contains at least all of them AND they are numbers too.
Wouldn't it be easier to use numeric arrays to represent coordinate vectors?
Anyway, I'm not versed in all of this, wbut I would try to make a function to turn a table record to a numeric array and then write the numeric array version of the func and call them, divide and conquer.
Francisco Olarte.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general