How to match sets?
Greetings!
I'm having some troubles creating a query, or rather, I can write one
that works but the approach feels wrong! The problem at hand boils
down to finding a record in a group where each result of two result-
sets matches on some columns.
The actual data I need to match isn't directly from tables but both
sides of the equation are the results of a set-returning function that
breaks up a unit string into separate tokens (base-unit & exponent).
An example of the two sets I need to "join" are, at the left hand side:
unit | token | exponent
-------+-------+----------
m.s^-1 | m | 1
m.s^-1 | s | -1
m.s^-2 | m | 1
m.s^-2 | s | -2
And at the right hand side:
token | exponent
-------+----------
m | 1
s | -2
The goal of the query is to find which unit at the left hand side
matches all the tokens and exponents at the right hand side, which
would be 'm.s^-2' in the above example. The order in which the tokens
are returned can be random, there isn't really a defined order as it
doesn't change the meaning of a unit.
I do have a possible solution using array_accum [1]array_accum is an aggregate from the documentation that transforms a set into an array.[2]The SRF's actually return a type unit_token(token text, exponent int) which makes using array_accum and comparisons easier. on an ordered
version (on unit,token,exponent) of these sets. It's not a pretty
solution though, I'm not happy with it - it's a transformation (from a
set to an array) where I feel none should be necessary. Isn't there a
better solution?
To illustrate, I'd prefer to perform a query somewhat like this:
SELECT unit
FROM unit, tokenize_unit('m.s^-2') AS token
WHERE each(unit.token) = each(token.token)
GROUP BY unit;
But I'm pretty sure it's not possible to use aggregates in the WHERE-
clause.
Definitions for the above are:
CREATE TYPE unit_token AS (
unit text,
exponent int
);
CREATE OR REPLACE FUNCTION tokenize_unit(unit text)
RETURNS SETOF unit_token
AS '@MODULE_PATH@', 'tokenize_unit_text'
LANGUAGE C IMMUTABLE STRICT;
CREATE TABLE token (
unit text NOT NULL REFERENCES unit,
token unit_token NOT NULL
);
[1]: array_accum is an aggregate from the documentation that transforms a set into an array.
a set into an array.
[2]: The SRF's actually return a type unit_token(token text, exponent int) which makes using array_accum and comparisons easier.
int) which makes using array_accum and comparisons easier.
Regards,
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.
!DSPAM:737,4aacebc413788472316367!
In article <C07F9BFD-5FC1-4B8B-BA87-C8BDC47D0136@solfertje.student.utwente.nl>,
Alban Hertroys <dalroi@solfertje.student.utwente.nl> writes:
An example of the two sets I need to "join" are, at the left hand side:
unit | token | exponent
-------+-------+----------
m.s^-1 | m | 1
m.s^-1 | s | -1
m.s^-2 | m | 1
m.s^-2 | s | -2
And at the right hand side:
token | exponent
-------+----------
m | 1
s | -2
The goal of the query is to find which unit at the left hand side
matches all the tokens and exponents at the right hand side, which
would be 'm.s^-2' in the above example. The order in which the tokens
are returned can be random, there isn't really a defined order as it
doesn't change the meaning of a unit.
I do have a possible solution using array_accum [1][2] on an ordered
version (on unit,token,exponent) of these sets. It's not a pretty
solution though, I'm not happy with it - it's a transformation (from a
set to an array) where I feel none should be necessary. Isn't there a
better solution?
Hm, how about a "double negation", i.e. return all units except those
with a non-match? In SQL:
SELECT t1.unit
FROM t1
EXCEPT
SELECT t1.unit
FROM t1
LEFT JOIN t2 ON t2.token = t1.token AND t2.exponent = t1.exponent
WHERE t2.token IS NULL
Hello,
this query on the two "tables" you suggested (named "test_left" and
"test_right") returns the correct result without transformations:
select distinct
t1.unit
from
test_left as t1 inner join
test_left as t2 on t1.unit = t2.unit and t1.token != t2.token and
t1.exponent != t2.exponent
inner join test_right as t3 on t1.token = t3.token and t1.exponent =
t3.exponent
inner join test_right as t4 on t2.token = t4.token and t2.exponent =
t4.exponent;
Regards
Ludwig Kniprath
Alban Hertroys schrieb:
Show quoted text
Greetings!
I'm having some troubles creating a query, or rather, I can write one
that works but the approach feels wrong! The problem at hand boils
down to finding a record in a group where each result of two
result-sets matches on some columns.The actual data I need to match isn't directly from tables but both
sides of the equation are the results of a set-returning function that
breaks up a unit string into separate tokens (base-unit & exponent).An example of the two sets I need to "join" are, at the left hand side:
unit | token | exponent
-------+-------+----------
m.s^-1 | m | 1
m.s^-1 | s | -1
m.s^-2 | m | 1
m.s^-2 | s | -2And at the right hand side:
token | exponent
-------+----------
m | 1
s | -2The goal of the query is to find which unit at the left hand side
matches all the tokens and exponents at the right hand side, which
would be 'm.s^-2' in the above example. The order in which the tokens
are returned can be random, there isn't really a defined order as it
doesn't change the meaning of a unit.I do have a possible solution using array_accum [1][2] on an ordered
version (on unit,token,exponent) of these sets. It's not a pretty
solution though, I'm not happy with it - it's a transformation (from a
set to an array) where I feel none should be necessary. Isn't there a
better solution?To illustrate, I'd prefer to perform a query somewhat like this:
SELECT unit
FROM unit, tokenize_unit('m.s^-2') AS token
WHERE each(unit.token) = each(token.token)
GROUP BY unit;But I'm pretty sure it's not possible to use aggregates in the
WHERE-clause.Definitions for the above are:
CREATE TYPE unit_token AS (
unit text,
exponent int
);CREATE OR REPLACE FUNCTION tokenize_unit(unit text)
RETURNS SETOF unit_token
AS '@MODULE_PATH@', 'tokenize_unit_text'
LANGUAGE C IMMUTABLE STRICT;CREATE TABLE token (
unit text NOT NULL REFERENCES unit,
token unit_token NOT NULL
);[1] array_accum is an aggregate from the documentation that transforms
a set into an array.
[2] The SRF's actually return a type unit_token(token text, exponent
int) which makes using array_accum and comparisons easier.Regards,
Alban Hertroys--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.!DSPAM:737,4aacebc413788472316367!