To all the pgsql developers..Have a look at the operators proposed by me in my research paper.
Dear All,
I am explaining here how i have defined the operators NEAR and NOT NEAR in my research paper. I want critics/comments from all of you on the operators.
These operators belong to the category of “Fuzzy Operators” proposed in the paper for ANSI SQL.
NEAR
It deals with the NUMBER and DATE datatypes simulating the human behavior and processing the information contained in NEAR in the same way as we humans take it. This is a binary operator with the syntax:
op1 NEAR op2
Here, the op1 refers to an attribute, whereas op2 is a fixed value, both of the same datatype.
Suppose we want a list of all the VGAs, price of which should be somewhere around 30$ .. the query will look like:
SELECT *
FROM accessories
WHERE prod_name = ‘VGA’
AND prod_price NEAR 30
A query for the datatype DATE will look like:
SELECT *
FROM sales
WHERE item = ’printer’
AND s_date NEAR 10-7-06
The algorithm for the NEAR operator works as follows:
The margins to the op2, i.e. m1 and m2, are added dynamically on both the sides, considering the value it contains. To keep this margin big is important for a certain reason discussed later.
The NEAR operator is supposed to obtain the values near to the op2, thus the target membership degree(md) is initially set to 0.8.
The algorithm compares the op1(column) values row by row to the elements of the set that NEAR defined, i.e. the values from md 1.0 to 0.8, adding matching tuples to the result set.
4. It is very much possible that the result set is empty since no values within the range exist in the column. Thus, the algorithm checks for empty result set, and in that case, decreases the target md by 0.2 and jumps to step 3. This is the reason big margins to the op2 are added.
5. In case there are no values in op1 that are between m1 and m2 (where the membership degree of the values with respect to NEAR becomes 0.1) and the result set is empty, the algorithm fetches the two nearest values (tuples) to op2, one smaller and one larger than the op2, as the result.
The algorithm will give an empty result only if the table referred to in the query is empty.
2. NOT NEAR
This operator is also a binary operator, dealing with the datatype NUMBER and DATE. It has the syntax:
op1 NOT NEAR op2
The op1 refers to an attribute, whereas op2 is a fixed value, both of the same data type.
A query containing the operator looks like:
SELECT id, name, age, history
FROM casualties
WHERE cause = ‘heart attack’
AND age NOT NEAR 55
Or suppose we need a list of some event that is not clashing with some commitment of ours:
SELECT *
FROM events
WHERE e_name= ‘concert’
AND date NOT NEAR 8/28/2007
The algorithm for NOT NEAR works like this:
First of all it adds the margins to the op2, i.e. m1 and m2, dynamically on both the sides, considering the value op2 contains.
op1 values outside the scope of the op2 (m1, m2) are retrieved and added to the result.
If the result set is empty, the farthest values within the op2 fuzzy set (those possessing the least membership degree) are retrieved. This is done by continuing the search from values with md=0.1 till the md=0.6, where the md for NOT NEAR reaches 0.4.
Regards,
Tasneem Ishaque Memon MS (Computer Science) Institute of Mathmetics and Computer Science University of Sindh, Jamshoro Pakistan http://www.geocities.com/tasneememon/- The process of gaining knowledge is not producing results but clearing away ignorance.
_________________________________________________________________
Discover the new Windows Vista
http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE
On Jun 1, 2007, at 8:24 AM, Tasneem Memon wrote:
NEAR
It deals with the NUMBER and DATE datatypes simulating the human
behavior and processing the
Why just number and date?
information contained in NEAR in the same way as we humans take it.
This is a binary operator with the syntax:
op1 NEAR op2
Here, the op1 refers to an attribute, whereas op2 is a fixed value,
both of the same datatype.
Suppose we want a list of all the VGAs, price of which should be
somewhere around 30$ .. the query will look like:SELECT *
FROM accessories
WHERE prod_name = ‘VGA’
AND prod_price NEAR 30A query for the datatype DATE will look like:
SELECT *
FROM sales
WHERE item = ’printer’
AND s_date NEAR 10-7-06The algorithm for the NEAR operator works as follows:
The margins to the op2, i.e. m1 and m2, are added dynamically on
both the sides, considering the value it contains. To keep this
margin big is important for a certain reason discussed later.
The NEAR operator is supposed to obtain the values near to the op2,
thus the target membership degree(md) is initially set to 0.8.
The algorithm compares the op1(column) values row by row to the
elements of the set that NEAR defined, i.e. the values from md 1.0
to 0.8, adding matching tuples to the result set.
How would one change 0.8 to some other value?
4. It is very much possible that the result set is empty since
no values within the range exist in the column. Thus, the algorithm
checks for empty result set, and in that case, decreases the target
md by 0.2 and jumps to step 3. This is the reason big margins to
the op2 are added.
5. In case there are no values in op1 that are between m1 and
m2 (where the membership degree of the values with respect to NEAR
becomes 0.1) and the result set is empty, the algorithm fetches the
two nearest values (tuples) to op2, one smaller and one larger than
the op2, as the result.The algorithm will give an empty result only if the table referred
to in the query is empty.2. NOT NEAR
This operator is also a binary operator, dealing with
the datatype NUMBER and DATE. It has the syntax:
op1 NOT NEAR op2
The op1 refers to an attribute, whereas op2 is a fixed value, both
of the same data type.
A query containing the operator looks like:SELECT id, name, age, history
FROM casualties
WHERE cause = ‘heart attack’
AND age NOT NEAR 55Or suppose we need a list of some event that is not clashing with
some commitment of ours:SELECT *
FROM events
WHERE e_name= ‘concert’
AND date NOT NEAR 8/28/2007The algorithm for NOT NEAR works like this:
First of all it adds the margins to the op2, i.e. m1 and m2,
dynamically on both the sides, considering the value op2 contains.
op1 values outside the scope of the op2 (m1, m2) are retrieved and
added to the result.
If the result set is empty, the farthest values within the op2
fuzzy set (those possessing the least membership degree) are
retrieved. This is done by continuing the search from values with
md=0.1 till the md=0.6, where the md for NOT NEAR reaches 0.4.
Why isn't this just the exact opposite set of NEAR?
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Tasneem,
The margins to the op2, i.e. m1 and m2, are added dynamically on
both the sides, considering the value it contains. To keep this
margin big is important for a certain reason discussed later.
The NEAR operator is supposed to obtain the values near to the op2,
thus the target membership degree(md) is initially set to 0.8.
The algorithm compares the op1(column) values row by row to the
elements of the set that NEAR defined, i.e. the values from md 1.0
to 0.8, adding matching tuples to the result set.
Are we talking about a mathematical calculation on the values, or an algorithm
against the population of the result set? I'm presuming the latter or you
could just use a function. If so, is NEAR an absolute range or based on
something logarithmic like standard deviation?
Beyond that, I would think that this mechanism would need some kind of extra
heuristics to be at all performant, otherwise you're querying the entire
table (or at least the entire index) every time you run a query. Have you
given any thought to this?
--
Josh Berkus
PostgreSQL @ Sun
San Francisco