Binary-compatible types vs. overloaded operators

Started by Tom Laneabout 27 years ago10 messageshackers

tgl@sss.pgh.pa.us

about 27 years ago

I have checked in fixes for all of the genuine bugs that I found in
pg_operator and pg_proc by means of mechanical consistency checks.

I would like to add these consistency checks to the regression tests,
but right now they still produce some bogus "failures":

QUERY: SELECT p1.oid, p1.oprname, p2.oid, p2.proname
FROM pg_operator AS p1, pg_proc AS p2
WHERE p1.oprcode = p2.oid AND
p1.oprkind = 'b' AND
(p2.pronargs != 2 OR
p1.oprresult != p2.prorettype OR
(p1.oprleft != p2.proargtypes[0] AND p2.proargtypes[0] != 0) OR
(p1.oprright != p2.proargtypes[1] AND p2.proargtypes[1] != 0));
oid|oprname| oid|proname
----+-------+----+-------------
609|< | 66|int4lt
610|> | 147|int4gt
611|<= | 149|int4le
612|>= | 150|int4ge
974||| |1258|textcat
979||| |1258|textcat
1055|~ |1254|textregexeq
1056|!~ |1256|textregexne
1063|~ |1254|textregexeq
1064|!~ |1256|textregexne
1211|~~ | 850|textlike
1212|!~~ | 851|textnlike
1213|~~ | 850|textlike
1214|!~~ | 851|textnlike
1232|~* |1238|texticregexeq
1233|!~* |1239|texticregexne
1234|~* |1238|texticregexeq
1235|!~* |1239|texticregexne
820|= | 920|network_eq
821|<> | 925|network_ne
822|< | 921|network_lt
823|<= | 922|network_le
824|> | 923|network_gt
825|>= | 924|network_ge
826|<< | 927|network_sub
827|<<= | 928|network_subeq
828|>> | 929|network_sup
1004|>>= | 930|network_supeq
(28 rows)

All of these mismatches occur because pg_operator contains more than
one entry for each of the underlying procs. For example, oid 974
is the operator for "bpchar || bpchar", which is implemented by
the same proc as "text || text". That's OK because the two types are
binary-compatible. But there's no good way for an automated test to
know that it's OK.

I see a couple of different ways to deal with this:

1. Drop all of the above pg_operator entries. They are all redundant
anyway, given that in each case the data types named by the operator
are considered binary-compatible with those named by the underlying
proc. If these entries were not present, the parser would still find
the operator, it'd just match against the pg_operator entry that names
the underlying type.

2. Make additional entries in pg_proc so that all of the above operators
can point to pg_proc entries that agree with them as to datatypes.
(These entries could still point at the same underlying C function,
of course.)

3. Extend the pg_type catalog to provide info about binary compatibility
of different types, so that the opr_sanity regress test could discover
whether a type mismatch is really a problem or not.

I like option #1 because it is the least work ;-). The only real
objection to it is that if we go down that path, we're essentially
saying that the only way to use the same proc to operate on multiple
data types is to declare the data types binary-equivalent --- that is,
to allow the data types to be substituted for each other in *every*
operation on those types. I can imagine having a couple of types that
you want to share one or two operations for, but not go so far as to
mark them binary-equivalent. But we have no examples of this --- all of
the existing cases of overloaded operators are for types that actually
are declared binary-equivalent.

Option #2 is nothing but a hack; it would get the job done, but not
elegantly.

Option #3 is the most work, and it would also imply making the regress
test a lot slower since it'd have to join more tables to discover
whether there is a problem or not. But conceptually it's the cleanest
answer, if we can figure out exactly what info has to be stored.

I think we might as well remove the above-named operators from
pg_operator in any case; they're just dead weight given the existence
of binary-compatibility declarations for the underlying data types.
The question is whether we need to allow for future operators to link
to pg_proc entries that name different data types that are *not* marked
fully binary compatible. And if so, how could we teach the regress test
not to complain? (Maybe we don't have to; just add the exceptions to
the expected output from the test. That's an ugly answer though...)

Comments?

regards, tom lane

James Thompson

jamest@math.ksu.edu

about 27 years ago

In reply to: Tom Lane (#1)

What unresolved issues are in CVS?

I'm currently using the CVS code as part of a project I'm working on and
have not run any any problems. Today at work, looking at my oracle
processes which look like

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
6174 oradba 23 0 33M 12M sleep 0:00 0.18% 0.12% oracle
6166 oradba 33 0 35M 12M sleep 0:00 0.16% 0.11% oracle
6168 oradba 33 0 33M 10M sleep 0:00 0.13% 0.08% oracle
6170 oradba 33 0 33M 10M sleep 0:00 0.13% 0.08% oracle
6176 oradba 27 0 33M 10M sleep 0:00 0.08% 0.05% oracle
6172 oradba 33 0 33M 10M sleep 0:00 0.08% 0.05% oracle
351 oradba 33 0 11M 2944K sleep 0:00 0.03% 0.02% tnslsnr

I've started to wonder why bother with this bloated beast at all. We are
just now moving to an RDBMS and so its the perfect time to switch (Oracle
doesn't cost us a cent so no money lost). Problem is, I'd rather not go
back to 6.4.2. I'm thinking move the Oracle stuff to CVS postgresql and
just dealing with the problems if any come up. I'd hope that would
provide you guys with additional feedback and thus possibly help improve
the code base prior to 6.5.

So, are there any show stoppers in CVS postgresql. I myself haven't hit
anything, but I'm not really pounding on it yet.

->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<
James Thompson 138 Cardwell Hall Manhattan, Ks 66506 785-532-0561 Kansas
State University Department of Mathematics
->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<

Bruce Momjian

bruce@momjian.us

about 27 years ago

In reply to: James Thompson (#2)

Re: [HACKERS] What unresolved issues are in CVS?

I've started to wonder why bother with this bloated beast at all. We are
just now moving to an RDBMS and so its the perfect time to switch (Oracle
doesn't cost us a cent so no money lost). Problem is, I'd rather not go
back to 6.4.2. I'm thinking move the Oracle stuff to CVS postgresql and
just dealing with the problems if any come up. I'd hope that would
provide you guys with additional feedback and thus possibly help improve
the code base prior to 6.5.

So, are there any show stoppers in CVS postgresql. I myself haven't hit
anything, but I'm not really pounding on it yet.

The only thing I know of is that the new MVCC code doesn't vacuum
properly yet. It does vacuum, I just suspect is does not know how to
prevent vacuuming of rows that are being viewed by other backends.

Perhaps Vadim can comment on that.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

James Thompson

jamest@math.ksu.edu

about 27 years ago

In reply to: Bruce Momjian (#3)

Re: [HACKERS] What unresolved issues are in CVS?

On Thu, 11 Mar 1999, Bruce Momjian wrote:

The only thing I know of is that the new MVCC code doesn't vacuum
properly yet. It does vacuum, I just suspect is does not know how to
prevent vacuuming of rows that are being viewed by other backends.

Perhaps Vadim can comment on that.

So if other backends are not active then vacuum works OK? Nightly vacuums
on inactive databases are OK? If so then its "good enough for government
work" :)

->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<
James Thompson 138 Cardwell Hall Manhattan, Ks 66506 785-532-0561
Kansas State University Department of Mathematics
->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<

Bruce Momjian

bruce@momjian.us

about 27 years ago

In reply to: James Thompson (#4)

Re: [HACKERS] What unresolved issues are in CVS?

On Thu, 11 Mar 1999, Bruce Momjian wrote:

The only thing I know of is that the new MVCC code doesn't vacuum
properly yet. It does vacuum, I just suspect is does not know how to
prevent vacuuming of rows that are being viewed by other backends.

Perhaps Vadim can comment on that.

So if other backends are not active then vacuum works OK? Nightly vacuums
on inactive databases are OK? If so then its "good enough for government
work" :)

My guess is that this is true.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Clark Evans

clark.evans@manhattanproject.com

about 27 years ago

In reply to: James Thompson (#4)

Parameterized Types and Session Memory

Questions:

a) Parameterized Types

I was wondering if a parameter mechanism is/will be
available in 4.5 for user defined types. Currently,
VARCHAR(10), NUMBER(6,2) are parameterized types,
this is more or less what I'm after (only it'd be
great if strings could be parameters too)

Why? I'm creating types from abstract algebra
(group theory) and there are an infinite number
of them, thus I need several parameters to specify
the information.

Even if it allowed one integer, this would be cool,
since I could load a table with the options, and
then pass the OID of the tuple.

b) Session Memory

I was wondering how I can get to memory area associated
with a user's session, similar to the package level
variables in Oracle.

Thanks!

Clark

James Thompson

jamest@math.ksu.edu

about 27 years ago

In reply to: Bruce Momjian (#5)

INT64_FORMAT missing

Todays CVS
Solaris 2.5.1
GCC 2.7.2.2

make[3]: Entering directory `/home/postgres/pgsql/src/backend/utils/adt'
gcc -I../../../include -I../../../backend -Wall -Wmissing-prototypes
-I../.. -c int8.c -o int8.o
int8.c: In function `int8out':
int8.c:83: `INT64_FORMAT' undeclared (first use this function)
int8.c:83: (Each undeclared identifier is reported only once
int8.c:83: for each function it appears in.)
make: *** [int8.o] Error 1

From the configure.in file I assume this is the part you'd need to see

checking for snprintf... (cached) no
checking for vsnprintf... (cached) no
checking for isinf... (cached) no
checking for getrusage... (cached) yes
checking for srandom... (cached) yes
checking for gethostname... (cached) yes
checking for random... (cached) yes
checking for inet_aton... (cached) no
checking for strerror... (cached) yes
checking for strdup... (cached) yes
checking for strtol... (cached) yes
checking for strtoul... (cached) yes
checking for strcasecmp... (cached) yes
checking for cbrt... (cached) yes
checking for rint... (cached) yes
checking whether 'long int' is 64 bits... no
checking whether 'long long int' is 64 bits... yes

The (cached) is due to my running it a second time to clip this output.

->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<
James Thompson 138 Cardwell Hall Manhattan, Ks 66506 785-532-0561
Kansas State University Department of Mathematics
->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<

Tom Lane

tgl@sss.pgh.pa.us

about 27 years ago

In reply to: James Thompson (#7)

Re: [HACKERS] INT64_FORMAT missing

James Thompson <jamest@math.ksu.edu> writes:

Todays CVS
int8.c: In function `int8out':
int8.c:83: `INT64_FORMAT' undeclared (first use this function)
int8.c:83: (Each undeclared identifier is reported only once
int8.c:83: for each function it appears in.)

Fixed, I hope.

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofThu11Mar1999164138-0600Pine.GSO.3.96.990311153643.16146A-100000@euler.math.ksu.edu | Resolved by subject fallback

Peter T Mount

peter@retep.org.uk

about 27 years ago

In reply to: Bruce Momjian (#5)

Re: [HACKERS] What unresolved issues are in CVS?

On Thu, 11 Mar 1999, Bruce Momjian wrote:

On Thu, 11 Mar 1999, Bruce Momjian wrote:

The only thing I know of is that the new MVCC code doesn't vacuum
properly yet. It does vacuum, I just suspect is does not know how to
prevent vacuuming of rows that are being viewed by other backends.

Perhaps Vadim can comment on that.

So if other backends are not active then vacuum works OK? Nightly vacuums
on inactive databases are OK? If so then its "good enough for government
work" :)

My guess is that this is true.

It is true for me ;-)

--
Peter T Mount peter@retep.org.uk
Main Homepage: http://www.retep.org.uk
PostgreSQL JDBC Faq: http://www.retep.org.uk/postgres
Java PDF Generator: http://www.retep.org.uk/pdf

#10

Thomas Lockhart

lockhart@alumni.caltech.edu

about 27 years ago

In reply to: Tom Lane (#1)

Re: [HACKERS] Binary-compatible types vs. overloaded operators

I have checked in fixes for all of the genuine bugs that I found in
pg_operator and pg_proc by means of mechanical consistency checks.
I would like to add these consistency checks to the regression tests,
but right now they still produce some bogus "failures":

All of these mismatches occur because pg_operator contains more than
one entry for each of the underlying procs. For example, oid 974
is the operator for "bpchar || bpchar", which is implemented by
the same proc as "text || text". That's OK because the two types are
binary-compatible. But there's no good way for an automated test to
know that it's OK.
I see a couple of different ways to deal with this:
1. Drop all of the above pg_operator entries. They are all redundant
anyway, given that in each case the data types named by the operator
are considered binary-compatible with those named by the underlying
proc. If these entries were not present, the parser would still find
the operator, it'd just match against the pg_operator entry that names
the underlying type.

Just a comment: types which are brute-force allowed to be binary
compatible (brute-force because it is compiled into the code rather
than entered into a table) would not be handled exactly the same as if
there were an explicit entry for them. With explicit entries there is
an exact match on for the operator found by the parser on its first
try. With binary compatibility but no explicit entry then the parser
tries first for that explicit match, fails, and then tries some
heuristics to get a good alternative. I would think that for anything
other than *very* small queries and tables the extra time would be
negligible.

3. Extend the pg_type catalog to provide info about binary
compatibility of different types, so that the opr_sanity regress test
could discover whether a type mismatch is really a problem or not.

This is the elegant solution of course, and probably a lot of work :)

- Tom