BYTEA, indexes and "like"

Started by Alvar Freudeover 23 years ago23 messagesgeneral
Jump to latest
#1Alvar Freude
alvar@a-blast.org

Hi,

it seems, that a BYTEA fiels doesn't support indexes in WHERE-Statement
with a "like" condition:

test=# EXPLAIN ANALYZE SELECT COUNT(*) FROM bytea_test
WHERE bytea_field like '\\000\\000\\001%';
NOTICE: QUERY PLAN:

Aggregate (cost=668.72..668.72 rows=1 width=0) (actual time=16.63..16.64
rows=1 loops=1)
-> Seq Scan on bytea_test (cost=0.00..668.71 rows=1 width=0) (actual
time=6.65..15.69 rows=145 loops=1)
Total runtime: 16.77 msec

But an equal condition uses indexes:

test=# EXPLAIN ANALYZE SELECT COUNT(*) FROM bytea_test
WHERE bytea_field = '\\000\\000\\001';
NOTICE: QUERY PLAN:

Aggregate (cost=5.93..5.93 rows=1 width=0) (actual time=0.10..0.10
rows=1 loops=1)
-> Index Scan using bytea_test_bytea_field_idx on bytea_test
(cost=0.00..5.93 rows=1 width=0) (actual time=0.05..0.07 rows=1 loops=1)
Total runtime: 0.24 msec

With text fields instead of bytea indexes are used also with like, but
there are no nullbytes allowed and they are sorted using locales (if not
switched of).

Are there plans to allow also the "like" conditions on bytea indexes?

Thnx and Ciao
Alvar

--
** ODEM ist für den poldi Award nominiert! http://www.poldiaward.de/
** http://www.poldiaward.de/index.php?display=detail&cat=audi&item=24
** http://odem.org/
** Mehr Projekte: http://alvar.a-blast.org/

#2Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
bytea operator bugs (was Re: [GENERAL] BYTEA, indexes and "like")

Alvar Freude wrote:

it seems, that a BYTEA fiels doesn't support indexes in WHERE-Statement
with a "like" condition:

I started to look at this issue and ran into two possibly unrelated bugs.

The first was an assert failure in patternsel(). It was looking for
strictly TEXT as the right-hand const. I guess when I originally did the
bytea comparison operators last year I didn't have assert checking on :(
In any case attached is a small patch for that one.

The second one I need help with. The basic problem is that whenever an
index is used on bytea, and no matching records are found, I get "ERROR:
Index bombytea_idx1 is not a btree". E.g.

parts=# explain select * from bombytea where parent_part = '02';
QUERY PLAN
----------------------------------------------------------------------------------
Index Scan using bombytea_idx1 on bombytea (cost=0.00..147.83 rows=37
width=34)
Index Cond: (parent_part = '02'::bytea)
(2 rows)

parts=# explain analyze select * from bombytea where parent_part = '02';
ERROR: Index bombytea_idx1 is not a btree

In fact, a little further testing shows any bytea index use now fails,
so maybe this is new in development sources:

parts=# explain analyze select * from bombytea where parent_part >=
'02-05000-0' limit 1;
QUERY PLAN
------------------------------------------------------------------------
Limit (cost=0.00..0.02 rows=1 width=34) (actual time=0.03..0.04
rows=1 loops=1)
-> Seq Scan on bombytea (cost=0.00..4677.85 rows=213026 width=34)
(actual time=0.03..0.03 rows=2 loops=1)
Filter: (parent_part >= '02-05000-0'::bytea)
Total runtime: 0.17 msec
(4 rows)

parts=# select * from bombytea where parent_part >= '02-05000-0' limit 1;
parent_part | child_part | child_part_qty
--------------+------------+----------------
FM04-13100-1 | NULL | 0
(1 row)

parts=# explain select * from bombytea where parent_part =
'FM04-13100-1' limit 1;
QUERY PLAN
----------------------------------------------------------------------------------------
Limit (cost=0.00..4.03 rows=1 width=34)
-> Index Scan using bombytea_idx1 on bombytea (cost=0.00..147.83
rows=37 width=34)
Index Cond: (parent_part = 'FM04-13100-1'::bytea)
(3 rows)

parts=# select * from bombytea where parent_part = 'FM04-13100-1' limit 1;
ERROR: Index bombytea_idx1 is not a btree

I've isolated this down to _bt_getroot() to the following line (about
line 125 in nbtpage.c):

if (!(metaopaque->btpo_flags & BTP_META) ||
metad->btm_magic != BTREE_MAGIC)
elog(ERROR, "Index %s is not a btree",
RelationGetRelationName(rel));

and more specifically to "!(metaopaque->btpo_flags & BTP_META)".
But I haven't been able to see any difference between the bytea case
which fails, and text or varchar which do not.

Any ideas what else I should be looking at?

Thanks,

Joe

Attachments:

selfuncs.c.2002.08.18.patchtext/plain; name=selfuncs.c.2002.08.18.patchDownload+10-10
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvar Freude (#1)
Re: BYTEA, indexes and "like"

Alvar Freude <alvar@a-blast.org> writes:

it seems, that a BYTEA fiels doesn't support indexes in WHERE-Statement
with a "like" condition:

I was somewhat bemused to discover that we even *have* a LIKE for bytea.
It does not seem to work real well:

regression=# create table foo (f1 bytea);
CREATE TABLE
regression=# select * from foo where f1 like 'z';
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

The server log shows
TRAP: FailedAssertion("!(((Const *) other)->consttype == 25)", File: "selfuncs.c", Line: 857)
LOG: server process (pid 8487) was terminated by signal 6

which traces down to the fact that patternsel() expects to see only TEXT
patterns. Joe, was it you that put this stuff in? If so, shame on you
for not using --enable-cassert when doing backend work.

regards, tom lane

#4Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: BYTEA, indexes and "like"

Tom Lane wrote:

which traces down to the fact that patternsel() expects to see only TEXT
patterns. Joe, was it you that put this stuff in? If so, shame on you
for not using --enable-cassert when doing backend work.

Shame already acknowledged -- you must have written this before reading
my last post. I know better now. Anyway, patch was sent in to patches.

A new problem exists though. Again see my post to patches. Indexed
lookups on bytea are completely broken on cvs tip. On 7.2.1:

test=# explain select * from bombytea where parent_part = 'FM04-13100-1';
NOTICE: QUERY PLAN:

Index Scan using bombytea_idx1 on bombytea (cost=0.00..148.37 rows=37
width=34)

EXPLAIN
test=# select * from bombytea where parent_part = 'FM04-13100-1';
parent_part | child_part | child_part_qty
--------------+------------+----------------
FM04-13100-1 | NULL | 0
(1 row)

and on cvs tip:

parts=# explain select * from bombytea where parent_part = 'FM04-13100-1';
QUERY PLAN
----------------------------------------------------------------------------------
Index Scan using bombytea_idx1 on bombytea (cost=0.00..147.83 rows=37
width=34)
Index Cond: (parent_part = 'FM04-13100-1'::bytea)
(2 rows)

parts=# select * from bombytea where parent_part = 'FM04-13100-1';
ERROR: Index bombytea_idx1 is not a btree

I'm trying to figure out why right now.

Joe

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#2)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes and "like")

Joe Conway <mail@joeconway.com> writes:

The first was an assert failure in patternsel(). It was looking for
strictly TEXT as the right-hand const. I guess when I originally did the
bytea comparison operators last year I didn't have assert checking on :(
In any case attached is a small patch for that one.

This is not gonna fly: the reason why that Assert is there is that
the rest of the pattern-analysis code assumes it can work with
null-terminated strings. If you want a one-line patch then the patch
is to change the right-hand argument of bytealike to TEXT. A "real"
patch would probably require changing all the patternsel routines to
use counted strings, which seems like it will add major uglification.

I've isolated this down to _bt_getroot() to the following line (about
line 125 in nbtpage.c):

if (!(metaopaque->btpo_flags & BTP_META) ||
metad->btm_magic != BTREE_MAGIC)
elog(ERROR, "Index %s is not a btree",
RelationGetRelationName(rel));

and more specifically to "!(metaopaque->btpo_flags & BTP_META)".
But I haven't been able to see any difference between the bytea case
which fails, and text or varchar which do not.

I'm betting on a memory-clobber kind of problem --- I think something in
the bytea-related code is stomping on the disk buffer that holds the
btree metapage. Are you testing with --enable-cassert now? (That turns
on MEMORY_CHECKING which might be helpful...)

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#4)
Re: BYTEA, indexes and "like"

Joe Conway <mail@joeconway.com> writes:

parts=# select * from bombytea where parent_part = 'FM04-13100-1';
ERROR: Index bombytea_idx1 is not a btree

I'm trying to figure out why right now.

FWIW, I'm not seeing that here... perhaps a clean rebuild is indicated?

regards, tom lane

#7Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Tom Lane wrote:

I'm betting on a memory-clobber kind of problem --- I think something in
the bytea-related code is stomping on the disk buffer that holds the
btree metapage. Are you testing with --enable-cassert now? (That turns
on MEMORY_CHECKING which might be helpful...)

My standard configure line these days is:
./configure --enable-integer-datetimes --enable-locale --enable-debug
--enable-cassert --enable-multibyte --enable-syslog --enable-nls
--enable-depend

and:

parts=# show debug_assertions;
debug_assertions
------------------
on
(1 row)

And I'm definitely seeing the asserts. I'll work on a proper fix for the
first item and continue to investigate the second.

Thanks,

Joe

#8Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Tom Lane wrote:

This is not gonna fly: the reason why that Assert is there is that
the rest of the pattern-analysis code assumes it can work with
null-terminated strings. If you want a one-line patch then the patch
is to change the right-hand argument of bytealike to TEXT. A "real"
patch would probably require changing all the patternsel routines to
use counted strings, which seems like it will add major uglification.

OK. I'll look at both options and make another diff -c proposal ;-) Once
that's resolved I'll go back to original issue Alvar raised.

BTW, you were right-on on the other issue. I started with a fresh sync
up from cvs, then configure, make all, make install, initdb. Now the
problem is gone.

As always, thanks for your help and guidance.

Joe

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#8)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes and "like")

Joe Conway <mail@joeconway.com> writes:

OK. I'll look at both options and make another diff -c proposal ;-) Once
that's resolved I'll go back to original issue Alvar raised.

Okay. When you get back to the original issue, the gold is hidden in
src/backend/optimizer/path/indxpath.c; see the "special indexable
operators" stuff near the bottom of that file. (It's a bit of a crock
that this code is hardwired there, and not somehow accessed through a
system catalog, but it's what we've got at the moment.)

regards, tom lane

#10Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

OK. I'll look at both options and make another diff -c proposal ;-) Once
that's resolved I'll go back to original issue Alvar raised.

Okay. When you get back to the original issue, the gold is hidden in
src/backend/optimizer/path/indxpath.c; see the "special indexable
operators" stuff near the bottom of that file. (It's a bit of a crock
that this code is hardwired there, and not somehow accessed through a
system catalog, but it's what we've got at the moment.)

Thanks.

Back on the pattern selectivity issue. With some more study I can
clearly see what you were referring to. Dragging string length through
the maze of function calls that would need it would be a mess.

In the longer term (i.e. not for 7.3) it might make sense to create a
set of pattern selectivity functions, just for bytea, that are careful
to avoid the null-terminated string assumption. But, for now, I'm
leaning toward restricting the right-hand argument of bytealike to TEXT,
as you suggested.

Joe

#11Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Joe Conway wrote:

BTW, you were right-on on the other issue. I started with a fresh sync
up from cvs, then configure, make all, make install, initdb. Now the
problem is gone.

Hmmm -- I might have to take that back. Take a look at the following
session:

parts=# SELECT * from bombytea where parent_part > '02-' and parent_part
< '02-1';
parent_part | child_part | child_part_qty
-------------+--------------+----------------
02-00010-0 | NULL | 0
02-00015-0 | NULL | 0
02-00015-1 |
...
02-06360-00 | 46-01-12700 | 0
02-06360-00 | 02-03360-0 | 1
(492 rows)

parts=# SELECT * from bombytea where parent_part like '02-%';
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
parts=# \q
[root@jec-linux pgsql]# psql -U postgres parts
Welcome to psql 7.3devel, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

parts=# SELECT * from bombytea where parent_part > '02-' and parent_part
< '02-1';
ERROR: Index bombytea_idx1 is not a btree
parts=# \q
[root@jec-linux pgsql]# /etc/init.d/postgresql restart
Stopping postgresql service: [ OK ]
Starting postgresql service: [ OK ]
[root@jec-linux pgsql]# psql -U postgres parts
Welcome to psql 7.3devel, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

parts=# SELECT * from bombytea where parent_part > '02-' and parent_part
< '02-1';
ERROR: Index bombytea_idx1 is not a btree
parts=# drop index bombytea_idx1;
DROP INDEX
parts=# create index bombytea_idx1 on bombytea (parent_part);
CREATE INDEX
parts=# SELECT * from bombytea where parent_part > '02-' and parent_part
< '02-1';
parent_part | child_part | child_part_qty
-------------+--------------+----------------
02-00010-0 | NULL | 0
02-00015-0 |
...
02-06360-00 | 46-01-12700 | 0
02-06360-00 | 02-03360-0 | 1
(492 rows)

So it appears that "SELECT * from bombytea where parent_part like
'02-%'; server closed the connection unexpectedly" somehow physically
corrupts the index!? This is starting from `make clean` with an initdb
just prior.

Any ideas?

Joe

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#11)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Joe Conway <mail@joeconway.com> writes:

parts=# SELECT * from bombytea where parent_part like '02-%';
server closed the connection unexpectedly

You should stop right here and investigate the reason for the crash.

So it appears that "SELECT * from bombytea where parent_part like
'02-%'; server closed the connection unexpectedly" somehow physically
corrupts the index!

Hm, I'd have expected WAL recovery to fix that. But the easiest avenue
of attack is to isolate the reason for the crash. Please post a test
case if you want help.

regards, tom lane

#13Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

parts=# SELECT * from bombytea where parent_part like '02-%';
server closed the connection unexpectedly

You should stop right here and investigate the reason for the crash.

It's the same reason as before -- i.e. the TEXTOID assertion in
patternsel(). I just didn't expect to see physical index corruption as a
result.

So it appears that "SELECT * from bombytea where parent_part like
'02-%'; server closed the connection unexpectedly" somehow physically
corrupts the index!

Hm, I'd have expected WAL recovery to fix that. But the easiest avenue
of attack is to isolate the reason for the crash. Please post a test
case if you want help.

OK. I'll be back at this again this morning on a different development
machine. I'll see if I can get a reliable test case and post it.

Joe

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#13)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

You should stop right here and investigate the reason for the crash.

It's the same reason as before -- i.e. the TEXTOID assertion in
patternsel(). I just didn't expect to see physical index corruption as a
result.

Oh really? Yeah, I'm surprised too (and not happy). Let's definitely
see that test case...

regards, tom lane

#15Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Tom Lane wrote:

Oh really? Yeah, I'm surprised too (and not happy). Let's definitely
see that test case...

Here is a test case. Prior to running the following two sessions, I did
the following:

/etc/init.d/postgresql stop
rm -rf pgsql
cvsup to resync with cvs
cvs co pgsql
cd pgsql
./configure --enable-integer-datetimes --enable-locale --enable-debug
--enable-cassert --enable-multibyte --enable-syslog --enable-nls
--enable-depend
make all
make install
initdb
/etc/init.d/postgresql start

Note this this is without any attempt to fix the TEXTOID assertion in
patternsel() -- this is unaltered cvs tip. The two sequntial psql
sessions are below.

Thanks,

Joe

First session:
==============
[root@jec-linux-1 pgsql]# psql -U postgres test
Welcome to psql 7.3devel, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

test=# ------------------------------------------------------
test=# -- Session 1:
test=# ------------------------------------------------------
test=# drop table foobytea;
DROP TABLE
test=# create table foobytea(f1 bytea);
CREATE TABLE
test=# insert into foobytea values('crash');
INSERT 698676 1
test=# create index foobytea_idx on foobytea(f1);
CREATE INDEX
test=# set enable_seqscan = off;
SET
test=# select f1 from foobytea where f1 = 'crash';
f1
-------
crash
(1 row)

test=# explain select f1 from foobytea where f1 like 'cr%';
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
test=# \q

Immediately followed by second session:
=======================================
[root@jec-linux-1 pgsql]# psql -U postgres test
Welcome to psql 7.3devel, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

test=# ------------------------------------------------------
test=# -- Session 2:
test=# ------------------------------------------------------
test=# set enable_seqscan = off;
SET
test=# explain select f1 from foobytea where f1 = 'crash';
QUERY PLAN
------------------------------------------------------------------------------
Index Scan using foobytea_idx on foobytea (cost=0.00..4.68 rows=1
width=32)
Index Cond: (f1 = 'crash'::bytea)
(2 rows)

test=# select f1 from foobytea where f1 = 'crash';
ERROR: Index foobytea_idx is not a btree
test=# drop index foobytea_idx;
DROP INDEX
test=# create index foobytea_idx on foobytea(f1);
CREATE INDEX
test=# explain select f1 from foobytea where f1 = 'crash';
QUERY PLAN
------------------------------------------------------------------------------
Index Scan using foobytea_idx on foobytea (cost=0.00..4.68 rows=1
width=32)
Index Cond: (f1 = 'crash'::bytea)
(2 rows)

test=# select f1 from foobytea where f1 = 'crash';
f1
-------
crash
(1 row)

test=#

#16Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Joe Conway wrote:

Back on the pattern selectivity issue. With some more study I can
clearly see what you were referring to. Dragging string length through
the maze of function calls that would need it would be a mess.

In the longer term (i.e. not for 7.3) it might make sense to create a
set of pattern selectivity functions, just for bytea, that are careful
to avoid the null-terminated string assumption. But, for now, I'm
leaning toward restricting the right-hand argument of bytealike to TEXT,
as you suggested.

As suggested by Tom, this patch restricts the right-hand argument of
bytealike to TEXT.

This leaves like_escape_bytea() without anything to do, but I left it in
place in anticipation of the eventual bytea pattern selectivity
functions. If there is agreement that this would be the best long term
solution, I'll take it as a TODO for 7.4.

I'll look around the docs to see if there is someplace where a note wrt
this is appropriate.

If there are no objections, please apply.

Thanks,

Joe

Attachments:

bytealike.2002.08.19.1.patchtext/plain; name=bytealike.2002.08.19.1.patchDownload+16-16
#17Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#16)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Your patch has been added to the PostgreSQL unapplied patches list at:

http://candle.pha.pa.us/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

---------------------------------------------------------------------------

Joe Conway wrote:

Joe Conway wrote:

Back on the pattern selectivity issue. With some more study I can
clearly see what you were referring to. Dragging string length through
the maze of function calls that would need it would be a mess.

In the longer term (i.e. not for 7.3) it might make sense to create a
set of pattern selectivity functions, just for bytea, that are careful
to avoid the null-terminated string assumption. But, for now, I'm
leaning toward restricting the right-hand argument of bytealike to TEXT,
as you suggested.

As suggested by Tom, this patch restricts the right-hand argument of
bytealike to TEXT.

This leaves like_escape_bytea() without anything to do, but I left it in
place in anticipation of the eventual bytea pattern selectivity
functions. If there is agreement that this would be the best long term
solution, I'll take it as a TODO for 7.4.

I'll look around the docs to see if there is someplace where a note wrt
this is appropriate.

If there are no objections, please apply.

Thanks,

Joe

Index: src/backend/utils/adt/like.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/utils/adt/like.c,v
retrieving revision 1.49
diff -c -r1.49 like.c
*** src/backend/utils/adt/like.c	20 Jun 2002 20:29:37 -0000	1.49
--- src/backend/utils/adt/like.c	19 Aug 2002 17:06:10 -0000
***************
*** 264,270 ****
bytealike(PG_FUNCTION_ARGS)
{
bytea	   *str = PG_GETARG_BYTEA_P(0);
! 	bytea	   *pat = PG_GETARG_BYTEA_P(1);
bool		result;
unsigned char *s,
*p;
--- 264,270 ----
bytealike(PG_FUNCTION_ARGS)
{
bytea	   *str = PG_GETARG_BYTEA_P(0);
! 	text	   *pat = PG_GETARG_TEXT_P(1);
bool		result;
unsigned char *s,
*p;
***************
*** 285,291 ****
byteanlike(PG_FUNCTION_ARGS)
{
bytea	   *str = PG_GETARG_BYTEA_P(0);
! 	bytea	   *pat = PG_GETARG_BYTEA_P(1);
bool		result;
unsigned char *s,
*p;
--- 285,291 ----
byteanlike(PG_FUNCTION_ARGS)
{
bytea	   *str = PG_GETARG_BYTEA_P(0);
! 	text	   *pat = PG_GETARG_TEXT_P(1);
bool		result;
unsigned char *s,
*p;
Index: src/include/catalog/pg_operator.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/catalog/pg_operator.h,v
retrieving revision 1.106
diff -c -r1.106 pg_operator.h
*** src/include/catalog/pg_operator.h	24 Jul 2002 19:11:12 -0000	1.106
--- src/include/catalog/pg_operator.h	19 Aug 2002 17:43:31 -0000
***************
*** 827,835 ****
DATA(insert OID = 1958 ( "<="	   PGNSP PGUID b f 17 17	16 1960 1959 0	  0   0   0 byteale scalarltsel scalarltjoinsel ));
DATA(insert OID = 1959 ( ">"	   PGNSP PGUID b f 17 17	16 1957 1958 0	  0   0   0 byteagt scalargtsel scalargtjoinsel ));
DATA(insert OID = 1960 ( ">="	   PGNSP PGUID b f 17 17	16 1958 1957 0	  0   0   0 byteage scalargtsel scalargtjoinsel ));
! DATA(insert OID = 2016 (  "~~"	   PGNSP PGUID b f 17 17	16 0	2017 0	  0   0   0 bytealike likesel likejoinsel ));
#define OID_BYTEA_LIKE_OP		2016
! DATA(insert OID = 2017 (  "!~~"    PGNSP PGUID b f 17 17	16 0	2016 0	  0   0   0 byteanlike nlikesel nlikejoinsel ));
DATA(insert OID = 2018 (  "||"	   PGNSP PGUID b f 17 17	17 0	0	 0	  0   0   0 byteacat - - ));
/* timestamp operators */
--- 827,835 ----
DATA(insert OID = 1958 ( "<="	   PGNSP PGUID b f 17 17	16 1960 1959 0	  0   0   0 byteale scalarltsel scalarltjoinsel ));
DATA(insert OID = 1959 ( ">"	   PGNSP PGUID b f 17 17	16 1957 1958 0	  0   0   0 byteagt scalargtsel scalargtjoinsel ));
DATA(insert OID = 1960 ( ">="	   PGNSP PGUID b f 17 17	16 1958 1957 0	  0   0   0 byteage scalargtsel scalargtjoinsel ));
! DATA(insert OID = 2016 (  "~~"	   PGNSP PGUID b f 17 25	16 0	2017 0	  0   0   0 bytealike likesel likejoinsel ));
#define OID_BYTEA_LIKE_OP		2016
! DATA(insert OID = 2017 (  "!~~"    PGNSP PGUID b f 17 25	16 0	2016 0	  0   0   0 byteanlike nlikesel nlikejoinsel ));
DATA(insert OID = 2018 (  "||"	   PGNSP PGUID b f 17 17	17 0	0	 0	  0   0   0 byteacat - - ));
/* timestamp operators */
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/catalog/pg_proc.h,v
retrieving revision 1.256
diff -c -r1.256 pg_proc.h
*** src/include/catalog/pg_proc.h	17 Aug 2002 13:04:15 -0000	1.256
--- src/include/catalog/pg_proc.h	19 Aug 2002 17:06:10 -0000
***************
*** 2766,2778 ****
DATA(insert OID = 1969 (  timetz		   PGNSP PGUID 12 f f t f i 2 1266 "1266 23"	timetz_scale - _null_ ));
DESCR("adjust time with time zone precision");
! DATA(insert OID = 2005 (  bytealike		   PGNSP PGUID 12 f f t f i 2 16 "17 17" bytealike - _null_ ));
DESCR("matches LIKE expression");
! DATA(insert OID = 2006 (  byteanlike	   PGNSP PGUID 12 f f t f i 2 16 "17 17" byteanlike - _null_ ));
DESCR("does not match LIKE expression");
! DATA(insert OID = 2007 (  like			   PGNSP PGUID 12 f f t f i 2 16 "17 17"	bytealike - _null_ ));
DESCR("matches LIKE expression");
! DATA(insert OID = 2008 (  notlike		   PGNSP PGUID 12 f f t f i 2 16 "17 17"	byteanlike - _null_ ));
DESCR("does not match LIKE expression");
DATA(insert OID = 2009 (  like_escape	   PGNSP PGUID 12 f f t f i 2 17 "17 17" like_escape_bytea - _null_ ));
DESCR("convert match pattern to use backslash escapes");
--- 2766,2778 ----
DATA(insert OID = 1969 (  timetz		   PGNSP PGUID 12 f f t f i 2 1266 "1266 23"	timetz_scale - _null_ ));
DESCR("adjust time with time zone precision");

! DATA(insert OID = 2005 ( bytealike PGNSP PGUID 12 f f t f i 2 16 "17 25" bytealike - _null_ ));
DESCR("matches LIKE expression");
! DATA(insert OID = 2006 ( byteanlike PGNSP PGUID 12 f f t f i 2 16 "17 25" byteanlike - _null_ ));
DESCR("does not match LIKE expression");
! DATA(insert OID = 2007 ( like PGNSP PGUID 12 f f t f i 2 16 "17 25" bytealike - _null_ ));
DESCR("matches LIKE expression");
! DATA(insert OID = 2008 ( notlike PGNSP PGUID 12 f f t f i 2 16 "17 25" byteanlike - _null_ ));
DESCR("does not match LIKE expression");
DATA(insert OID = 2009 ( like_escape PGNSP PGUID 12 f f t f i 2 17 "17 17" like_escape_bytea - _null_ ));
DESCR("convert match pattern to use backslash escapes");

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#18Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#16)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Patch applied. Thanks.

Added to TODO:

* Allow bytea to handle LIKE with non-TEXT patterns

---------------------------------------------------------------------------

Joe Conway wrote:

Joe Conway wrote:

Back on the pattern selectivity issue. With some more study I can
clearly see what you were referring to. Dragging string length through
the maze of function calls that would need it would be a mess.

In the longer term (i.e. not for 7.3) it might make sense to create a
set of pattern selectivity functions, just for bytea, that are careful
to avoid the null-terminated string assumption. But, for now, I'm
leaning toward restricting the right-hand argument of bytealike to TEXT,
as you suggested.

As suggested by Tom, this patch restricts the right-hand argument of
bytealike to TEXT.

This leaves like_escape_bytea() without anything to do, but I left it in
place in anticipation of the eventual bytea pattern selectivity
functions. If there is agreement that this would be the best long term
solution, I'll take it as a TODO for 7.4.

I'll look around the docs to see if there is someplace where a note wrt
this is appropriate.

If there are no objections, please apply.

Thanks,

Joe

Index: src/backend/utils/adt/like.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/utils/adt/like.c,v
retrieving revision 1.49
diff -c -r1.49 like.c
*** src/backend/utils/adt/like.c	20 Jun 2002 20:29:37 -0000	1.49
--- src/backend/utils/adt/like.c	19 Aug 2002 17:06:10 -0000
***************
*** 264,270 ****
bytealike(PG_FUNCTION_ARGS)
{
bytea	   *str = PG_GETARG_BYTEA_P(0);
! 	bytea	   *pat = PG_GETARG_BYTEA_P(1);
bool		result;
unsigned char *s,
*p;
--- 264,270 ----
bytealike(PG_FUNCTION_ARGS)
{
bytea	   *str = PG_GETARG_BYTEA_P(0);
! 	text	   *pat = PG_GETARG_TEXT_P(1);
bool		result;
unsigned char *s,
*p;
***************
*** 285,291 ****
byteanlike(PG_FUNCTION_ARGS)
{
bytea	   *str = PG_GETARG_BYTEA_P(0);
! 	bytea	   *pat = PG_GETARG_BYTEA_P(1);
bool		result;
unsigned char *s,
*p;
--- 285,291 ----
byteanlike(PG_FUNCTION_ARGS)
{
bytea	   *str = PG_GETARG_BYTEA_P(0);
! 	text	   *pat = PG_GETARG_TEXT_P(1);
bool		result;
unsigned char *s,
*p;
Index: src/include/catalog/pg_operator.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/catalog/pg_operator.h,v
retrieving revision 1.106
diff -c -r1.106 pg_operator.h
*** src/include/catalog/pg_operator.h	24 Jul 2002 19:11:12 -0000	1.106
--- src/include/catalog/pg_operator.h	19 Aug 2002 17:43:31 -0000
***************
*** 827,835 ****
DATA(insert OID = 1958 ( "<="	   PGNSP PGUID b f 17 17	16 1960 1959 0	  0   0   0 byteale scalarltsel scalarltjoinsel ));
DATA(insert OID = 1959 ( ">"	   PGNSP PGUID b f 17 17	16 1957 1958 0	  0   0   0 byteagt scalargtsel scalargtjoinsel ));
DATA(insert OID = 1960 ( ">="	   PGNSP PGUID b f 17 17	16 1958 1957 0	  0   0   0 byteage scalargtsel scalargtjoinsel ));
! DATA(insert OID = 2016 (  "~~"	   PGNSP PGUID b f 17 17	16 0	2017 0	  0   0   0 bytealike likesel likejoinsel ));
#define OID_BYTEA_LIKE_OP		2016
! DATA(insert OID = 2017 (  "!~~"    PGNSP PGUID b f 17 17	16 0	2016 0	  0   0   0 byteanlike nlikesel nlikejoinsel ));
DATA(insert OID = 2018 (  "||"	   PGNSP PGUID b f 17 17	17 0	0	 0	  0   0   0 byteacat - - ));
/* timestamp operators */
--- 827,835 ----
DATA(insert OID = 1958 ( "<="	   PGNSP PGUID b f 17 17	16 1960 1959 0	  0   0   0 byteale scalarltsel scalarltjoinsel ));
DATA(insert OID = 1959 ( ">"	   PGNSP PGUID b f 17 17	16 1957 1958 0	  0   0   0 byteagt scalargtsel scalargtjoinsel ));
DATA(insert OID = 1960 ( ">="	   PGNSP PGUID b f 17 17	16 1958 1957 0	  0   0   0 byteage scalargtsel scalargtjoinsel ));
! DATA(insert OID = 2016 (  "~~"	   PGNSP PGUID b f 17 25	16 0	2017 0	  0   0   0 bytealike likesel likejoinsel ));
#define OID_BYTEA_LIKE_OP		2016
! DATA(insert OID = 2017 (  "!~~"    PGNSP PGUID b f 17 25	16 0	2016 0	  0   0   0 byteanlike nlikesel nlikejoinsel ));
DATA(insert OID = 2018 (  "||"	   PGNSP PGUID b f 17 17	17 0	0	 0	  0   0   0 byteacat - - ));
/* timestamp operators */
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/catalog/pg_proc.h,v
retrieving revision 1.256
diff -c -r1.256 pg_proc.h
*** src/include/catalog/pg_proc.h	17 Aug 2002 13:04:15 -0000	1.256
--- src/include/catalog/pg_proc.h	19 Aug 2002 17:06:10 -0000
***************
*** 2766,2778 ****
DATA(insert OID = 1969 (  timetz		   PGNSP PGUID 12 f f t f i 2 1266 "1266 23"	timetz_scale - _null_ ));
DESCR("adjust time with time zone precision");
! DATA(insert OID = 2005 (  bytealike		   PGNSP PGUID 12 f f t f i 2 16 "17 17" bytealike - _null_ ));
DESCR("matches LIKE expression");
! DATA(insert OID = 2006 (  byteanlike	   PGNSP PGUID 12 f f t f i 2 16 "17 17" byteanlike - _null_ ));
DESCR("does not match LIKE expression");
! DATA(insert OID = 2007 (  like			   PGNSP PGUID 12 f f t f i 2 16 "17 17"	bytealike - _null_ ));
DESCR("matches LIKE expression");
! DATA(insert OID = 2008 (  notlike		   PGNSP PGUID 12 f f t f i 2 16 "17 17"	byteanlike - _null_ ));
DESCR("does not match LIKE expression");
DATA(insert OID = 2009 (  like_escape	   PGNSP PGUID 12 f f t f i 2 17 "17 17" like_escape_bytea - _null_ ));
DESCR("convert match pattern to use backslash escapes");
--- 2766,2778 ----
DATA(insert OID = 1969 (  timetz		   PGNSP PGUID 12 f f t f i 2 1266 "1266 23"	timetz_scale - _null_ ));
DESCR("adjust time with time zone precision");

! DATA(insert OID = 2005 ( bytealike PGNSP PGUID 12 f f t f i 2 16 "17 25" bytealike - _null_ ));
DESCR("matches LIKE expression");
! DATA(insert OID = 2006 ( byteanlike PGNSP PGUID 12 f f t f i 2 16 "17 25" byteanlike - _null_ ));
DESCR("does not match LIKE expression");
! DATA(insert OID = 2007 ( like PGNSP PGUID 12 f f t f i 2 16 "17 25" bytealike - _null_ ));
DESCR("matches LIKE expression");
! DATA(insert OID = 2008 ( notlike PGNSP PGUID 12 f f t f i 2 16 "17 25" byteanlike - _null_ ));
DESCR("does not match LIKE expression");
DATA(insert OID = 2009 ( like_escape PGNSP PGUID 12 f f t f i 2 17 "17 17" like_escape_bytea - _null_ ));
DESCR("convert match pattern to use backslash escapes");

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#19Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA, indexes

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

OK. I'll look at both options and make another diff -c proposal ;-) Once
that's resolved I'll go back to original issue Alvar raised.

Okay. When you get back to the original issue, the gold is hidden in
src/backend/optimizer/path/indxpath.c; see the "special indexable
operators" stuff near the bottom of that file. (It's a bit of a crock
that this code is hardwired there, and not somehow accessed through a
system catalog, but it's what we've got at the moment.)

The attached patch re-enables a bytea right hand argument (as compared
to a text right hand argument), and enables index usage, for bytea LIKE
-- e.g.:

parts=# explain select * from bombytea where parent_part like '05-05%';
QUERY PLAN
-------------------------------------------------------------------------------------
Index Scan using bombytea_idx1 on bombytea (cost=0.00..3479.67
rows=1118 width=34)
Index Cond: ((parent_part >= '05-05'::bytea) AND (parent_part <
'05-06'::bytea))
Filter: (parent_part ~~ '05-05%'::bytea)
(3 rows)

Passes all regression tests, and as far as I can tell does not break or
change the behavior of anything else. Please review and apply if there
are no objections (I'd like to see this applied for 7.3, before the
freeze, if possible, but I'll certainly understand if I'm told there's
not enough time left).

Thanks,

Joe

Attachments:

patternsel.3.patchtext/plain; name=patternsel.3.patchDownload+340-311
#20Alvar Freude
alvar@a-blast.org
In reply to: Joe Conway (#19)
Re: bytea operator bugs (was Re: [GENERAL] BYTEA,

Hi,

-- Joe Conway <mail@joeconway.com> wrote:

The attached patch re-enables a bytea right hand argument (as compared
to a text right hand argument), and enables index usage, for bytea LIKE
-- e.g.:

parts=# explain select * from bombytea where parent_part like '05-05%';

With this, the limits in Bytea indexes are removed? Great!

As far is I followed the discussion, there where some additionsl limits
on bytea indexes (and like); or are they removed with this patch?

I'll try it sometime later and make some performance checks -- in the
hopem that bytea is faster then texts ;-) (at least the base 255 encoding
i can throw out of my application).

Thanks && Ciao

Alvar

--
** Alvar C.H. Freude
** http://alvar.a-blast.org/
**
** http://odem.org/

#21Joe Conway
mail@joeconway.com
In reply to: Alvar Freude (#1)
#22Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#19)
#23Bruce Momjian
bruce@momjian.us
In reply to: Joe Conway (#19)