Best way to use indexes for partial match at beginning
I have 10 CHARACTER columns in table mytable. Mytable is indexed by some of
those columns.
CREATE TABLE mytable ( col1 CHARACTER(10),
col2 CHARACTER(10),col3 CHARACTER(10),col4 CHARACTER(10),col5
CHARACTER(10),col6 CHARACTER(10),col7 CHARACTER(10),col8 CHARACTER(10),col9
CHARACTER(10), col10 CHARACTER(10) );
CREATE INDEX i1 ON mytable(col1);
CREATE INDEX i2 ON mytable(col2);
I need to select records by knowing some characters from beginning.
I know always 1-10 first characters of col1. So my LIKE pattern starts
always with constant characters and ends with % .
I can use LIKE:
SELECT * FROM mytable
WHERE col1 LIKE 'A%'
AND col2 LIKE 'BC%'
AND col3 LIKE 'DEF%'
AND col4 LIKE 'G%';
or substring():
SELECT * FROM mytable
WHERE substring(col1 for 1)='A'
AND substring(col2 for 2)= 'BC'
AND substring(col3 for 3)='DEF'
AND substring(col4 for 1) ='G';
Can Postgres 8.1 use indexes to speed the queries above ?
Which is the best way to to write the where clause in this case so that
index is used ?
Andrus.
Well, for starters, see if PostgreSQL is currently using any indexes via
EXPLAIN. First rule of performance tuning: don't.
If it's not (which is probably the case), then your best bet is to
create functional indexes; ie:
CREATE INDEX mytable__col1_4 ON mytable( substring( col1 for 4 ) );
You can then either
SELECT ... WHERE substring( col1 for 4 ) = blah
or
SELECT ... WHERE substring( col1 for 4 ) LIKE 'bla%'
Though that last one might not use the index; you'll have to check and
see.
Also, keep in mind that PostgreSQL doesn't store CHAR the same as most
other databases; the internal storage is the same as what's used for
VARCHAR and TEXT.
On Sun, Nov 06, 2005 at 11:03:01PM +0200, Andrus Moor wrote:
I have 10 CHARACTER columns in table mytable. Mytable is indexed by some of
those columns.CREATE TABLE mytable ( col1 CHARACTER(10),
col2 CHARACTER(10),col3 CHARACTER(10),col4 CHARACTER(10),col5
CHARACTER(10),col6 CHARACTER(10),col7 CHARACTER(10),col8 CHARACTER(10),col9
CHARACTER(10), col10 CHARACTER(10) );CREATE INDEX i1 ON mytable(col1);
CREATE INDEX i2 ON mytable(col2);I need to select records by knowing some characters from beginning.
I know always 1-10 first characters of col1. So my LIKE pattern starts
always with constant characters and ends with % .I can use LIKE:
SELECT * FROM mytable
WHERE col1 LIKE 'A%'
AND col2 LIKE 'BC%'
AND col3 LIKE 'DEF%'
AND col4 LIKE 'G%';or substring():
SELECT * FROM mytable
WHERE substring(col1 for 1)='A'
AND substring(col2 for 2)= 'BC'
AND substring(col3 for 3)='DEF'
AND substring(col4 for 1) ='G';Can Postgres 8.1 use indexes to speed the queries above ?
Which is the best way to to write the where clause in this case so that
index is used ?Andrus.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Well, for starters, see if PostgreSQL is currently using any indexes via
EXPLAIN. First rule of performance tuning: don't.
I'm designing a new application. Data is not available yet.
I'm using Postgres 8.1 in Windows. Database encoding is UTF-8
lc_ctype is Estonian_Estonia.1257.
lc_collate is Estonian currently. However I can set lc_collate to C if this
solves this issue.
Doc says that
" to allow PostgreSQL to use indexes with LIKE clauses under a non-C locale,
several custom operator classes exist"
I don't understand "non-C locale". Does this mean lc_collate or also some
other lc_ setting ?
If it's not (which is probably the case), then your best bet is to
create functional indexes; ie:CREATE INDEX mytable__col1_4 ON mytable( substring( col1 for 4 ) );
You can then either
SELECT ... WHERE substring( col1 for 4 ) = blah
I need to optimize queries with variable number of characters in beginning
like
SELECT ... WHERE substring( col1 for 1 ) = 'f'
SELECT ... WHERE substring( col1 for 2 ) = 'fo'
SELECT ... WHERE substring( col1 for 3 ) = 'foo'
etc
This approach requires creating 10 indexes for each column which is
unreasonable.
In my current dbms, Microsoft Visual FoxPro I have a single index
CREATE INDEX i1 ON mytable(col1)
I can use queries:
WHERE col1 BETWEEN 'f' and 'f'+CHR(255)
WHERE col1 BETWEEN 'fo' and 'fo'+CHR(255)
WHERE col1 BETWEEN 'foo' and 'foo'+CHR(255)
All those queries can use same index automatically in all locales. CHR(255)
is last character in any lc_collate sequence. CHR(255) is not used in col1
data.
How to get same functionality in Postgres ?
Does there exist unicode special character which is greater than all other
chars ?
Andrus.
On Wed, Nov 09, 2005 at 12:37:25PM +0200, Andrus wrote:
I'm using Postgres 8.1 in Windows. Database encoding is UTF-8
lc_ctype is Estonian_Estonia.1257.
lc_collate is Estonian currently. However I can set lc_collate to C if this
solves this issue.Doc says that
" to allow PostgreSQL to use indexes with LIKE clauses under a non-C locale,
several custom operator classes exist"I don't understand "non-C locale". Does this mean lc_collate or also some
other lc_ setting ?
lc == locale. There are several different locale settings but collation
affects ordering. And Estonian is not C (obviously).
I need to optimize queries with variable number of characters in beginning
likeSELECT ... WHERE substring( col1 for 1 ) = 'f'
SELECT ... WHERE substring( col1 for 2 ) = 'fo'
SELECT ... WHERE substring( col1 for 3 ) = 'foo'
etc
If you use queries like:
SELECT ... WHERE col1 LIKE 'fo%'
it can use an index declared like:
CREATE INDEX myindex on mytable(col1 text_pattern_ops);
In my current dbms, Microsoft Visual FoxPro I have a single index
CREATE INDEX i1 ON mytable(col1)
I can use queries:
WHERE col1 BETWEEN 'f' and 'f'+CHR(255)
WHERE col1 BETWEEN 'fo' and 'fo'+CHR(255)
WHERE col1 BETWEEN 'foo' and 'foo'+CHR(255)
Well, you could do that in postgresql too, you just need to use the SQL
standard concatination operator.
WHERE col1 BETWEEN 'f' and 'f' || chr(255);
How to get same functionality in Postgres ?
Does there exist unicode special character which is greater than all other
chars ?
Umm, I don't think so. Order is defined by the locale, not the
character set. My guess is that text_pattern_ops is the way to go.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
Martijn,
I can use queries:
WHERE col1 BETWEEN 'f' and 'f'+CHR(255)
Well, you could do that in postgresql too, you just need to use the SQL
standard concatination operator.
WHERE col1 BETWEEN 'f' and 'f' || chr(255);
thank you.
I think it is best to use regular indexes since regular indexes since they
can be used in other types of queries also.
It seems that only way is to use BETWEEN comparison for this in Postgres
8.1.
I tried
CREATE TABLE foo ( col1 CHAR(20));
CREATE INDEX i1 ON foo(col1);
INSERT INTO foo VALUES ('bar');
SELECT * FROM foo WHERE col1 BETWEEN 'b' and 'b' || chr(255);
But this does not return any data.
How to write index optimizable WHERE clause when only some (variable number)
of characters from beginning of col1 are known ?
Only way seems to use BETWEEN comparison by concatenating character greater
than all other characters in locale. Since CHR(255) does not work this is
not possible.
So
CREATE INDEX i1 ON foo(col1);
cannot be used to optimize queries of type "get all rows where first n
charaters of col1 are known" in Postgres.
Andrus.
Andrus wrote:
So
CREATE INDEX i1 ON foo(col1);
cannot be used to optimize queries of type "get all rows where first n
charaters of col1 are known" in Postgres.
Of course it will! Any btree based index will let you do that. Re-read
the previous answers and make sure you pay attention to the bit about
text_pattern_ops and LIKE in non-C locales.
--
Richard Huxton
Archonet Ltd
On 11/9/05, Andrus <eetasoft@online.ee> wrote:
Martijn,
I can use queries:
WHERE col1 BETWEEN 'f' and 'f'+CHR(255)
Well, you could do that in postgresql too, you just need to use the SQL
standard concatination operator.WHERE col1 BETWEEN 'f' and 'f' || chr(255);
thank you.
I think it is best to use regular indexes since regular indexes since they
can be used in other types of queries also.It seems that only way is to use BETWEEN comparison for this in Postgres
8.1.I tried
CREATE TABLE foo ( col1 CHAR(20));
CREATE INDEX i1 ON foo(col1);
INSERT INTO foo VALUES ('bar');
SELECT * FROM foo WHERE col1 BETWEEN 'b' and 'b' || chr(255);But this does not return any data.
How to write index optimizable WHERE clause when only some (variable number)
of characters from beginning of col1 are known ?Only way seems to use BETWEEN comparison by concatenating character greater
than all other characters in locale. Since CHR(255) does not work this is
not possible.So
CREATE INDEX i1 ON foo(col1);
cannot be used to optimize queries of type "get all rows where first n
charaters of col1 are known" in Postgres.Andrus.
you can create two indexes:
CREATE INDEX myindex_lc on mytable(col1 text_pattern_ops);
and
CREATE INDEX myindex_normal ON foo(col1);
the first one will be used when using LIKE and the other for normal
comparisons .
--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)
you can create two indexes:
CREATE INDEX myindex_lc on mytable(col1 text_pattern_ops);
and
CREATE INDEX myindex_normal ON foo(col1);the first one will be used when using LIKE and the other for normal
comparisons .
Jaime,
CREATE INDEX myindex_normal ON foo(col1);
Creates btree structure. In other dbm system btree structure can be used for
searches where only some first characters in index key are known.
So I see no reason to create second index using text_pattern_ops for this
purpose.
I'm searching a way to use Postgres regular index for this.
Andrus.
On Wed, 2005-11-09 at 14:56, Andrus wrote:
you can create two indexes:
CREATE INDEX myindex_lc on mytable(col1 text_pattern_ops);
and
CREATE INDEX myindex_normal ON foo(col1);the first one will be used when using LIKE and the other for normal
comparisons .Jaime,
CREATE INDEX myindex_normal ON foo(col1);
Creates btree structure. In other dbm system btree structure can be used for
searches where only some first characters in index key are known.So I see no reason to create second index using text_pattern_ops for this
purpose.I'm searching a way to use Postgres regular index for this.
Easy, do what those other databases do. Setup your database to not use
a locale.
initdb --locale=C
and you're golden.
On Wed, Nov 09, 2005 at 10:46:27PM +0200, Andrus wrote:
thank you. I try to formulate my problem more presicely.
I have tableCREATE TABLE foo ( bar CHAR(10) PRIMARY KEY);
Cluster locale is non-C. Database encoding is UTF-8. Postgres vers is 8.1
Do this instead:
CREATE TABLE foo ( bar CHAR(10) NOT NULL );
CREATE UNIQUE INDEX foo_bar ON foo(bar char_pattern_ops);
I want to run fast queries by knowing first characters of bar like :
1. Select records from foo where first character of bar is A
2. Select records from foo where first character of bar is B
3. Select records from foo where first two characters of bar are BC
4. Select records from foo where first three characters of bar are ABC
SELECT * FROM foo WHERE bar LIKE 'A%';
SELECT * FROM foo WHERE bar LIKE 'B%';
SELECT * FROM foo WHERE bar LIKE 'BC%';
SELECT * FROM foo WHERE bar LIKE 'ABC%';
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
Import Notes
Reply to msg id not found: dktnpk$2jrl$1@news.hub.org
On 11/9/05, Andrus <eetasoft@online.ee> wrote:
you can create two indexes:
CREATE INDEX myindex_lc on mytable(col1 text_pattern_ops);
and
CREATE INDEX myindex_normal ON foo(col1);the first one will be used when using LIKE and the other for normal
comparisons .Jaime,
CREATE INDEX myindex_normal ON foo(col1);
Creates btree structure. In other dbm system btree structure can be used for
searches where only some first characters in index key are known.
and the same is true for postgres when you use C LOCALE, but because
some implementation details i don't know so deep when using non-C
LOCALE you need the class operator in order to use the index with LIKE
'pattern%'
So I see no reason to create second index using text_pattern_ops for this
purpose.
the reason is that you want to use the index in the search... and, at
least you go and solve the problem with code, that's the way to do
it...
--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)
On 2005-11-09 13:08, Martijn van Oosterhout wrote:
I want to run fast queries by knowing first characters of bar like :
1. Select records from foo where first character of bar is A
2. Select records from foo where first character of bar is B
3. Select records from foo where first two characters of bar are BC
4. Select records from foo where first three characters of bar are ABCSELECT * FROM foo WHERE bar LIKE 'A%';
SELECT * FROM foo WHERE bar LIKE 'B%';
SELECT * FROM foo WHERE bar LIKE 'BC%';
SELECT * FROM foo WHERE bar LIKE 'ABC%';Have a nice day,
Or:
SELECT * FROM foo WHERE bar::CHAR(1) = 'A';
SELECT * FROM foo WHERE bar::CHAR(1) = 'B';
SELECT * FROM foo WHERE bar::CHAR(2) = 'BC';
SELECT * FROM foo WHERE bar::CHAR(3) = 'ABC';
-- Dean
Scott,
I'm searching a way to use Postgres regular index for this.
Easy, do what those other databases do. Setup your database to not use
a locale.initdb --locale=C
and you're golden.
thank you.
My language has letters in ������ with correstonding upper case letters
������
I need to prevent entering of duplicate customer names into database.
For this I created unique index on UPPER(customer_name)
I need also search for customer name in case-insensitive way. For this I use
ILIKE operator.
Those two features are working in Postgres 8.1 when I use non-C locale.
If I switch to C locale, they will not work.
My current database, Microsoft Visual Foxpro implements this functionality.
How to implement this functionality in Postgres if I switch to C locale ?
Andrus.
CREATE TABLE foo ( bar CHAR(10) PRIMARY KEY);
Cluster locale is non-C. Database encoding is UTF-8. Postgres vers is 8.1
Do this instead:
CREATE TABLE foo ( bar CHAR(10) NOT NULL );
CREATE UNIQUE INDEX foo_bar ON foo(bar char_pattern_ops);
Martijn,
Thank you. I have CHAR columns and need a primary key also. So I tried the
code
CREATE TABLE foo ( bar CHAR(10) NOT NULL );
CREATE UNIQUE INDEX foo_bar ON foo(bar bpchar_pattern_ops);
ALTER TABLE foo ADD PRIMARY KEY (bar);
I found that adding primary key creates another index.
How to create primary key without duplicate index on bar column ?
Andrus.
On Wed, 2005-11-09 at 15:30, Andrus wrote:
Scott,
I'm searching a way to use Postgres regular index for this.
Easy, do what those other databases do. Setup your database to not use
a locale.initdb --locale=C
and you're golden.
thank you.
My language has letters in with correstonding upper case letters
I need to prevent entering of duplicate customer names into database.
For this I created unique index on UPPER(customer_name)I need also search for customer name in case-insensitive way. For this I use
ILIKE operator.Those two features are working in Postgres 8.1 when I use non-C locale.
If I switch to C locale, they will not work.My current database, Microsoft Visual Foxpro implements this functionality.
How to implement this functionality in Postgres if I switch to C locale ?
You can't. You have conflicting desires. PostgreSQL IS NOT FOXPRO. If
you want to use foxpro, then do so. If you want to use PostgreSQL, then
you'll either have to accept that you need to make certain accomodations
to use it with a non-C locale, or accept a C locale and its limitations.
You say that Foxpro implements this functionality, but are you sure that
it gets things like collation correct? I.e. does it truly understand
all the rules for what comes before something else in your language?
Locales are a complex and difficult thing to get exactly right, and
while, at first blush, Foxpro may seem to do the right thing, you may
find it isn't doing EVERYTHING exactly right, and still having good
performance.
Then again, maybe it is.
But PostgreSQL is limited to working either in a C locale and
automatically using indexes for like 'abc%' queries but getting
collation wrong, or working in the correct locale, not using indexes for
like 'abc%', having to use the special class operator if you want likes
to work, and getting the collation correct.
If that doesn't work for you, your only real choice is to either use
another database, or start hacking to make PostgreSQL the database you
want it to be.
It's not a simple problem, and there is no simple answer. And if you
expect any database to not have things like this in it to deal with, you
just haven't looked very hard at any of them. They've all got warts.
And sometimes, one db is just not a good fit.
Perhaps full text searching could help you out here? Not sure.
SELECT * FROM foo WHERE bar::CHAR(1) = 'A';
SELECT * FROM foo WHERE bar::CHAR(1) = 'B';
SELECT * FROM foo WHERE bar::CHAR(2) = 'BC';
SELECT * FROM foo WHERE bar::CHAR(3) = 'ABC';
Dean,
thank you. That would be EXCELLENT solution!
Can you confirm that in this case Postgres 8.1 can use index created by
CREATE TABLE foo ( bar CHAR(10) PRIMARY KEY );
even on non-C locale ?
Andrus.
On 11/9/05, Andrus <eetasoft@online.ee> wrote:
CREATE TABLE foo ( bar CHAR(10) PRIMARY KEY);
Cluster locale is non-C. Database encoding is UTF-8. Postgres vers is 8.1
Do this instead:
CREATE TABLE foo ( bar CHAR(10) NOT NULL );
CREATE UNIQUE INDEX foo_bar ON foo(bar char_pattern_ops);Martijn,
Thank you. I have CHAR columns and need a primary key also. So I tried the
codeCREATE TABLE foo ( bar CHAR(10) NOT NULL );
CREATE UNIQUE INDEX foo_bar ON foo(bar bpchar_pattern_ops);
ALTER TABLE foo ADD PRIMARY KEY (bar);I found that adding primary key creates another index.
How to create primary key without duplicate index on bar column ?
Andrus.
you can't.
postgresql implements primary keys creating unique indexes and not
null constraints on the pk columns.
--
Atentamente,
Jaime Casanova
(DBA: DataBase Aniquilator ;)
On Wed, 2005-11-09 at 16:23, Jaime Casanova wrote:
On 11/9/05, Andrus <eetasoft@online.ee> wrote:
CREATE TABLE foo ( bar CHAR(10) PRIMARY KEY);
Cluster locale is non-C. Database encoding is UTF-8. Postgres vers is 8.1
Do this instead:
CREATE TABLE foo ( bar CHAR(10) NOT NULL );
CREATE UNIQUE INDEX foo_bar ON foo(bar char_pattern_ops);Martijn,
Thank you. I have CHAR columns and need a primary key also. So I tried the
codeCREATE TABLE foo ( bar CHAR(10) NOT NULL );
CREATE UNIQUE INDEX foo_bar ON foo(bar bpchar_pattern_ops);
ALTER TABLE foo ADD PRIMARY KEY (bar);I found that adding primary key creates another index.
How to create primary key without duplicate index on bar column ?
Andrus.
you can't.
postgresql implements primary keys creating unique indexes and not
null constraints on the pk columns.
But, of course, you CAN delete that other index now that it's redundant.
Andrus wrote:
SELECT * FROM foo WHERE bar::CHAR(1) = 'A';
SELECT * FROM foo WHERE bar::CHAR(1) = 'B';
SELECT * FROM foo WHERE bar::CHAR(2) = 'BC';
SELECT * FROM foo WHERE bar::CHAR(3) = 'ABC';
Can you confirm that in this case Postgres 8.1 can use index created
by
CREATE TABLE foo ( bar CHAR(10) PRIMARY KEY );
even on non-C locale ?
It will not use the index no matter what locale. You would in these
cases need to create additional expression indexes on bar::char(1) etc.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
How to create primary key without duplicate index on bar column ?
Andrus.
you can't.
postgresql implements primary keys creating unique indexes and not
null constraints on the pk columns.But, of course, you CAN delete that other index now that it's redundant.
Scott,
thank you. I don't understand how to delete primary key index without
removing primary key constraint or how to force primary key to use foo_bar
index.
I tried
CREATE TABLE foo ( bar CHAR(10) PRIMARY KEY );
CREATE UNIQUE INDEX foo_bar ON foo(bar bpchar_pattern_ops);
DROP INDEX foo_pkey;
but got
ERROR: cannot drop index foo_pkey because constraint foo_pkey on table foo
requires it
Andrus.