[PATCH] Add citext_pattern_ops to citext contrib module

Started by Alexey Chernyshovover 8 years ago6 messageshackers
Jump to latest
#1Alexey Chernyshov
a.chernyshov@postgrespro.ru

Hi all,

The attached patch introduces citext_pattern_ops for citext extension
type like text_pattern_ops for text type. Here are operators ~<~, ~<=~,
~>~, ~>=~ combined into citext_pattern_ops operator class. These
operators simply compare underlying citext values as C strings with
memcmp() function. This operator class isn’t supported by B-Tree index
yet, but it is a first step to do it.

Patch includes regression tests and is applicable to the latest commit
(c85ec643ff2586e2d144374f51f93bfa215088a2).

The problem of citext support for LIKE operator with B-Tree index was
mentioned in [1]. Briefly, the planner doesn’t use B-Tree index for
queries text_col LIKE ‘abc%’. I’d like to investigate how to improve it
and make another patch. I think the start point is
match_special_index_operator() function which doesn’t support custom
types. I would appreciate hearing your opinion on this.

1. /messages/by-id/3924.1480351187@sss.pgh.pa.us

--
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

citext_pattern_ops-v1.patchtext/x-patch; name=citext_pattern_ops-v1.patchDownload+1519-459
#2Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Alexey Chernyshov (#1)
Re: [PATCH] Add citext_pattern_ops to citext contrib module

On Tue, Jul 18, 2017 at 5:18 AM, Alexey Chernyshov
<a.chernyshov@postgrespro.ru> wrote:

Hi all,

Hi Alexey, I took a look at your patch. Builds fine here, and passes
the new tests.

I'm new to this code, so take my review with a grain of salt.

The attached patch introduces citext_pattern_ops for citext extension type
like text_pattern_ops for text type. Here are operators ~<~, ~<=~, ~>~, ~>=~
combined into citext_pattern_ops operator class. These operators simply
compare underlying citext values as C strings with memcmp() function.

Are there any cases where performing the str_tolower with the default
collation, then comparing byte-by-byte, could backfire? The added test
cases don't make use of any multibyte/accented characters, so it's not
clear to me yet what *should* be happening in those cases.

It also might be a good idea to add some test cases that compare
strings of different lengths, to exercise all the paths in
internal_citext_pattern_cmp().

+-- test citext_pattern_cmp() function explicetily.

Spelling nitpick in the new SQL: s/explicetily/explicitly .

--Jacob

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexey Chernyshov (#1)
Re: [PATCH] Add citext_pattern_ops to citext contrib module

Alexey Chernyshov <a.chernyshov@postgrespro.ru> writes:

The attached patch introduces citext_pattern_ops for citext extension
type like text_pattern_ops for text type. Here are operators ~<~, ~<=~,
~>~, ~>=~ combined into citext_pattern_ops operator class. These
operators simply compare underlying citext values as C strings with
memcmp() function.

Hi Alexey,

Quick comment on this patch: recently, we've decided that having patches
replace the whole base script for an extension is too much of a
maintenance problem, especially when there are several patches in the
pipeline for the same contrib module. The new style is to provide only
a version update script (which you'd have to write anyway), and then
rely on CREATE EXTENSION to apply the old base script plus the update(s).
You can see some examples in the patch I just posted at

/messages/by-id/24721.1505229713@sss.pgh.pa.us

Also, since that patch is probably going to get committed pretty soon, you
could reformulate your patch as an add-on to its citext--1.4--1.5.sql
script. We don't really need to have a separate version of the extension
for states that are intermediate between two PG major releases. Only
if your patch doesn't get in by v11 freeze would you need to make it a
separate citext--1.5--1.6.sql script.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Alexey Chernyshov
a.chernyshov@postgrespro.ru
In reply to: Tom Lane (#3)
Re: [PATCH] Add citext_pattern_ops to citext contrib module

On Tue, 12 Sep 2017 12:59:20 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

Quick comment on this patch: recently, we've decided that having
patches replace the whole base script for an extension is too much of
a maintenance problem, especially when there are several patches in
the pipeline for the same contrib module. The new style is to
provide only a version update script (which you'd have to write
anyway), and then rely on CREATE EXTENSION to apply the old base
script plus the update(s). You can see some examples in the patch I
just posted at

/messages/by-id/24721.1505229713@sss.pgh.pa.us

Also, since that patch is probably going to get committed pretty
soon, you could reformulate your patch as an add-on to its
citext--1.4--1.5.sql script. We don't really need to have a separate
version of the extension for states that are intermediate between two
PG major releases. Only if your patch doesn't get in by v11 freeze
would you need to make it a separate citext--1.5--1.6.sql script.

regards, tom lane

Accented characters and different length strings tests added.
Since patch
(ttps://www.postgresql.org/message-id/24721.1505229713@sss.pgh.pa.us)
is committed, I changed the patch as you said. Thanks for your notes.
Do we need expected/citext.out? It seems that only
expected/citext_1.out has correct output.

--
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

citext_pattern_ops-v2.patchtext/x-patchDownload+1012-1
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexey Chernyshov (#4)
Re: [PATCH] Add citext_pattern_ops to citext contrib module

Alexey Chernyshov <a.chernyshov@postgrespro.ru> writes:

Do we need expected/citext.out? It seems that only
expected/citext_1.out has correct output.

Well, for me, citext.out matches the results in C locale,
and citext_1.out matches the results in en_US. If you don't
satisfy both of those cases, the buildfarm will not like you.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
Re: [PATCH] Add citext_pattern_ops to citext contrib module

On 09/18/2017 12:50 PM, Tom Lane wrote:

Alexey Chernyshov <a.chernyshov@postgrespro.ru> writes:

Do we need expected/citext.out? It seems that only
expected/citext_1.out has correct output.

Well, for me, citext.out matches the results in C locale,
and citext_1.out matches the results in en_US. If you don't
satisfy both of those cases, the buildfarm will not like you.

I'm about to pick this one up - I will handle the expected file issue.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers