Custom text type for title text

Started by Daniel Heathabout 7 years ago6 messages
#1Daniel Heath
daniel@heath.cc

Hi All,

I've frequently seen an issue in applications which store titles (eg of books, events, user profiles) where duplicate values are not properly vetted.

The 'citext' type is helpful here, but I'd be keen to go further.

I propose a 'titletext' type, which has the following properties when compared for equality:
* Case insensitivity (like 'citext')
* Only considers characters in [:alnum:] (that is, ignores spaces, punctuation, etc)

This would be useful for a range of situations where it's important to avoid entering duplicate values.

Given the discussion at /messages/by-id/CAKFQuwY9u14TqG8Yzj=fAB0tydvvtK7ibgFEx3tegbPWsGjJpg@mail.gmail.com </messages/by-id/CAKFQuwY9u14TqG8Yzj=fAB0tydvvtK7ibgFEx3tegbPWsGjJpg@mail.gmail.com&gt; I'd lean towards making this type not automatically coerce to text (to avoid surprising behaviour when comparing text to titletext).

Is a suitable patch likely to be accepted?

Thanks,

Daniel Heath

#2Fabrízio de Royes Mello
fabriziomello@gmail.com
In reply to: Daniel Heath (#1)
Re: Custom text type for title text

Em qui, 3 de jan de 2019 às 20:22, Daniel Heath <daniel@heath.cc> escreveu:

Hi All,

I've frequently seen an issue in applications which store titles (eg of
books, events, user profiles) where duplicate values are not properly
vetted.

The 'citext' type is helpful here, but I'd be keen to go further.

I propose a 'titletext' type, which has the following properties when
compared for equality:
* Case insensitivity (like 'citext')
* Only considers characters in [:alnum:] (that is, ignores spaces,
punctuation, etc)

This would be useful for a range of situations where it's important to
avoid entering duplicate values.

Given the discussion at
/messages/by-id/CAKFQuwY9u14TqG8Yzj=fAB0tydvvtK7ibgFEx3tegbPWsGjJpg@mail.gmail.com
</messages/by-id/CAKFQuwY9u14TqG8Yzj=fAB0tydvvtK7ibgFEx3tegbPWsGjJpg@mail.gmail.com&gt; I'd
lean towards making this type not automatically coerce to text (to avoid
surprising behaviour when comparing text to titletext).

Is a suitable patch likely to be accepted?

You don’t need touch the core to do that. Just implement it as an

extension and share throught some channel like pgxn.org.

Note that citext also is an extension and released as a contrib module.

Regards,

--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#3Daniel Heath
daniel@heath.cc
In reply to: Fabrízio de Royes Mello (#2)
Re: Custom text type for title text

Would this also be appropriate for inclusion as contrib? I'm unfamiliar
with the policy for what is / is not included there.
Thanks,
Daniel Heath

On Fri, Jan 4, 2019, at 9:47 AM, Fabrízio de Royes Mello wrote:

Em qui, 3 de jan de 2019 às 20:22, Daniel Heath <daniel@heath.cc>
escreveu:>> Hi All,

I've frequently seen an issue in applications which store titles (eg
of books, events, user profiles) where duplicate values are not
properly vetted.>>
The 'citext' type is helpful here, but I'd be keen to go further.

I propose a 'titletext' type, which has the following properties when
compared for equality:>> * Case insensitivity (like 'citext')
* Only considers characters in [:alnum:] (that is, ignores spaces,
punctuation, etc)>>
This would be useful for a range of situations where it's important
to avoid entering duplicate values.>>
Given the discussion at
/messages/by-id/CAKFQuwY9u14TqG8Yzj=fAB0tydvvtK7ibgFEx3tegbPWsGjJpg@mail.gmail.com%5B1%5D
I'd lean towards making this type not automatically coerce to text
(to avoid surprising behaviour when comparing text to titletext).>>
Is a suitable patch likely to be accepted?

You don’t need touch the core to do that. Just implement it as an
extension and share throught some channel like pgxn.org.>
Note that citext also is an extension and released as a contrib
module.>
Regards,

--
Fabrízio de Royes Mello Timbira -
http://www.timbira.com.br/&gt; PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e
Treinamento

Links:

1. /messages/by-id/CAKFQuwY9u14TqG8Yzj=fAB0tydvvtK7ibgFEx3tegbPWsGjJpg@mail.gmail.com

#4Fabrízio de Royes Mello
fabriziomello@gmail.com
In reply to: Daniel Heath (#3)
Re: Custom text type for title text

Em qui, 3 de jan de 2019 às 20:53, Daniel Heath <daniel@heath.cc> escreveu:

Would this also be appropriate for inclusion as contrib? I'm unfamiliar
with the policy for what is / is not included there.

Please do not top post.

At first I recommend you implement it as an extension (using gitlab,
github, bitbucket or something else) and after you have a stable working
code maybe you should try to send it as a contrib module and then the
community will decide to accept it or not.

PostgreSQL is extensible enough to you provide this piece of work without
care with the community decisions. What I mean is you necessarily don’t
need to send it as a contrib module, just maintain it as a separate
extension project.

Regards,

--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#5Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Daniel Heath (#1)
Re: Custom text type for title text

On 03/01/2019 23:22, Daniel Heath wrote:

I propose a 'titletext' type, which has the following properties when
compared for equality:
�* Case insensitivity (like 'citext')
�* Only considers characters in [:alnum:] (that is, ignores spaces,
punctuation, etc)

My work on insensitive/non-deterministic collations[0]/messages/by-id/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com might cover this.

[0]: /messages/by-id/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
/messages/by-id/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com

For example:

CREATE COLLATION yournamehere (provider = icu,
locale = 'und-u-ks-level2-ka-shifted', deterministic = false);

(Roughly, ks-level2 means ignore case, ka-shifted means ignore punctuation.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Greg Stark
stark@mit.edu
In reply to: Daniel Heath (#3)
Re: Custom text type for title text

Policy on contrib has shifted over time. But generally we want to encourage
a lively ecosystem of extensions maintained outside of the Postgres source
tree so we avoid adding things to contrib when there's no particular
advantage.

The most common reason things are added to contrib is when the extension is
closely tied to internals and needs to be maintained along with changes to
internals. Modules like that are hard to maintain separately. But modules
that use documented general extensibility APIs should be able to be stable
across versions and live outside contrib.

On Thu 3 Jan 2019, 23:54 Daniel Heath <daniel@heath.cc wrote:

Show quoted text

Would this also be appropriate for inclusion as contrib? I'm unfamiliar
with the policy for what is / is not included there.

Thanks,
Daniel Heath

On Fri, Jan 4, 2019, at 9:47 AM, Fabrízio de Royes Mello wrote:

Em qui, 3 de jan de 2019 às 20:22, Daniel Heath <daniel@heath.cc>
escreveu:

Hi All,

I've frequently seen an issue in applications which store titles (eg of
books, events, user profiles) where duplicate values are not properly
vetted.

The 'citext' type is helpful here, but I'd be keen to go further.

I propose a 'titletext' type, which has the following properties when
compared for equality:
* Case insensitivity (like 'citext')
* Only considers characters in [:alnum:] (that is, ignores spaces,
punctuation, etc)

This would be useful for a range of situations where it's important to
avoid entering duplicate values.

Given the discussion at
/messages/by-id/CAKFQuwY9u14TqG8Yzj=fAB0tydvvtK7ibgFEx3tegbPWsGjJpg@mail.gmail.com
</messages/by-id/CAKFQuwY9u14TqG8Yzj=fAB0tydvvtK7ibgFEx3tegbPWsGjJpg@mail.gmail.com&gt; I'd
lean towards making this type not automatically coerce to text (to avoid
surprising behaviour when comparing text to titletext).

Is a suitable patch likely to be accepted?

You don’t need touch the core to do that. Just implement it as an
extension and share throught some channel like pgxn.org.

Note that citext also is an extension and released as a contrib module.

Regards,

--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento