Allowing GIN array_ops to work on anyarray

Started by Tom Laneover 9 years ago8 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

In
/messages/by-id/15293.1466536829@sss.pgh.pa.us
I speculated that it might not take too much to replace all the variants
of GIN array_ops with a single polymorphic opclass over anyarray.
Attached is a proposed patch that does that.

There are two bits of added functionality needed to make this work:

1. We need to abstract the storage type. The patch does this by teaching
catalog/index.c to recognize an opckeytype specified as ANYELEMENT with an
opcintype of ANYARRAY, and doing the array element type lookup at index
creation time.

2. We need to abstract the key comparator. The patch does this by
teaching gin/ginutil.c that if the opclass omits a GIN_COMPARE_PROC,
it should look up the default btree comparator for the index key type.

Both of these seem to me to be reasonable general-purpose behaviors with
potential application to other opclasses.

In the aforementioned message I worried that a core opclass defined this
way might conflict with user-built opclasses for specific array types,
but it seems to work out fine without any additional tweaks: CREATE INDEX
already prefers an exact match if it finds one, and only falls back to
matching anyarray when it doesn't. Also, all the replaced opclasses are
presently default for their types, which means that pg_dump won't print
them explicitly in CREATE INDEX commands, so we don't have a dump/reload
or pg_upgrade hazard from them disappearing.

A potential downside is that for an opclass defined this way, we add a
lookup_type_cache() call to each initGinState() call. That's basically
just a single dynahash lookup once the caches are populated, so it's not
much added cost, but conceivably it could be measurable in bulk insert
operations. If it does prove objectionable my inclination would be to
look into ways to avoid the repetitive function lookups of initGinState,
perhaps by letting it cache that stuff in the index's relcache entry.

I'll put this on the September commitfest docket.

regards, tom lane

Attachments:

gin-true-anyarray-opclass-1.patchtext/x-diff; charset=us-ascii; name=gin-true-anyarray-opclass-1.patchDownload+284-234
#2M Enrique
enrique.mailing.lists@gmail.com
In reply to: Tom Lane (#1)
Re: Allowing GIN array_ops to work on anyarray

This is awesome. I will build it to start using and testing it in my
development environment. Thank you so much for making this change.

On Thu, Aug 11, 2016 at 11:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

In
/messages/by-id/15293.1466536829@sss.pgh.pa.us
I speculated that it might not take too much to replace all the variants
of GIN array_ops with a single polymorphic opclass over anyarray.
Attached is a proposed patch that does that.

There are two bits of added functionality needed to make this work:

1. We need to abstract the storage type. The patch does this by teaching
catalog/index.c to recognize an opckeytype specified as ANYELEMENT with an
opcintype of ANYARRAY, and doing the array element type lookup at index
creation time.

2. We need to abstract the key comparator. The patch does this by
teaching gin/ginutil.c that if the opclass omits a GIN_COMPARE_PROC,
it should look up the default btree comparator for the index key type.

Both of these seem to me to be reasonable general-purpose behaviors with
potential application to other opclasses.

In the aforementioned message I worried that a core opclass defined this
way might conflict with user-built opclasses for specific array types,
but it seems to work out fine without any additional tweaks: CREATE INDEX
already prefers an exact match if it finds one, and only falls back to
matching anyarray when it doesn't. Also, all the replaced opclasses are
presently default for their types, which means that pg_dump won't print
them explicitly in CREATE INDEX commands, so we don't have a dump/reload
or pg_upgrade hazard from them disappearing.

A potential downside is that for an opclass defined this way, we add a
lookup_type_cache() call to each initGinState() call. That's basically
just a single dynahash lookup once the caches are populated, so it's not
much added cost, but conceivably it could be measurable in bulk insert
operations. If it does prove objectionable my inclination would be to
look into ways to avoid the repetitive function lookups of initGinState,
perhaps by letting it cache that stuff in the index's relcache entry.

I'll put this on the September commitfest docket.

regards, tom lane

#3Enrique Meneses
emmeneses@gmail.com
In reply to: Tom Lane (#1)
Re: Allowing GIN array_ops to work on anyarray

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: tested, passed

I built and installed (make world / make install-world) github branch REL9_6_STABLE and applied the patch (patch -p1 < patch/gin-true-anyarray-opclass-1.patch).

I then upgraded my development database (postgres 9.5) using pg_upgrade. This database has one table with an UUID array gin index. The database was upgraded correctly to postgresql 9.6 and I was able to successfully connect to it from a web application which uses the database. There were no conflicts so I expect others to be able to upgrade without issues.

I then dropped the database and re-created it... I made sure that I no longer used my prior operator class definition. I re-started my web application and confirmed it works. This verifies the feature works as designed.

The following is no longer required:

CREATE OPERATOR CLASS _uuid_ops DEFAULT
FOR TYPE _uuid USING gin AS
OPERATOR 1 &&(anyarray, anyarray),
OPERATOR 2 @>(anyarray, anyarray),
OPERATOR 3 <@(anyarray, anyarray),
OPERATOR 4 =(anyarray, anyarray),
FUNCTION 1 uuid_cmp(uuid, uuid),
FUNCTION 2 ginarrayextract(anyarray, internal, internal),
FUNCTION 3 ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
FUNCTION 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),
STORAGE uuid;

I also ran "make installcheck-world" and all the tests passed.

The new status of this patch is: Waiting on Author

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Enrique Meneses
emmeneses@gmail.com
In reply to: M Enrique (#2)
Re: Allowing GIN array_ops to work on anyarray

I was not sure what "Spec compliant means"... so I did not select as tested or passed. What should I do to validate that this change is "Spec compliant"?
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Enrique Meneses
emmeneses@gmail.com
In reply to: Tom Lane (#1)
Re: Allowing GIN array_ops to work on anyarray

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: tested, passed

I am resending this due to an earlier error message from the mailing list:
-----
These are my comments for the review of https://commitfest.postgresql.org/10/708/

I built and installed (make world / make install-world) github branch REL9_6_STABLE and applied the patch (patch -p1 < patch/gin-true-anyarray-opclass-1.patch).

I then upgraded my development database (postgres 9.5) using pg_upgrade. This database has one table with an UUID array gin index. The database was upgraded correctly to postgresql 9.6 and I was able to successfully connect to it from a web application which uses the database. There were no conflicts so I expect others to be able to upgrade without issues.

I then dropped the database and re-created it... I made sure that I no longer used my prior operator class definition. I re-started my web application and confirmed it works. This verifies the feature works as designed.

The following is no longer required:

CREATE OPERATOR CLASS _uuid_ops DEFAULT
FOR TYPE _uuid USING gin AS
OPERATOR 1 &&(anyarray, anyarray),
OPERATOR 2 @>(anyarray, anyarray),
OPERATOR 3 <@(anyarray, anyarray),
OPERATOR 4 =(anyarray, anyarray),
FUNCTION 1 uuid_cmp(uuid, uuid),
FUNCTION 2 ginarrayextract(anyarray, internal, internal),
FUNCTION 3 ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
FUNCTION 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),
STORAGE uuid;

I also ran "make installcheck-world" and all the tests passed.

I am not sure what "Spec compliant means"... so I did not select as tested or passed. What should I do to validate that this change is "Spec compliant"?

The new status of this patch is: Waiting on Author

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Enrique Meneses (#4)
Re: Allowing GIN array_ops to work on anyarray

Enrique Meneses <emmeneses@gmail.com> writes:

I was not sure what "Spec compliant means"... so I did not select as tested or passed. What should I do to validate that this change is "Spec compliant"?

It's irrelevant to this patch, AFAICS. The SQL standard doesn't discuss
indexes at all, much less legislate on which operator classes ought to
exist for GIN indexes.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Enrique Meneses
emmeneses@gmail.com
In reply to: Tom Lane (#6)
Re: Allowing GIN array_ops to work on anyarray

Great, given it does not apply to this patch, then all the other tests
passed and the change looks good.

Thank you,
Enrique

On Mon, Sep 26, 2016 at 6:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Enrique Meneses <emmeneses@gmail.com> writes:

I was not sure what "Spec compliant means"... so I did not select as

tested or passed. What should I do to validate that this change is "Spec
compliant"?

It's irrelevant to this patch, AFAICS. The SQL standard doesn't discuss
indexes at all, much less legislate on which operator classes ought to
exist for GIN indexes.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Enrique Meneses (#7)
Re: Allowing GIN array_ops to work on anyarray

Enrique Meneses <emmeneses@gmail.com> writes:

Great, given it does not apply to this patch, then all the other tests
passed and the change looks good.

Pushed, thanks for the review!

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers