btree_gin and ranges
Suggested patch adds GIN support contains operator for ranges over scalar column.
It allows more effective GIN scan. Currently, queries like
SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
will be excuted by GIN with two scans: one is from mines infinity to 1 and
another is from -1 to plus infinity. That's because GIN is "generalized" and it
doesn't know a semantics of operation.
With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.
Patch provides index support only for existing range types: int4, int8, numeric,
date and timestamp with and without time zone.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
Hi
On Wed, Oct 22, 2014 at 1:55 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.
Shouldn't this be implemented in a more generic manner? An ordinary
btree index could very well support <@ queries too, but your patch
only adds this capability to btree-gin.
The documentation describes btree-gin as providing "GIN operator
classes that implement B-tree equivalent behavior", but now the
behavior diverges.
Regards,
Marti
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Shouldn't this be implemented in a more generic manner? An ordinary
Unfortunately I don't see a way for that. GIN is generalized - and it doesn't
know semantic. Actually, we could fix first five strategy numbers for BTree's
strategies, and then we could teach GIN core to use BTRee semantic, but what
about with already existed operator classes?
The documentation describes btree-gin as providing "GIN operator
classes that implement B-tree equivalent behavior", but now the
behavior diverges.
Anyway GIN couldn't be used for ORDER BY clause.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/22/2014 03:01 PM, Teodor Sigaev wrote:
Anyway GIN couldn't be used for ORDER BY clause.
which would also be nice to fix...
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/22/2014 03:01 PM, Teodor Sigaev wrote:
Anyway GIN couldn't be used for ORDER BY clause.
which would also be nice to fix...
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
which would also be nice to fix..
Of course, agree. With rbtree usage instead of tidbitmap hash and semantic
knowledge about operators in GIN...
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
Suggested patch adds GIN support contains operator for ranges over scalar column.
It allows more effective GIN scan. Currently, queries like
SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
will be excuted by GIN with two scans: one is from mines infinity to 1 and
another is from -1 to plus infinity. That's because GIN is "generalized" and it
doesn't know a semantics of operation.With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.Patch provides index support only for existing range types: int4, int8, numeric,
date and timestamp with and without time zone.
I started to look at this, but very quickly got carried away into
refactoring away the macros. Defining long functions as macros makes
debugging quite difficult, and it's not nice to read or edit the source
code either.
I propose the attached refactoring (it doesn't include your range-patch
yet, just refactoring the existing code). It turns the bulk of the
macros into static functions. GIN_SUPPORT macro still defines the
datatype-specific functions, but they are now very thin wrappers that
just call the corresponding generic static functions.
It's annoying that we need the concept of a leftmost value, for the <
and <= queries. Without that, we could have the support functions look
up the required datatype information from the type cache, based on the
datatype of the passed argument. Then you could easily use the btree_gin
support functions also for user-defined datatypes. But that needs bigger
changes, and this is a step in the right direction anyway.
- Heikki
Attachments:
0001-Turn-much-of-the-btree_gin-macros-into-real-function.patchtext/x-diff; name=0001-Turn-much-of-the-btree_gin-macros-into-real-function.patchDownload+173-203
On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
Suggested patch adds GIN support contains operator for ranges over scalar
column.It allows more effective GIN scan. Currently, queries like
SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
will be excuted by GIN with two scans: one is from mines infinity to 1 and
another is from -1 to plus infinity. That's because GIN is "generalized"
and it
doesn't know a semantics of operation.With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.Patch provides index support only for existing range types: int4, int8,
numeric,
date and timestamp with and without time zone.I started to look at this, but very quickly got carried away into
refactoring away the macros. Defining long functions as macros makes
debugging quite difficult, and it's not nice to read or edit the source code
either.I propose the attached refactoring (it doesn't include your range-patch yet,
just refactoring the existing code). It turns the bulk of the macros into
static functions. GIN_SUPPORT macro still defines the datatype-specific
functions, but they are now very thin wrappers that just call the
corresponding generic static functions.It's annoying that we need the concept of a leftmost value, for the < and <=
queries. Without that, we could have the support functions look up the
required datatype information from the type cache, based on the datatype of
the passed argument. Then you could easily use the btree_gin support
functions also for user-defined datatypes. But that needs bigger changes,
and this is a step in the right direction anyway.
I just had a look at the refactoring patch and ISTM that this is a
good step forward in terms of readability. Teodor, I am noticing that
your patch cannot apply once the refactoring is done. Could you rebase
your patch once the refactoring is pushed?s
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/22/2014 03:15 AM, Michael Paquier wrote:
On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
Suggested patch adds GIN support contains operator for ranges over scalar
column.It allows more effective GIN scan. Currently, queries like
SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
will be excuted by GIN with two scans: one is from mines infinity to 1 and
another is from -1 to plus infinity. That's because GIN is "generalized"
and it
doesn't know a semantics of operation.With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.Patch provides index support only for existing range types: int4, int8,
numeric,
date and timestamp with and without time zone.I started to look at this, but very quickly got carried away into
refactoring away the macros. Defining long functions as macros makes
debugging quite difficult, and it's not nice to read or edit the source code
either.I propose the attached refactoring (it doesn't include your range-patch yet,
just refactoring the existing code). It turns the bulk of the macros into
static functions. GIN_SUPPORT macro still defines the datatype-specific
functions, but they are now very thin wrappers that just call the
corresponding generic static functions.It's annoying that we need the concept of a leftmost value, for the < and <=
queries. Without that, we could have the support functions look up the
required datatype information from the type cache, based on the datatype of
the passed argument. Then you could easily use the btree_gin support
functions also for user-defined datatypes. But that needs bigger changes,
and this is a step in the right direction anyway.I just had a look at the refactoring patch and ISTM that this is a
good step forward in terms of readability. Teodor, I am noticing that
your patch cannot apply once the refactoring is done. Could you rebase
your patch once the refactoring is pushed?s
I've pushed the refactoring patch. Here's a version of Teodor's patch,
rebased over the pushed patch.
Teodor's patch could use some more comments. The
STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are a good idea, but they probably
should go into src/include/access/gin.h so that they can be used in all
compare_partial implementations.
- Heikki
Attachments:
btree_gin_range-2.patchtext/x-diff; name=btree_gin_range-2.patchDownload+1288-709
Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
macros are a good idea, but they probably should go into
src/include/access/gin.h so that they can be used in all compare_partial
implementations.
STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and comments
are improved.
Split patch to two: gin and module
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
On Fri, Dec 26, 2014 at 01:33:26PM +0300, Teodor Sigaev wrote:
Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
macros are a good idea, but they probably should go into
src/include/access/gin.h so that they can be used in all compare_partial
implementations.STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and
comments are improved.Split patch to two: gin and module
Where are we on this patch?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Teodor Sigaev wrote:
Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
macros are a good idea, but they probably should go into
src/include/access/gin.h so that they can be used in all compare_partial
implementations.STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and
comments are improved.Split patch to two: gin and module
Here you forgot to "git add" the two .sql files for the extension. They
are present in the patch Heikki posted upthread but not here.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers