Review: B-Tree emulation for GIN
Hi Teodor Sigaev,
I am getting server crash in contrib regression. May be I am doing something
wrong here. Regression diff is attached.
BTW these queries work fine outside the regression.
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com
Attachments:
regression.diffsapplication/octet-stream; name=regression.diffsDownload+1042-1042
will see, may be it's needed to update the patch
Ibrar Ahmed wrote:
Hi Teodor Sigaev,
I am getting server crash in contrib regression. May be I am doing
something wrong here. Regression diff is attached.BTW these queries work fine outside the regression.
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com------------------------------------------------------------------------
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Updated patch.
Ibrar Ahmed wrote:
Hi Teodor Sigaev,
I am getting server crash in contrib regression. May be I am doing
something wrong here. Regression diff is attached.BTW these queries work fine outside the regression.
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com------------------------------------------------------------------------
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
Thanks,
On Fri, Dec 19, 2008 at 3:26 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Updated patch.
Ibrar Ahmed wrote:
Hi Teodor Sigaev,
I am getting server crash in contrib regression. May be I am doing
something wrong here. Regression diff is attached.BTW these queries work fine outside the regression.
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com------------------------------------------------------------------------
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW:
http://www.sigaev.ru/
--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com
On Fri, 2008-12-19 at 13:26 +0300, Teodor Sigaev wrote:
Updated patch.
I have merged this with HEAD, written a brief document (which is mostly
a copy of the btree-gist page), added the module to the contrib
Makefile, and made some very minor changes. Patch attached.
A couple comments:
1. Right now, to implement "less than" you need to start at the
beginning of the index and scan until you reach the supplied query
datum. This is because GIN doesn't support backwards scans
( http://archives.postgresql.org/pgsql-hackers/2008-07/msg01146.php ).
Unfortunately, that means numeric is not supportable for btree-gin
because there is no left-most value from which to start the scan. Do you
see an easy workaround to support numeric?
2. Why do you create a shell type "int32" from btree_gin.sql? I tried
doing a search-and-replace to use "int4" instead, and it seems to work
fine (and eliminates the warnings). I left it with int32 in my version
of the patch because I thought you may have some reason for using it.
Regards,
Jeff Davis
Attachments:
On Sun, 2008-12-28 at 23:41 -0800, Jeff Davis wrote:
2. Why do you create a shell type "int32" from btree_gin.sql? I tried
doing a search-and-replace to use "int4" instead, and it seems to work
fine (and eliminates the warnings). I left it with int32 in my version
of the patch because I thought you may have some reason for using it.
Attached an updated patch with this change.
Regards,
Jeff Davis
Attachments:
I have merged this with HEAD, written a brief document (which is mostly
a copy of the btree-gist page), added the module to the contrib
Makefile, and made some very minor changes. Patch attached.
Thank you
A couple comments:
1. Right now, to implement "less than" you need to start at the
beginning of the index and scan until you reach the supplied query
datum. This is because GIN doesn't support backwards scans
( http://archives.postgresql.org/pgsql-hackers/2008-07/msg01146.php ).Unfortunately, that means numeric is not supportable for btree-gin
because there is no left-most value from which to start the scan. Do you
see an easy workaround to support numeric?
Hmm, let we use minimal varlena struct (with size equal to VARHDRSZ) as
left-most (and fake) value. It is never used for any actual argument except
compare function, so, new compare function is provided. New version of patch is
attached and it based on on your patch (btree-gin.20090111.patch.gz).
2. Why do you create a shell type "int32" from btree_gin.sql? I tried
doing a search-and-replace to use "int4" instead, and it seems to work
fine (and eliminates the warnings). I left it with int32 in my version
of the patch because I thought you may have some reason for using it.
My mistake, thank you for fix
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
On Fri, 2009-01-16 at 17:42 +0300, Teodor Sigaev wrote:
Unfortunately, that means numeric is not supportable for btree-gin
because there is no left-most value from which to start the scan. Do you
see an easy workaround to support numeric?Hmm, let we use minimal varlena struct (with size equal to VARHDRSZ) as
left-most (and fake) value. It is never used for any actual argument except
compare function, so, new compare function is provided. New version of patch is
attached and it based on on your patch (btree-gin.20090111.patch.gz).
Looks good to me. I updated the CommitFest wiki to point to this patch.
I like the fact that this patch does not modify the numeric ADT. It
still relies on the fact that the numeric type will never make use of
the minimal varlena struct, however. I bring this up in case someone
sees it as a problem.
Thanks,
Jeff Davis
Jeff Davis escribi�:
I like the fact that this patch does not modify the numeric ADT. It
still relies on the fact that the numeric type will never make use of
the minimal varlena struct, however. I bring this up in case someone
sees it as a problem.
Greg Stark was working on a patch to make certain values use the short
representation.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
still relies on the fact that the numeric type will never make use of
the minimal varlena struct, however. I bring this up in case someone
sees it as a problem.Greg Stark was working on a patch to make certain values use the short
representation.
Fake value contains only VARHDRSZ bytes.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Jeff Davis <pgsql@j-davis.com> writes:
I like the fact that this patch does not modify the numeric ADT. It
still relies on the fact that the numeric type will never make use of
the minimal varlena struct, however. I bring this up in case someone
sees it as a problem.
I'm pretty certain I recall Greg Stark recommending that we adopt
something like that as the standard numeric representation of zero.
It didn't get done yet, but it might happen someday.
regards, tom lane
On Mon, 2009-01-19 at 11:35 -0500, Tom Lane wrote:
Jeff Davis <pgsql@j-davis.com> writes:
I like the fact that this patch does not modify the numeric ADT. It
still relies on the fact that the numeric type will never make use of
the minimal varlena struct, however. I bring this up in case someone
sees it as a problem.I'm pretty certain I recall Greg Stark recommending that we adopt
something like that as the standard numeric representation of zero.
It didn't get done yet, but it might happen someday.
Then we should use the previous version of the patch here:
http://archives.postgresql.org/message-id/1231709713.25019.129.camel@jdavis
Was there any talk of supporting a +/- infinity in numeric? If we did
that, it would allow numeric to be supported for btree-gin.
Regards,
Jeff Davis
I'm pretty certain I recall Greg Stark recommending that we adopt
something like that as the standard numeric representation of zero.
It didn't get done yet, but it might happen someday.
Changes:
- use NULL as left-most value. It's safe because NULL numeric value
cannot be an argument for any function except gin_numeric_cmp and it
cannot be returned in regular SQL query.
- fix uninstall script for support numeric type.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
On Mon, 2009-01-19 at 20:15 +0300, Teodor Sigaev wrote:
Changes:
- use NULL as left-most value. It's safe because NULL numeric value
cannot be an argument for any function except gin_numeric_cmp and it
cannot be returned in regular SQL query.
gin_numeric_cmp() can be called from regular SQL. I missed this before,
but that function will segfault if you call gin_numeric_cmp(NULL, 1) (in
v0.7 at least).
I know you mean a C NULL, not a SQL NULL, but it reminded me to test SQL
NULL.
And how does GIN handle SQL NULL values in the column? Does it index
them at all, or just ignore them?
Regards,
Jeff Davis
gin_numeric_cmp() can be called from regular SQL. I missed this before,
but that function will segfault if you call gin_numeric_cmp(NULL, 1) (in
v0.7 at least).
Fixed, gin_numeric_cmp is marked as strict.
And how does GIN handle SQL NULL values in the column? Does it index
them at all, or just ignore them?
SQL NULL: GIN doesn't support it (amindexnulls/amsearchnulls == false)
C NULL: NULL-numeric could be returned only by gin_extract_query_numeric which
cannot be called by user directly because of internal type of argument.
GIN doesn't do anything with values returned by gin_extract_query_numeric except
providing they as an argument for comparing functions.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
On Mon, 2009-01-19 at 21:41 +0300, Teodor Sigaev wrote:
And how does GIN handle SQL NULL values in the column? Does it index
them at all, or just ignore them?SQL NULL: GIN doesn't support it (amindexnulls/amsearchnulls == false)
C NULL: NULL-numeric could be returned only by gin_extract_query_numeric which
cannot be called by user directly because of internal type of argument.
GIN doesn't do anything with values returned by gin_extract_query_numeric except
providing they as an argument for comparing functions.
Ok, I understand now. I will look at this later.
Regards,
Jeff Davis
On Mon, 2009-01-19 at 21:41 +0300, Teodor Sigaev wrote:
gin_numeric_cmp() can be called from regular SQL. I missed this before,
but that function will segfault if you call gin_numeric_cmp(NULL, 1) (in
v0.7 at least).Fixed, gin_numeric_cmp is marked as strict.
And how does GIN handle SQL NULL values in the column? Does it index
them at all, or just ignore them?SQL NULL: GIN doesn't support it (amindexnulls/amsearchnulls == false)
C NULL: NULL-numeric could be returned only by gin_extract_query_numeric which
cannot be called by user directly because of internal type of argument.
GIN doesn't do anything with values returned by gin_extract_query_numeric except
providing they as an argument for comparing functions.
Ok, looks good. I updated the wiki to show this as the latest version of
the patch.
Thanks,
Jeff Davis
Looked at this a bit ... do you think it's really a good idea to remove
the strategy number argument of comparePartial? The argument given in
the docs for it is that it might be needed to determine when to end the
scan, and that still seems plausible to me.
The description of extractQuery's extra_data parameter seems confusing
too. AFAICS it is incorrect, or at least misleading, to describe it as
void ** extra_data[]; it is really void ***extra_data, because there is
only one object there not an array.
regards, tom lane
Looked at this a bit ... do you think it's really a good idea to remove
the strategy number argument of comparePartial? The argument given in
the docs for it is that it might be needed to determine when to end the
scan, and that still seems plausible to me.
Strategy number is not removed, it's replaced by pointer to opclass-specific
data on per key basis. Actually, partial match feature is new for 8.4 and there
is no any compatibility problem. None of existing opclasses (except
contrib/btree_gin) uses strategy number in comparePartial.
The description of extractQuery's extra_data parameter seems confusing
too. AFAICS it is incorrect, or at least misleading, to describe it as
void ** extra_data[]; it is really void ***extra_data, because there is
only one object there not an array.
It's really array, see fillScanKey() in ginscan.c, line 61 of patched file.
extractQuery could provide data for each returned key.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
On Wed, 2009-02-04 at 20:22 +0300, Teodor Sigaev wrote:
The description of extractQuery's extra_data parameter seems confusing
too. AFAICS it is incorrect, or at least misleading, to describe it as
void ** extra_data[]; it is really void ***extra_data, because there is
only one object there not an array.It's really array, see fillScanKey() in ginscan.c, line 61 of patched file.
extractQuery could provide data for each returned key.
If I understand correctly, the extra_data parameter to extractQuery()
should mean "a pointer to an array of void*". "void **extra_data[]"
might be misinterpreted as "an array of void**".
Would "void *(*extra_data)[]" be more descriptive? I'm not an expert on
style questions like this, but it seems to match the parameter lists for
comparePartial() and consistent().
"void ***extra_data" seems reasonable, too.
Regards,
Jeff Davis