[PATCH] btree_gist: fix union implementation for variable length columns

Started by Pavel Raiskupover 7 years ago7 messages
#1Pavel Raiskup
praiskup@redhat.com
1 attachment(s)

Hi hackers,

while I tried to debug 'gcc -fstack-protector -O3' problems in [1]https://bugzilla.redhat.com/show_bug.cgi?id=1544349 Pavel, I noticed
that gbt_var_union() mistreats the first vector element. Patch is attached.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1544349 Pavel
Pavel

Attachments:

0001-Fix-btree_gist-union-for-variable-length-types.patchtext/x-patch; charset=UTF-8; name=0001-Fix-btree_gist-union-for-variable-length-types.patchDownload
From 4e4f0fe8d2f74a85f37abdf095ab8aecf9329596 Mon Sep 17 00:00:00 2001
From: Pavel Raiskup <praiskup@redhat.com>
Date: Wed, 4 Jul 2018 11:39:38 +0200
Subject: [PATCH] Fix btree_gist "union" for variable length types

The gbt_var_union() did not expect that first key in the entryvec
could represent leaf node, while it many times is.  So try call
f_l2n() even for the first element, same as we do for all the
other elements in gbt_var_bin_union().

This bug could create wrong indexes, so we it's suggested to
REINDEX.

diff --git a/contrib/btree_gist/btree_bit.c b/contrib/btree_gist/btree_bit.c
index 2225244ded..432b788204 100644
*** a/contrib/btree_gist/btree_bit.c
--- b/contrib/btree_gist/btree_bit.c
***************
*** 75,81 **** gbt_bitcmp(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)
  static bytea *
  gbt_bit_xfrm(bytea *leaf)
  {
! 	bytea	   *out = leaf;
  	int			sz = VARBITBYTES(leaf) + VARHDRSZ;
  	int			padded_sz = INTALIGN(sz);
  
--- 75,81 ----
  static bytea *
  gbt_bit_xfrm(bytea *leaf)
  {
! 	bytea	   *out;
  	int			sz = VARBITBYTES(leaf) + VARHDRSZ;
  	int			padded_sz = INTALIGN(sz);
  
diff --git a/contrib/btree_gist/btreeindex 670c879e77..6d8299d9b5 100644
*** a/contrib/btree_gist/btree_utils_var.c
--- b/contrib/btree_gist/btree_utils_var.c
***************
*** 230,252 **** gbt_var_node_truncate(const GBT_VARKEY *node, int32 cpf_length, const gbtree_vin
  }
  
  
  
  void
  gbt_var_bin_union(Datum *u, GBT_VARKEY *e, Oid collation,
  				  const gbtree_vinfo *tinfo, FmgrInfo *flinfo)
  {
! 	GBT_VARKEY_R eo = gbt_var_key_readable(e);
  	GBT_VARKEY_R nr;
  
- 	if (eo.lower == eo.upper)	/* leaf */
- 	{
- 		GBT_VARKEY *tmp;
- 
- 		tmp = gbt_var_leaf2node(e, tinfo, flinfo);
- 		if (tmp != e)
- 			eo = gbt_var_key_readable(tmp);
- 	}
- 
  	if (DatumGetPointer(*u))
  	{
  		GBT_VARKEY_R ro = gbt_var_key_readable((GBT_VARKEY *) DatumGetPointer(*u));
--- 230,258 ----
  }
  
  
+ static GBT_VARKEY_R
+ gbt_var_key_readable_resolved(GBT_VARKEY *key, const gbtree_vinfo *tinfo,
+ 							  FmgrInfo *flinfo)
+ {
+ 	GBT_VARKEY_R out = gbt_var_key_readable(key);
+ 	if (out.lower == out.upper) /* leaf */
+ 	{
+ 		GBT_VARKEY *tmp;
+ 		tmp = gbt_var_leaf2node(key, tinfo, flinfo);
+ 		if (tmp != key)
+ 			out = gbt_var_key_readable(tmp);
+ 	}
+ 	return out;
+ }
+ 
  
  void
  gbt_var_bin_union(Datum *u, GBT_VARKEY *e, Oid collation,
  				  const gbtree_vinfo *tinfo, FmgrInfo *flinfo)
  {
! 	GBT_VARKEY_R eo = gbt_var_key_readable_resolved(e, tinfo, flinfo);
  	GBT_VARKEY_R nr;
  
  	if (DatumGetPointer(*u))
  	{
  		GBT_VARKEY_R ro = gbt_var_key_readable((GBT_VARKEY *) DatumGetPointer(*u));
***************
*** 333,339 **** gbt_var_union(const GistEntryVector *entryvec, int32 *size, Oid collation,
  	*size = sizeof(GBT_VARKEY);
  
  	cur = (GBT_VARKEY *) DatumGetPointer(entryvec->vector[0].key);
! 	rk = gbt_var_key_readable(cur);
  	out = PointerGetDatum(gbt_var_key_copy(&rk));
  
  	for (i = 1; i < numranges; i++)
--- 339,345 ----
  	*size = sizeof(GBT_VARKEY);
  
  	cur = (GBT_VARKEY *) DatumGetPointer(entryvec->vector[0].key);
! 	rk = gbt_var_key_readable_resolved(cur, tinfo, flinfo);
  	out = PointerGetDatum(gbt_var_key_copy(&rk));
  
  	for (i = 1; i < numranges; i++)
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Raiskup (#1)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

Pavel Raiskup <praiskup@redhat.com> writes:

while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed
that gbt_var_union() mistreats the first vector element. Patch is attached.

Hi Pavel! For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so? (The BZ you pointed to
doesn't seem to address this...)

regards, tom lane

#3Pavel Raiskup
praiskup@redhat.com
In reply to: Tom Lane (#2)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

Hi Tom,

On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:

Pavel Raiskup <praiskup@redhat.com> writes:

while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed
that gbt_var_union() mistreats the first vector element. Patch is attached.

Hi Pavel! For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so? (The BZ you pointed to
doesn't seem to address this...)

I've been looking at the reproducer for a while now... and I probably
need a hint if that's necessary.

Turns out the problem is only related to bit/bit varying type (so the
patch comments need to be reworded properly, at least) since those are the
only types which have implemented the f_l2n() callback.

Considering testcase 'bit.sql' (bit(33) type), it's pretty clear that on
little endian boxes the problematic strings would be '00100001*' (int32
value for '33', because there's varbit header). Big endian boxes would
have problems with char[] like {0, 0, 0, 33, ...}. But I fail to
construct right order of correctly picked elements into 'bit.data'.
The problem probably is (a) that picksplit reorders the elements very
often, and probably that (b) random() brings non-determinism into the
final tree shape (if the reproducer was to be based on many equal keys in
the index). It's amazing to see how the index fixes itself :-) for this
bug.

Can you help me with the reproducer idea, resp. can we live without the
reproducer in this particular case?

Pavel

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Raiskup (#3)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

Pavel Raiskup <praiskup@redhat.com> writes:

On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:

Hi Pavel! For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so? (The BZ you pointed to
doesn't seem to address this...)

Turns out the problem is only related to bit/bit varying type (so the
patch comments need to be reworded properly, at least) since those are the
only types which have implemented the f_l2n() callback.

What I'm failing to wrap my head around is why this code exists at all.
AFAICS, gbt_bit_xfrm just forces the bitstring to be zero-padded out to
an INTALIGN boundary, which it wouldn't necessarily be otherwise. But
why bother? That should have no effect on the behavior of bit_cmp().
So I'm speculating that the reason nobody has noticed a problem is that
there is no problem because this code is useless and could be ripped out.

It would be easier to figure this out if the btree_gist code weren't
so desperately undocumented. Teodor, do you remember why it's like
this?

regards, tom lane

#5Pavel Raiskup
praiskup@redhat.com
In reply to: Tom Lane (#4)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

On Wednesday, July 11, 2018 7:26:40 PM CEST Tom Lane wrote:

Pavel Raiskup <praiskup@redhat.com> writes:

On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:

Hi Pavel! For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so? (The BZ you pointed to
doesn't seem to address this...)

Turns out the problem is only related to bit/bit varying type (so the
patch comments need to be reworded properly, at least) since those are the
only types which have implemented the f_l2n() callback.

What I'm failing to wrap my head around is why this code exists at all.
AFAICS, gbt_bit_xfrm just forces the bitstring to be zero-padded out to
an INTALIGN boundary, which it wouldn't necessarily be otherwise.
But why bother? That should have no effect on the behavior of bit_cmp().

The gbt_bit_xfrm() method actually also skips the varbit header (number of
bits, stored in VarBit.bit_len), the magic is done by VARBITS macro. If
the header is dropped, it's OK to just use existing byteacmp() for bit
array comparison.

Pavel

#6Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#4)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

It would be easier to figure this out if the btree_gist code weren't
so desperately undocumented. Teodor, do you remember why it's like
this?

Will look.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#7Pavel Raiskup
praiskup@redhat.com
In reply to: Teodor Sigaev (#6)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

On Thursday, July 12, 2018 5:53:18 PM CET Teodor Sigaev wrote:

It would be easier to figure this out if the btree_gist code weren't
so desperately undocumented. Teodor, do you remember why it's like
this?

Will look.

Ping on this issue. I guess the patch I proposed isn't wrong in this
case, I'm just not sure about the commit message.

Pavel