[PATCH] btree_gist: fix union implementation for variable length columns

Started by Pavel Raiskupalmost 8 years ago7 messageshackers
Jump to latest
#1Pavel Raiskup
praiskup@redhat.com

Hi hackers,

while I tried to debug 'gcc -fstack-protector -O3' problems in [1]https://bugzilla.redhat.com/show_bug.cgi?id=1544349 Pavel, I noticed
that gbt_var_union() mistreats the first vector element. Patch is attached.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1544349 Pavel
Pavel

Attachments:

0001-Fix-btree_gist-union-for-variable-length-types.patchtext/x-patch; charset=UTF-8; name=0001-Fix-btree_gist-union-for-variable-length-types.patchDownload+21-15
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Raiskup (#1)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

Pavel Raiskup <praiskup@redhat.com> writes:

while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed
that gbt_var_union() mistreats the first vector element. Patch is attached.

Hi Pavel! For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so? (The BZ you pointed to
doesn't seem to address this...)

regards, tom lane

#3Pavel Raiskup
praiskup@redhat.com
In reply to: Tom Lane (#2)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

Hi Tom,

On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:

Pavel Raiskup <praiskup@redhat.com> writes:

while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed
that gbt_var_union() mistreats the first vector element. Patch is attached.

Hi Pavel! For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so? (The BZ you pointed to
doesn't seem to address this...)

I've been looking at the reproducer for a while now... and I probably
need a hint if that's necessary.

Turns out the problem is only related to bit/bit varying type (so the
patch comments need to be reworded properly, at least) since those are the
only types which have implemented the f_l2n() callback.

Considering testcase 'bit.sql' (bit(33) type), it's pretty clear that on
little endian boxes the problematic strings would be '00100001*' (int32
value for '33', because there's varbit header). Big endian boxes would
have problems with char[] like {0, 0, 0, 33, ...}. But I fail to
construct right order of correctly picked elements into 'bit.data'.
The problem probably is (a) that picksplit reorders the elements very
often, and probably that (b) random() brings non-determinism into the
final tree shape (if the reproducer was to be based on many equal keys in
the index). It's amazing to see how the index fixes itself :-) for this
bug.

Can you help me with the reproducer idea, resp. can we live without the
reproducer in this particular case?

Pavel

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Raiskup (#3)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

Pavel Raiskup <praiskup@redhat.com> writes:

On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:

Hi Pavel! For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so? (The BZ you pointed to
doesn't seem to address this...)

Turns out the problem is only related to bit/bit varying type (so the
patch comments need to be reworded properly, at least) since those are the
only types which have implemented the f_l2n() callback.

What I'm failing to wrap my head around is why this code exists at all.
AFAICS, gbt_bit_xfrm just forces the bitstring to be zero-padded out to
an INTALIGN boundary, which it wouldn't necessarily be otherwise. But
why bother? That should have no effect on the behavior of bit_cmp().
So I'm speculating that the reason nobody has noticed a problem is that
there is no problem because this code is useless and could be ripped out.

It would be easier to figure this out if the btree_gist code weren't
so desperately undocumented. Teodor, do you remember why it's like
this?

regards, tom lane

#5Pavel Raiskup
praiskup@redhat.com
In reply to: Tom Lane (#4)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

On Wednesday, July 11, 2018 7:26:40 PM CEST Tom Lane wrote:

Pavel Raiskup <praiskup@redhat.com> writes:

On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:

Hi Pavel! For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so? (The BZ you pointed to
doesn't seem to address this...)

Turns out the problem is only related to bit/bit varying type (so the
patch comments need to be reworded properly, at least) since those are the
only types which have implemented the f_l2n() callback.

What I'm failing to wrap my head around is why this code exists at all.
AFAICS, gbt_bit_xfrm just forces the bitstring to be zero-padded out to
an INTALIGN boundary, which it wouldn't necessarily be otherwise.
But why bother? That should have no effect on the behavior of bit_cmp().

The gbt_bit_xfrm() method actually also skips the varbit header (number of
bits, stored in VarBit.bit_len), the magic is done by VARBITS macro. If
the header is dropped, it's OK to just use existing byteacmp() for bit
array comparison.

Pavel

#6Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#4)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

It would be easier to figure this out if the btree_gist code weren't
so desperately undocumented. Teodor, do you remember why it's like
this?

Will look.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#7Pavel Raiskup
praiskup@redhat.com
In reply to: Teodor Sigaev (#6)
Re: [PATCH] btree_gist: fix union implementation for variable length columns

On Thursday, July 12, 2018 5:53:18 PM CET Teodor Sigaev wrote:

It would be easier to figure this out if the btree_gist code weren't
so desperately undocumented. Teodor, do you remember why it's like
this?

Will look.

Ping on this issue. I guess the patch I proposed isn't wrong in this
case, I'm just not sure about the commit message.

Pavel