Improve performance of pg_strtointNN functions

Started by David Rowleyover 3 years ago9 messageshackers
Jump to latest
#1David Rowley
dgrowleyml@gmail.com

Over on [1]/messages/by-id/CAApHDvrL6_+wKgPqRHr7gH_6xy3hXM6a3QCsZ5ForurjDFfenA@mail.gmail.com, Dean and I were discussing why both gcc and clang don't
seem to want to optimize the multiplication that we're doing in
pg_strtoint16, pg_strtoint32 and pg_strtoint64 into something more
efficient. It seems that this is down to the overflow checks.
Removing seems to allow the compiler to better optimize the compiled
code. See the use of LEA instead of IMUL in [2]https://godbolt.org/z/7YoMT63q1.

Instead of using the pg_mul_sNN_overflow functions, we can just
accumulate the number in an unsigned version of the type and do an
overflow check by checking if the accumulated value has gone beyond a
10th of the maximum *signed* value for the type. We then just need to
do some final overflow checks after the accumulation is done. What
those depend on if it's a negative or positive number.

I ran a few microbenchmarks with the attached str2int.c file and I see
about a 10-20% performance increase:

$ ./str2int -100000000 100000000
n = -100000000, e = 100000000
pg_strtoint32 in 3.207926 seconds
pg_strtoint32_v2 in 2.763062 seconds (16.100399% faster)

v2 is the updated version of the function

On Windows, the gains are generally a bit more. I think this must be
due to the lack of overflow intrinsic functions.

str2int.exe -100000000 100000000

n = -100000000, e = 100000000
pg_strtoint32 in 9.416000 seconds
pg_strtoint32_v2 in 8.099000 seconds (16.261267% faster)

I was thinking that we should likely apply this before doing the hex
literals, which is the main focus of [1]/messages/by-id/CAApHDvrL6_+wKgPqRHr7gH_6xy3hXM6a3QCsZ5ForurjDFfenA@mail.gmail.com. The reason being is so that
that patch can immediately have faster conversions by allowing the
compiler to use bit shifting instead of other means of multiplying by
a power-of-2 number. I'm hoping this removes a barrier for Peter from
the small gripe I raised on that thread about the patch having slower
than required hex, octal and binary string parsing.

David

[1]: /messages/by-id/CAApHDvrL6_+wKgPqRHr7gH_6xy3hXM6a3QCsZ5ForurjDFfenA@mail.gmail.com
[2]: https://godbolt.org/z/7YoMT63q1

Attachments:

str2int.ctext/plain; charset=US-ASCII; name=str2int.cDownload
v1-0001-Improve-performance-of-pg_strtointNN-functions.patchtext/plain; charset=US-ASCII; name=v1-0001-Improve-performance-of-pg_strtointNN-functions.patchDownload+39-45
#2John Naylor
john.naylor@enterprisedb.com
In reply to: David Rowley (#1)
Re: Improve performance of pg_strtointNN functions

On Thu, Dec 1, 2022 at 6:42 AM David Rowley <dgrowleyml@gmail.com> wrote:

I was thinking that we should likely apply this before doing the hex
literals, which is the main focus of [1]. The reason being is so that
that patch can immediately have faster conversions by allowing the
compiler to use bit shifting instead of other means of multiplying by
a power-of-2 number. I'm hoping this removes a barrier for Peter from
the small gripe I raised on that thread about the patch having slower
than required hex, octal and binary string parsing.

I don't see why the non-decimal literal patch needs to be "immediately"
faster? If doing this first leads to less code churn, that's another
consideration, but you haven't made that argument.

--
John Naylor
EDB: http://www.enterprisedb.com

#3David Rowley
dgrowleyml@gmail.com
In reply to: John Naylor (#2)
Re: Improve performance of pg_strtointNN functions

On Thu, 1 Dec 2022 at 18:27, John Naylor <john.naylor@enterprisedb.com> wrote:

I don't see why the non-decimal literal patch needs to be "immediately" faster? If doing this first leads to less code churn, that's another consideration, but you haven't made that argument.

My view is that Peter wants to keep the code he's adding for the hex,
octal and binary parsing as similar to the existing code as possible.
I very much understand Peter's point of view on that. Consistency is
good. However, if we commit the hex literals patch first, people might
ask "why don't we use bit-wise operators to make the power-of-2 bases
faster?", which seems like a very legitimate question. I asked it,
anyway... On the other hand, if Peter adds the bit-wise operators
then the problem of code inconsistency remains.

As an alternative to those 2 options, I'm proposing we commit this
first then the above dilemma disappears completely.

If this was going to cause huge conflicts with Peter's patch then I
might think differently. I feel like it's a fairly trivial task to
rebase.

If the consensus is that we should fix this afterwards, then I'm happy to delay.

David

#4Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: David Rowley (#3)
Re: Improve performance of pg_strtointNN functions

On Thu, 1 Dec 2022 at 05:38, David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 1 Dec 2022 at 18:27, John Naylor <john.naylor@enterprisedb.com> wrote:

I don't see why the non-decimal literal patch needs to be "immediately" faster? If doing this first leads to less code churn, that's another consideration, but you haven't made that argument.

My view is that Peter wants to keep the code he's adding for the hex,
octal and binary parsing as similar to the existing code as possible.
I very much understand Peter's point of view on that. Consistency is
good. However, if we commit the hex literals patch first, people might
ask "why don't we use bit-wise operators to make the power-of-2 bases
faster?", which seems like a very legitimate question. I asked it,
anyway... On the other hand, if Peter adds the bit-wise operators
then the problem of code inconsistency remains.

As an alternative to those 2 options, I'm proposing we commit this
first then the above dilemma disappears completely.

If this was going to cause huge conflicts with Peter's patch then I
might think differently. I feel like it's a fairly trivial task to
rebase.

If the consensus is that we should fix this afterwards, then I'm happy to delay.

I feel like it should be done afterwards, so that any performance
gains can be measured for all bases. Otherwise, we won't really know,
or have any record of, how much faster this was for other bases, or be
able to go back and test that.

Regards,
Dean

#5Peter Eisentraut
peter_e@gmx.net
In reply to: David Rowley (#3)
Re: Improve performance of pg_strtointNN functions

On 01.12.22 06:38, David Rowley wrote:

If this was going to cause huge conflicts with Peter's patch then I
might think differently. I feel like it's a fairly trivial task to
rebase.

If the consensus is that we should fix this afterwards, then I'm happy to delay.

If we are happy with this patch, then it's okay with me if you push this
first. I'll probably need to do another pass over my patch anyway, so a
bit more work isn't a problem.

#6David Rowley
dgrowleyml@gmail.com
In reply to: Peter Eisentraut (#5)
Re: Improve performance of pg_strtointNN functions

On Fri, 2 Dec 2022 at 20:35, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:

If we are happy with this patch, then it's okay with me if you push this
first. I'll probably need to do another pass over my patch anyway, so a
bit more work isn't a problem.

Thanks. I'll start looking at the patch again now. If I don't find any
problems then I'll push it.

I just did some performance tests by COPYing in 40 million INTs
ranging from -/+ 20 million.

create table ab(a int, b int);
insert into ab select x,x from generate_series(-20000000,20000000)x;
copy ab to '/tmp/ab.csv';

-- master
truncate ab; copy ab from '/tmp/ab.csv';
Time: 10219.386 ms (00:10.219)
Time: 10252.572 ms (00:10.253)
Time: 10202.940 ms (00:10.203)

-- patched
truncate ab; copy ab from '/tmp/ab.csv';
Time: 9522.020 ms (00:09.522)
Time: 9441.294 ms (00:09.441)
Time: 9432.834 ms (00:09.433)

About ~8% faster

David

#7David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#6)
Re: Improve performance of pg_strtointNN functions

On Sun, 4 Dec 2022 at 13:52, David Rowley <dgrowleyml@gmail.com> wrote:

Thanks. I'll start looking at the patch again now. If I don't find any
problems then I'll push it.

Pushed with some small adjustments.

David

#8Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: David Rowley (#7)
Re: Improve performance of pg_strtointNN functions

On Sun, 4 Dec 2022 at 03:19, David Rowley <dgrowleyml@gmail.com> wrote:

Pushed with some small adjustments.

Ah, I see that you changed the overflow test, and I realise that I
forgot to answer your question about why I wrote that as 1 - INT_MIN /
10 over on the other thread.

The reason is that we need to detect whether tmp * base will exceed
-INT_MIN, not INT_MAX, since we're accumulating the absolute value of
a signed integer. So the right test is

tmp >= 1 - INT_MIN / base

or equivalently

tmp > -(INT_MIN / base)

I used the first form, because it didn't require extra parentheses,
but that doesn't really matter. The point is that, in general, that's
not the same as

tmp > INT_MAX / base

though it happens to be the same for base = 10, because INT_MIN and
INT_MAX aren't divisible by 10. It will break when base is a power of
2 though, so although it's not broken now, it's morally wrong, and it
risks breaking when Peter commits his patch.

Regards,
Dean

#9David Rowley
dgrowleyml@gmail.com
In reply to: Dean Rasheed (#8)
Re: Improve performance of pg_strtointNN functions

On Sun, 4 Dec 2022 at 22:53, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

Ah, I see that you changed the overflow test, and I realise that I
forgot to answer your question about why I wrote that as 1 - INT_MIN /
10 over on the other thread.

The reason is that we need to detect whether tmp * base will exceed
-INT_MIN, not INT_MAX, since we're accumulating the absolute value of
a signed integer.

I think I'd been too focused on the simplicity of that expression and
also the base 10 part. I saw that everything worked in base 10 and
failed to give enough forward thought to other bases.

I now see that it was wrong-headed to code it the way I had it.
Thanks for pointing this out. I've just pushed a fix.

David