Cleaning up and speeding up string functions

Started by David Rowleyalmost 7 years ago11 messageshackers
Jump to latest
#1David Rowley
dgrowleyml@gmail.com

Here's a small patch series aimed to both clean up a few misuses of
string functions and also to optimise a few things along the way.

0001: Converts various call that use appendPQExpBuffer() that really
should use appendPQExrBufferStr(). If there's no formatting then
using the former function is a waste of effort.

0002: Similar to 0001 but replaces various appendStringInfo calls with
appendStringInfoString calls.

0003: Adds a new function named appendStringInfoStringInfo() which
appends one StringInfo onto another. Various places did this using
appendStringInfoString(), but that required a needless strlen() call.
The length is already known and stored in the StringInfo's len field.
Not sure if this is the best name for this function, but can't think
of a better one right now.

0004: inlines appendStringInfoString so that any callers that pass in
a string constant (most of them) can have the strlen() call optimised
out.

I don't have any benchmarks to show workloads that this improves,
Likely the chances that it'll slow anything down are pretty remote.

I'll park this here until July.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0001-Use-appendPQExpBufferStr-instead-of-appendPQExpBuffe.patchtext/x-patch; charset=US-ASCII; name=0001-Use-appendPQExpBufferStr-instead-of-appendPQExpBuffe.patchDownload+183-184
0002-Use-appendStringInfoString-instead-of-appendStringIn.patchtext/x-patch; charset=US-ASCII; name=0002-Use-appendStringInfoString-instead-of-appendStringIn.patchDownload+10-11
0004-Add-appendStringInfoStringInfo-to-append-one-StringI.patchtext/x-patch; charset=US-ASCII; name=0004-Add-appendStringInfoStringInfo-to-append-one-StringI.patchDownload+26-17
0003-Inline-appendStringInfoString.patchtext/x-patch; charset=US-ASCII; name=0003-Inline-appendStringInfoString.patchDownload+16-24
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#1)
Re: Cleaning up and speeding up string functions

David Rowley <david.rowley@2ndquadrant.com> writes:

Here's a small patch series aimed to both clean up a few misuses of
string functions and also to optimise a few things along the way.

0001: Converts various call that use appendPQExpBuffer() that really
should use appendPQExrBufferStr(). If there's no formatting then
using the former function is a waste of effort.

0002: Similar to 0001 but replaces various appendStringInfo calls with
appendStringInfoString calls.

Agreed on these; we've applied such transformations before.

0003: Adds a new function named appendStringInfoStringInfo() which
appends one StringInfo onto another. Various places did this using
appendStringInfoString(), but that required a needless strlen() call.

I can't get excited about this one unless you can point to places
where the savings is meaningful. Otherwise it's just adding mental
burden.

0004: inlines appendStringInfoString so that any callers that pass in
a string constant (most of them) can have the strlen() call optimised
out.

Here the cost is code space rather than programmer-visible complexity,
but I still doubt that it's worth it.

regards, tom lane

#3David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#2)
Re: Cleaning up and speeding up string functions

On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

0003: Adds a new function named appendStringInfoStringInfo() which
appends one StringInfo onto another. Various places did this using
appendStringInfoString(), but that required a needless strlen() call.

I can't get excited about this one unless you can point to places
where the savings is meaningful. Otherwise it's just adding mental
burden.

The original idea was just to use appendBinaryStringInfo and make use
of the StringInfo's len field. Peter mentioned he'd rather seen a
wrapper function here [1]/messages/by-id/5567B7F5.7050705@gmx.net.

0004: inlines appendStringInfoString so that any callers that pass in
a string constant (most of them) can have the strlen() call optimised
out.

Here the cost is code space rather than programmer-visible complexity,
but I still doubt that it's worth it.

I see on today's master the postgres binary did grow from 8633960
bytes to 8642504 on my machine using GCC 8.3, so you might be right.
pg_receivewal grew from 96376 to 96424 bytes.

[1]: /messages/by-id/5567B7F5.7050705@gmx.net

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: David Rowley (#3)
Re: Cleaning up and speeding up string functions

On 2019-May-26, David Rowley wrote:

On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Here the cost is code space rather than programmer-visible complexity,
but I still doubt that it's worth it.

I see on today's master the postgres binary did grow from 8633960
bytes to 8642504 on my machine using GCC 8.3, so you might be right.
pg_receivewal grew from 96376 to 96424 bytes.

I suppose one place that could be affected visibly is JSON object
construction (json.c, jsonfuncs.c) that could potentially deal with
millions of stringinfo manipulations, but most of those calls don't
actually use appendStringInfoString with constant values, so it's
probably not worth bothering with.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5David Rowley
dgrowleyml@gmail.com
In reply to: Alvaro Herrera (#4)
Re: Cleaning up and speeding up string functions

On Thu, 6 Jun 2019 at 08:54, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2019-May-26, David Rowley wrote:

On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Here the cost is code space rather than programmer-visible complexity,
but I still doubt that it's worth it.

I see on today's master the postgres binary did grow from 8633960
bytes to 8642504 on my machine using GCC 8.3, so you might be right.
pg_receivewal grew from 96376 to 96424 bytes.

I suppose one place that could be affected visibly is JSON object
construction (json.c, jsonfuncs.c) that could potentially deal with
millions of stringinfo manipulations, but most of those calls don't
actually use appendStringInfoString with constant values, so it's
probably not worth bothering with.

We could probably get the best of both worlds by using a macro and
__builtin_constant_p() to detect if the string is a const, but I won't
be pushing for that unless I find something to make it worthwhile.

For patch 0004, I think it's likely worth revising so instead of
adding a new function, make use of appendBinaryStringInfo() and pass
in the known length. Likely mostly for the xml.c calls.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#6David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#2)
Re: Cleaning up and speeding up string functions

On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

Here's a small patch series aimed to both clean up a few misuses of
string functions and also to optimise a few things along the way.

0001: Converts various call that use appendPQExpBuffer() that really
should use appendPQExrBufferStr(). If there's no formatting then
using the former function is a waste of effort.

0002: Similar to 0001 but replaces various appendStringInfo calls with
appendStringInfoString calls.

Agreed on these; we've applied such transformations before.

I've pushed 0001 and 0002.

Instead of having 0004, how about the attached?

Most of the calls won't improve much performance-wise since they're so
cheap anyway, but there is xmlconcat(), I imagine that should see some
speedup.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

use_binary_string_info_when_len_is_known.patchapplication/octet-stream; name=use_binary_string_info_when_len_is_known.patchDownload+18-16
#7David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#6)
Re: Cleaning up and speeding up string functions

On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote:

Instead of having 0004, how about the attached?

Most of the calls won't improve much performance-wise since they're so
cheap anyway, but there is xmlconcat(), I imagine that should see some
speedup.

I've pushed this after having found a couple more places where the
length is known.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In reply to: David Rowley (#7)
Re: Cleaning up and speeding up string functions

David Rowley <david.rowley@2ndquadrant.com> writes:

On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote:

Instead of having 0004, how about the attached?

Most of the calls won't improve much performance-wise since they're so
cheap anyway, but there is xmlconcat(), I imagine that should see some
speedup.

I've pushed this after having found a couple more places where the
length is known.

I noticed a lot of these are appending one StringInfo onto another;
would it make sense to introduce a helper funciton
appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
`str.data, str2.len` repetition?

- ilmari
--
"A disappointingly low fraction of the human race is,
at any given time, on fire." - Stig Sandbeck Mathisen

In reply to: Dagfinn Ilmari Mannsåker (#8)
Re: Cleaning up and speeding up string functions

ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

David Rowley <david.rowley@2ndquadrant.com> writes:

On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote:

Instead of having 0004, how about the attached?

Most of the calls won't improve much performance-wise since they're so
cheap anyway, but there is xmlconcat(), I imagine that should see some
speedup.

I've pushed this after having found a couple more places where the
length is known.

I noticed a lot of these are appending one StringInfo onto another;
would it make sense to introduce a helper funciton
appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
`str.data, str2.len` repetition?

A bit of grepping only turned up 18 uses, but I was bored and whipped up
the attached anyway, in case we decide it's worth it.

- ilmari
--
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
the consequences of." -- Skud's Meta-Law

Attachments:

0001-Add-appendStringInfoStringInfo-function.patchtext/x-diffDownload+37-21
#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Dagfinn Ilmari Mannsåker (#9)
Re: Cleaning up and speeding up string functions

On 2019-Jul-22, Dagfinn Ilmari Manns�ker wrote:

ilmari@ilmari.org (Dagfinn Ilmari Manns�ker) writes:

I noticed a lot of these are appending one StringInfo onto another;
would it make sense to introduce a helper funciton
appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
`str.data, str2.len` repetition?

A bit of grepping only turned up 18 uses, but I was bored and whipped up
the attached anyway, in case we decide it's worth it.

David had already submitted the same thing upthread, and it was rejected
on the grounds that it increases the code space.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In reply to: Alvaro Herrera (#10)
Re: Cleaning up and speeding up string functions

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2019-Jul-22, Dagfinn Ilmari Mannsåker wrote:

ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

I noticed a lot of these are appending one StringInfo onto another;
would it make sense to introduce a helper funciton
appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
`str.data, str2.len` repetition?

A bit of grepping only turned up 18 uses, but I was bored and whipped up
the attached anyway, in case we decide it's worth it.

David had already submitted the same thing upthread, and it was rejected
on the grounds that it increases the code space.

Oops, sorry, I missed that. Never mind then.

- ilmari
--
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
the consequences of." -- Skud's Meta-Law