Replace some cstring_to_text to cstring_to_text_with_len
Hi,
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.
Is it worth the effort to avoid this, where do we know the size of the
parameter?
best regards,
Ranier Vilela
Attachments:
0003-Avoid-unecessary-calls-to-strlen-function.patchapplication/octet-stream; name=0003-Avoid-unecessary-calls-to-strlen-function.patchDownload+20-20
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size of the
parameter?
Are there workloads where this matters?
--
Michael
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier <michael@paquier.xyz>
escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size of the
parameter?Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc
<https://github.com/postgres/postgres/commit/8b26769bc441fffa8aad31dddc484c2f4043d2c9>
.
best regards,
Ranier Vilela
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier <michael@paquier.xyz>
escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size of the
parameter?Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text("");
+ return cstring_to_text_with_len("", 0);
This looks worse, so we'd better be getting something in return.
@@ -360,7 +360,7 @@ pg_tablespace_location(PG_FUNCTION_ARGS)
sourcepath)));
targetpath[rllen] = '\0';
- PG_RETURN_TEXT_P(cstring_to_text(targetpath));
+ PG_RETURN_TEXT_P(cstring_to_text_with_len(targetpath, rllen));
This could be a worthwhile cosmetic improvement if the nul-terminator (and
space reserved for it, and comment explaining that) is taken out as well,
but the patch didn't bother to do that.
--
John Naylor
EDB: http://www.enterprisedb.com
On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
<michael@paquier.xyz> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size
of the
parameter?
Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating a
function that returned an empty text value, so we'd just avoid
converting the empty cstring altogether and say:
return empty_text();
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
<michael@paquier.xyz> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size
of the
parameter?
Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating a
function that returned an empty text value, so we'd just avoid
converting the empty cstring altogether and say:return empty_text();
Or we could generalise it for any string literal (of which there are
slightly more¹ non-empty than empty in calls to
cstring_to_text(_with_len)):
#define literal_to_text(str) cstring_to_text_with_len("" str "", sizeof(str)-1)
[1]: ~/src/postgresql $ rg 'cstring_to_text(_with_len)?\("[^"]+"' | wc -l 17 ~/src/postgresql $ rg 'cstring_to_text(_with_len)?\(""' | wc -l 15
~/src/postgresql $ rg 'cstring_to_text(_with_len)?\("[^"]+"' | wc -l
17
~/src/postgresql $ rg 'cstring_to_text(_with_len)?\(""' | wc -l
15
- ilmari
Em qui., 31 de ago. de 2023 às 08:41, John Naylor <
john.naylor@enterprisedb.com> escreveu:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier <
michael@paquier.xyz> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size of the
parameter?Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
Per suggestion by Andrew Dustan, I provided a new function.
@@ -360,7 +360,7 @@ pg_tablespace_location(PG_FUNCTION_ARGS)
sourcepath)));
targetpath[rllen] = '\0';- PG_RETURN_TEXT_P(cstring_to_text(targetpath)); + PG_RETURN_TEXT_P(cstring_to_text_with_len(targetpath, rllen));This could be a worthwhile cosmetic improvement if the nul-terminator (and
space reserved for it, and comment explaining that) is taken out as well,
but the patch didn't bother to do that.
Thanks for the tip.
Please see a new version of the patch in the Andrew Dunstan, reply.
best regards,
Ranier Vilela
Em qui., 31 de ago. de 2023 às 09:51, Andrew Dunstan <andrew@dunslane.net>
escreveu:
On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier <
michael@paquier.xyz> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size of the
parameter?Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating a
function that returned an empty text value, so we'd just avoid converting
the empty cstring altogether and say:return empty_text();
Hi,
Thanks for the suggestion, I agreed.
New patch is attached.
best regards,
Ranier Vilela
Show quoted text
Attachments:
0001-Replace-cstring_to_text-with-empty_text.patchapplication/octet-stream; name=0001-Replace-cstring_to_text-with-empty_text.patchDownload+33-16
0002-Avoid-unecessary-calls-to-strlen-function.patchapplication/octet-stream; name=0002-Avoid-unecessary-calls-to-strlen-function.patchDownload+8-9
Em qui., 31 de ago. de 2023 às 10:12, Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> escreveu:
Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com>
wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
<michael@paquier.xyz> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size
of the
parameter?
Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating a
function that returned an empty text value, so we'd just avoid
converting the empty cstring altogether and say:return empty_text();
Or we could generalise it for any string literal (of which there are
slightly more¹ non-empty than empty in calls to
cstring_to_text(_with_len)):#define literal_to_text(str) cstring_to_text_with_len("" str "",
sizeof(str)-1)
I do not agree, I think this will get worse.
best regards,
Ranier Vilela
Ranier Vilela <ranier.vf@gmail.com> writes:
Em qui., 31 de ago. de 2023 às 10:12, Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> escreveu:Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com>
wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
<michael@paquier.xyz> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size
of the
parameter?
Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating a
function that returned an empty text value, so we'd just avoid
converting the empty cstring altogether and say:return empty_text();
Or we could generalise it for any string literal (of which there are
slightly more¹ non-empty than empty in calls to
cstring_to_text(_with_len)):#define literal_to_text(str) cstring_to_text_with_len("" str "",
sizeof(str)-1)I do not agree, I think this will get worse.
How exactly will it get worse? It's exactly equivalent to
cstring_to_text_with_len("", 0), since sizeof() is a compile-time
construct, and the string concatenation makes it fail if the argument is
not a literal string.
Whether we want an even-more-optimised version for an empty text value
is another matter, but I doubt it'd be worth it. Another option would
be to make cstring_to_text(_with_len) static inline functions, which
lets the compiler eliminate the memcpy() call when len == 0.
In fact, after playing around a bit (https://godbolt.org/z/x51aYGadh),
it seems like GCC, Clang and MSVC all eliminate the strlen() and
memcpy() calls for cstring_to_text("") under -O2 if it's static inline.
- ilmari
On 31.08.23 16:10, Ranier Vilela wrote:
Em qui., 31 de ago. de 2023 às 09:51, Andrew Dunstan
<andrew@dunslane.net <mailto:andrew@dunslane.net>> escreveu:On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com
<mailto:ranier.vf@gmail.com>> wrote:Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
<michael@paquier.xyz <mailto:michael@paquier.xyz>> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the
size of the
parameter?
Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating
a function that returned an empty text value, so we'd just avoid
converting the empty cstring altogether and say:return empty_text();
Hi,
Thanks for the suggestion, I agreed.New patch is attached.
I think these patches make the code uniformly uglier and harder to
understand.
If a performance benefit could be demonstrated, then making
cstring_to_text() an inline function could be sensible. But I wouldn't
go beyond that.
Em qui., 31 de ago. de 2023 às 12:12, Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
Em qui., 31 de ago. de 2023 às 10:12, Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> escreveu:Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com>
wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
<michael@paquier.xyz> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size
of the
parameter?
Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating a
function that returned an empty text value, so we'd just avoid
converting the empty cstring altogether and say:return empty_text();
Or we could generalise it for any string literal (of which there are
slightly more¹ non-empty than empty in calls to
cstring_to_text(_with_len)):#define literal_to_text(str) cstring_to_text_with_len("" str "",
sizeof(str)-1)I do not agree, I think this will get worse.
How exactly will it get worse? It's exactly equivalent to
cstring_to_text_with_len("", 0), since sizeof() is a compile-time
construct, and the string concatenation makes it fail if the argument is
not a literal string.
I think that concatenation makes the strings twice bigger, doesn't it?
Whether we want an even-more-optimised version for an empty text value
is another matter, but I doubt it'd be worth it. Another option would
be to make cstring_to_text(_with_len) static inline functions, which
lets the compiler eliminate the memcpy() call when len == 0.
In fact, after playing around a bit (https://godbolt.org/z/x51aYGadh),
it seems like GCC, Clang and MSVC all eliminate the strlen() and
memcpy() calls for cstring_to_text("") under -O2 if it's static inline.
In that case, it seems to me that would be good too.
Compilers removing memcpy would have the same as empty_text.
Without the burden of a new function and all its future maintenance.
best regards,
Ranier Vilela
Ranier Vilela <ranier.vf@gmail.com> writes:
Em qui., 31 de ago. de 2023 às 12:12, Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> escreveu:Ranier Vilela <ranier.vf@gmail.com> writes:
Em qui., 31 de ago. de 2023 às 10:12, Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> escreveu:Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com>
wrote:
Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
<michael@paquier.xyz> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the size
of the
parameter?
Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating a
function that returned an empty text value, so we'd just avoid
converting the empty cstring altogether and say:return empty_text();
Or we could generalise it for any string literal (of which there are
slightly more¹ non-empty than empty in calls to
cstring_to_text(_with_len)):#define literal_to_text(str) cstring_to_text_with_len("" str "",
sizeof(str)-1)I do not agree, I think this will get worse.
How exactly will it get worse? It's exactly equivalent to
cstring_to_text_with_len("", 0), since sizeof() is a compile-time
construct, and the string concatenation makes it fail if the argument is
not a literal string.I think that concatenation makes the strings twice bigger, doesn't it?
No, it's just taking advantage of the fact that C string literals can be
split into multiple pieces separated by whitespace (like in SQL, but
without requiring a newline between them). E.g. "foo" "bar" is exactly
equivalent to "foobar" after parsing. The reason to use it in the macro
is to make it a syntax error if the argument is not a literal string but
instead a string variable, becuause in that case the sizeof() would
return the size of the pointer, not the string.
- ilmari
Peter Eisentraut <peter@eisentraut.org> writes:
On 31.08.23 16:10, Ranier Vilela wrote:
Em qui., 31 de ago. de 2023 às 09:51, Andrew Dunstan
<andrew@dunslane.net <mailto:andrew@dunslane.net>> escreveu:On 2023-08-31 Th 07:41, John Naylor wrote:
On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela <ranier.vf@gmail.com
<mailto:ranier.vf@gmail.com>> wrote:Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
<michael@paquier.xyz <mailto:michael@paquier.xyz>> escreveu:
On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
cstring_to_text has a small overhead, because call strlen for
pointer to char parameter.Is it worth the effort to avoid this, where do we know the
size of the
parameter?
Are there workloads where this matters?
None, but note this change has the same spirit of 8b26769bc.
- return cstring_to_text(""); + return cstring_to_text_with_len("", 0);This looks worse, so we'd better be getting something in return.
I agree this is a bit ugly. I wonder if we'd be better off creating
a function that returned an empty text value, so we'd just avoid
converting the empty cstring altogether and say:
return empty_text();
Hi,
Thanks for the suggestion, I agreed.
New patch is attached.I think these patches make the code uniformly uglier and harder to
understand.If a performance benefit could be demonstrated, then making
cstring_to_text() an inline function could be sensible. But I wouldn't
go beyond that.
I haven't benchmarked it yet, but here's a patch that inlines them and
changes callers of cstring_to_text_with_len() with a aliteral string and
constant length to cstring_to_text().
On an x86-64 Linux build (meson with -Dbuildtype=debugoptimized
-Dcassert=true), the inlining increases the size of the text section of
the postgres binary from 9719722 bytes to 9750557, i.e. an increase of
30KiB or 0.3%, while the change to cstring_to_text() makes zero
difference (as expected from my investigation).
- ilmari