Faster StrNCpy

Started by Tom Laneover 19 years ago49 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

David Strong points out here
http://archives.postgresql.org/pgsql-hackers/2006-09/msg02071.php
that some popular implementations of strncpy(dst,src,n) are quite
inefficient when strlen(src) is much less than n, because they don't
optimize the zero-pad step that is required by the standard.

It looks to me like we have a good number of places that are using
either StrNCpy or strncpy directly to copy into large buffers that
we do not need full zero-padding in, only a single guaranteed null
byte. While not all of these places are in performance-critical
paths, some are. David identified set_ps_display, and the other
thing that's probably significant is unnecessary use of strncpy
for keys of string-keyed hash tables. (We used to actually need
zero padding for string-keyed hash keys, but that was a long time ago.)

I propose adding an additional macro in c.h, along the lines of

#define StrNCopy(dst,src,len) \
do \
{ \
char * _dst = (dst); \
Size _len = (len); \
\
if (_len > 0) \
{ \
const char * _src = (src); \
Size _src_len = strlen(_src); \
\
if (_src_len < _len) \
memcpy(_dst, _src, _src_len + 1); \
else \
{ \
memcpy(_dst, _src, _len - 1); \
_dst[_len-1] = '\0'; \
} \
} \
} while (0)

Unlike StrNCpy, this requires that the source string be null-terminated,
so it would not be a drop-in replacement everywhere. Also, it could be
a performance loss if strlen(src) is much larger than len ... but that
is not usually the case for the places we'd want to apply it.

Thoughts, objections? In particular, is the name OK, or do we need
something a bit further away from StrNCpy?

regards, tom lane

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: Tom Lane (#1)
Re: Faster StrNCpy

On Tue, Sep 26, 2006 at 04:24:51PM -0400, Tom Lane wrote:

David Strong points out here
http://archives.postgresql.org/pgsql-hackers/2006-09/msg02071.php
that some popular implementations of strncpy(dst,src,n) are quite
inefficient when strlen(src) is much less than n, because they don't
optimize the zero-pad step that is required by the standard.

I think that's why strlcpy was invented, to deal with the issues with
strncpy.

http://www.gratisoft.us/todd/papers/strlcpy.html

There's an implementation here (used in glib), though you could
probably find more.

http://mail.gnome.org/archives/gtk-devel-list/2000-May/msg00029.html

Do you really think it's worth making a macro rather than just a normal
function?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Martijn van Oosterhout (#2)
Re: Faster StrNCpy

Martijn van Oosterhout wrote:

On Tue, Sep 26, 2006 at 04:24:51PM -0400, Tom Lane wrote:

David Strong points out here
http://archives.postgresql.org/pgsql-hackers/2006-09/msg02071.php
that some popular implementations of strncpy(dst,src,n) are quite
inefficient when strlen(src) is much less than n, because they don't
optimize the zero-pad step that is required by the standard.

I think that's why strlcpy was invented, to deal with the issues with
strncpy.

http://www.gratisoft.us/todd/papers/strlcpy.html

There's an implementation here (used in glib), though you could
probably find more.

http://mail.gnome.org/archives/gtk-devel-list/2000-May/msg00029.html

That one would be LGPL (glib's license). Here is OpenBSD's version,
linked from that one:

ftp://ftp.openbsd.org/pub/OpenBSD/src/lib/libc/string/strlcpy.c

You'll notice that it iterates once per char. Between that and the
strlen() call in Tom's version, not sure which is the lesser evil.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#2)
Re: Faster StrNCpy

Martijn van Oosterhout <kleptog@svana.org> writes:

I think that's why strlcpy was invented, to deal with the issues with
strncpy.
http://www.gratisoft.us/todd/papers/strlcpy.html

strlcpy does more than we need (note that none of the existing uses care
about counting the overflowed bytes). Not sure if it's worth adopting
those semantics when they're not really standard, but if you think a lot
of people would be familiar with strlcpy, maybe we should.

Do you really think it's worth making a macro rather than just a normal
function?

Only in that a macro in c.h is less work than a configure test plus a
replacement file in src/port. But if we want to consider this a
standard function that just doesn't happen to exist everywhere, I
suppose we should use configure.

regards, tom lane

#5Neil Conway
neilc@samurai.com
In reply to: Tom Lane (#4)
Re: Faster StrNCpy

On Tue, 2006-09-26 at 16:53 -0400, Tom Lane wrote:

strlcpy does more than we need (note that none of the existing uses care
about counting the overflowed bytes). Not sure if it's worth adopting
those semantics when they're not really standard, but if you think a lot
of people would be familiar with strlcpy, maybe we should.

I think we should -- while strlcpy() is not standardized, it is widely
used (in libc on all the BSDs, Solaris and OS X, as well as private
copies in Linux, glib, etc.).

A wholesale replacement of strncpy() calls is probably worth doing --
replacing them with strlcpy() if the source string is NUL-terminated,
and I suppose memcpy() otherwise.

-Neil

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: Faster StrNCpy

Alvaro Herrera <alvherre@commandprompt.com> writes:

You'll notice that it iterates once per char. Between that and the
strlen() call in Tom's version, not sure which is the lesser evil.

Yeah, I was wondering that too. My code would require two scans of the
source string (one inside strlen and one in memcpy), but in much of our
usage the source and dest should be reasonably well aligned and one
could expect memcpy to be using word rather than byte operations, so you
might possibly make it back on the strength of fewer write cycles. And
on the third hand, for short source strings none of this matters and
the extra function call involved for strlen/memcpy probably dominates.

I'm happy to just use the OpenBSD version as a src/port module.
Any objections?

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Neil Conway (#5)
Re: Faster StrNCpy

Neil Conway <neilc@samurai.com> writes:

A wholesale replacement of strncpy() calls is probably worth doing --
replacing them with strlcpy() if the source string is NUL-terminated,
and I suppose memcpy() otherwise.

What I'd like to do immediately is put in strlcpy() and hit the two or
three places I think are performance-relevant. I agree with trying to
get rid of StrNCpy/strncpy calls over the long run, but it's just code
beautification and probably not appropriate for beta.

regards, tom lane

#8Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#7)
Re: Faster StrNCpy

Tom,

What I'd like to do immediately is put in strlcpy() and hit the two or
three places I think are performance-relevant. I agree with trying to
get rid of StrNCpy/strncpy calls over the long run, but it's just code
beautification and probably not appropriate for beta.

Immediately? Presumably you mean for 8.3?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#8)
Re: Faster StrNCpy

Josh Berkus <josh@agliodbs.com> writes:

What I'd like to do immediately is put in strlcpy() and hit the two or
three places I think are performance-relevant.

Immediately? Presumably you mean for 8.3?

No, I mean now. This is a performance bug and it's still open season on
bugs. If we were close to having a release-candidate version, I'd hold
off, but the above proposal seems sufficiently low-risk for the current
stage of the cycle.

regards, tom lane

#10Strong, David
david.strong@unisys.com
In reply to: Tom Lane (#1)
Re: Faster StrNCpy

Tom,

Let us know when you've added strlcpy () and we'll be happy to run some tests on the new code.

David

________________________________

From: pgsql-hackers-owner@postgresql.org on behalf of Tom Lane
Sent: Tue 9/26/2006 7:25 PM
To: josh@agliodbs.com
Cc: pgsql-hackers@postgresql.org; Neil Conway; Martijn van Oosterhout
Subject: Re: [HACKERS] Faster StrNCpy

Josh Berkus <josh@agliodbs.com> writes:

What I'd like to do immediately is put in strlcpy() and hit the two or
three places I think are performance-relevant.

Immediately? Presumably you mean for 8.3?

No, I mean now. This is a performance bug and it's still open season on
bugs. If we were close to having a release-candidate version, I'd hold
off, but the above proposal seems sufficiently low-risk for the current
stage of the cycle.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#9)
Re: Faster StrNCpy

Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

What I'd like to do immediately is put in strlcpy() and hit the two or
three places I think are performance-relevant.

Immediately? Presumably you mean for 8.3?

No, I mean now. This is a performance bug and it's still open season on
bugs. If we were close to having a release-candidate version, I'd hold
off, but the above proposal seems sufficiently low-risk for the current
stage of the cycle.

What are the other hotspots?

cheers

andrew

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#11)
Re: Faster StrNCpy

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

What I'd like to do immediately is put in strlcpy() and hit the two or
three places I think are performance-relevant.

What are the other hotspots?

The ones I can think of offhand are set_ps_display and use of strncpy as
a HashCopyFunc.

regards, tom lane

#13Strong, David
david.strong@unisys.com
In reply to: Tom Lane (#1)
Re: Faster StrNCpy

We sometimes see TupleDescInitEntry () taking high CPU times via OProfile. This does include, amongst a lot of other code, a call to namestrcpy () which in turn calls StrNCpy (). Perhaps this is not a good candidate right now as a name string is only 64 bytes.

David

________________________________

From: pgsql-hackers-owner@postgresql.org on behalf of Tom Lane
Sent: Wed 9/27/2006 6:49 AM
To: Andrew Dunstan
Cc: josh@agliodbs.com; pgsql-hackers@postgresql.org; Neil Conway; Martijn van Oosterhout
Subject: Re: [HACKERS] Faster StrNCpy

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

What I'd like to do immediately is put in strlcpy() and hit the two or
three places I think are performance-relevant.

What are the other hotspots?

The ones I can think of offhand are set_ps_display and use of strncpy as
a HashCopyFunc.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

#14Mark Mielke
mark@mark.mielke.cc
In reply to: Strong, David (#13)
Re: Faster StrNCpy

On Wed, Sep 27, 2006 at 07:08:05AM -0700, Strong, David wrote:

We sometimes see TupleDescInitEntry () taking high CPU times via
OProfile. This does include, amongst a lot of other code, a call to
namestrcpy () which in turn calls StrNCpy (). Perhaps this is not a
good candidate right now as a name string is only 64 bytes.

Just wondering - are any of these cases where a memcpy() would work
just as well? Or are you not sure that the source string is at least
64 bytes in length?

memcpy(&target, &source, sizeof(target));
target[sizeof(target)-1] = '\0';

I imagine any extra checking causes processor stalls, or at least for
the branch prediction to fill up? Straight copies might allow for
maximum parallelism? If it's only 64 bytes, on processors such as
Pentium or Athlon, that's 2 or 4 cache lines, and writes are always
performed as cache lines.

I haven't seen the code that you and Tom are looking at to tell
whether it is safe to do this or not.

Cheers,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

#15Adnan DURSUN
a_dursun@hotmail.com
In reply to: Tom Lane (#1)
Can i see server SQL commands ?

Hi all

I wanna know what is going on while a DML command works. For example
;
Which commands are executed by the core when we send an "UPDATE tab
SET col = val1..."
in case there is a foreing key or an unique constraint on table
"tab".

How can i see that ?

Best regards

Adnan DURSUN
ASRIN Bili�im Ltd.

#16tomas@tuxteam.de
tomas@tuxteam.de
In reply to: Adnan DURSUN (#15)
Re: Can i see server SQL commands ?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Sep 28, 2006 at 04:27:36AM +0300, Adnan DURSUN wrote:

Hi all

I wanna know what is going on while a DML command works. For example
;
Which commands are executed by the core when we send an "UPDATE tab
SET col = val1..."

Adnan,

this mailing list is not the right one for such questions. More
appropriate would be <pgsql-novice@postgresql.org> or maybe
<pgsql-general@postgresql.org>.

Having said that, you may set the log level of the server in the
configuration file (whose location depends on your OS and PostgreSQL
version. Look there for a line log_statements = XXX and set XXX to
'all'. Don't forget to restart your server afterwards.

HTH
- -- tomas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFG3QyBcgs9XrR2kYRAgdqAJ0VnUw5+Q79HiIwHocHIw4TWHePaQCffBBK
ASn3Z6XpKG91NTrmEaBtz08=
=Ibh3
-----END PGP SIGNATURE-----

#17Strong, David
david.strong@unisys.com
In reply to: Mark Mielke (#14)
Re: Faster StrNCpy

Mark,

In the specific case of the namestrcpy () function, it will copy a
maximum of 64 bytes, but the length of the source string is unknown. I
would have to think that memcpy () would certainly win if you knew the
source and destination sizes etc. Perhaps there are some places like
that in the code that don't use memcpy () currently?

David

-----Original Message-----
From: mark@mark.mielke.cc [mailto:mark@mark.mielke.cc]
Sent: Wednesday, September 27, 2006 4:27 PM
To: Strong, David
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Faster StrNCpy

On Wed, Sep 27, 2006 at 07:08:05AM -0700, Strong, David wrote:

We sometimes see TupleDescInitEntry () taking high CPU times via
OProfile. This does include, amongst a lot of other code, a call to
namestrcpy () which in turn calls StrNCpy (). Perhaps this is not a
good candidate right now as a name string is only 64 bytes.

Just wondering - are any of these cases where a memcpy() would work
just as well? Or are you not sure that the source string is at least
64 bytes in length?

memcpy(&target, &source, sizeof(target));
target[sizeof(target)-1] = '\0';

I imagine any extra checking causes processor stalls, or at least for
the branch prediction to fill up? Straight copies might allow for
maximum parallelism? If it's only 64 bytes, on processors such as
Pentium or Athlon, that's 2 or 4 cache lines, and writes are always
performed as cache lines.

I haven't seen the code that you and Tom are looking at to tell
whether it is safe to do this or not.

Cheers,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com
__________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood
Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario,
Canada

One ring to rule them all, one ring to find them, one ring to bring
them all
and in the darkness bind them...

http://mark.mielke.cc/

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Strong, David (#17)
Re: Faster StrNCpy

"Strong, David" <david.strong@unisys.com> writes:

Just wondering - are any of these cases where a memcpy() would work
just as well? Or are you not sure that the source string is at least
64 bytes in length?

In most cases, we're pretty sure that it's *not* --- it'll just be a
palloc'd C string.

I'm disinclined to fool with the restriction that namestrcpy zero-pad
Name values, because they might end up on disk, and allowing random
memory contents to get written out is ungood from a security point of
view. However, it's entirely possible that it'd be a bit faster to do
a MemSet followed by strlcpy than to use strncpy for zero-padding.

regards, tom lane

#19Markus Schaber
schabi@logix-tt.com
In reply to: Tom Lane (#18)
Re: Faster StrNCpy

Hi, Tom,

Tom Lane wrote:

"Strong, David" <david.strong@unisys.com> writes:

Just wondering - are any of these cases where a memcpy() would work
just as well? Or are you not sure that the source string is at least
64 bytes in length?

In most cases, we're pretty sure that it's *not* --- it'll just be a
palloc'd C string.

I'm disinclined to fool with the restriction that namestrcpy zero-pad
Name values, because they might end up on disk, and allowing random
memory contents to get written out is ungood from a security point of
view.

There's another disadvantage of always copying 64 bytes:

It may be that the 64-byte range crosses a page boundary. Now guess what
happens when this next page is not mapped -> segfault.

Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Markus Schaber (#19)
Re: Faster StrNCpy

Markus Schaber <schabi@logix-tt.com> writes:

There's another disadvantage of always copying 64 bytes:
It may be that the 64-byte range crosses a page boundary. Now guess what
happens when this next page is not mapped -> segfault.

Irrelevant, because in all interesting cases the Name field is part of a
larger record that would stretch into that other page anyway.

regards, tom lane

#21Mark Mielke
mark@mark.mielke.cc
In reply to: Markus Schaber (#19)
#22Mark Mielke
mark@mark.mielke.cc
In reply to: Mark Mielke (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Mielke (#22)
#24Mark Mielke
mark@mark.mielke.cc
In reply to: Tom Lane (#23)
#25Strong, David
david.strong@unisys.com
In reply to: Strong, David (#17)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Mielke (#24)
#27Luke Lonergan
llonergan@greenplum.com
In reply to: Mark Mielke (#24)
#28Mark Mielke
mark@mark.mielke.cc
In reply to: Tom Lane (#26)
#29Sergey E. Koposov
math@sai.msu.ru
In reply to: Tom Lane (#26)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey E. Koposov (#29)
#31Sergey E. Koposov
math@sai.msu.ru
In reply to: Tom Lane (#30)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey E. Koposov (#31)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Strong, David (#25)
#34Strong, David
david.strong@unisys.com
In reply to: Strong, David (#17)
#35Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#7)
#36Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#6)
#37Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#32)
#38Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Tom Lane (#32)
#39Mark Mielke
mark@mark.mielke.cc
In reply to: Zeugswetter Andreas SB SD (#38)
#40Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Mark Mielke (#39)
#41Benny Amorsen
benny+usenet@amorsen.dk
In reply to: Mark Mielke (#39)
#42Mark Mielke
mark@mark.mielke.cc
In reply to: Zeugswetter Andreas SB SD (#40)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB SD (#38)
#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Mielke (#42)
#45Strong, David
david.strong@unisys.com
In reply to: Tom Lane (#43)
#46Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#32)
#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#46)
#48Bruce Momjian
bruce@momjian.us
In reply to: Strong, David (#45)
#49Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#48)