patch: Use pg_mbcliplen for truncation in text-to-name conversion

Started by Karl Schnaitterover 13 years ago2 messages
#1Karl Schnaitter
karlsch@gmail.com
1 attachment(s)

The text_name function was truncating its input string to the first
NAMEDATALEN-1 bytes, which is wrong if the string has multi-byte
characters. This patch changes it to use pg_mbcliplen, following
the namein function.

Attachments:

text_to_name.patchtext/plain; name=text_to_name.patchDownload
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index a5592d5..02fe9b4 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -2255,8 +2255,7 @@ text_name(PG_FUNCTION_ARGS)
 	len = VARSIZE_ANY_EXHDR(s);
 
 	/* Truncate oversize input */
-	if (len >= NAMEDATALEN)
-		len = NAMEDATALEN - 1;
+	len = pg_mbcliplen(VARDATA_ANY(s), len, NAMEDATALEN - 1);
 
 	result = (Name) palloc(NAMEDATALEN);
 	memcpy(NameStr(*result), VARDATA_ANY(s), len);
diff --git a/src/test/regress/expected/name.out b/src/test/regress/expected/name.out
index b359d52..f4b58f1 100644
--- a/src/test/regress/expected/name.out
+++ b/src/test/regress/expected/name.out
@@ -15,6 +15,19 @@ SELECT name 'name string' = name 'name string ' AS "False";
  f
 (1 row)
 
+-- name truncation with unicode characters
+SELECT length(repeat(U&'\0400', 32)::unknown::name::bytea, 'utf8') as length_from_unknown;
+ length_from_unknown 
+---------------------
+                  31
+(1 row)
+
+SELECT length(repeat(U&'\0400', 32)::text::name::bytea, 'utf8') as length_from_text;
+ length_from_text 
+------------------
+               31
+(1 row)
+
 --
 --
 --
diff --git a/src/test/regress/sql/name.sql b/src/test/regress/sql/name.sql
index 1c7a671..9f7a5f0 100644
--- a/src/test/regress/sql/name.sql
+++ b/src/test/regress/sql/name.sql
@@ -8,6 +8,10 @@ SELECT name 'name string' = name 'name string' AS "True";
 
 SELECT name 'name string' = name 'name string ' AS "False";
 
+-- name truncation with unicode characters
+SELECT length(repeat(U&'\0400', 32)::unknown::name::bytea, 'utf8') as length_from_unknown;
+SELECT length(repeat(U&'\0400', 32)::text::name::bytea, 'utf8') as length_from_text;
+
 --
 --
 --
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Karl Schnaitter (#1)
Re: patch: Use pg_mbcliplen for truncation in text-to-name conversion

Karl Schnaitter <karlsch@gmail.com> writes:

The text_name function was truncating its input string to the first
NAMEDATALEN-1 bytes, which is wrong if the string has multi-byte
characters. This patch changes it to use pg_mbcliplen, following
the namein function.

Good catch, but poking around I note that bpchar_name has the same
disease. Will fix, thanks for the report!

regards, tom lane