character type value is not padded with spaces

Started by Yoshiyuki Asabaover 20 years ago6 messages
#1Yoshiyuki Asaba
y-asaba@sra.co.jp
1 attachment(s)

Character type value including multibyte characters is not padded
with spaces. It reproduces at 7.3.x, 7.4.x and 8.0.x.

create table t (a char(10));
insert into t values ('XXXXX'); -- X is 2byte character.

I expect that 'XXXXX ' is inserted. But 'XXXXX' is inserted.

select a, octed_length(a) from t;

a | octet_length
-------+--------------
XXXXX | 10

If padded with spaces, octet_length(a) is 15. This problem is caused
that string length is calculated by byte length(VARSIZE) in
exprTypmod().

I attache the patch for this problem.

Regards,

--
Yoshiyuki Asaba
y-asaba@sra.co.jp

Attachments:

exprTypmod.patchtext/plain; charset=us-asciiDownload
*** parse_expr.c.orig	2005-01-13 02:32:36.000000000 +0900
--- parse_expr.c	2005-05-22 17:12:37.000000000 +0900
***************
*** 18,23 ****
--- 18,24 ----
  #include "catalog/pg_operator.h"
  #include "catalog/pg_proc.h"
  #include "commands/dbcommands.h"
+ #include "mb/pg_wchar.h"
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
  #include "nodes/params.h"
***************
*** 34,40 ****
  #include "utils/lsyscache.h"
  #include "utils/syscache.h"
  
- 
  bool		Transform_null_equals = false;
  
  static Node *transformColumnRef(ParseState *pstate, ColumnRef *cref);
--- 35,40 ----
***************
*** 1491,1497 ****
  				{
  					case BPCHAROID:
  						if (!con->constisnull)
! 							return VARSIZE(DatumGetPointer(con->constvalue));
  						break;
  					default:
  						break;
--- 1491,1503 ----
  				{
  					case BPCHAROID:
  						if (!con->constisnull)
! 						{
! 							int32 len = VARSIZE(DatumGetPointer(con->constvalue)) - VARHDRSZ;
! 
! 							if (pg_database_encoding_max_length() > 1)
! 								len = pg_mbstrlen_with_len(VARDATA(DatumGetPointer(con->constvalue)), len);
! 							return len + VARHDRSZ;
! 						}
  						break;
  					default:
  						break;
#2Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Yoshiyuki Asaba (#1)
Re: [PATCHES] character type value is not padded with spaces

Hackers,

The problem he found is not only existing in Japanese characters but
also in any multibyte encodings including UTF-8. For me the patch
looks good and I will commit it to 7.3, 7.4, 8.0 stables and current
if there's no objection.
--
Tatsuo Ishii

Show quoted text

Character type value including multibyte characters is not padded
with spaces. It reproduces at 7.3.x, 7.4.x and 8.0.x.

create table t (a char(10));
insert into t values ('XXXXX'); -- X is 2byte character.

I expect that 'XXXXX ' is inserted. But 'XXXXX' is inserted.

select a, octed_length(a) from t;

a | octet_length
-------+--------------
XXXXX | 10

If padded with spaces, octet_length(a) is 15. This problem is caused
that string length is calculated by byte length(VARSIZE) in
exprTypmod().

I attache the patch for this problem.

Regards,

--
Yoshiyuki Asaba
y-asaba@sra.co.jp

#3John Hansen
john@geeknet.com.au
In reply to: Tatsuo Ishii (#2)
Re: [PATCHES] character type value is not padded with spaces

Ahemm,...

UNICODE DB:

create table t (a char(10));
set client_encoding = iso88591;
insert into t VALUES ('æøå');

select a, octet_length(a),length(a) from t;
a | octet_length | length
------------+--------------+--------
æøå | 13 | 3
(1 row)

This is with 8.0.2.

Just FYI.

... John

Show quoted text

-----Original Message-----
From: pgsql-patches-owner@postgresql.org
[mailto:pgsql-patches-owner@postgresql.org] On Behalf Of Tatsuo Ishii
Sent: Tuesday, May 24, 2005 8:52 AM
To: y-asaba@sra.co.jp
Cc: pgsql-patches@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [PATCHES] character type value is not padded with spaces

Hackers,

The problem he found is not only existing in Japanese
characters but also in any multibyte encodings including
UTF-8. For me the patch looks good and I will commit it to
7.3, 7.4, 8.0 stables and current if there's no objection.
--
Tatsuo Ishii

Character type value including multibyte characters is not

padded with

spaces. It reproduces at 7.3.x, 7.4.x and 8.0.x.

create table t (a char(10));
insert into t values ('XXXXX'); -- X is 2byte character.

I expect that 'XXXXX ' is inserted. But 'XXXXX' is inserted.

select a, octed_length(a) from t;

a | octet_length
-------+--------------
XXXXX | 10

If padded with spaces, octet_length(a) is 15. This problem

is caused

that string length is calculated by byte length(VARSIZE) in
exprTypmod().

I attache the patch for this problem.

Regards,

--
Yoshiyuki Asaba
y-asaba@sra.co.jp

---------------------------(end of
broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)

#4Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: John Hansen (#3)
Re: [PATCHES] character type value is not padded with spaces

I think you need to test with 5 characters, not 3.
--
Tatsuo Ishii

Show quoted text

Ahemm,...

UNICODE DB:

create table t (a char(10));
set client_encoding = iso88591;
insert into t VALUES ('æøå');

select a, octet_length(a),length(a) from t;
a | octet_length | length
------------+--------------+--------
æøå | 13 | 3
(1 row)

This is with 8.0.2.

Just FYI.

... John

-----Original Message-----
From: pgsql-patches-owner@postgresql.org
[mailto:pgsql-patches-owner@postgresql.org] On Behalf Of Tatsuo Ishii
Sent: Tuesday, May 24, 2005 8:52 AM
To: y-asaba@sra.co.jp
Cc: pgsql-patches@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [PATCHES] character type value is not padded with spaces

Hackers,

The problem he found is not only existing in Japanese
characters but also in any multibyte encodings including
UTF-8. For me the patch looks good and I will commit it to
7.3, 7.4, 8.0 stables and current if there's no objection.
--
Tatsuo Ishii

Character type value including multibyte characters is not

padded with

spaces. It reproduces at 7.3.x, 7.4.x and 8.0.x.

create table t (a char(10));
insert into t values ('XXXXX'); -- X is 2byte character.

I expect that 'XXXXX ' is inserted. But 'XXXXX' is inserted.

select a, octed_length(a) from t;

a | octet_length
-------+--------------
XXXXX | 10

If padded with spaces, octet_length(a) is 15. This problem

is caused

that string length is calculated by byte length(VARSIZE) in
exprTypmod().

I attache the patch for this problem.

Regards,

--
Yoshiyuki Asaba
y-asaba@sra.co.jp

---------------------------(end of
broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)

#5John Hansen
john@geeknet.com.au
In reply to: Tatsuo Ishii (#4)
Re: [PATCHES] character type value is not padded with spaces

Ahhh...

Show quoted text

-----Original Message-----
From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
Sent: Tuesday, May 24, 2005 9:26 AM
To: John Hansen
Cc: y-asaba@sra.co.jp; pgsql-patches@postgresql.org;
pgsql-hackers@postgresql.org
Subject: Re: [PATCHES] character type value is not padded with spaces

I think you need to test with 5 characters, not 3.
--
Tatsuo Ishii

Ahemm,...

UNICODE DB:

create table t (a char(10));
set client_encoding = iso88591;
insert into t VALUES ('æøå');

select a, octet_length(a),length(a) from t;
a | octet_length | length
------------+--------------+--------
æøå | 13 | 3
(1 row)

This is with 8.0.2.

Just FYI.

... John

-----Original Message-----
From: pgsql-patches-owner@postgresql.org
[mailto:pgsql-patches-owner@postgresql.org] On Behalf Of Tatsuo
Ishii
Sent: Tuesday, May 24, 2005 8:52 AM
To: y-asaba@sra.co.jp
Cc: pgsql-patches@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [PATCHES] character type value is not padded with
spaces

Hackers,

The problem he found is not only existing in Japanese

characters but

also in any multibyte encodings including UTF-8. For me the patch
looks good and I will commit it to 7.3, 7.4, 8.0 stables

and current

if there's no objection.
--
Tatsuo Ishii

Character type value including multibyte characters is not

padded with

spaces. It reproduces at 7.3.x, 7.4.x and 8.0.x.

create table t (a char(10));
insert into t values ('XXXXX'); -- X is 2byte character.

I expect that 'XXXXX ' is inserted. But 'XXXXX' is inserted.

select a, octed_length(a) from t;

a | octet_length
-------+--------------
XXXXX | 10

If padded with spaces, octet_length(a) is 15. This problem

is caused

that string length is calculated by byte length(VARSIZE) in
exprTypmod().

I attache the patch for this problem.

Regards,

--
Yoshiyuki Asaba
y-asaba@sra.co.jp

---------------------------(end of
broadcast)---------------------------
TIP 2: you can get off all lists at once with the

unregister command

(send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)

#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Yoshiyuki Asaba (#1)
Re: character type value is not padded with spaces

I see Tatsuo already applied this, which is great. I added a little
comment:

/* if multi-byte, take len and find # characters */

---------------------------------------------------------------------------

Yoshiyuki Asaba wrote:

Character type value including multibyte characters is not padded
with spaces. It reproduces at 7.3.x, 7.4.x and 8.0.x.

create table t (a char(10));
insert into t values ('XXXXX'); -- X is 2byte character.

I expect that 'XXXXX ' is inserted. But 'XXXXX' is inserted.

select a, octed_length(a) from t;

a | octet_length
-------+--------------
XXXXX | 10

If padded with spaces, octet_length(a) is 15. This problem is caused
that string length is calculated by byte length(VARSIZE) in
exprTypmod().

I attache the patch for this problem.

Regards,

--
Yoshiyuki Asaba
y-asaba@sra.co.jp

*** parse_expr.c.orig	2005-01-13 02:32:36.000000000 +0900
--- parse_expr.c	2005-05-22 17:12:37.000000000 +0900
***************
*** 18,23 ****
--- 18,24 ----
#include "catalog/pg_operator.h"
#include "catalog/pg_proc.h"
#include "commands/dbcommands.h"
+ #include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
#include "nodes/params.h"
***************
*** 34,40 ****
#include "utils/lsyscache.h"
#include "utils/syscache.h"

-
bool Transform_null_equals = false;

static Node *transformColumnRef(ParseState *pstate, ColumnRef *cref);
--- 35,40 ----
***************
*** 1491,1497 ****
{
case BPCHAROID:
if (!con->constisnull)
! 							return VARSIZE(DatumGetPointer(con->constvalue));
break;
default:
break;
--- 1491,1503 ----
{
case BPCHAROID:
if (!con->constisnull)
! 						{
! 							int32 len = VARSIZE(DatumGetPointer(con->constvalue)) - VARHDRSZ;
! 
! 							if (pg_database_encoding_max_length() > 1)
! 								len = pg_mbstrlen_with_len(VARDATA(DatumGetPointer(con->constvalue)), len);
! 							return len + VARHDRSZ;
! 						}
break;
default:
break;

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073