Why format() adds double quote?

Started by Tatsuo Ishiiabout 10 years ago28 messageshackers
Jump to latest
#1Tatsuo Ishii
t-ishii@sra.co.jp

test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)

Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().

We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.

test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tatsuo Ishii (#1)
Re: Why format() adds double quote?

2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)

Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().

We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.

test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)

format uses same routine as quote_ident. So quote_ident should be fixed
first.

Regards

Pavel

Show quoted text

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Pavel Stehule (#2)
Re: Why format() adds double quote?

2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)

Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().

We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.

test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)

format uses same routine as quote_ident. So quote_ident should be fixed
first.

Yes, I had that in my mind too.

Attached is the proposed patch to fix the bug.
Regression tests passed.

Here is an example after the patch. Note that the third row is not
quoted any more.

test=# select format('%I', あいう) from t2;
format
--------
aaa
"AAA"
あああ
(3 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Attachments:

ruleutils.c.difftext/x-patch; charset=us-asciiDownload+3-2
#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tatsuo Ishii (#3)
Re: Why format() adds double quote?

Hi

2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)

Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().

We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.

test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)

format uses same routine as quote_ident. So quote_ident should be fixed
first.

Yes, I had that in my mind too.

Attached is the proposed patch to fix the bug.
Regression tests passed.

Here is an example after the patch. Note that the third row is not
quoted any more.

test=# select format('%I', あいう) from t2;
format
--------
aaa
"AAA"
あああ
(3 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

diff --git a/src/backend/utils/adt/ruleutils.c
b/src/backend/utils/adt/ruleutils.c
index 3783e97..b93fc27 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -9405,7 +9405,7 @@ quote_identifier(const char *ident)
* would like to use <ctype.h> macros here, but they might yield
unwanted
* locale-specific results...
*/
-       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_');
+       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_' ||
IS_HIGHBIT_SET(ident[0]));

for (ptr = ident; *ptr; ptr++)
{
@@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)

if ((ch >= 'a' && ch <= 'z') ||
(ch >= '0' && ch <= '9') ||
-                       (ch == '_'))
+                       (ch == '_') ||
+                       (IS_HIGHBIT_SET(ch)))
{
/* okay */
}

This patch ls simply - I remember I was surprised, so we allow any
multibyte char few months ago.

+1

Pavel

#5Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Pavel Stehule (#4)
Re: Why format() adds double quote?

Hi

2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)

Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().

We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.

test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)

format uses same routine as quote_ident. So quote_ident should be fixed
first.

Yes, I had that in my mind too.

Attached is the proposed patch to fix the bug.
Regression tests passed.

Here is an example after the patch. Note that the third row is not
quoted any more.

test=# select format('%I', あいう) from t2;
format
--------
aaa
"AAA"
あああ
(3 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

diff --git a/src/backend/utils/adt/ruleutils.c
b/src/backend/utils/adt/ruleutils.c
index 3783e97..b93fc27 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -9405,7 +9405,7 @@ quote_identifier(const char *ident)
* would like to use <ctype.h> macros here, but they might yield
unwanted
* locale-specific results...
*/
-       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_');
+       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_' ||
IS_HIGHBIT_SET(ident[0]));

for (ptr = ident; *ptr; ptr++)
{
@@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)

if ((ch >= 'a' && ch <= 'z') ||
(ch >= '0' && ch <= '9') ||
-                       (ch == '_'))
+                       (ch == '_') ||
+                       (IS_HIGHBIT_SET(ch)))
{
/* okay */
}

This patch ls simply - I remember I was surprised, so we allow any
multibyte char few months ago.

+1

If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tatsuo Ishii (#5)
Re: Why format() adds double quote?

2016-01-20 10:17 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

Hi

2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)

Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().

We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.

test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)

format uses same routine as quote_ident. So quote_ident should be

fixed

first.

Yes, I had that in my mind too.

Attached is the proposed patch to fix the bug.
Regression tests passed.

Here is an example after the patch. Note that the third row is not
quoted any more.

test=# select format('%I', あいう) from t2;
format
--------
aaa
"AAA"
あああ
(3 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

diff --git a/src/backend/utils/adt/ruleutils.c
b/src/backend/utils/adt/ruleutils.c
index 3783e97..b93fc27 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -9405,7 +9405,7 @@ quote_identifier(const char *ident)
* would like to use <ctype.h> macros here, but they might yield
unwanted
* locale-specific results...
*/
-       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] ==

'_');

+ safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_'

||

IS_HIGHBIT_SET(ident[0]));

for (ptr = ident; *ptr; ptr++)
{
@@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)

if ((ch >= 'a' && ch <= 'z') ||
(ch >= '0' && ch <= '9') ||
-                       (ch == '_'))
+                       (ch == '_') ||
+                       (IS_HIGHBIT_SET(ch)))
{
/* okay */
}

This patch ls simply - I remember I was surprised, so we allow any
multibyte char few months ago.

+1

If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.

I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.

Pavel

Show quoted text

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#7Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Stehule (#6)
Re: Why format() adds double quote?

On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.

I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.

Quite. This is not a bug fix. It's a behavior change, perhaps for the better.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Robert Haas (#7)
Re: Why format() adds double quote?

On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.

I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.

Quite. This is not a bug fix. It's a behavior change, perhaps for the better.

Added to the commitfest 2016-03.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Dickson S. Guedes
listas@guedesoft.net
In reply to: Tatsuo Ishii (#8)
Re: Why format() adds double quote?

2016-01-24 8:04 GMT-02:00 Tatsuo Ishii <ishii@postgresql.org>:

On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.

I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.

Quite. This is not a bug fix. It's a behavior change, perhaps for the better.

Added to the commitfest 2016-03.

Hi,

I gone ahead a little and tested this patch and it works like was
proposed, I agree that it's not a bug fix but a new behavior so -1 for
backport.

While applying patch against master
(1129c2b0ad2732f301f696ae2cf98fb063a4c1f8) it offsets two hunks.

Since format() has regression tests I suggest that one should be added
to cover this. It could worth to add the new behavior to the docs,
since there no explicit example for %I.

I performed the follow tests that works as expected using some Portuguese words:

postgres=# create table test (nome varchar, endereço text, "UF"
varchar(2), título varchar);
CREATE TABLE
Time: 80,769 ms
postgres=# select format('%I', attname) from pg_attribute join
pg_class on (attrelid = oid) where relname = 'test';
format
----------
"UF"
cmax
cmin
ctid
endereço
nome
tableoid
título
xmax
xmin
(10 rows)

Time: 1,728 ms
postgres=# select format('%I', 'endereco');
format
----------
endereco
(1 row)

Time: 0,098 ms
postgres=# select format('%I', 'endereço');
format
----------
endereço
(1 row)

Time: 0,088 ms
postgres=# select format('%I', 'あああ');
format
--------
あああ
(1 row)

Time: 0,072 ms
postgres=# select format('%I', 'título');
format
--------
título
(1 row)

Time: 0,051 ms
postgres=# select format('%I', 'título e');
format
------------
"título e"
(1 row)

Time: 0,051 ms
postgres=# select format('%I', 'título_e');
format
----------
título_e
(1 row)

Time: 0,051 ms
postgres=# select format('%I', '_título');
format
---------
_título
(1 row)

Time: 0,047 ms
postgres=# select format('%I', '1_título');
format
------------
"1_título"
(1 row)

Time: 0,046 ms

Thank you for this!

Best regards,
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Dickson S. Guedes (#9)
Re: Why format() adds double quote?

2016-01-24 8:04 GMT-02:00 Tatsuo Ishii <ishii@postgresql.org>:

On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.

I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.

Quite. This is not a bug fix. It's a behavior change, perhaps for the better.

Added to the commitfest 2016-03.

Hi,

I gone ahead a little and tested this patch and it works like was
proposed, I agree that it's not a bug fix but a new behavior so -1 for
backport.

IMO, it's a bug or at least an inconsistency but I admit it's too late
to back patch to existing stable branches.

While applying patch against master
(1129c2b0ad2732f301f696ae2cf98fb063a4c1f8) it offsets two hunks.

Since format() has regression tests I suggest that one should be added
to cover this.

I don't think it's doable. The test requires to handle multiple
database encodings. The regression test framework handles only one
database encoding. Probably adding to the existing mb test is the
easiest.

It could worth to add the new behavior to the docs,
since there no explicit example for %I.

I performed the follow tests that works as expected using some Portuguese words:

I assume you used UTF-8 encoding database.
Great.

postgres=# create table test (nome varchar, endereço text, "UF"
varchar(2), título varchar);
CREATE TABLE
Time: 80,769 ms
postgres=# select format('%I', attname) from pg_attribute join
pg_class on (attrelid = oid) where relname = 'test';
format
----------
"UF"
cmax
cmin
ctid
endereço
nome
tableoid
título
xmax
xmin
(10 rows)

Time: 1,728 ms
postgres=# select format('%I', 'endereco');
format
----------
endereco
(1 row)

Time: 0,098 ms
postgres=# select format('%I', 'endereço');
format
----------
endereço
(1 row)

Time: 0,088 ms
postgres=# select format('%I', 'あああ');
format
--------
あああ
(1 row)

Time: 0,072 ms
postgres=# select format('%I', 'título');
format
--------
título
(1 row)

Time: 0,051 ms
postgres=# select format('%I', 'título e');
format
------------
"título e"
(1 row)

Time: 0,051 ms
postgres=# select format('%I', 'título_e');
format
----------
título_e
(1 row)

Time: 0,051 ms
postgres=# select format('%I', '_título');
format
---------
_título
(1 row)

Time: 0,047 ms
postgres=# select format('%I', '1_título');
format
------------
"1_título"
(1 row)

Time: 0,046 ms

Thank you for this!

Best regards,
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Daniel Verite
daniel@manitou-mail.org
In reply to: Tatsuo Ishii (#10)
Re: Why format() adds double quote?

Tatsuo Ishii wrote:

IMO, it's a bug or at least an inconsistency

Personally I don't see this change being good for everything.

Let's play devil's advocate:

create table abc(U&"foo\2003" int);

U+2003 is 'EM SPACE', in Unicode's General Punctuation block.

With the current version, format('%I', attname) on this column is:
"foo "

With the patched version, it produces this:
foo 

So the visual hint that there are more characters at the end is lost.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Dickson S. Guedes
listas@guedesoft.net
In reply to: Tatsuo Ishii (#10)
Re: Why format() adds double quote?

2016-01-26 5:29 GMT-02:00 Tatsuo Ishii <ishii@postgresql.org>:

I assume you used UTF-8 encoding database.

Yes, I do.

--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Dickson S. Guedes
listas@guedesoft.net
In reply to: Daniel Verite (#11)
Re: Why format() adds double quote?

2016-01-26 18:00 GMT-02:00 Daniel Verite <daniel@manitou-mail.org>:

...
create table abc(U&"foo\2003" int);

U+2003 is 'EM SPACE', in Unicode's General Punctuation block.

With the current version, format('%I', attname) on this column is:
"foo "

With the patched version, it produces this:
foo

So the visual hint that there are more characters at the end is lost.

Thanks for advocate, I see here that it even produces that output with
simple spaces.

postgres=# create table x ("aí " text);
CREATE TABLE
postgres=# \d x
Tabela "public.x"
Coluna | Tipo | Modificadores
----------+------+---------------
aí | text |

This will break copy&paste user actions and scripts that parses that output.

Maybe the patch should consider left/right non-printable chars to
choose whether to show or not the " ?

[]s
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Dickson S. Guedes (#13)
Re: Why format() adds double quote?

Thanks for advocate, I see here that it even produces that output with
simple spaces.

postgres=# create table x ("aí " text);
CREATE TABLE
postgres=# \d x
Tabela "public.x"
Coluna | Tipo | Modificadores
----------+------+---------------
aí | text |

This will break copy&paste user actions and scripts that parses that output.

Maybe the patch should consider left/right non-printable chars to
choose whether to show or not the " ?

This is a totally different story from the topic discussed in this
thread. psql never adds double quotations to column name even with
upper case col names.

test=# create table t6("ABC" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
--------+---------+-----------
ABC | integer |

If you want to change the existing psql's behavior, propose it
yourself.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Daniel Verite (#11)
Re: Why format() adds double quote?

IMO, it's a bug or at least an inconsistency

Personally I don't see this change being good for everything.

Let's play devil's advocate:

create table abc(U&"foo\2003" int);

U+2003 is 'EM SPACE', in Unicode's General Punctuation block.

With the current version, format('%I', attname) on this column is:
"foo "

With the patched version, it produces this:
foo 

So the visual hint that there are more characters at the end is lost.

What is the "visual hint"? If you are talking about psql's output, it
never adds "visual hint" (double quotations).

If you are talking about the string handling in a program, what kind
of program cares about "visiual"?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Pavel Stehule
pavel.stehule@gmail.com
In reply to: Daniel Verite (#11)
Re: Why format() adds double quote?

2016-01-26 21:00 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Tatsuo Ishii wrote:

IMO, it's a bug or at least an inconsistency

Personally I don't see this change being good for everything.

Let's play devil's advocate:

create table abc(U&"foo\2003" int);

U+2003 is 'EM SPACE', in Unicode's General Punctuation block.

With the current version, format('%I', attname) on this column is:
"foo "

With the patched version, it produces this:
foo

So the visual hint that there are more characters at the end is lost.

I can agree, so current behave can be useful in some cases, but still it is
bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.

Currently, any multibyte char can be unescaped identifier (only apostrophes
are tested). We should to test white chars too.

Regards

Pavel

Show quoted text

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#17Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Pavel Stehule (#16)
Re: Why format() adds double quote?

2016-01-26 21:00 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Tatsuo Ishii wrote:

IMO, it's a bug or at least an inconsistency

Personally I don't see this change being good for everything.

Let's play devil's advocate:

create table abc(U&"foo\2003" int);

U+2003 is 'EM SPACE', in Unicode's General Punctuation block.

With the current version, format('%I', attname) on this column is:
"foo "

With the patched version, it produces this:
foo

So the visual hint that there are more characters at the end is lost.

I can agree, so current behave can be useful in some cases, but still it is
bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.

Currently, any multibyte char can be unescaped identifier (only apostrophes
are tested). We should to test white chars too.

Really? I thought we do that test.

test=# create table t6("あいう えお" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
-------------+---------+-----------
あいう えお | integer |
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tatsuo Ishii (#17)
Re: Why format() adds double quote?

2016-01-27 6:13 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

2016-01-26 21:00 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Tatsuo Ishii wrote:

IMO, it's a bug or at least an inconsistency

Personally I don't see this change being good for everything.

Let's play devil's advocate:

create table abc(U&"foo\2003" int);

U+2003 is 'EM SPACE', in Unicode's General Punctuation block.

With the current version, format('%I', attname) on this column is:
"foo "

With the patched version, it produces this:
foo

So the visual hint that there are more characters at the end is lost.

I can agree, so current behave can be useful in some cases, but still it

is

bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.

Currently, any multibyte char can be unescaped identifier (only

apostrophes

are tested). We should to test white chars too.

Really? I thought we do that test.

what you are expecting from this test? UTF single quotes are tested only in
quote functions probably.

Pavel

Show quoted text

test=# create table t6("あいう えお" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
-------------+---------+-----------
あいう えお | integer |
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#19Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Pavel Stehule (#18)
Re: Why format() adds double quote?

I can agree, so current behave can be useful in some cases, but still it

is

bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.

Currently, any multibyte char can be unescaped identifier (only

apostrophes

are tested). We should to test white chars too.

Really? I thought we do that test.

what you are expecting from this test? UTF single quotes are tested only in
quote functions probably.

I just wanted to demonstrate multibyte chars including ASCII white
spaces can be an identifier.

We should to test white chars too.

What do you exactly propose regarding white chars and multibyte chars
here? Maybe you propose to consider non ASCII white spaces (treate
them as ASCII white spaces)?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Pavel

test=# create table t6("あいう えお" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
-------------+---------+-----------
あいう えお | integer |
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tatsuo Ishii (#19)
Re: Why format() adds double quote?

2016-01-27 6:24 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:

I can agree, so current behave can be useful in some cases, but still

it

is

bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.

Currently, any multibyte char can be unescaped identifier (only

apostrophes

are tested). We should to test white chars too.

Really? I thought we do that test.

what you are expecting from this test? UTF single quotes are tested only

in

quote functions probably.

I just wanted to demonstrate multibyte chars including ASCII white
spaces can be an identifier.

I understand now.

We should to test white chars too.

What do you exactly propose regarding white chars and multibyte chars
here? Maybe you propose to consider non ASCII white spaces (treate
them as ASCII white spaces)?

I propose the work with UTF white chars should be same like ASCII white
chars. The current design is too simple - with possible pretty bad issues.
Daniel's example is good - there is big gap in design.

Regards

Pavel

Show quoted text

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Pavel

test=# create table t6("あいう えお" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
-------------+---------+-----------
あいう えお | integer |
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#21Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Pavel Stehule (#20)
#22Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tatsuo Ishii (#21)
#23Daniel Verite
daniel@manitou-mail.org
In reply to: Tatsuo Ishii (#21)
#24Daniel Verite
daniel@manitou-mail.org
In reply to: Tatsuo Ishii (#15)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Verite (#24)
#26Dickson S. Guedes
listas@guedesoft.net
In reply to: Tatsuo Ishii (#14)
#27Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Daniel Verite (#23)
#28Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#25)