Why format() adds double quote?
test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)
Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().
We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.
test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)
format uses same routine as quote_ident. So quote_ident should be fixed
first.
Regards
Pavel
Show quoted text
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)format uses same routine as quote_ident. So quote_ident should be fixed
first.
Yes, I had that in my mind too.
Attached is the proposed patch to fix the bug.
Regression tests passed.
Here is an example after the patch. Note that the third row is not
quoted any more.
test=# select format('%I', あいう) from t2;
format
--------
aaa
"AAA"
あああ
(3 rows)
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
ruleutils.c.difftext/x-patch; charset=us-asciiDownload+3-2
Hi
2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)format uses same routine as quote_ident. So quote_ident should be fixed
first.Yes, I had that in my mind too.
Attached is the proposed patch to fix the bug.
Regression tests passed.Here is an example after the patch. Note that the third row is not
quoted any more.test=# select format('%I', あいう) from t2;
format
--------
aaa
"AAA"
あああ
(3 rows)Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jpdiff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 3783e97..b93fc27 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -9405,7 +9405,7 @@ quote_identifier(const char *ident) * would like to use <ctype.h> macros here, but they might yield unwanted * locale-specific results... */ - safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_'); + safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_' || IS_HIGHBIT_SET(ident[0]));for (ptr = ident; *ptr; ptr++)
{
@@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)if ((ch >= 'a' && ch <= 'z') || (ch >= '0' && ch <= '9') || - (ch == '_')) + (ch == '_') || + (IS_HIGHBIT_SET(ch))) { /* okay */ }
This patch ls simply - I remember I was surprised, so we allow any
multibyte char few months ago.
+1
Pavel
Hi
2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)format uses same routine as quote_ident. So quote_ident should be fixed
first.Yes, I had that in my mind too.
Attached is the proposed patch to fix the bug.
Regression tests passed.Here is an example after the patch. Note that the third row is not
quoted any more.test=# select format('%I', あいう) from t2;
format
--------
aaa
"AAA"
あああ
(3 rows)Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jpdiff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 3783e97..b93fc27 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -9405,7 +9405,7 @@ quote_identifier(const char *ident) * would like to use <ctype.h> macros here, but they might yield unwanted * locale-specific results... */ - safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_'); + safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_' || IS_HIGHBIT_SET(ident[0]));for (ptr = ident; *ptr; ptr++)
{
@@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)if ((ch >= 'a' && ch <= 'z') || (ch >= '0' && ch <= '9') || - (ch == '_')) + (ch == '_') || + (IS_HIGHBIT_SET(ch))) { /* okay */ }This patch ls simply - I remember I was surprised, so we allow any
multibyte char few months ago.+1
If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-20 10:17 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
Hi
2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
test=# select format('%I', t) from t1;
format
----------
aaa
"AAA"
"あいう"
(3 rows)Why is the text value of the third line needed to be double quoted?
(note that it is a multi byte character). Same thing can be said to
quote_ident().We treat identifiers made of the multi byte characters without double
quotation (non delimited identifier) in other places.test=# create table t2(あいう text);
CREATE TABLE
test=# insert into t2 values('aaa');
INSERT 0 1
test=# select あいう from t2;
あいう
--------
aaa
(1 row)format uses same routine as quote_ident. So quote_ident should be
fixed
first.
Yes, I had that in my mind too.
Attached is the proposed patch to fix the bug.
Regression tests passed.Here is an example after the patch. Note that the third row is not
quoted any more.test=# select format('%I', あいう) from t2;
format
--------
aaa
"AAA"
あああ
(3 rows)Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jpdiff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 3783e97..b93fc27 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -9405,7 +9405,7 @@ quote_identifier(const char *ident) * would like to use <ctype.h> macros here, but they might yield unwanted * locale-specific results... */ - safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] =='_');
+ safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_'
||
IS_HIGHBIT_SET(ident[0]));
for (ptr = ident; *ptr; ptr++)
{
@@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)if ((ch >= 'a' && ch <= 'z') || (ch >= '0' && ch <= '9') || - (ch == '_')) + (ch == '_') || + (IS_HIGHBIT_SET(ch))) { /* okay */ }This patch ls simply - I remember I was surprised, so we allow any
multibyte char few months ago.+1
If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.
I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.
Pavel
Show quoted text
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.
Quite. This is not a bug fix. It's a behavior change, perhaps for the better.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.Quite. This is not a bug fix. It's a behavior change, perhaps for the better.
Added to the commitfest 2016-03.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-24 8:04 GMT-02:00 Tatsuo Ishii <ishii@postgresql.org>:
On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.Quite. This is not a bug fix. It's a behavior change, perhaps for the better.
Added to the commitfest 2016-03.
Hi,
I gone ahead a little and tested this patch and it works like was
proposed, I agree that it's not a bug fix but a new behavior so -1 for
backport.
While applying patch against master
(1129c2b0ad2732f301f696ae2cf98fb063a4c1f8) it offsets two hunks.
Since format() has regression tests I suggest that one should be added
to cover this. It could worth to add the new behavior to the docs,
since there no explicit example for %I.
I performed the follow tests that works as expected using some Portuguese words:
postgres=# create table test (nome varchar, endereço text, "UF"
varchar(2), título varchar);
CREATE TABLE
Time: 80,769 ms
postgres=# select format('%I', attname) from pg_attribute join
pg_class on (attrelid = oid) where relname = 'test';
format
----------
"UF"
cmax
cmin
ctid
endereço
nome
tableoid
título
xmax
xmin
(10 rows)
Time: 1,728 ms
postgres=# select format('%I', 'endereco');
format
----------
endereco
(1 row)
Time: 0,098 ms
postgres=# select format('%I', 'endereço');
format
----------
endereço
(1 row)
Time: 0,088 ms
postgres=# select format('%I', 'あああ');
format
--------
あああ
(1 row)
Time: 0,072 ms
postgres=# select format('%I', 'título');
format
--------
título
(1 row)
Time: 0,051 ms
postgres=# select format('%I', 'título e');
format
------------
"título e"
(1 row)
Time: 0,051 ms
postgres=# select format('%I', 'título_e');
format
----------
título_e
(1 row)
Time: 0,051 ms
postgres=# select format('%I', '_título');
format
---------
_título
(1 row)
Time: 0,047 ms
postgres=# select format('%I', '1_título');
format
------------
"1_título"
(1 row)
Time: 0,046 ms
Thank you for this!
Best regards,
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-24 8:04 GMT-02:00 Tatsuo Ishii <ishii@postgresql.org>:
On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
If we would go this way, question is if we should back patch this or
not since the patch apparently changes the existing
behaviors. Comments? I would think we should not.I am sure, so we should not backport this change. This can breaks customer
regress tests - and the current behave isn't 100% correct, but it is safe.Quite. This is not a bug fix. It's a behavior change, perhaps for the better.
Added to the commitfest 2016-03.
Hi,
I gone ahead a little and tested this patch and it works like was
proposed, I agree that it's not a bug fix but a new behavior so -1 for
backport.
IMO, it's a bug or at least an inconsistency but I admit it's too late
to back patch to existing stable branches.
While applying patch against master
(1129c2b0ad2732f301f696ae2cf98fb063a4c1f8) it offsets two hunks.Since format() has regression tests I suggest that one should be added
to cover this.
I don't think it's doable. The test requires to handle multiple
database encodings. The regression test framework handles only one
database encoding. Probably adding to the existing mb test is the
easiest.
It could worth to add the new behavior to the docs,
since there no explicit example for %I.
I performed the follow tests that works as expected using some Portuguese words:
I assume you used UTF-8 encoding database.
Great.
postgres=# create table test (nome varchar, endereço text, "UF"
varchar(2), título varchar);
CREATE TABLE
Time: 80,769 ms
postgres=# select format('%I', attname) from pg_attribute join
pg_class on (attrelid = oid) where relname = 'test';
format
----------
"UF"
cmax
cmin
ctid
endereço
nome
tableoid
título
xmax
xmin
(10 rows)Time: 1,728 ms
postgres=# select format('%I', 'endereco');
format
----------
endereco
(1 row)Time: 0,098 ms
postgres=# select format('%I', 'endereço');
format
----------
endereço
(1 row)Time: 0,088 ms
postgres=# select format('%I', 'あああ');
format
--------
あああ
(1 row)Time: 0,072 ms
postgres=# select format('%I', 'título');
format
--------
título
(1 row)Time: 0,051 ms
postgres=# select format('%I', 'título e');
format
------------
"título e"
(1 row)Time: 0,051 ms
postgres=# select format('%I', 'título_e');
format
----------
título_e
(1 row)Time: 0,051 ms
postgres=# select format('%I', '_título');
format
---------
_título
(1 row)Time: 0,047 ms
postgres=# select format('%I', '1_título');
format
------------
"1_título"
(1 row)Time: 0,046 ms
Thank you for this!
Best regards,
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tatsuo Ishii wrote:
IMO, it's a bug or at least an inconsistency
Personally I don't see this change being good for everything.
Let's play devil's advocate:
create table abc(U&"foo\2003" int);
U+2003 is 'EM SPACE', in Unicode's General Punctuation block.
With the current version, format('%I', attname) on this column is:
"foo "
With the patched version, it produces this:
foo
So the visual hint that there are more characters at the end is lost.
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-26 5:29 GMT-02:00 Tatsuo Ishii <ishii@postgresql.org>:
I assume you used UTF-8 encoding database.
Yes, I do.
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-26 18:00 GMT-02:00 Daniel Verite <daniel@manitou-mail.org>:
...
create table abc(U&"foo\2003" int);U+2003 is 'EM SPACE', in Unicode's General Punctuation block.
With the current version, format('%I', attname) on this column is:
"foo "With the patched version, it produces this:
fooSo the visual hint that there are more characters at the end is lost.
Thanks for advocate, I see here that it even produces that output with
simple spaces.
postgres=# create table x ("aí " text);
CREATE TABLE
postgres=# \d x
Tabela "public.x"
Coluna | Tipo | Modificadores
----------+------+---------------
aí | text |
This will break copy&paste user actions and scripts that parses that output.
Maybe the patch should consider left/right non-printable chars to
choose whether to show or not the " ?
[]s
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thanks for advocate, I see here that it even produces that output with
simple spaces.postgres=# create table x ("aí " text);
CREATE TABLE
postgres=# \d x
Tabela "public.x"
Coluna | Tipo | Modificadores
----------+------+---------------
aí | text |This will break copy&paste user actions and scripts that parses that output.
Maybe the patch should consider left/right non-printable chars to
choose whether to show or not the " ?
This is a totally different story from the topic discussed in this
thread. psql never adds double quotations to column name even with
upper case col names.
test=# create table t6("ABC" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
--------+---------+-----------
ABC | integer |
If you want to change the existing psql's behavior, propose it
yourself.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
IMO, it's a bug or at least an inconsistency
Personally I don't see this change being good for everything.
Let's play devil's advocate:
create table abc(U&"foo\2003" int);
U+2003 is 'EM SPACE', in Unicode's General Punctuation block.
With the current version, format('%I', attname) on this column is:
"foo "With the patched version, it produces this:
fooSo the visual hint that there are more characters at the end is lost.
What is the "visual hint"? If you are talking about psql's output, it
never adds "visual hint" (double quotations).
If you are talking about the string handling in a program, what kind
of program cares about "visiual"?
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-26 21:00 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:
Tatsuo Ishii wrote:
IMO, it's a bug or at least an inconsistency
Personally I don't see this change being good for everything.
Let's play devil's advocate:
create table abc(U&"foo\2003" int);
U+2003 is 'EM SPACE', in Unicode's General Punctuation block.
With the current version, format('%I', attname) on this column is:
"foo "With the patched version, it produces this:
fooSo the visual hint that there are more characters at the end is lost.
I can agree, so current behave can be useful in some cases, but still it is
bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.
Currently, any multibyte char can be unescaped identifier (only apostrophes
are tested). We should to test white chars too.
Regards
Pavel
Show quoted text
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
2016-01-26 21:00 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:
Tatsuo Ishii wrote:
IMO, it's a bug or at least an inconsistency
Personally I don't see this change being good for everything.
Let's play devil's advocate:
create table abc(U&"foo\2003" int);
U+2003 is 'EM SPACE', in Unicode's General Punctuation block.
With the current version, format('%I', attname) on this column is:
"foo "With the patched version, it produces this:
fooSo the visual hint that there are more characters at the end is lost.
I can agree, so current behave can be useful in some cases, but still it is
bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.Currently, any multibyte char can be unescaped identifier (only apostrophes
are tested). We should to test white chars too.
Really? I thought we do that test.
test=# create table t6("あいう えお" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
-------------+---------+-----------
あいう えお | integer |
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-27 6:13 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
2016-01-26 21:00 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:
Tatsuo Ishii wrote:
IMO, it's a bug or at least an inconsistency
Personally I don't see this change being good for everything.
Let's play devil's advocate:
create table abc(U&"foo\2003" int);
U+2003 is 'EM SPACE', in Unicode's General Punctuation block.
With the current version, format('%I', attname) on this column is:
"foo "With the patched version, it produces this:
fooSo the visual hint that there are more characters at the end is lost.
I can agree, so current behave can be useful in some cases, but still it
is
bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.Currently, any multibyte char can be unescaped identifier (only
apostrophes
are tested). We should to test white chars too.
Really? I thought we do that test.
what you are expecting from this test? UTF single quotes are tested only in
quote functions probably.
Pavel
Show quoted text
test=# create table t6("あいう えお" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
-------------+---------+-----------
あいう えお | integer |
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
I can agree, so current behave can be useful in some cases, but still it
is
bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.Currently, any multibyte char can be unescaped identifier (only
apostrophes
are tested). We should to test white chars too.
Really? I thought we do that test.
what you are expecting from this test? UTF single quotes are tested only in
quote functions probably.
I just wanted to demonstrate multibyte chars including ASCII white
spaces can be an identifier.
We should to test white chars too.
What do you exactly propose regarding white chars and multibyte chars
here? Maybe you propose to consider non ASCII white spaces (treate
them as ASCII white spaces)?
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Pavel
test=# create table t6("あいう えお" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
-------------+---------+-----------
あいう えお | integer |
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-01-27 6:24 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
I can agree, so current behave can be useful in some cases, but still
it
is
bug (inconsistency) between PostgreSQL parser and PostgreSQL escaping
functions.Currently, any multibyte char can be unescaped identifier (only
apostrophes
are tested). We should to test white chars too.
Really? I thought we do that test.
what you are expecting from this test? UTF single quotes are tested only
in
quote functions probably.
I just wanted to demonstrate multibyte chars including ASCII white
spaces can be an identifier.
I understand now.
We should to test white chars too.
What do you exactly propose regarding white chars and multibyte chars
here? Maybe you propose to consider non ASCII white spaces (treate
them as ASCII white spaces)?
I propose the work with UTF white chars should be same like ASCII white
chars. The current design is too simple - with possible pretty bad issues.
Daniel's example is good - there is big gap in design.
Regards
Pavel
Show quoted text
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jpPavel
test=# create table t6("あいう えお" int);
CREATE TABLE
test=# \d t6
Table "public.t6"
Column | Type | Modifiers
-------------+---------+-----------
あいう えお | integer |
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp