Re: Why format() adds double quote? - Mailing list pgsql-hackers
From | Tatsuo Ishii |
---|---|
Subject | Re: Why format() adds double quote? |
Date | |
Msg-id | 20160120.181715.761036523837813422.t-ishii@sraoss.co.jp Whole thread Raw |
In response to | Re: Why format() adds double quote? (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: Why format() adds double quote?
(Pavel Stehule <pavel.stehule@gmail.com>)
|
List | pgsql-hackers |
> Hi > > 2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>: > >> > 2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>: >> > >> >> test=# select format('%I', t) from t1; >> >> format >> >> ---------- >> >> aaa >> >> "AAA" >> >> "あいう" >> >> (3 rows) >> >> >> >> Why is the text value of the third line needed to be double quoted? >> >> (note that it is a multi byte character). Same thing can be said to >> >> quote_ident(). >> >> >> >> We treat identifiers made of the multi byte characters without double >> >> quotation (non delimited identifier) in other places. >> >> >> >> test=# create table t2(あいう text); >> >> CREATE TABLE >> >> test=# insert into t2 values('aaa'); >> >> INSERT 0 1 >> >> test=# select あいう from t2; >> >> あいう >> >> -------- >> >> aaa >> >> (1 row) >> > >> > format uses same routine as quote_ident. So quote_ident should be fixed >> > first. >> >> Yes, I had that in my mind too. >> >> Attached is the proposed patch to fix the bug. >> Regression tests passed. >> >> Here is an example after the patch. Note that the third row is not >> quoted any more. >> >> test=# select format('%I', あいう) from t2; >> format >> -------- >> aaa >> "AAA" >> あああ >> (3 rows) >> >> Best regards, >> -- >> Tatsuo Ishii >> SRA OSS, Inc. Japan >> English: http://www.sraoss.co.jp/index_en.php >> Japanese:http://www.sraoss.co.jp >> >> diff --git a/src/backend/utils/adt/ruleutils.c >> b/src/backend/utils/adt/ruleutils.c >> index 3783e97..b93fc27 100644 >> --- a/src/backend/utils/adt/ruleutils.c >> +++ b/src/backend/utils/adt/ruleutils.c >> @@ -9405,7 +9405,7 @@ quote_identifier(const char *ident) >> * would like to use <ctype.h> macros here, but they might yield >> unwanted >> * locale-specific results... >> */ >> - safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_'); >> + safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_' || >> IS_HIGHBIT_SET(ident[0])); >> >> for (ptr = ident; *ptr; ptr++) >> { >> @@ -9413,7 +9413,8 @@ quote_identifier(const char *ident) >> >> if ((ch >= 'a' && ch <= 'z') || >> (ch >= '0' && ch <= '9') || >> - (ch == '_')) >> + (ch == '_') || >> + (IS_HIGHBIT_SET(ch))) >> { >> /* okay */ >> } >> >> > This patch ls simply - I remember I was surprised, so we allow any > multibyte char few months ago. > > +1 If we would go this way, question is if we should back patch this or not since the patch apparently changes the existing behaviors. Comments? I would think we should not. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
pgsql-hackers by date: