Re: Why format() adds double quote? - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Why format() adds double quote?
Date
Msg-id 20160120.152015.2253940486230409723.t-ishii@sraoss.co.jp
Whole thread Raw
In response to Re: Why format() adds double quote?  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Why format() adds double quote?  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
> 2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <ishii@postgresql.org>:
> 
>> test=# select format('%I', t) from t1;
>>   format
>> ----------
>>  aaa
>>  "AAA"
>>  "あいう"
>> (3 rows)
>>
>> Why is the text value of the third line needed to be double quoted?
>> (note that it is a multi byte character). Same thing can be said to
>> quote_ident().
>>
>> We treat identifiers made of the multi byte characters without double
>> quotation (non delimited identifier) in other places.
>>
>> test=# create table t2(あいう text);
>> CREATE TABLE
>> test=# insert into t2 values('aaa');
>> INSERT 0 1
>> test=# select あいう from t2;
>>  あいう
>> --------
>>  aaa
>> (1 row)
> 
> format uses same routine as quote_ident. So quote_ident should be fixed
> first.

Yes, I had that in my mind too.

Attached is the proposed patch to fix the bug.
Regression tests passed.

Here is an example after the patch. Note that the third row is not
quoted any more.

test=#  select format('%I', あいう) from t2;format 
--------aaa"AAA"あああ
(3 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 3783e97..b93fc27 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -9405,7 +9405,7 @@ quote_identifier(const char *ident)     * would like to use <ctype.h> macros here, but they might
yieldunwanted     * locale-specific results...     */
 
-    safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_');
+    safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_' || IS_HIGHBIT_SET(ident[0]));    for (ptr = ident;
*ptr;ptr++)    {
 
@@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)        if ((ch >= 'a' && ch <= 'z') ||            (ch >= '0'
&&ch <= '9') ||
 
-            (ch == '_'))
+            (ch == '_') ||
+            (IS_HIGHBIT_SET(ch)))        {            /* okay */        }

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Support for N synchronous standby servers - take 2
Next
From: Craig Ringer
Date:
Subject: Re: Stream consistent snapshot via a logical decoding plugin as a series of INSERTs