Thread: string function - "format" function proposal

string function - "format" function proposal

From
Pavel Stehule
Date:
Hello

I am returning back to string functions. For me, the most important
function isn't commited still. There was discussion about "format" or
"sprintf" fuction. So I'll do a small resume.

goal: to get function that helps with formatting a message texts and
helps with building a SQL commands (used as dynamic SQL)

propsals:

* "format" function - uses same formatting as PL/pgSQL RAISE statement
* "sprintf" function

Itagaki objectives to "format" function:
* there are not possibility put two parameters without a space between
* it is too simple

My objectives to sprintf function:
* it is designed to different environment than SQL - missing support
NULL, missing support for date, timestamp, boolean, ...
* it is too complex, some parameters has different meaning for different tags
* we have a "to_char" function for complex formatting now.

Now I propose a compromise - "format" function with only three tags:

%s .. some string
%i  .. SQL identifier
%l  .. string literal

using a NULL:

for %s NULL is transformed to empty string - like "concat"
for %i NULL raises an exception
for %l NULL is transformed to ' NULL '  string.

This system is still simple and enough.

Implemented sprintf function can be moved to "sprintf" contrib module.

comments

Regards

Pavel Stehule


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> propsals:
> * "format" function - uses same formatting as PL/pgSQL RAISE statement
> * "sprintf" function
>
> Now I propose a compromise - "format" function with only three tags:
> %s .. some string
> %i  .. SQL identifier
> %l  .. string literal

These are just ideas:

* Use $n, as like as PREPARE command. It allows for us to swap arguments in any order. SELECT format('$2 before $1',
'aaa','bbb') 

* Call to_char() functions for each placeholder. For example,   format('=={YYYY-MM-DD}==', tm::timestamp) is equivalent
to  '==' || to_char(tm, 'YYYY-MM-DD') || '==' '{}' prints the input with the default format. 

New languages' libraries might be of some help. LLs, C#, etc.

--
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/8/30 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> propsals:
>> * "format" function - uses same formatting as PL/pgSQL RAISE statement
>> * "sprintf" function
>>
>> Now I propose a compromise - "format" function with only three tags:
>> %s .. some string
>> %i  .. SQL identifier
>> %l  .. string literal
>
> These are just ideas:
>
> * Use $n, as like as PREPARE command.
>  It allows for us to swap arguments in any order.
>  SELECT format('$2 before $1', 'aaa', 'bbb')
>

what is use case for this feature? I don't see it.

> * Call to_char() functions for each placeholder.
>  For example,
>    format('=={YYYY-MM-DD}==', tm::timestamp)
>  is equivalent to
>    '==' || to_char(tm, 'YYYY-MM-DD') || '=='
>  '{}' prints the input with the default format.
>
> New languages' libraries might be of some help. LLs, C#, etc.

I though about integration with to_char function too. There are not
technical barrier. And I can live with just {to_char_format} too. It
can be or cannot be mixed with basic tags together - there is
specified a NULL value behave. If we allow {format} syntax, then we
have to specify a escape syntax for { and }. Do you have a some idea?

Regards

Pavel



>
> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Alvaro Herrera
Date:
Excerpts from Pavel Stehule's message of lun ago 30 07:51:55 -0400 2010:
> 2010/8/30 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> > On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> >> propsals:
> >> * "format" function - uses same formatting as PL/pgSQL RAISE statement
> >> * "sprintf" function
> >>
> >> Now I propose a compromise - "format" function with only three tags:
> >> %s .. some string
> >> %i  .. SQL identifier
> >> %l  .. string literal
> >
> > These are just ideas:
> >
> > * Use $n, as like as PREPARE command.
> >  It allows for us to swap arguments in any order.
> >  SELECT format('$2 before $1', 'aaa', 'bbb')
> 
> what is use case for this feature? I don't see it.

Translations :-)  I haven't had a use for that but I've heard people
implements gettext of sorts in database tables.  Maybe that kind of
thing would be of use here.

> > * Call to_char() functions for each placeholder.
> >  For example,
> >    format('=={YYYY-MM-DD}==', tm::timestamp)
> >  is equivalent to
> >    '==' || to_char(tm, 'YYYY-MM-DD') || '=='
> >  '{}' prints the input with the default format.
> >
> > New languages' libraries might be of some help. LLs, C#, etc.
> 
> I though about integration with to_char function too. There are not
> technical barrier. And I can live with just {to_char_format} too. It
> can be or cannot be mixed with basic tags together - there is
> specified a NULL value behave. If we allow {format} syntax, then we
> have to specify a escape syntax for { and }. Do you have a some idea?

What about %{sth}?  That way you don't need to escape {.  The closing } would
need escaping only inside the %{} specifier, so {%{YYYY{\}MM}} prints
{2010{}08}  So the above example is:

format('==%{YYYY-MM-DD}==', tm::timestamp);

Not sure about this to_char stuff though, seems too cute.  You can do
the case above like this:

format('==%s==', to_char(tm::timestamp, 'YYYY-MM-DD'))

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/8/30 Alvaro Herrera <alvherre@commandprompt.com>:
> Excerpts from Pavel Stehule's message of lun ago 30 07:51:55 -0400 2010:
>> 2010/8/30 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
>> > On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> >> propsals:
>> >> * "format" function - uses same formatting as PL/pgSQL RAISE statement
>> >> * "sprintf" function
>> >>
>> >> Now I propose a compromise - "format" function with only three tags:
>> >> %s .. some string
>> >> %i  .. SQL identifier
>> >> %l  .. string literal
>> >
>> > These are just ideas:
>> >
>> > * Use $n, as like as PREPARE command.
>> >  It allows for us to swap arguments in any order.
>> >  SELECT format('$2 before $1', 'aaa', 'bbb')
>>
>> what is use case for this feature? I don't see it.
>
> Translations :-)  I haven't had a use for that but I've heard people
> implements gettext of sorts in database tables.  Maybe that kind of
> thing would be of use here.
>
>> > * Call to_char() functions for each placeholder.
>> >  For example,
>> >    format('=={YYYY-MM-DD}==', tm::timestamp)
>> >  is equivalent to
>> >    '==' || to_char(tm, 'YYYY-MM-DD') || '=='
>> >  '{}' prints the input with the default format.
>> >
>> > New languages' libraries might be of some help. LLs, C#, etc.
>>
>> I though about integration with to_char function too. There are not
>> technical barrier. And I can live with just {to_char_format} too. It
>> can be or cannot be mixed with basic tags together - there is
>> specified a NULL value behave. If we allow {format} syntax, then we
>> have to specify a escape syntax for { and }. Do you have a some idea?
>
> What about %{sth}?  That way you don't need to escape {.  The closing } would
> need escaping only inside the %{} specifier, so {%{YYYY{\}MM}} prints
> {2010{}08}  So the above example is:

then you need escaping too :)

>
> format('==%{YYYY-MM-DD}==', tm::timestamp);

I am not sure if this is correct -but why not

so there are possible combinations

%s   .. no quoting, NULL is ''
%{}  .. no quoting, NULL is NULL .. like output from to_char
%{}s .. no quoting with formatting, NULL is ''

now I have not idea about nice syntax for positional parameters - maybe
%{...}$1s or we can use a two variants for tags - not positional '%'
and positional '%', so
$1{...}s, %{...}s, $1, %s, $1s, $1{...}, %{...} can be valid tags

Regards

Pavel Stehule

>
> Not sure about this to_char stuff though, seems too cute.  You can do
> the case above like this:
>
> format('==%s==', to_char(tm::timestamp, 'YYYY-MM-DD'))
>

I like an using a format like tag - there are not technical problem -
format can be taken from string and data type parameter can be known
too. But this feature can be some enhancing. The basic features are
NULL handling and right quoting.



> --
> Álvaro Herrera <alvherre@commandprompt.com>
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
Hello

attached WIP patch.

I implement only basic format's tags related to SQL: string, value,
literal, sql identifier. These tags are basic, but there are not any
break to implement any other formats or enhance a syntax. The mix with
to_char function is more complex then I expected - so I don't thinking
about it for now (there are more then one to_char function).

I don't found a nice mix for placeholders and positional placeholders
- so I propose a new special function "substitute" (in contrib) where
placeholders are positional. More - we check in function "format" if
all parameters are used - this check isn't related to positional
placeholders, this is reason for separate implementation too:

so some examples:
postgres=# select substitute('second parameter is "$2" and first
parameter is "$1"', 'first parameter', 'second parameter');                                  substitute
─────────────────────────────────────────────────────────────────────────────────second parameter is "second parameter"
andfirst parameter is "first parameter" 
(1 row)

postgres=# select format('INSERT INTO %i (c1, c2, c3, c4) VALUES
(%v,%v,%v,%v)', 'my tab',1, NULL, true, 'hello');                            format
────────────────────────────────────────────────────────────────INSERT INTO "my tab" (c1, c2, c3, c4) VALUES
(1,NULL,t,'hello')
(1 row)

postgres=# select format('SQL identifier %i cannot be a NULL', NULL);
ERROR:  SQL identifier cannot be a NULL

postgres=# select format('NULL is %v or empty string "%s"', NULL, NULL);            format
─────────────────────────────────NULL is NULL or empty string ""
(1 row)

%i ... sql identifier
%v ... sql value
%s ... string --- the most used tag I expect
%l ... literal

I hope so this system is clean, simple, readable and extensible

Regards

Pavel

2010/8/30 Pavel Stehule <pavel.stehule@gmail.com>:
> 2010/8/30 Alvaro Herrera <alvherre@commandprompt.com>:
>> Excerpts from Pavel Stehule's message of lun ago 30 07:51:55 -0400 2010:
>>> 2010/8/30 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
>>> > On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> >> propsals:
>>> >> * "format" function - uses same formatting as PL/pgSQL RAISE statement
>>> >> * "sprintf" function
>>> >>
>>> >> Now I propose a compromise - "format" function with only three tags:
>>> >> %s .. some string
>>> >> %i  .. SQL identifier
>>> >> %l  .. string literal
>>> >
>>> > These are just ideas:
>>> >
>>> > * Use $n, as like as PREPARE command.
>>> >  It allows for us to swap arguments in any order.
>>> >  SELECT format('$2 before $1', 'aaa', 'bbb')
>>>
>>> what is use case for this feature? I don't see it.
>>
>> Translations :-)  I haven't had a use for that but I've heard people
>> implements gettext of sorts in database tables.  Maybe that kind of
>> thing would be of use here.
>>
>>> > * Call to_char() functions for each placeholder.
>>> >  For example,
>>> >    format('=={YYYY-MM-DD}==', tm::timestamp)
>>> >  is equivalent to
>>> >    '==' || to_char(tm, 'YYYY-MM-DD') || '=='
>>> >  '{}' prints the input with the default format.
>>> >
>>> > New languages' libraries might be of some help. LLs, C#, etc.
>>>
>>> I though about integration with to_char function too. There are not
>>> technical barrier. And I can live with just {to_char_format} too. It
>>> can be or cannot be mixed with basic tags together - there is
>>> specified a NULL value behave. If we allow {format} syntax, then we
>>> have to specify a escape syntax for { and }. Do you have a some idea?
>>
>> What about %{sth}?  That way you don't need to escape {.  The closing } would
>> need escaping only inside the %{} specifier, so {%{YYYY{\}MM}} prints
>> {2010{}08}  So the above example is:
>
> then you need escaping too :)
>
>>
>> format('==%{YYYY-MM-DD}==', tm::timestamp);
>
> I am not sure if this is correct -but why not
>
> so there are possible combinations
>
> %s   .. no quoting, NULL is ''
> %{}  .. no quoting, NULL is NULL .. like output from to_char
> %{}s .. no quoting with formatting, NULL is ''
>
> now I have not idea about nice syntax for positional parameters - maybe
> %{...}$1s or we can use a two variants for tags - not positional '%'
> and positional '%', so
> $1{...}s, %{...}s, $1, %s, $1s, $1{...}, %{...} can be valid tags
>
> Regards
>
> Pavel Stehule
>
>>
>> Not sure about this to_char stuff though, seems too cute.  You can do
>> the case above like this:
>>
>> format('==%s==', to_char(tm::timestamp, 'YYYY-MM-DD'))
>>
>
> I like an using a format like tag - there are not technical problem -
> format can be taken from string and data type parameter can be known
> too. But this feature can be some enhancing. The basic features are
> NULL handling and right quoting.
>
>
>
>> --
>> Álvaro Herrera <alvherre@commandprompt.com>
>> The PostgreSQL Company - Command Prompt, Inc.
>> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>>
>


Re: string function - "format" function proposal

From
"A.M."
Date:
On Aug 31, 2010, at 5:07 PM, Pavel Stehule wrote:

> Hello
>
> attached WIP patch.
>
> I implement only basic format's tags related to SQL: string, value,
> literal, sql identifier. These tags are basic, but there are not any
> break to implement any other formats or enhance a syntax. The mix with
> to_char function is more complex then I expected - so I don't thinking
> about it for now (there are more then one to_char function).
>

<snip>

It would be pretty handy if plpgsql EXECUTE could operate like this with USING to support identifiers.

Cheers,
M

Re: string function - "format" function proposal

From
David Fetter
Date:
On Tue, Aug 31, 2010 at 11:07:40PM +0200, Pavel Stehule wrote:
> Hello
> 
> attached WIP patch.

I don't see it attached.  Is it just me?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Wed, Sep 1, 2010 at 6:07 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I don't found a nice mix for placeholders and positional placeholders

How about %pos$format, used in C-printf()? It might be
only in Linux's libc.

printf("<%2$s> <%1$d>\n", 123, "abc");
=> <abc> <123>
http://linux.die.net/man/3/printf

> %i ... sql identifier
> %v ... sql value
> %s ... string --- the most used tag I expect
> %l ... literal

Looks good designed.  I have a couple of comments and questions:

* There is no examples for %l. What's the difference from %v and %s? If it always quotes, how does it work? Like as
quote_literal()or quote_nullable()?
 

* %v quotes text values (and maybe all non-numeric values) with single quotes, but doesn't numeric values. How do we
determinethe difference? By type oid?
 

* %v also doesn't quote boolean values, but t and f are not valid. You should use true and false (or 't' and 'f') for
thecases. (So, your "INSERT INTO" example is broken.)
 

-- 
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/9/1 David Fetter <david@fetter.org>:
> On Tue, Aug 31, 2010 at 11:07:40PM +0200, Pavel Stehule wrote:
>> Hello
>>
>> attached WIP patch.
>
> I don't see it attached.  Is it just me?

sorry, it was at 1 ofter midnight

Regards

Pavel

>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

Attachment

Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/9/1 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Wed, Sep 1, 2010 at 6:07 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> I don't found a nice mix for placeholders and positional placeholders
>
> How about %pos$format, used in C-printf()? It might be
> only in Linux's libc.
>
> printf("<%2$s> <%1$d>\n", 123, "abc");
> => <abc> <123>
> http://linux.die.net/man/3/printf

same syntax I designed and didn't implement, because it isn't too
readable (my opinion). But I am not against - just thinking so
separate function can be better.
>
>> %i ... sql identifier
>> %v ... sql value
>> %s ... string --- the most used tag I expect
>> %l ... literal
>
> Looks good designed.  I have a couple of comments and questions:
>
> * There is no examples for %l. What's the difference from %v and %s?
>  If it always quotes, how does it work? Like as quote_literal()
>  or quote_nullable()?
>

%l is always quoted

it is designed for syntax: SELECT integer '10';

%s versus %v ... %s is never quoted X %v is quoted when it is
necessary, NULL is showed as empty string for %s and as "NULL" for %v.
%s is used for messages (the behave is same like "concat"), %v is used
for SQL statement building

> * %v quotes text values (and maybe all non-numeric values) with
>  single quotes, but doesn't numeric values. How do we determine
>  the difference? By type oid?
>

every datatype has typecategory attribute
   typname     │ typcategory
─────────────────┼─────────────int8            │ Nint2            │ Nint4            │ Nregproc         │ Noid
  │ Nfloat4          │ Nfloat8          │ Nmoney           │ Nnumeric         │ Nregprocedure    │ Nregoper         │
Nregoperator    │ Nregclass        │ Nregtype         │ Nregconfig       │ Nregdictionary   │ Ncardinal_number │ N 

so these types are unquoted


> * %v also doesn't quote boolean values, but t and f are not valid.
>  You should use true and false (or 't' and 'f') for the cases.
>  (So, your "INSERT INTO" example is broken.)
>

you have a true - it should be fixed

Regards

Pavel

> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Wed, Sep 1, 2010 at 1:29 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> * %v also doesn't quote boolean values, but t and f are not valid.
>>  You should use true and false (or 't' and 'f') for the cases.
>
> you have a true - it should be fixed

I found quote_literal() prints boolean values as 'true' or 'false'.
It uses casting to text type rather than calling output function.
OTOH, format functions (and concat funcs) use output functions.

Which should we use for such purposes? Consistent behavior is
obviously preferred. Boolean type might be the only type that
is converted to different representation in typoutput or cast-to-test,
but we should consider to have boolean-specific hardwired code,
or cast all types to text instead of output functions.

--
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Wed, Sep 1, 2010 at 1:29 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> * %v also doesn't quote boolean values, but t and f are not valid.
>>>  You should use true and false (or 't' and 'f') for the cases.
>>
>> you have a true - it should be fixed
>
> I found quote_literal() prints boolean values as 'true' or 'false'.
> It uses casting to text type rather than calling output function.
> OTOH, format functions (and concat funcs) use output functions.
>
> Which should we use for such purposes? Consistent behavior is
> obviously preferred. Boolean type might be the only type that
> is converted to different representation in typoutput or cast-to-test,
> but we should consider to have boolean-specific hardwired code,
> or cast all types to text instead of output functions.

Personally I prefer casting to text - it allows some later
customizations. And it's more consistent with || operator. So the
functions concat* should to be fixed.

Regards

Pavel

>
> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
>> Which should we use for such purposes? Consistent behavior is
>> obviously preferred. Boolean type might be the only type that
>> is converted to different representation in typoutput or cast-to-test,
>> but we should consider to have boolean-specific hardwired code,
>> or cast all types to text instead of output functions.

> Personally I prefer casting to text -

No, you need to use the I/O functions.  Not every type is guaranteed to
have a cast to text.

> iit allows some later
> customizations. And it's more consistent with || operator.

I don't buy either of those arguments.
        regards, tom lane


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Mon, Sep 6, 2010 at 10:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> No, you need to use the I/O functions.  Not every type is guaranteed to
> have a cast to text.

One issue is that Pavel want to generate valid SQL statement using
%v format. Boolean values are printed as t or f, so the unquoted
values are not valid syntax.

If we only use output functions, boolean values should be written as
't' or 'f' (single-quoted), Only numeric values can be unquoted on %v.

--
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/9/6 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> 2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
>>> Which should we use for such purposes? Consistent behavior is
>>> obviously preferred. Boolean type might be the only type that
>>> is converted to different representation in typoutput or cast-to-test,
>>> but we should consider to have boolean-specific hardwired code,
>>> or cast all types to text instead of output functions.
>
>> Personally I prefer casting to text -
>
> No, you need to use the I/O functions.  Not every type is guaranteed to
> have a cast to text.
>
>> iit allows some later
>> customizations. And it's more consistent with || operator.
>
> I don't buy either of those arguments.

can we use a both? like plpgsql? First check cast to text, and second
use a IO functions?

Why I think so this is useful - sometimes people asked some GUC for
formatting date, boolean and other. If these functions try to use a
cast to text first, then there is some space for customization via
custom cast functions.

Regards

Pavel Stehule

>
>                        regards, tom lane
>


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Mon, Sep 6, 2010 at 10:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> No, you need to use the I/O functions.  Not every type is guaranteed to
>> have a cast to text.
>
> One issue is that Pavel want to generate valid SQL statement using
> %v format. Boolean values are printed as t or f, so the unquoted
> values are not valid syntax.

we can format some types directly - but I like idea of casting to text
because there is space for users.

Pavel

>
> If we only use output functions, boolean values should be written as
> 't' or 'f' (single-quoted), Only numeric values can be unquoted on %v.
>
> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Tom Lane
Date:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> On Mon, Sep 6, 2010 at 10:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> No, you need to use the I/O functions.  Not every type is guaranteed to
>> have a cast to text.

> One issue is that Pavel want to generate valid SQL statement using
> %v format. Boolean values are printed as t or f, so the unquoted
> values are not valid syntax.

So?  You'd need to quote the values anyway, in general.  If you want
something that will be valid SQL you'd better include the functionality
of quote_literal() in it.

> If we only use output functions, boolean values should be written as
> 't' or 'f' (single-quoted), Only numeric values can be unquoted on %v.

I'm not sure that it's a good idea to have any type-specific special
cases.  Failing to quote numeric values will bring in the whole set of
issues about how the parser initially types numeric constants, as in the
other thread over the weekend.
        regards, tom lane


Re: string function - "format" function proposal

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> Why I think so this is useful - sometimes people asked some GUC for
> formatting date, boolean and other. If these functions try to use a
> cast to text first, then there is some space for customization via
> custom cast functions.

This is basically nonsense.  If you don't control a type's output
function, you don't control its cast to text either.  Nor do I think
it's a good idea to encourage people to make their casts to text operate
differently from their output functions.  We have that one wart in
boolean casting because the SQL standard specifies the result of cast to
text and it's different from our historical practice in the bool output
function --- but it is a wart, not something we should encourage people
to emulate.
        regards, tom lane


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Mon, Sep 6, 2010 at 11:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So?  You'd need to quote the values anyway, in general.  If you want
> something that will be valid SQL you'd better include the functionality
> of quote_literal() in it.
>
> I'm not sure that it's a good idea to have any type-specific special
> cases.

As I remember, the original motivation of %v formatter is
some DBMSes don't like quoted numeric literals. However,
Postgres accepts quoted numerics, and we're developing Postgres.

So, our consensus would be %v formatter should be removed
completely from the format function.

--
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Mon, Sep 6, 2010 at 11:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So?  You'd need to quote the values anyway, in general.  If you want
>> something that will be valid SQL you'd better include the functionality
>> of quote_literal() in it.
>>
>> I'm not sure that it's a good idea to have any type-specific special
>> cases.
>
> As I remember, the original motivation of %v formatter is
> some DBMSes don't like quoted numeric literals. However,
> Postgres accepts quoted numerics, and we're developing Postgres.
>
> So, our consensus would be %v formatter should be removed
> completely from the format function.
>

I think so tag that quotes all without numbers can be very useful, but
it isn't too much important for me. I can live without them.

Regards

Pavel

> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
Hello

I am sending a updated version.

changes:

* tag %v removed from format function,
* proprietary tags %lq a iq removed from sprintf
* code cleaned

patch divided to two parts - format function and stringfunc (contains
sprintf function and substitute function)

Regards

Pavel Stehule



2010/9/6 Pavel Stehule <pavel.stehule@gmail.com>:
> 2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
>> On Mon, Sep 6, 2010 at 11:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> So?  You'd need to quote the values anyway, in general.  If you want
>>> something that will be valid SQL you'd better include the functionality
>>> of quote_literal() in it.
>>>
>>> I'm not sure that it's a good idea to have any type-specific special
>>> cases.
>>
>> As I remember, the original motivation of %v formatter is
>> some DBMSes don't like quoted numeric literals. However,
>> Postgres accepts quoted numerics, and we're developing Postgres.
>>
>> So, our consensus would be %v formatter should be removed
>> completely from the format function.
>>
>
> I think so tag that quotes all without numbers can be very useful, but
> it isn't too much important for me. I can live without them.
>
> Regards
>
> Pavel
>
>> --
>> Itagaki Takahiro
>>
>

Attachment

Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Thu, Sep 9, 2010 at 8:57 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I am sending a updated version.
>
> changes:
> * tag %v removed from format function,
> * proprietary tags %lq a iq removed from sprintf
> * code cleaned
>
> patch divided to two parts - format function and stringfunc (contains
> sprintf function and substitute function)

=== Discussions about the spec ===
Two patches add format() into the core, and substitute() and sprintf() into
stringfunc contrib module. But will we have 3 versions of string formatters?

IMHO, substitute() is the best choice that we will have in the core because
functionalities in format() and sprintf() can be achieved by combination of
substitute() and quote_nullable(), quote_ident(), or to_char(). I think the
core will provide only simple and non-overlapped features. Users can write
wrapper functions by themselves if they think the description is redundant.

=== format.diff ===
* It has a reject in doc, but the hunk can be fixed easily.   1 out of 2 hunks FAILED -- saving rejects to file
doc/src/sgml/func.sgml.rejCOMMENT: We have the function list in alphabetical order, so format() should be inserted
afterencode().
 
* It can be built without compile warnings.
* Enough documentation and regression tests are included.

=== stringfunc.diff ===
* It can be applied cleanly and built without compile warnings.
* Documentation is included, but not enough. COMMENT: According to existing docs, function list are described with
<variablelist>or <table>.
 
* Enough regression tests are included.
* COMMENT: stringfunc directory should be added to contrib/Makefile.

* BUG: stringfunc_substitute_nv() calls text_format(). I think we don't need stringfunc_substitute_nv at all. It can be
replacedby stringfunc_substitute(). _nv version is only required if it is in the core because of sanity regression
test.

* BUG?: The doc says sprintf() doesn't support length modifiers, but it is actually supported in broken state:
postgres=# SELECT sprintf('%*s', 2, 'ABC');sprintf
---------ABC      <= should be ERROR if unsupported, or AB if supported.
(1 row)

* BUG?: ereport(ERROR,        (errcode(ERRCODE_INVALID_PARAMETER_VALUE),         errmsg("unsupported tag \"%%%c\"",
tag)));Isthe code ok if the tag (a char) is a partial byte of multi-byte character?My machine prints ? in the case, but
itmight be platform-dependent.
 

=== Both patches ===
* Performance: I don't think those functions are not performance-critical, but we could cache typoutput functions in
fn_extraif needed. record_out would be a reference.
 

* Coding: Whitespace and tabs are mixed in some places. They are not so important because we will run pgindent, but
carefulchoice will be preferred even of a patch.
 

-- 
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
Hello

2010/9/29 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Thu, Sep 9, 2010 at 8:57 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> I am sending a updated version.
>>
>> changes:
>> * tag %v removed from format function,
>> * proprietary tags %lq a iq removed from sprintf
>> * code cleaned
>>
>> patch divided to two parts - format function and stringfunc (contains
>> sprintf function and substitute function)
>
> === Discussions about the spec ===
> Two patches add format() into the core, and substitute() and sprintf() into
> stringfunc contrib module. But will we have 3 versions of string formatters?
>
> IMHO, substitute() is the best choice that we will have in the core because
> functionalities in format() and sprintf() can be achieved by combination of
> substitute() and quote_nullable(), quote_ident(), or to_char(). I think the
> core will provide only simple and non-overlapped features. Users can write
> wrapper functions by themselves if they think the description is redundant.

I think we need a three variants of formating functions - "format" in
core, fo simply creating and building a messages, a SQL strings,
"sprintf" for traditionalist in contrib - this functions isn't well
joined to SQL environment and it's too heavy - more it overwrite a
some functionality of "to_char" function. "substitute" function
provide just positional unformatted parameters - that isn't typical
ucase - so must not be in core too.

>
> === format.diff ===
> * It has a reject in doc, but the hunk can be fixed easily.
>    1 out of 2 hunks FAILED -- saving rejects to file doc/src/sgml/func.sgml.rej
>  COMMENT: We have the function list in alphabetical order,

fixed

>  so format() should be inserted after encode().
> * It can be built without compile warnings.
> * Enough documentation and regression tests are included.
>
> === stringfunc.diff ===
> * It can be applied cleanly and built without compile warnings.
> * Documentation is included, but not enough.
>  COMMENT: According to existing docs, function list are described with
>  <variablelist> or <table>.

fixed

> * Enough regression tests are included.
> * COMMENT: stringfunc directory should be added to contrib/Makefile.
>
> * BUG: stringfunc_substitute_nv() calls text_format().
>  I think we don't need stringfunc_substitute_nv at all.
>  It can be replaced by stringfunc_substitute(). _nv version is only
>  required if it is in the core because of sanity regression test.

you have a true - but I am not sure about coding patters for contribs,
so I designed it with respect to core sanity check.

>
> * BUG?: The doc says sprintf() doesn't support length modifiers,
>  but it is actually supported in broken state:

I was wrong in documentation - length modifiers are supported -
positional modifiers are not supported. fixed.

> postgres=# SELECT sprintf('%*s', 2, 'ABC');
>  sprintf
> ---------
>  ABC      <= should be ERROR if unsupported, or AB if supported.
> (1 row)

it works well - "with" modifier doesn't reduce string. String is
stripped by "precision" modifiers.

SELECT sprintf('%*.s', 2, ABC) --> AB

checked via gcc

please, try
printf(">>%s<<\n", "12345678");
printf(">>%3s<<\n", "12345678");
printf(">>%.3s<<\n", "12345678");
printf(">>%10.3s<<\n", "12345678");

do you understand me, why I "dislike" "printf"? How much people knows
well these formatting rules?

>
> * BUG?: ereport(ERROR,
>         (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>          errmsg("unsupported tag \"%%%c\"", tag)));
>  Is the code ok if the tag (a char) is a partial byte of multi-byte character?

it's bug - the supported tags are only single byte, but unsupported
tag can be multibyte character and must by showed correctly - fixed.

>  My machine prints ? in the case, but it might be platform-dependent.
>
> === Both patches ===
> * Performance: I don't think those functions are not performance-critical,
>  but we could cache typoutput functions in fn_extra if needed.
>  record_out would be a reference.

I though about it too and I checked it now - there is 0.4% performance
on 10000000 rows on my PC (format function) - so  I don't do any
changes - caching of oids means a few lines more - but here isn't
expected effect.

>
> * Coding: Whitespace and tabs are mixed in some places. They are not so
>  important because we will run pgindent, but careful choice will be
>  preferred even of a patch.
>

checked, fixed

Thank you very much for review

regards

Pavel Stehule

> --
> Itagaki Takahiro
>

Attachment

Re: string function - "format" function proposal

From
Alvaro Herrera
Date:
Excerpts from Itagaki Takahiro's message of mié oct 13 23:03:16 -0300 2010:
> On Thu, Oct 14, 2010 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> >> [ updated patch, in response to a review from Itagaki Takahiro ]
> >
> > This patch appears to be waiting for a second round of review.
> > Itagaki-san, are you planning to do that?
> 
> I can, but I was waiting for other people's comments about the design:
>   - format() in core, that implements %s, %i, and %l.
>   - substitute() for $n format and sprintf() that partially implements
>     the same function in C in contrib/stringfunc.
> 
> I don't like having three similar functions for the same purpose,
> but Pavel said they are the best solutions. What will be our consensus?

I don't have much love for moving the position stuff ($n) out of the main
function either.  I've been meaning to take a look at how hard it would
be to integrate that into format() in core -- no luck :-(

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
Hello

2010/10/14 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
>> Let's make format support %s, %i, and %l, as well as
>> allowing things like %$3l (meaning, escape the third argument as a
>> literal and interpolate here), and call it good.
>
> Your idea is:
>  % [ $ pos ] format  -- ex. %$3l , %l
>  Escapes: %% => %
>
> Just for information, $ and pos are reversed in C sprintf.
>  % [ pos $ ] format  -- ex. %3$l , %l
>  Escapes: %% => %

ook - +1 for %3$l

>
> IMHO, I like {} syntax as like as C# because the format strings are extensible.
>  { pos [ : format ] } -- ex {3:l}, {3} (, and {l} could be also supported)
>  Escapes: {{ => {, }} => }
>

I dislike it. The target usage for this function is plpgsql code. I
prefer a simply design - second sprintf is useles. More - {} can be
used in  messages relative often and with your proposal, you have to
intensivelly use a escaping.

Regards

Pavel

> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
> Let's make format support %s, %i, and %l, as well as
> allowing things like %$3l (meaning, escape the third argument as a
> literal and interpolate here), and call it good.

Your idea is: % [ $ pos ] format  -- ex. %$3l , %l Escapes: %% => %

Just for information, $ and pos are reversed in C sprintf. % [ pos $ ] format  -- ex. %3$l , %l Escapes: %% => %

IMHO, I like {} syntax as like as C# because the format strings are extensible. { pos [ : format ] } -- ex {3:l}, {3}
(,and {l} could be also supported) Escapes: {{ => {, }} => }
 

-- 
Itagaki Takahiro


Re: string function - "format" function proposal

From
Robert Haas
Date:
On Wed, Oct 13, 2010 at 10:03 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Thu, Oct 14, 2010 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> [ updated patch, in response to a review from Itagaki Takahiro ]
>>
>> This patch appears to be waiting for a second round of review.
>> Itagaki-san, are you planning to do that?
>
> I can, but I was waiting for other people's comments about the design:
>  - format() in core, that implements %s, %i, and %l.
>  - substitute() for $n format and sprintf() that partially implements
>    the same function in C in contrib/stringfunc.
>
> I don't like having three similar functions for the same purpose,
> but Pavel said they are the best solutions. What will be our consensus?

<rereads thread>

I agree with you.  I think we should pick one implementation and just
go with it.  There's nothing to say that Pavel can't distribute his
own code however he likes, but I don't think there's any compelling
reason for us to carry all that code in the main tree, even in
/contrib.  Let's make format support %s, %i, and %l, as well as
allowing things like %$3l (meaning, escape the third argument as a
literal and interpolate here), and call it good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/10/14 Robert Haas <robertmhaas@gmail.com>:
> On Wed, Oct 13, 2010 at 10:03 PM, Itagaki Takahiro
> <itagaki.takahiro@gmail.com> wrote:
>> On Thu, Oct 14, 2010 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>> [ updated patch, in response to a review from Itagaki Takahiro ]
>>>
>>> This patch appears to be waiting for a second round of review.
>>> Itagaki-san, are you planning to do that?
>>
>> I can, but I was waiting for other people's comments about the design:
>>  - format() in core, that implements %s, %i, and %l.
>>  - substitute() for $n format and sprintf() that partially implements
>>    the same function in C in contrib/stringfunc.
>>
>> I don't like having three similar functions for the same purpose,
>> but Pavel said they are the best solutions. What will be our consensus?
>
> <rereads thread>
>
> I agree with you.  I think we should pick one implementation and just
> go with it.  There's nothing to say that Pavel can't distribute his
> own code however he likes, but I don't think there's any compelling
> reason for us to carry all that code in the main tree, even in
> /contrib.  Let's make format support %s, %i, and %l, as well as
> allowing things like %$3l (meaning, escape the third argument as a
> literal and interpolate here), and call it good.
>

my objection to put all to one functions was a format complexity and
little bit less readability - like %$3. But I am not strong it this.

Regards

Pavel Stehule



> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: string function - "format" function proposal

From
Robert Haas
Date:
On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> [ updated patch, in response to a review from Itagaki Takahiro ]

This patch appears to be waiting for a second round of review.
Itagaki-san, are you planning to do that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Thu, Oct 14, 2010 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> [ updated patch, in response to a review from Itagaki Takahiro ]
>
> This patch appears to be waiting for a second round of review.
> Itagaki-san, are you planning to do that?

I can, but I was waiting for other people's comments about the design: - format() in core, that implements %s, %i, and
%l.- substitute() for $n format and sprintf() that partially implements   the same function in C in
contrib/stringfunc.

I don't like having three similar functions for the same purpose,
but Pavel said they are the best solutions. What will be our consensus?

-- 
Itagaki Takahiro


Re: string function - "format" function proposal

From
Robert Haas
Date:
On Thu, Oct 14, 2010 at 2:25 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
>> Let's make format support %s, %i, and %l, as well as
>> allowing things like %$3l (meaning, escape the third argument as a
>> literal and interpolate here), and call it good.
>
> Your idea is:
>  % [ $ pos ] format  -- ex. %$3l , %l
>  Escapes: %% => %
>
> Just for information, $ and pos are reversed in C sprintf.
>  % [ pos $ ] format  -- ex. %3$l , %l
>  Escapes: %% => %

Oh, woops.  I intended to copy the way C works.

> IMHO, I like {} syntax as like as C# because the format strings are extensible.
>  { pos [ : format ] } -- ex {3:l}, {3} (, and {l} could be also supported)
>  Escapes: {{ => {, }} => }

My personal preference (and Pavel's, I guess) is to use the C-like
syntax.  But I wouldn't be upset if consensus congealed around some
other option.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: string function - "format" function proposal

From
Andrew Dunstan
Date:

On 10/14/2010 08:25 AM, Robert Haas wrote:
>
>> IMHO, I like {} syntax as like as C# because the format strings are extensible.
>>   { pos [ : format ] } -- ex {3:l}, {3} (, and {l} could be also supported)
>>   Escapes: {{ =>  {, }} =>  }
> My personal preference (and Pavel's, I guess) is to use the C-like
> syntax.  But I wouldn't be upset if consensus congealed around some
> other option.

They're both somewhat arcane. But I think the C syntax is likely to be 
more familiar to a wider group of users (including, for example, perl 
hackers) than the C# syntax, and is to be preferred on those grounds alone.

cheers

andrew


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Thu, Oct 14, 2010 at 9:50 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> They're both somewhat arcane. But I think the C syntax is likely to be more
> familiar to a wider group of users (including, for example, perl hackers)
> than the C# syntax, and is to be preferred on those grounds alone.

OK, probably C syntax is the best design.
Then, let's merge format() and  substitute() in the latest patch.

I have a comment about %i for identifier format. %i is also used in
printf(), so it would be better to choose another character, like %I.
(%l is ok, but would be %L if we choose %I.)
Implementation for sprintf() in strincfunc might not be used now,
but it will be a conflict when we also merge it to format() in the future.

-- 
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/10/15 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Thu, Oct 14, 2010 at 9:50 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> They're both somewhat arcane. But I think the C syntax is likely to be more
>> familiar to a wider group of users (including, for example, perl hackers)
>> than the C# syntax, and is to be preferred on those grounds alone.
>
> OK, probably C syntax is the best design.
> Then, let's merge format() and  substitute() in the latest patch.
>
> I have a comment about %i for identifier format. %i is also used in
> printf(), so it would be better to choose another character, like %I.
> (%l is ok, but would be %L if we choose %I.)
> Implementation for sprintf() in strincfunc might not be used now,
> but it will be a conflict when we also merge it to format() in the future.
>

I like to see only lower case chars for tags. Using upper chars isn't
a good idea. The system in tags must not be same as I designed, but
there should be clean relation between tag and semantic. The current
system was simple %s string, %i identifier %l literal - if you don't
like %l or %i, then maybe %ls or %is - like "literal string" or "ident
string".

I don't think so merging sprintf and format can be good. Sprintf is
too complex - so long years users don't know specification well and
creating some like sprintf function can be messy for users. I like to
see accurate sprintf function in contrib - and some else in core.

Regards

Pavel Stehule
> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Fri, Oct 15, 2010 at 12:59 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> then maybe %ls or %is - like "literal string" or "ident string".

Yeah, good idea!

> I don't think so merging sprintf and format can be good. Sprintf is
> too complex - so long years users don't know specification well and
> creating some like sprintf function can be messy for users. I like to
> see accurate sprintf function in contrib - and some else in core.

I agree that full-spec sprintf is too complex, but precision and
zero-full for numeric types are commonly used. I think someone
will ask us "Why don't have numeric formats though we have %s?".

-- 
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/10/15 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Fri, Oct 15, 2010 at 12:59 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> then maybe %ls or %is - like "literal string" or "ident string".
>
> Yeah, good idea!
>
>> I don't think so merging sprintf and format can be good. Sprintf is
>> too complex - so long years users don't know specification well and
>> creating some like sprintf function can be messy for users. I like to
>> see accurate sprintf function in contrib - and some else in core.
>
> I agree that full-spec sprintf is too complex, but precision and
> zero-full for numeric types are commonly used. I think someone
> will ask us "Why don't have numeric formats though we have %s?".

And the reply is - we have function to_char. I don't see any reason
why we have to have two independent formatting systems.

Pavel


>
> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Fri, Oct 15, 2010 at 8:20 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> And the reply is - we have function to_char. I don't see any reason
> why we have to have two independent formatting systems.

The formats for literal and identifier can be replaced to quote_nullable() and
quote_ident(), too. Features to write simple queries are constantly in demand.

-- 
Itagaki Takahiro


Re: string function - "format" function proposal

From
Robert Haas
Date:
On Fri, Oct 15, 2010 at 12:55 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Fri, Oct 15, 2010 at 12:59 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> then maybe %ls or %is - like "literal string" or "ident string".
>
> Yeah, good idea!

-1 from me.  What does this do except make it more long-winded?

>> I don't think so merging sprintf and format can be good. Sprintf is
>> too complex - so long years users don't know specification well and
>> creating some like sprintf function can be messy for users. I like to
>> see accurate sprintf function in contrib - and some else in core.
>
> I agree that full-spec sprintf is too complex, but precision and
> zero-full for numeric types are commonly used. I think someone
> will ask us "Why don't have numeric formats though we have %s?".

I think someone might also ask - why are you bothering to create this
at all?  The amount of work that has been put into this is, IMHO, far
out of proportion to the value of the feature.  As Pavel points out,
we already have perfectly good mechanisms for converting our various
data types to text.  We do not need to invent new ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: string function - "format" function proposal

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Oct 15, 2010 at 12:55 AM, Itagaki Takahiro
> <itagaki.takahiro@gmail.com> wrote:
>> I agree that full-spec sprintf is too complex, but precision and
>> zero-full for numeric types are commonly used. I think someone
>> will ask us "Why don't have numeric formats though we have %s?".

> I think someone might also ask - why are you bothering to create this
> at all?  The amount of work that has been put into this is, IMHO, far
> out of proportion to the value of the feature.  As Pavel points out,
> we already have perfectly good mechanisms for converting our various
> data types to text.  We do not need to invent new ones.

I beg to differ.  IMO to_char is a lot closer to the "sucks big-time"
end of the spectrum than the "perfectly good" end of the spectrum:
it's a bad implementation of a crummy design.  I think a lot of people
would like to have something closer to sprintf-style formatting.

I think we should go into this with the idea that it might only do 10%
of what sprintf can do initially, but there will be pressure to cover a
lot of the other 90% eventually.  So it would be a good idea to ensure
that we don't make any choices that are gratuitously incompatible with
standard sprintf format codes.  In particular, I agree with the idea of
using %I not %i for identifiers --- in fact I'd go so far as to suggest
that all specifiers we invent, rather than borrowing from sprintf, be
upper-case.
        regards, tom lane


Re: string function - "format" function proposal

From
Robert Haas
Date:
On Fri, Oct 15, 2010 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Oct 15, 2010 at 12:55 AM, Itagaki Takahiro
>> <itagaki.takahiro@gmail.com> wrote:
>>> I agree that full-spec sprintf is too complex, but precision and
>>> zero-full for numeric types are commonly used. I think someone
>>> will ask us "Why don't have numeric formats though we have %s?".
>
>> I think someone might also ask - why are you bothering to create this
>> at all?  The amount of work that has been put into this is, IMHO, far
>> out of proportion to the value of the feature.  As Pavel points out,
>> we already have perfectly good mechanisms for converting our various
>> data types to text.  We do not need to invent new ones.
>
> I beg to differ.  IMO to_char is a lot closer to the "sucks big-time"
> end of the spectrum than the "perfectly good" end of the spectrum:
> it's a bad implementation of a crummy design.  I think a lot of people
> would like to have something closer to sprintf-style formatting.
>
> I think we should go into this with the idea that it might only do 10%
> of what sprintf can do initially, but there will be pressure to cover a
> lot of the other 90% eventually.  So it would be a good idea to ensure
> that we don't make any choices that are gratuitously incompatible with
> standard sprintf format codes.  In particular, I agree with the idea of
> using %I not %i for identifiers --- in fact I'd go so far as to suggest
> that all specifiers we invent, rather than borrowing from sprintf, be
> upper-case.

Hmm.  I have a feeling that's going to be a rathole.  Among other
problems, what about types other than strings and numbers?  The thing
I most often need to format is a date, and the second most common one
is timestamp with time zone.  The specification for sprintf is
ridiculously complicated with just the things C has as built-in types,
never mind SQL.

Then again, if I'm not the one who has to spend time in the rathole...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: string function - "format" function proposal

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Oct 15, 2010 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think we should go into this with the idea that it might only do 10%
>> of what sprintf can do initially, but there will be pressure to cover a
>> lot of the other 90% eventually.

> Hmm.  I have a feeling that's going to be a rathole.  Among other
> problems, what about types other than strings and numbers?

I think the general solution is to split those off as subproblems.
If you've got a type that has special formatting requirements,
you can do

sprintf('foo is %s', format_my_type(value, other-arguments))

where format_my_type returns text.  (So, in particular, you could use
to_char for this if it solved the particular need.)

Having said that, it might make sense to provide special case handling
of dates and timestamps, since that's definitely the most common case
where you might not be satisfied with the default conversion to text.

> The specification for sprintf is
> ridiculously complicated with just the things C has as built-in types,
> never mind SQL.

Sure, but an awful lot of those bells and whistles turn out to be handy.
Personally I think the field width control options are the main thing
that sprintf has got over to_char, so I think we're going to want those
sooner rather than later.

> Then again, if I'm not the one who has to spend time in the rathole...

Yeah, I'm not in a hurry to spend time on it either.  I just foresee
that somebody will want to, and so I don't want a dead-end definition.
        regards, tom lane


Re: string function - "format" function proposal

From
Robert Haas
Date:
On Fri, Oct 15, 2010 at 5:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Oct 15, 2010 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I think we should go into this with the idea that it might only do 10%
>>> of what sprintf can do initially, but there will be pressure to cover a
>>> lot of the other 90% eventually.
>
>> Hmm.  I have a feeling that's going to be a rathole.  Among other
>> problems, what about types other than strings and numbers?
>
> I think the general solution is to split those off as subproblems.
> If you've got a type that has special formatting requirements,
> you can do
>
> sprintf('foo is %s', format_my_type(value, other-arguments))
>
> where format_my_type returns text.  (So, in particular, you could use
> to_char for this if it solved the particular need.)
>
> Having said that, it might make sense to provide special case handling
> of dates and timestamps, since that's definitely the most common case
> where you might not be satisfied with the default conversion to text.
>
>> The specification for sprintf is
>> ridiculously complicated with just the things C has as built-in types,
>> never mind SQL.
>
> Sure, but an awful lot of those bells and whistles turn out to be handy.

No doubt.  The problem is that we're going to end up with those bells
and whistles in two places: in to_char or other type-specific
formatting functions, and again in format.

> Personally I think the field width control options are the main thing
> that sprintf has got over to_char, so I think we're going to want those
> sooner rather than later.
>
>> Then again, if I'm not the one who has to spend time in the rathole...
>
> Yeah, I'm not in a hurry to spend time on it either.  I just foresee
> that somebody will want to, and so I don't want a dead-end definition.

Perhaps we could design a family of (heh, heh, undocumented) functions
called _format_helper(type, text), or something like that.  format()
would call the format helper for the appropriate type with the datum
to be formatted and the sprintf-like format string as arguments.  Or
maybe that isn't feasible, I'm just brainstorming.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: string function - "format" function proposal

From
Itagaki Takahiro
Date:
On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> No doubt.  The problem is that we're going to end up with those bells
> and whistles in two places: in to_char or other type-specific
> formatting functions, and again in format.

If we decide to use C-like sprintf(), I think the only thing we can do
is to implement C-syntax as much as possible. Users will expect the
function behaves as sprintf, because it has the similar syntax.
It's not an item for now, but someone would request it at a future date.


BTW, the interoperability is why I proposed {} syntax. For example,
{1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD').
(Maybe it's not so easy; It requires function lookups depending on types.)

--
Itagaki Takahiro


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/10/18 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
> On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> No doubt.  The problem is that we're going to end up with those bells
>> and whistles in two places: in to_char or other type-specific
>> formatting functions, and again in format.
>
> If we decide to use C-like sprintf(), I think the only thing we can do
> is to implement C-syntax as much as possible. Users will expect the
> function behaves as sprintf, because it has the similar syntax.
> It's not an item for now, but someone would request it at a future date.
>

yes, it is reason why I wrote two functions - sprintf and format.

>
> BTW, the interoperability is why I proposed {} syntax. For example,
> {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD').
> (Maybe it's not so easy; It requires function lookups depending on types.)

why this shorcut is necessary?

Regards

Pavel Stehule

>
> --
> Itagaki Takahiro
>


Re: string function - "format" function proposal

From
Robert Haas
Date:
On Mon, Oct 18, 2010 at 7:37 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> No doubt.  The problem is that we're going to end up with those bells
>> and whistles in two places: in to_char or other type-specific
>> formatting functions, and again in format.
>
> If we decide to use C-like sprintf(), I think the only thing we can do
> is to implement C-syntax as much as possible. Users will expect the
> function behaves as sprintf, because it has the similar syntax.
> It's not an item for now, but someone would request it at a future date.
>
>
> BTW, the interoperability is why I proposed {} syntax. For example,
> {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD').
> (Maybe it's not so easy; It requires function lookups depending on types.)

There's no particular reason why we couldn't make this work with
sprintf-type syntax; for example, you could allow %{XYZ} to mean
to_char(value, 'XYZ').  But it seems to me that we have agreement that
this should start with just %s, %I, %L and allow 3$ or similar in the
middle to specify which argument it is.  We can then argue about how
many more bells and whistles to add later.

I would like to bounce this back for rework along the lines described
above and ask for a resubmit to the next CF.  We are out of time to
consider this further for this CF, and clearly it's not ready to go
ATM.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: string function - "format" function proposal

From
Pavel Stehule
Date:
2010/10/18, Robert Haas <robertmhaas@gmail.com>:
> On Mon, Oct 18, 2010 at 7:37 AM, Itagaki Takahiro
> <itagaki.takahiro@gmail.com> wrote:
>> On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com>
>> wrote:
>>> No doubt.  The problem is that we're going to end up with those bells
>>> and whistles in two places: in to_char or other type-specific
>>> formatting functions, and again in format.
>>
>> If we decide to use C-like sprintf(), I think the only thing we can do
>> is to implement C-syntax as much as possible. Users will expect the
>> function behaves as sprintf, because it has the similar syntax.
>> It's not an item for now, but someone would request it at a future date.
>>
>>
>> BTW, the interoperability is why I proposed {} syntax. For example,
>> {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD').
>> (Maybe it's not so easy; It requires function lookups depending on types.)
>
> There's no particular reason why we couldn't make this work with
> sprintf-type syntax; for example, you could allow %{XYZ} to mean
> to_char(value, 'XYZ').  But it seems to me that we have agreement that
> this should start with just %s, %I, %L and allow 3$ or similar in the
> middle to specify which argument it is.  We can then argue about how
> many more bells and whistles to add later.

so, yes. Can we finish this discus with this result? I'll prepare
patch for next commit fest. Next question - what about sprintf
function in core? Is living this idea still?

Regards

Pavel


>
> I would like to bounce this back for rework along the lines described
> above and ask for a resubmit to the next CF.  We are out of time to
> consider this further for this CF, and clearly it's not ready to go
> ATM.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: string function - "format" function proposal

From
Robert Haas
Date:
On Mon, Oct 18, 2010 at 1:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2010/10/18, Robert Haas <robertmhaas@gmail.com>:
>> On Mon, Oct 18, 2010 at 7:37 AM, Itagaki Takahiro
>> <itagaki.takahiro@gmail.com> wrote:
>>> On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com>
>>> wrote:
>>>> No doubt.  The problem is that we're going to end up with those bells
>>>> and whistles in two places: in to_char or other type-specific
>>>> formatting functions, and again in format.
>>>
>>> If we decide to use C-like sprintf(), I think the only thing we can do
>>> is to implement C-syntax as much as possible. Users will expect the
>>> function behaves as sprintf, because it has the similar syntax.
>>> It's not an item for now, but someone would request it at a future date.
>>>
>>>
>>> BTW, the interoperability is why I proposed {} syntax. For example,
>>> {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD').
>>> (Maybe it's not so easy; It requires function lookups depending on types.)
>>
>> There's no particular reason why we couldn't make this work with
>> sprintf-type syntax; for example, you could allow %{XYZ} to mean
>> to_char(value, 'XYZ').  But it seems to me that we have agreement that
>> this should start with just %s, %I, %L and allow 3$ or similar in the
>> middle to specify which argument it is.  We can then argue about how
>> many more bells and whistles to add later.
>
> so, yes. Can we finish this discus with this result? I'll prepare
> patch for next commit fest. Next question - what about sprintf
> function in core? Is living this idea still?

I'm indifferent about whether we put it in core or contrib.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company