Thread: string function - "format" function proposal
Hello I am returning back to string functions. For me, the most important function isn't commited still. There was discussion about "format" or "sprintf" fuction. So I'll do a small resume. goal: to get function that helps with formatting a message texts and helps with building a SQL commands (used as dynamic SQL) propsals: * "format" function - uses same formatting as PL/pgSQL RAISE statement * "sprintf" function Itagaki objectives to "format" function: * there are not possibility put two parameters without a space between * it is too simple My objectives to sprintf function: * it is designed to different environment than SQL - missing support NULL, missing support for date, timestamp, boolean, ... * it is too complex, some parameters has different meaning for different tags * we have a "to_char" function for complex formatting now. Now I propose a compromise - "format" function with only three tags: %s .. some string %i .. SQL identifier %l .. string literal using a NULL: for %s NULL is transformed to empty string - like "concat" for %i NULL raises an exception for %l NULL is transformed to ' NULL ' string. This system is still simple and enough. Implemented sprintf function can be moved to "sprintf" contrib module. comments Regards Pavel Stehule
On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > propsals: > * "format" function - uses same formatting as PL/pgSQL RAISE statement > * "sprintf" function > > Now I propose a compromise - "format" function with only three tags: > %s .. some string > %i .. SQL identifier > %l .. string literal These are just ideas: * Use $n, as like as PREPARE command. It allows for us to swap arguments in any order. SELECT format('$2 before $1', 'aaa','bbb') * Call to_char() functions for each placeholder. For example, format('=={YYYY-MM-DD}==', tm::timestamp) is equivalent to '==' || to_char(tm, 'YYYY-MM-DD') || '==' '{}' prints the input with the default format. New languages' libraries might be of some help. LLs, C#, etc. -- Itagaki Takahiro
2010/8/30 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> propsals: >> * "format" function - uses same formatting as PL/pgSQL RAISE statement >> * "sprintf" function >> >> Now I propose a compromise - "format" function with only three tags: >> %s .. some string >> %i .. SQL identifier >> %l .. string literal > > These are just ideas: > > * Use $n, as like as PREPARE command. > It allows for us to swap arguments in any order. > SELECT format('$2 before $1', 'aaa', 'bbb') > what is use case for this feature? I don't see it. > * Call to_char() functions for each placeholder. > For example, > format('=={YYYY-MM-DD}==', tm::timestamp) > is equivalent to > '==' || to_char(tm, 'YYYY-MM-DD') || '==' > '{}' prints the input with the default format. > > New languages' libraries might be of some help. LLs, C#, etc. I though about integration with to_char function too. There are not technical barrier. And I can live with just {to_char_format} too. It can be or cannot be mixed with basic tags together - there is specified a NULL value behave. If we allow {format} syntax, then we have to specify a escape syntax for { and }. Do you have a some idea? Regards Pavel > > -- > Itagaki Takahiro >
Excerpts from Pavel Stehule's message of lun ago 30 07:51:55 -0400 2010: > 2010/8/30 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > > On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > >> propsals: > >> * "format" function - uses same formatting as PL/pgSQL RAISE statement > >> * "sprintf" function > >> > >> Now I propose a compromise - "format" function with only three tags: > >> %s .. some string > >> %i .. SQL identifier > >> %l .. string literal > > > > These are just ideas: > > > > * Use $n, as like as PREPARE command. > > It allows for us to swap arguments in any order. > > SELECT format('$2 before $1', 'aaa', 'bbb') > > what is use case for this feature? I don't see it. Translations :-) I haven't had a use for that but I've heard people implements gettext of sorts in database tables. Maybe that kind of thing would be of use here. > > * Call to_char() functions for each placeholder. > > For example, > > format('=={YYYY-MM-DD}==', tm::timestamp) > > is equivalent to > > '==' || to_char(tm, 'YYYY-MM-DD') || '==' > > '{}' prints the input with the default format. > > > > New languages' libraries might be of some help. LLs, C#, etc. > > I though about integration with to_char function too. There are not > technical barrier. And I can live with just {to_char_format} too. It > can be or cannot be mixed with basic tags together - there is > specified a NULL value behave. If we allow {format} syntax, then we > have to specify a escape syntax for { and }. Do you have a some idea? What about %{sth}? That way you don't need to escape {. The closing } would need escaping only inside the %{} specifier, so {%{YYYY{\}MM}} prints {2010{}08} So the above example is: format('==%{YYYY-MM-DD}==', tm::timestamp); Not sure about this to_char stuff though, seems too cute. You can do the case above like this: format('==%s==', to_char(tm::timestamp, 'YYYY-MM-DD')) -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
2010/8/30 Alvaro Herrera <alvherre@commandprompt.com>: > Excerpts from Pavel Stehule's message of lun ago 30 07:51:55 -0400 2010: >> 2010/8/30 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >> > On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> >> propsals: >> >> * "format" function - uses same formatting as PL/pgSQL RAISE statement >> >> * "sprintf" function >> >> >> >> Now I propose a compromise - "format" function with only three tags: >> >> %s .. some string >> >> %i .. SQL identifier >> >> %l .. string literal >> > >> > These are just ideas: >> > >> > * Use $n, as like as PREPARE command. >> > It allows for us to swap arguments in any order. >> > SELECT format('$2 before $1', 'aaa', 'bbb') >> >> what is use case for this feature? I don't see it. > > Translations :-) I haven't had a use for that but I've heard people > implements gettext of sorts in database tables. Maybe that kind of > thing would be of use here. > >> > * Call to_char() functions for each placeholder. >> > For example, >> > format('=={YYYY-MM-DD}==', tm::timestamp) >> > is equivalent to >> > '==' || to_char(tm, 'YYYY-MM-DD') || '==' >> > '{}' prints the input with the default format. >> > >> > New languages' libraries might be of some help. LLs, C#, etc. >> >> I though about integration with to_char function too. There are not >> technical barrier. And I can live with just {to_char_format} too. It >> can be or cannot be mixed with basic tags together - there is >> specified a NULL value behave. If we allow {format} syntax, then we >> have to specify a escape syntax for { and }. Do you have a some idea? > > What about %{sth}? That way you don't need to escape {. The closing } would > need escaping only inside the %{} specifier, so {%{YYYY{\}MM}} prints > {2010{}08} So the above example is: then you need escaping too :) > > format('==%{YYYY-MM-DD}==', tm::timestamp); I am not sure if this is correct -but why not so there are possible combinations %s .. no quoting, NULL is '' %{} .. no quoting, NULL is NULL .. like output from to_char %{}s .. no quoting with formatting, NULL is '' now I have not idea about nice syntax for positional parameters - maybe %{...}$1s or we can use a two variants for tags - not positional '%' and positional '%', so $1{...}s, %{...}s, $1, %s, $1s, $1{...}, %{...} can be valid tags Regards Pavel Stehule > > Not sure about this to_char stuff though, seems too cute. You can do > the case above like this: > > format('==%s==', to_char(tm::timestamp, 'YYYY-MM-DD')) > I like an using a format like tag - there are not technical problem - format can be taken from string and data type parameter can be known too. But this feature can be some enhancing. The basic features are NULL handling and right quoting. > -- > Álvaro Herrera <alvherre@commandprompt.com> > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support >
Hello attached WIP patch. I implement only basic format's tags related to SQL: string, value, literal, sql identifier. These tags are basic, but there are not any break to implement any other formats or enhance a syntax. The mix with to_char function is more complex then I expected - so I don't thinking about it for now (there are more then one to_char function). I don't found a nice mix for placeholders and positional placeholders - so I propose a new special function "substitute" (in contrib) where placeholders are positional. More - we check in function "format" if all parameters are used - this check isn't related to positional placeholders, this is reason for separate implementation too: so some examples: postgres=# select substitute('second parameter is "$2" and first parameter is "$1"', 'first parameter', 'second parameter'); substitute ─────────────────────────────────────────────────────────────────────────────────second parameter is "second parameter" andfirst parameter is "first parameter" (1 row) postgres=# select format('INSERT INTO %i (c1, c2, c3, c4) VALUES (%v,%v,%v,%v)', 'my tab',1, NULL, true, 'hello'); format ────────────────────────────────────────────────────────────────INSERT INTO "my tab" (c1, c2, c3, c4) VALUES (1,NULL,t,'hello') (1 row) postgres=# select format('SQL identifier %i cannot be a NULL', NULL); ERROR: SQL identifier cannot be a NULL postgres=# select format('NULL is %v or empty string "%s"', NULL, NULL); format ─────────────────────────────────NULL is NULL or empty string "" (1 row) %i ... sql identifier %v ... sql value %s ... string --- the most used tag I expect %l ... literal I hope so this system is clean, simple, readable and extensible Regards Pavel 2010/8/30 Pavel Stehule <pavel.stehule@gmail.com>: > 2010/8/30 Alvaro Herrera <alvherre@commandprompt.com>: >> Excerpts from Pavel Stehule's message of lun ago 30 07:51:55 -0400 2010: >>> 2010/8/30 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >>> > On Mon, Aug 30, 2010 at 7:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> >> propsals: >>> >> * "format" function - uses same formatting as PL/pgSQL RAISE statement >>> >> * "sprintf" function >>> >> >>> >> Now I propose a compromise - "format" function with only three tags: >>> >> %s .. some string >>> >> %i .. SQL identifier >>> >> %l .. string literal >>> > >>> > These are just ideas: >>> > >>> > * Use $n, as like as PREPARE command. >>> > It allows for us to swap arguments in any order. >>> > SELECT format('$2 before $1', 'aaa', 'bbb') >>> >>> what is use case for this feature? I don't see it. >> >> Translations :-) I haven't had a use for that but I've heard people >> implements gettext of sorts in database tables. Maybe that kind of >> thing would be of use here. >> >>> > * Call to_char() functions for each placeholder. >>> > For example, >>> > format('=={YYYY-MM-DD}==', tm::timestamp) >>> > is equivalent to >>> > '==' || to_char(tm, 'YYYY-MM-DD') || '==' >>> > '{}' prints the input with the default format. >>> > >>> > New languages' libraries might be of some help. LLs, C#, etc. >>> >>> I though about integration with to_char function too. There are not >>> technical barrier. And I can live with just {to_char_format} too. It >>> can be or cannot be mixed with basic tags together - there is >>> specified a NULL value behave. If we allow {format} syntax, then we >>> have to specify a escape syntax for { and }. Do you have a some idea? >> >> What about %{sth}? That way you don't need to escape {. The closing } would >> need escaping only inside the %{} specifier, so {%{YYYY{\}MM}} prints >> {2010{}08} So the above example is: > > then you need escaping too :) > >> >> format('==%{YYYY-MM-DD}==', tm::timestamp); > > I am not sure if this is correct -but why not > > so there are possible combinations > > %s .. no quoting, NULL is '' > %{} .. no quoting, NULL is NULL .. like output from to_char > %{}s .. no quoting with formatting, NULL is '' > > now I have not idea about nice syntax for positional parameters - maybe > %{...}$1s or we can use a two variants for tags - not positional '%' > and positional '%', so > $1{...}s, %{...}s, $1, %s, $1s, $1{...}, %{...} can be valid tags > > Regards > > Pavel Stehule > >> >> Not sure about this to_char stuff though, seems too cute. You can do >> the case above like this: >> >> format('==%s==', to_char(tm::timestamp, 'YYYY-MM-DD')) >> > > I like an using a format like tag - there are not technical problem - > format can be taken from string and data type parameter can be known > too. But this feature can be some enhancing. The basic features are > NULL handling and right quoting. > > > >> -- >> Álvaro Herrera <alvherre@commandprompt.com> >> The PostgreSQL Company - Command Prompt, Inc. >> PostgreSQL Replication, Consulting, Custom Development, 24x7 support >> >
On Aug 31, 2010, at 5:07 PM, Pavel Stehule wrote: > Hello > > attached WIP patch. > > I implement only basic format's tags related to SQL: string, value, > literal, sql identifier. These tags are basic, but there are not any > break to implement any other formats or enhance a syntax. The mix with > to_char function is more complex then I expected - so I don't thinking > about it for now (there are more then one to_char function). > <snip> It would be pretty handy if plpgsql EXECUTE could operate like this with USING to support identifiers. Cheers, M
On Tue, Aug 31, 2010 at 11:07:40PM +0200, Pavel Stehule wrote: > Hello > > attached WIP patch. I don't see it attached. Is it just me? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Sep 1, 2010 at 6:07 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > I don't found a nice mix for placeholders and positional placeholders How about %pos$format, used in C-printf()? It might be only in Linux's libc. printf("<%2$s> <%1$d>\n", 123, "abc"); => <abc> <123> http://linux.die.net/man/3/printf > %i ... sql identifier > %v ... sql value > %s ... string --- the most used tag I expect > %l ... literal Looks good designed. I have a couple of comments and questions: * There is no examples for %l. What's the difference from %v and %s? If it always quotes, how does it work? Like as quote_literal()or quote_nullable()? * %v quotes text values (and maybe all non-numeric values) with single quotes, but doesn't numeric values. How do we determinethe difference? By type oid? * %v also doesn't quote boolean values, but t and f are not valid. You should use true and false (or 't' and 'f') for thecases. (So, your "INSERT INTO" example is broken.) -- Itagaki Takahiro
2010/9/1 David Fetter <david@fetter.org>: > On Tue, Aug 31, 2010 at 11:07:40PM +0200, Pavel Stehule wrote: >> Hello >> >> attached WIP patch. > > I don't see it attached. Is it just me? sorry, it was at 1 ofter midnight Regards Pavel > > Cheers, > David. > -- > David Fetter <david@fetter.org> http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fetter@gmail.com > iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate >
Attachment
2010/9/1 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Wed, Sep 1, 2010 at 6:07 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> I don't found a nice mix for placeholders and positional placeholders > > How about %pos$format, used in C-printf()? It might be > only in Linux's libc. > > printf("<%2$s> <%1$d>\n", 123, "abc"); > => <abc> <123> > http://linux.die.net/man/3/printf same syntax I designed and didn't implement, because it isn't too readable (my opinion). But I am not against - just thinking so separate function can be better. > >> %i ... sql identifier >> %v ... sql value >> %s ... string --- the most used tag I expect >> %l ... literal > > Looks good designed. I have a couple of comments and questions: > > * There is no examples for %l. What's the difference from %v and %s? > If it always quotes, how does it work? Like as quote_literal() > or quote_nullable()? > %l is always quoted it is designed for syntax: SELECT integer '10'; %s versus %v ... %s is never quoted X %v is quoted when it is necessary, NULL is showed as empty string for %s and as "NULL" for %v. %s is used for messages (the behave is same like "concat"), %v is used for SQL statement building > * %v quotes text values (and maybe all non-numeric values) with > single quotes, but doesn't numeric values. How do we determine > the difference? By type oid? > every datatype has typecategory attribute typname │ typcategory ─────────────────┼─────────────int8 │ Nint2 │ Nint4 │ Nregproc │ Noid │ Nfloat4 │ Nfloat8 │ Nmoney │ Nnumeric │ Nregprocedure │ Nregoper │ Nregoperator │ Nregclass │ Nregtype │ Nregconfig │ Nregdictionary │ Ncardinal_number │ N so these types are unquoted > * %v also doesn't quote boolean values, but t and f are not valid. > You should use true and false (or 't' and 'f') for the cases. > (So, your "INSERT INTO" example is broken.) > you have a true - it should be fixed Regards Pavel > -- > Itagaki Takahiro >
On Wed, Sep 1, 2010 at 1:29 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> * %v also doesn't quote boolean values, but t and f are not valid. >> You should use true and false (or 't' and 'f') for the cases. > > you have a true - it should be fixed I found quote_literal() prints boolean values as 'true' or 'false'. It uses casting to text type rather than calling output function. OTOH, format functions (and concat funcs) use output functions. Which should we use for such purposes? Consistent behavior is obviously preferred. Boolean type might be the only type that is converted to different representation in typoutput or cast-to-test, but we should consider to have boolean-specific hardwired code, or cast all types to text instead of output functions. -- Itagaki Takahiro
2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Wed, Sep 1, 2010 at 1:29 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> * %v also doesn't quote boolean values, but t and f are not valid. >>> You should use true and false (or 't' and 'f') for the cases. >> >> you have a true - it should be fixed > > I found quote_literal() prints boolean values as 'true' or 'false'. > It uses casting to text type rather than calling output function. > OTOH, format functions (and concat funcs) use output functions. > > Which should we use for such purposes? Consistent behavior is > obviously preferred. Boolean type might be the only type that > is converted to different representation in typoutput or cast-to-test, > but we should consider to have boolean-specific hardwired code, > or cast all types to text instead of output functions. Personally I prefer casting to text - it allows some later customizations. And it's more consistent with || operator. So the functions concat* should to be fixed. Regards Pavel > > -- > Itagaki Takahiro >
Pavel Stehule <pavel.stehule@gmail.com> writes: > 2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >> Which should we use for such purposes? Consistent behavior is >> obviously preferred. Boolean type might be the only type that >> is converted to different representation in typoutput or cast-to-test, >> but we should consider to have boolean-specific hardwired code, >> or cast all types to text instead of output functions. > Personally I prefer casting to text - No, you need to use the I/O functions. Not every type is guaranteed to have a cast to text. > iit allows some later > customizations. And it's more consistent with || operator. I don't buy either of those arguments. regards, tom lane
On Mon, Sep 6, 2010 at 10:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > No, you need to use the I/O functions. Not every type is guaranteed to > have a cast to text. One issue is that Pavel want to generate valid SQL statement using %v format. Boolean values are printed as t or f, so the unquoted values are not valid syntax. If we only use output functions, boolean values should be written as 't' or 'f' (single-quoted), Only numeric values can be unquoted on %v. -- Itagaki Takahiro
2010/9/6 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> 2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >>> Which should we use for such purposes? Consistent behavior is >>> obviously preferred. Boolean type might be the only type that >>> is converted to different representation in typoutput or cast-to-test, >>> but we should consider to have boolean-specific hardwired code, >>> or cast all types to text instead of output functions. > >> Personally I prefer casting to text - > > No, you need to use the I/O functions. Not every type is guaranteed to > have a cast to text. > >> iit allows some later >> customizations. And it's more consistent with || operator. > > I don't buy either of those arguments. can we use a both? like plpgsql? First check cast to text, and second use a IO functions? Why I think so this is useful - sometimes people asked some GUC for formatting date, boolean and other. If these functions try to use a cast to text first, then there is some space for customization via custom cast functions. Regards Pavel Stehule > > regards, tom lane >
2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Mon, Sep 6, 2010 at 10:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> No, you need to use the I/O functions. Not every type is guaranteed to >> have a cast to text. > > One issue is that Pavel want to generate valid SQL statement using > %v format. Boolean values are printed as t or f, so the unquoted > values are not valid syntax. we can format some types directly - but I like idea of casting to text because there is space for users. Pavel > > If we only use output functions, boolean values should be written as > 't' or 'f' (single-quoted), Only numeric values can be unquoted on %v. > > -- > Itagaki Takahiro >
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes: > On Mon, Sep 6, 2010 at 10:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> No, you need to use the I/O functions. Not every type is guaranteed to >> have a cast to text. > One issue is that Pavel want to generate valid SQL statement using > %v format. Boolean values are printed as t or f, so the unquoted > values are not valid syntax. So? You'd need to quote the values anyway, in general. If you want something that will be valid SQL you'd better include the functionality of quote_literal() in it. > If we only use output functions, boolean values should be written as > 't' or 'f' (single-quoted), Only numeric values can be unquoted on %v. I'm not sure that it's a good idea to have any type-specific special cases. Failing to quote numeric values will bring in the whole set of issues about how the parser initially types numeric constants, as in the other thread over the weekend. regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > Why I think so this is useful - sometimes people asked some GUC for > formatting date, boolean and other. If these functions try to use a > cast to text first, then there is some space for customization via > custom cast functions. This is basically nonsense. If you don't control a type's output function, you don't control its cast to text either. Nor do I think it's a good idea to encourage people to make their casts to text operate differently from their output functions. We have that one wart in boolean casting because the SQL standard specifies the result of cast to text and it's different from our historical practice in the bool output function --- but it is a wart, not something we should encourage people to emulate. regards, tom lane
On Mon, Sep 6, 2010 at 11:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > So? You'd need to quote the values anyway, in general. If you want > something that will be valid SQL you'd better include the functionality > of quote_literal() in it. > > I'm not sure that it's a good idea to have any type-specific special > cases. As I remember, the original motivation of %v formatter is some DBMSes don't like quoted numeric literals. However, Postgres accepts quoted numerics, and we're developing Postgres. So, our consensus would be %v formatter should be removed completely from the format function. -- Itagaki Takahiro
2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Mon, Sep 6, 2010 at 11:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So? You'd need to quote the values anyway, in general. If you want >> something that will be valid SQL you'd better include the functionality >> of quote_literal() in it. >> >> I'm not sure that it's a good idea to have any type-specific special >> cases. > > As I remember, the original motivation of %v formatter is > some DBMSes don't like quoted numeric literals. However, > Postgres accepts quoted numerics, and we're developing Postgres. > > So, our consensus would be %v formatter should be removed > completely from the format function. > I think so tag that quotes all without numbers can be very useful, but it isn't too much important for me. I can live without them. Regards Pavel > -- > Itagaki Takahiro >
Hello I am sending a updated version. changes: * tag %v removed from format function, * proprietary tags %lq a iq removed from sprintf * code cleaned patch divided to two parts - format function and stringfunc (contains sprintf function and substitute function) Regards Pavel Stehule 2010/9/6 Pavel Stehule <pavel.stehule@gmail.com>: > 2010/9/6 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >> On Mon, Sep 6, 2010 at 11:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> So? You'd need to quote the values anyway, in general. If you want >>> something that will be valid SQL you'd better include the functionality >>> of quote_literal() in it. >>> >>> I'm not sure that it's a good idea to have any type-specific special >>> cases. >> >> As I remember, the original motivation of %v formatter is >> some DBMSes don't like quoted numeric literals. However, >> Postgres accepts quoted numerics, and we're developing Postgres. >> >> So, our consensus would be %v formatter should be removed >> completely from the format function. >> > > I think so tag that quotes all without numbers can be very useful, but > it isn't too much important for me. I can live without them. > > Regards > > Pavel > >> -- >> Itagaki Takahiro >> >
Attachment
On Thu, Sep 9, 2010 at 8:57 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > I am sending a updated version. > > changes: > * tag %v removed from format function, > * proprietary tags %lq a iq removed from sprintf > * code cleaned > > patch divided to two parts - format function and stringfunc (contains > sprintf function and substitute function) === Discussions about the spec === Two patches add format() into the core, and substitute() and sprintf() into stringfunc contrib module. But will we have 3 versions of string formatters? IMHO, substitute() is the best choice that we will have in the core because functionalities in format() and sprintf() can be achieved by combination of substitute() and quote_nullable(), quote_ident(), or to_char(). I think the core will provide only simple and non-overlapped features. Users can write wrapper functions by themselves if they think the description is redundant. === format.diff === * It has a reject in doc, but the hunk can be fixed easily. 1 out of 2 hunks FAILED -- saving rejects to file doc/src/sgml/func.sgml.rejCOMMENT: We have the function list in alphabetical order, so format() should be inserted afterencode(). * It can be built without compile warnings. * Enough documentation and regression tests are included. === stringfunc.diff === * It can be applied cleanly and built without compile warnings. * Documentation is included, but not enough. COMMENT: According to existing docs, function list are described with <variablelist>or <table>. * Enough regression tests are included. * COMMENT: stringfunc directory should be added to contrib/Makefile. * BUG: stringfunc_substitute_nv() calls text_format(). I think we don't need stringfunc_substitute_nv at all. It can be replacedby stringfunc_substitute(). _nv version is only required if it is in the core because of sanity regression test. * BUG?: The doc says sprintf() doesn't support length modifiers, but it is actually supported in broken state: postgres=# SELECT sprintf('%*s', 2, 'ABC');sprintf ---------ABC <= should be ERROR if unsupported, or AB if supported. (1 row) * BUG?: ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("unsupported tag \"%%%c\"", tag)));Isthe code ok if the tag (a char) is a partial byte of multi-byte character?My machine prints ? in the case, but itmight be platform-dependent. === Both patches === * Performance: I don't think those functions are not performance-critical, but we could cache typoutput functions in fn_extraif needed. record_out would be a reference. * Coding: Whitespace and tabs are mixed in some places. They are not so important because we will run pgindent, but carefulchoice will be preferred even of a patch. -- Itagaki Takahiro
Hello 2010/9/29 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Thu, Sep 9, 2010 at 8:57 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> I am sending a updated version. >> >> changes: >> * tag %v removed from format function, >> * proprietary tags %lq a iq removed from sprintf >> * code cleaned >> >> patch divided to two parts - format function and stringfunc (contains >> sprintf function and substitute function) > > === Discussions about the spec === > Two patches add format() into the core, and substitute() and sprintf() into > stringfunc contrib module. But will we have 3 versions of string formatters? > > IMHO, substitute() is the best choice that we will have in the core because > functionalities in format() and sprintf() can be achieved by combination of > substitute() and quote_nullable(), quote_ident(), or to_char(). I think the > core will provide only simple and non-overlapped features. Users can write > wrapper functions by themselves if they think the description is redundant. I think we need a three variants of formating functions - "format" in core, fo simply creating and building a messages, a SQL strings, "sprintf" for traditionalist in contrib - this functions isn't well joined to SQL environment and it's too heavy - more it overwrite a some functionality of "to_char" function. "substitute" function provide just positional unformatted parameters - that isn't typical ucase - so must not be in core too. > > === format.diff === > * It has a reject in doc, but the hunk can be fixed easily. > 1 out of 2 hunks FAILED -- saving rejects to file doc/src/sgml/func.sgml.rej > COMMENT: We have the function list in alphabetical order, fixed > so format() should be inserted after encode(). > * It can be built without compile warnings. > * Enough documentation and regression tests are included. > > === stringfunc.diff === > * It can be applied cleanly and built without compile warnings. > * Documentation is included, but not enough. > COMMENT: According to existing docs, function list are described with > <variablelist> or <table>. fixed > * Enough regression tests are included. > * COMMENT: stringfunc directory should be added to contrib/Makefile. > > * BUG: stringfunc_substitute_nv() calls text_format(). > I think we don't need stringfunc_substitute_nv at all. > It can be replaced by stringfunc_substitute(). _nv version is only > required if it is in the core because of sanity regression test. you have a true - but I am not sure about coding patters for contribs, so I designed it with respect to core sanity check. > > * BUG?: The doc says sprintf() doesn't support length modifiers, > but it is actually supported in broken state: I was wrong in documentation - length modifiers are supported - positional modifiers are not supported. fixed. > postgres=# SELECT sprintf('%*s', 2, 'ABC'); > sprintf > --------- > ABC <= should be ERROR if unsupported, or AB if supported. > (1 row) it works well - "with" modifier doesn't reduce string. String is stripped by "precision" modifiers. SELECT sprintf('%*.s', 2, ABC) --> AB checked via gcc please, try printf(">>%s<<\n", "12345678"); printf(">>%3s<<\n", "12345678"); printf(">>%.3s<<\n", "12345678"); printf(">>%10.3s<<\n", "12345678"); do you understand me, why I "dislike" "printf"? How much people knows well these formatting rules? > > * BUG?: ereport(ERROR, > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("unsupported tag \"%%%c\"", tag))); > Is the code ok if the tag (a char) is a partial byte of multi-byte character? it's bug - the supported tags are only single byte, but unsupported tag can be multibyte character and must by showed correctly - fixed. > My machine prints ? in the case, but it might be platform-dependent. > > === Both patches === > * Performance: I don't think those functions are not performance-critical, > but we could cache typoutput functions in fn_extra if needed. > record_out would be a reference. I though about it too and I checked it now - there is 0.4% performance on 10000000 rows on my PC (format function) - so I don't do any changes - caching of oids means a few lines more - but here isn't expected effect. > > * Coding: Whitespace and tabs are mixed in some places. They are not so > important because we will run pgindent, but careful choice will be > preferred even of a patch. > checked, fixed Thank you very much for review regards Pavel Stehule > -- > Itagaki Takahiro >
Attachment
Excerpts from Itagaki Takahiro's message of mié oct 13 23:03:16 -0300 2010: > On Thu, Oct 14, 2010 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > >> [ updated patch, in response to a review from Itagaki Takahiro ] > > > > This patch appears to be waiting for a second round of review. > > Itagaki-san, are you planning to do that? > > I can, but I was waiting for other people's comments about the design: > - format() in core, that implements %s, %i, and %l. > - substitute() for $n format and sprintf() that partially implements > the same function in C in contrib/stringfunc. > > I don't like having three similar functions for the same purpose, > but Pavel said they are the best solutions. What will be our consensus? I don't have much love for moving the position stuff ($n) out of the main function either. I've been meaning to take a look at how hard it would be to integrate that into format() in core -- no luck :-( -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hello 2010/10/14 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >> Let's make format support %s, %i, and %l, as well as >> allowing things like %$3l (meaning, escape the third argument as a >> literal and interpolate here), and call it good. > > Your idea is: > % [ $ pos ] format -- ex. %$3l , %l > Escapes: %% => % > > Just for information, $ and pos are reversed in C sprintf. > % [ pos $ ] format -- ex. %3$l , %l > Escapes: %% => % ook - +1 for %3$l > > IMHO, I like {} syntax as like as C# because the format strings are extensible. > { pos [ : format ] } -- ex {3:l}, {3} (, and {l} could be also supported) > Escapes: {{ => {, }} => } > I dislike it. The target usage for this function is plpgsql code. I prefer a simply design - second sprintf is useles. More - {} can be used in messages relative often and with your proposal, you have to intensivelly use a escaping. Regards Pavel > -- > Itagaki Takahiro >
> Let's make format support %s, %i, and %l, as well as > allowing things like %$3l (meaning, escape the third argument as a > literal and interpolate here), and call it good. Your idea is: % [ $ pos ] format -- ex. %$3l , %l Escapes: %% => % Just for information, $ and pos are reversed in C sprintf. % [ pos $ ] format -- ex. %3$l , %l Escapes: %% => % IMHO, I like {} syntax as like as C# because the format strings are extensible. { pos [ : format ] } -- ex {3:l}, {3} (,and {l} could be also supported) Escapes: {{ => {, }} => } -- Itagaki Takahiro
On Wed, Oct 13, 2010 at 10:03 PM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > On Thu, Oct 14, 2010 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> [ updated patch, in response to a review from Itagaki Takahiro ] >> >> This patch appears to be waiting for a second round of review. >> Itagaki-san, are you planning to do that? > > I can, but I was waiting for other people's comments about the design: > - format() in core, that implements %s, %i, and %l. > - substitute() for $n format and sprintf() that partially implements > the same function in C in contrib/stringfunc. > > I don't like having three similar functions for the same purpose, > but Pavel said they are the best solutions. What will be our consensus? <rereads thread> I agree with you. I think we should pick one implementation and just go with it. There's nothing to say that Pavel can't distribute his own code however he likes, but I don't think there's any compelling reason for us to carry all that code in the main tree, even in /contrib. Let's make format support %s, %i, and %l, as well as allowing things like %$3l (meaning, escape the third argument as a literal and interpolate here), and call it good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2010/10/14 Robert Haas <robertmhaas@gmail.com>: > On Wed, Oct 13, 2010 at 10:03 PM, Itagaki Takahiro > <itagaki.takahiro@gmail.com> wrote: >> On Thu, Oct 14, 2010 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> [ updated patch, in response to a review from Itagaki Takahiro ] >>> >>> This patch appears to be waiting for a second round of review. >>> Itagaki-san, are you planning to do that? >> >> I can, but I was waiting for other people's comments about the design: >> - format() in core, that implements %s, %i, and %l. >> - substitute() for $n format and sprintf() that partially implements >> the same function in C in contrib/stringfunc. >> >> I don't like having three similar functions for the same purpose, >> but Pavel said they are the best solutions. What will be our consensus? > > <rereads thread> > > I agree with you. I think we should pick one implementation and just > go with it. There's nothing to say that Pavel can't distribute his > own code however he likes, but I don't think there's any compelling > reason for us to carry all that code in the main tree, even in > /contrib. Let's make format support %s, %i, and %l, as well as > allowing things like %$3l (meaning, escape the third argument as a > literal and interpolate here), and call it good. > my objection to put all to one functions was a format complexity and little bit less readability - like %$3. But I am not strong it this. Regards Pavel Stehule > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > [ updated patch, in response to a review from Itagaki Takahiro ] This patch appears to be waiting for a second round of review. Itagaki-san, are you planning to do that? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 14, 2010 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Sep 29, 2010 at 3:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> [ updated patch, in response to a review from Itagaki Takahiro ] > > This patch appears to be waiting for a second round of review. > Itagaki-san, are you planning to do that? I can, but I was waiting for other people's comments about the design: - format() in core, that implements %s, %i, and %l.- substitute() for $n format and sprintf() that partially implements the same function in C in contrib/stringfunc. I don't like having three similar functions for the same purpose, but Pavel said they are the best solutions. What will be our consensus? -- Itagaki Takahiro
On Thu, Oct 14, 2010 at 2:25 AM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: >> Let's make format support %s, %i, and %l, as well as >> allowing things like %$3l (meaning, escape the third argument as a >> literal and interpolate here), and call it good. > > Your idea is: > % [ $ pos ] format -- ex. %$3l , %l > Escapes: %% => % > > Just for information, $ and pos are reversed in C sprintf. > % [ pos $ ] format -- ex. %3$l , %l > Escapes: %% => % Oh, woops. I intended to copy the way C works. > IMHO, I like {} syntax as like as C# because the format strings are extensible. > { pos [ : format ] } -- ex {3:l}, {3} (, and {l} could be also supported) > Escapes: {{ => {, }} => } My personal preference (and Pavel's, I guess) is to use the C-like syntax. But I wouldn't be upset if consensus congealed around some other option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/14/2010 08:25 AM, Robert Haas wrote: > >> IMHO, I like {} syntax as like as C# because the format strings are extensible. >> { pos [ : format ] } -- ex {3:l}, {3} (, and {l} could be also supported) >> Escapes: {{ => {, }} => } > My personal preference (and Pavel's, I guess) is to use the C-like > syntax. But I wouldn't be upset if consensus congealed around some > other option. They're both somewhat arcane. But I think the C syntax is likely to be more familiar to a wider group of users (including, for example, perl hackers) than the C# syntax, and is to be preferred on those grounds alone. cheers andrew
On Thu, Oct 14, 2010 at 9:50 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > They're both somewhat arcane. But I think the C syntax is likely to be more > familiar to a wider group of users (including, for example, perl hackers) > than the C# syntax, and is to be preferred on those grounds alone. OK, probably C syntax is the best design. Then, let's merge format() and substitute() in the latest patch. I have a comment about %i for identifier format. %i is also used in printf(), so it would be better to choose another character, like %I. (%l is ok, but would be %L if we choose %I.) Implementation for sprintf() in strincfunc might not be used now, but it will be a conflict when we also merge it to format() in the future. -- Itagaki Takahiro
2010/10/15 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Thu, Oct 14, 2010 at 9:50 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> They're both somewhat arcane. But I think the C syntax is likely to be more >> familiar to a wider group of users (including, for example, perl hackers) >> than the C# syntax, and is to be preferred on those grounds alone. > > OK, probably C syntax is the best design. > Then, let's merge format() and substitute() in the latest patch. > > I have a comment about %i for identifier format. %i is also used in > printf(), so it would be better to choose another character, like %I. > (%l is ok, but would be %L if we choose %I.) > Implementation for sprintf() in strincfunc might not be used now, > but it will be a conflict when we also merge it to format() in the future. > I like to see only lower case chars for tags. Using upper chars isn't a good idea. The system in tags must not be same as I designed, but there should be clean relation between tag and semantic. The current system was simple %s string, %i identifier %l literal - if you don't like %l or %i, then maybe %ls or %is - like "literal string" or "ident string". I don't think so merging sprintf and format can be good. Sprintf is too complex - so long years users don't know specification well and creating some like sprintf function can be messy for users. I like to see accurate sprintf function in contrib - and some else in core. Regards Pavel Stehule > -- > Itagaki Takahiro >
On Fri, Oct 15, 2010 at 12:59 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > then maybe %ls or %is - like "literal string" or "ident string". Yeah, good idea! > I don't think so merging sprintf and format can be good. Sprintf is > too complex - so long years users don't know specification well and > creating some like sprintf function can be messy for users. I like to > see accurate sprintf function in contrib - and some else in core. I agree that full-spec sprintf is too complex, but precision and zero-full for numeric types are commonly used. I think someone will ask us "Why don't have numeric formats though we have %s?". -- Itagaki Takahiro
2010/10/15 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Fri, Oct 15, 2010 at 12:59 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> then maybe %ls or %is - like "literal string" or "ident string". > > Yeah, good idea! > >> I don't think so merging sprintf and format can be good. Sprintf is >> too complex - so long years users don't know specification well and >> creating some like sprintf function can be messy for users. I like to >> see accurate sprintf function in contrib - and some else in core. > > I agree that full-spec sprintf is too complex, but precision and > zero-full for numeric types are commonly used. I think someone > will ask us "Why don't have numeric formats though we have %s?". And the reply is - we have function to_char. I don't see any reason why we have to have two independent formatting systems. Pavel > > -- > Itagaki Takahiro >
On Fri, Oct 15, 2010 at 8:20 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > And the reply is - we have function to_char. I don't see any reason > why we have to have two independent formatting systems. The formats for literal and identifier can be replaced to quote_nullable() and quote_ident(), too. Features to write simple queries are constantly in demand. -- Itagaki Takahiro
On Fri, Oct 15, 2010 at 12:55 AM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > On Fri, Oct 15, 2010 at 12:59 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> then maybe %ls or %is - like "literal string" or "ident string". > > Yeah, good idea! -1 from me. What does this do except make it more long-winded? >> I don't think so merging sprintf and format can be good. Sprintf is >> too complex - so long years users don't know specification well and >> creating some like sprintf function can be messy for users. I like to >> see accurate sprintf function in contrib - and some else in core. > > I agree that full-spec sprintf is too complex, but precision and > zero-full for numeric types are commonly used. I think someone > will ask us "Why don't have numeric formats though we have %s?". I think someone might also ask - why are you bothering to create this at all? The amount of work that has been put into this is, IMHO, far out of proportion to the value of the feature. As Pavel points out, we already have perfectly good mechanisms for converting our various data types to text. We do not need to invent new ones. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Oct 15, 2010 at 12:55 AM, Itagaki Takahiro > <itagaki.takahiro@gmail.com> wrote: >> I agree that full-spec sprintf is too complex, but precision and >> zero-full for numeric types are commonly used. I think someone >> will ask us "Why don't have numeric formats though we have %s?". > I think someone might also ask - why are you bothering to create this > at all? The amount of work that has been put into this is, IMHO, far > out of proportion to the value of the feature. As Pavel points out, > we already have perfectly good mechanisms for converting our various > data types to text. We do not need to invent new ones. I beg to differ. IMO to_char is a lot closer to the "sucks big-time" end of the spectrum than the "perfectly good" end of the spectrum: it's a bad implementation of a crummy design. I think a lot of people would like to have something closer to sprintf-style formatting. I think we should go into this with the idea that it might only do 10% of what sprintf can do initially, but there will be pressure to cover a lot of the other 90% eventually. So it would be a good idea to ensure that we don't make any choices that are gratuitously incompatible with standard sprintf format codes. In particular, I agree with the idea of using %I not %i for identifiers --- in fact I'd go so far as to suggest that all specifiers we invent, rather than borrowing from sprintf, be upper-case. regards, tom lane
On Fri, Oct 15, 2010 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Oct 15, 2010 at 12:55 AM, Itagaki Takahiro >> <itagaki.takahiro@gmail.com> wrote: >>> I agree that full-spec sprintf is too complex, but precision and >>> zero-full for numeric types are commonly used. I think someone >>> will ask us "Why don't have numeric formats though we have %s?". > >> I think someone might also ask - why are you bothering to create this >> at all? The amount of work that has been put into this is, IMHO, far >> out of proportion to the value of the feature. As Pavel points out, >> we already have perfectly good mechanisms for converting our various >> data types to text. We do not need to invent new ones. > > I beg to differ. IMO to_char is a lot closer to the "sucks big-time" > end of the spectrum than the "perfectly good" end of the spectrum: > it's a bad implementation of a crummy design. I think a lot of people > would like to have something closer to sprintf-style formatting. > > I think we should go into this with the idea that it might only do 10% > of what sprintf can do initially, but there will be pressure to cover a > lot of the other 90% eventually. So it would be a good idea to ensure > that we don't make any choices that are gratuitously incompatible with > standard sprintf format codes. In particular, I agree with the idea of > using %I not %i for identifiers --- in fact I'd go so far as to suggest > that all specifiers we invent, rather than borrowing from sprintf, be > upper-case. Hmm. I have a feeling that's going to be a rathole. Among other problems, what about types other than strings and numbers? The thing I most often need to format is a date, and the second most common one is timestamp with time zone. The specification for sprintf is ridiculously complicated with just the things C has as built-in types, never mind SQL. Then again, if I'm not the one who has to spend time in the rathole... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Oct 15, 2010 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think we should go into this with the idea that it might only do 10% >> of what sprintf can do initially, but there will be pressure to cover a >> lot of the other 90% eventually. > Hmm. I have a feeling that's going to be a rathole. Among other > problems, what about types other than strings and numbers? I think the general solution is to split those off as subproblems. If you've got a type that has special formatting requirements, you can do sprintf('foo is %s', format_my_type(value, other-arguments)) where format_my_type returns text. (So, in particular, you could use to_char for this if it solved the particular need.) Having said that, it might make sense to provide special case handling of dates and timestamps, since that's definitely the most common case where you might not be satisfied with the default conversion to text. > The specification for sprintf is > ridiculously complicated with just the things C has as built-in types, > never mind SQL. Sure, but an awful lot of those bells and whistles turn out to be handy. Personally I think the field width control options are the main thing that sprintf has got over to_char, so I think we're going to want those sooner rather than later. > Then again, if I'm not the one who has to spend time in the rathole... Yeah, I'm not in a hurry to spend time on it either. I just foresee that somebody will want to, and so I don't want a dead-end definition. regards, tom lane
On Fri, Oct 15, 2010 at 5:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Oct 15, 2010 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I think we should go into this with the idea that it might only do 10% >>> of what sprintf can do initially, but there will be pressure to cover a >>> lot of the other 90% eventually. > >> Hmm. I have a feeling that's going to be a rathole. Among other >> problems, what about types other than strings and numbers? > > I think the general solution is to split those off as subproblems. > If you've got a type that has special formatting requirements, > you can do > > sprintf('foo is %s', format_my_type(value, other-arguments)) > > where format_my_type returns text. (So, in particular, you could use > to_char for this if it solved the particular need.) > > Having said that, it might make sense to provide special case handling > of dates and timestamps, since that's definitely the most common case > where you might not be satisfied with the default conversion to text. > >> The specification for sprintf is >> ridiculously complicated with just the things C has as built-in types, >> never mind SQL. > > Sure, but an awful lot of those bells and whistles turn out to be handy. No doubt. The problem is that we're going to end up with those bells and whistles in two places: in to_char or other type-specific formatting functions, and again in format. > Personally I think the field width control options are the main thing > that sprintf has got over to_char, so I think we're going to want those > sooner rather than later. > >> Then again, if I'm not the one who has to spend time in the rathole... > > Yeah, I'm not in a hurry to spend time on it either. I just foresee > that somebody will want to, and so I don't want a dead-end definition. Perhaps we could design a family of (heh, heh, undocumented) functions called _format_helper(type, text), or something like that. format() would call the format helper for the appropriate type with the datum to be formatted and the sprintf-like format string as arguments. Or maybe that isn't feasible, I'm just brainstorming. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com> wrote: > No doubt. The problem is that we're going to end up with those bells > and whistles in two places: in to_char or other type-specific > formatting functions, and again in format. If we decide to use C-like sprintf(), I think the only thing we can do is to implement C-syntax as much as possible. Users will expect the function behaves as sprintf, because it has the similar syntax. It's not an item for now, but someone would request it at a future date. BTW, the interoperability is why I proposed {} syntax. For example, {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD'). (Maybe it's not so easy; It requires function lookups depending on types.) -- Itagaki Takahiro
2010/10/18 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> No doubt. The problem is that we're going to end up with those bells >> and whistles in two places: in to_char or other type-specific >> formatting functions, and again in format. > > If we decide to use C-like sprintf(), I think the only thing we can do > is to implement C-syntax as much as possible. Users will expect the > function behaves as sprintf, because it has the similar syntax. > It's not an item for now, but someone would request it at a future date. > yes, it is reason why I wrote two functions - sprintf and format. > > BTW, the interoperability is why I proposed {} syntax. For example, > {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD'). > (Maybe it's not so easy; It requires function lookups depending on types.) why this shorcut is necessary? Regards Pavel Stehule > > -- > Itagaki Takahiro >
On Mon, Oct 18, 2010 at 7:37 AM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> No doubt. The problem is that we're going to end up with those bells >> and whistles in two places: in to_char or other type-specific >> formatting functions, and again in format. > > If we decide to use C-like sprintf(), I think the only thing we can do > is to implement C-syntax as much as possible. Users will expect the > function behaves as sprintf, because it has the similar syntax. > It's not an item for now, but someone would request it at a future date. > > > BTW, the interoperability is why I proposed {} syntax. For example, > {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD'). > (Maybe it's not so easy; It requires function lookups depending on types.) There's no particular reason why we couldn't make this work with sprintf-type syntax; for example, you could allow %{XYZ} to mean to_char(value, 'XYZ'). But it seems to me that we have agreement that this should start with just %s, %I, %L and allow 3$ or similar in the middle to specify which argument it is. We can then argue about how many more bells and whistles to add later. I would like to bounce this back for rework along the lines described above and ask for a resubmit to the next CF. We are out of time to consider this further for this CF, and clearly it's not ready to go ATM. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2010/10/18, Robert Haas <robertmhaas@gmail.com>: > On Mon, Oct 18, 2010 at 7:37 AM, Itagaki Takahiro > <itagaki.takahiro@gmail.com> wrote: >> On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com> >> wrote: >>> No doubt. The problem is that we're going to end up with those bells >>> and whistles in two places: in to_char or other type-specific >>> formatting functions, and again in format. >> >> If we decide to use C-like sprintf(), I think the only thing we can do >> is to implement C-syntax as much as possible. Users will expect the >> function behaves as sprintf, because it has the similar syntax. >> It's not an item for now, but someone would request it at a future date. >> >> >> BTW, the interoperability is why I proposed {} syntax. For example, >> {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD'). >> (Maybe it's not so easy; It requires function lookups depending on types.) > > There's no particular reason why we couldn't make this work with > sprintf-type syntax; for example, you could allow %{XYZ} to mean > to_char(value, 'XYZ'). But it seems to me that we have agreement that > this should start with just %s, %I, %L and allow 3$ or similar in the > middle to specify which argument it is. We can then argue about how > many more bells and whistles to add later. so, yes. Can we finish this discus with this result? I'll prepare patch for next commit fest. Next question - what about sprintf function in core? Is living this idea still? Regards Pavel > > I would like to bounce this back for rework along the lines described > above and ask for a resubmit to the next CF. We are out of time to > consider this further for this CF, and clearly it's not ready to go > ATM. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
On Mon, Oct 18, 2010 at 1:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2010/10/18, Robert Haas <robertmhaas@gmail.com>: >> On Mon, Oct 18, 2010 at 7:37 AM, Itagaki Takahiro >> <itagaki.takahiro@gmail.com> wrote: >>> On Sat, Oct 16, 2010 at 7:29 AM, Robert Haas <robertmhaas@gmail.com> >>> wrote: >>>> No doubt. The problem is that we're going to end up with those bells >>>> and whistles in two places: in to_char or other type-specific >>>> formatting functions, and again in format. >>> >>> If we decide to use C-like sprintf(), I think the only thing we can do >>> is to implement C-syntax as much as possible. Users will expect the >>> function behaves as sprintf, because it has the similar syntax. >>> It's not an item for now, but someone would request it at a future date. >>> >>> >>> BTW, the interoperability is why I proposed {} syntax. For example, >>> {1:YYYY-MM-DD} for date is expanded to to_char($1, 'YYYY-MM-DD'). >>> (Maybe it's not so easy; It requires function lookups depending on types.) >> >> There's no particular reason why we couldn't make this work with >> sprintf-type syntax; for example, you could allow %{XYZ} to mean >> to_char(value, 'XYZ'). But it seems to me that we have agreement that >> this should start with just %s, %I, %L and allow 3$ or similar in the >> middle to specify which argument it is. We can then argue about how >> many more bells and whistles to add later. > > so, yes. Can we finish this discus with this result? I'll prepare > patch for next commit fest. Next question - what about sprintf > function in core? Is living this idea still? I'm indifferent about whether we put it in core or contrib. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company