Thread: Questions about parsing boolean and casting to anyelement
The pg_autovacuum system catalog will be deprecated in 8.4, but my customers use them to control autovacuum to emulate maintenance window. So, I'm trying to re-implement the catalog using a VIEW and RULEs in 8.4. The attached is a WIP script, but I have some questions around it: (XXX: I don't mean to propose the script in the core.) - Postgres interprets 'on' as true and 'off' as false in configuration parameters, but they are not accepted in sql-boolean. Is it a design? or should we add a parser for 'on' and 'off' ? I'd like to allow 'on' and 'off' in sql-boolean, too. - The input strings are stored as-is in pg_class.reloptions. So, mixed values could be shown in reloptions. For example autovacuum_enabled=0/1/on/off/true/false . Should we canonicalize them? However, I think the current behavior is not so bad because it can preserve user inputs. - Are there any limitations in casting to anyelement? I got an error when I define the 3rd argument of array_find() as anyelement: ERROR: UNION types text and integer cannot be matched Even if I use casts, it seems to be ignored. CREATE FUNCTION array_find(text[], text, anyelement) RETURNS anyelement AS $$ SELECT substring(i from E'\\W*=(.*)')::anyelement FROM unnest($1) AS t(i) WHERE i LIKE $2 || '=%' UNION ALL SELECT $3 LIMIT 1 $$ LANGUAGE sql IMMUTABLE STRICT; Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Attachment
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > - Are there any limitations in casting to anyelement? It's a no-op ... probably we shouldn't even let you do it, if the lack of an error leaves room for such misinterpretation as this. anyelement and friends are placeholders for use in function declarations, not real types that it makes sense to cast to. > CREATE FUNCTION array_find(text[], text, anyelement) > RETURNS anyelement AS > $$ > SELECT substring(i from E'\\W*=(.*)')::anyelement > FROM unnest($1) AS t(i) WHERE i LIKE $2 || '=%' > UNION ALL SELECT $3 LIMIT 1 > $$ > LANGUAGE sql IMMUTABLE STRICT; The substring() necessarily produces type text, so I dunno why you think $3 needs to be anything but text. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > > - Are there any limitations in casting to anyelement? > > It's a no-op ... probably we shouldn't even let you do it, if the > lack of an error leaves room for such misinterpretation as this. > anyelement and friends are placeholders for use in function > declarations, not real types that it makes sense to cast to. I hope anyelement could be used in cast because casts are supported by almost programming languages where template or generics are available. Moreover, we can cast to anyelement if we use C functions; using oid of anyelement in runtime and querying an associated cast function from system catalog. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: >>> - Are there any limitations in casting to anyelement? >> >> It's a no-op ... probably we shouldn't even let you do it, if the >> lack of an error leaves room for such misinterpretation as this. >> anyelement and friends are placeholders for use in function >> declarations, not real types that it makes sense to cast to. > I hope anyelement could be used in cast because casts are supported by > almost programming languages where template or generics are available. I think what you're suggesting is that inside a polymorphic function, anyelement would somehow be a macro for the type that the function's current anyelement parameter(s) have. It's an interesting idea but it's just fantasy at the moment; I don't even have an idea of how we might implement that. In the meantime I'm more convinced than ever that we should throw an error for attempting such a cast. If people are imagining that it will do something like that, we need to disillusion them. regards, tom lane
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote: > - Postgres interprets 'on' as true and 'off' as false in configuration > parameters, but they are not accepted in sql-boolean. > Is it a design? or should we add a parser for 'on' and 'off' ? > I'd like to allow 'on' and 'off' in sql-boolean, too. Here is a patch to allow 'on' and 'off' as input texts for boolean. Duplicated boolean parsers in parse_bool() and boolin() are merged into a new parse_bool_with_len(). I think the change is useful when we treat reloptions in programs. Since human-readable texts are not suitable for programs, we would need conversions from text to boolean. Then the shared parser works well. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Attachment
On Mon, Feb 16, 2009 at 08:03:33PM -0500, Tom Lane wrote: > ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > > I hope anyelement could be used in cast because casts are supported by > > almost programming languages where template or generics are available. Programming languages with "generics" (aka, parametric polymorphism in literature) should mean that you need *less* casts because the type system is expressive enough that you don't need to "escape" through a cast. > I think what you're suggesting is that inside a polymorphic function, > anyelement would somehow be a macro for the type that the function's > current anyelement parameter(s) have. It's an interesting idea but > it's just fantasy at the moment; I don't even have an idea of how we > might implement that. A couple of solutions would immediately present themselves; making functions first class objects and introducing something called "type classes" (please note these bear little resemblance to "classes" in object orientated programming). If functions were first class objects; you could pass in the "input" function (i.e. boolin, or numeric_in) to the "array_find" function directly call it in place of the "magic" cast syntax (magic because it has to figure out the type of the LHS, whereas if it was a function with known type then it wouldn't need to infer the source type). Type classes[1][2] are a general mechanism for making the "magic" above tractable and sound. The cast above would be exactly analogous to the "read" function in Haskell, and is used very regularly in most code. > In the meantime I'm more convinced than ever that we should throw an > error for attempting such a cast. If people are imagining that it will > do something like that, we need to disillusion them. Yes, sounds sensible at the moment. -- Sam http://samason.me.uk/[1] http://portal.acm.org/citation.cfm?id=75277.75283 is the original paper[2] http://portal.acm.org/citation.cfm?id=141536 extends them to have multiple type parameters, not for PG but nice to knowit's been done before and isn't new ground
I wrote: > ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: >> I hope anyelement could be used in cast because casts are supported by >> almost programming languages where template or generics are available. > I think what you're suggesting is that inside a polymorphic function, > anyelement would somehow be a macro for the type that the function's > current anyelement parameter(s) have. It's an interesting idea but > it's just fantasy at the moment; I don't even have an idea of how we > might implement that. After thinking about it for awhile, I don't like the notation anyway --- it's not immediately obvious that a cast to anyelement should mean something like that. What seems more sensible to me is to introduce a function to get the type of an expression, so that you could write something like cast(expression as typeof(expression)) This special function would act like C's sizeof and similar constructs in that its argument would never be evaluated, only inspected at parse time to determine its type. (There are already precedents for this in SQL; see the IS OF construct.) So the original requirement would be met with something like "expression::typeof($1)". A small disadvantage of this approach is that it's notationally a bit uglier for anyelement/anyarray pairs. For example, consider a function "foo(anyelement) returns anyarray". To get at the element type you just say typeof($1), but if you have to name the array type you need a hack like typeof(array[$1]). In the other direction (name the element type of a parameter array) something like typeof($1[1]) would work. The countervailing advantage is that this solves a lot of problems that overloading anyelement wouldn't ever solve, since you can get at the type of any expression not just a bare parameter. Also I think it'd be relatively easy to stick into the parser; it wouldn't require introduction of any new parse-time context information. Anyway, none of this is material for 8.4, just a possible TODO item. regards, tom lane
On Wed, Feb 18, 2009 at 2:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > After thinking about it for awhile, I don't like the notation anyway > --- it's not immediately obvious that a cast to anyelement should mean > something like that. What seems more sensible to me is to introduce > a function to get the type of an expression, so that you could write > something like We already have such a function, pg_typeof(). I submitted a patch for it in the November commitfest, and you committed it. [1] Or is that not the sort of function you were thinking of? Cheers, BJ [1] http://git.postgresql.org/?p=postgresql.git;a=commit;h=1a850edf036a1c7dbb9f4fcfeae1e5f2c68cf049
Brendan Jurd <direvus@gmail.com> writes: > On Wed, Feb 18, 2009 at 2:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> After thinking about it for awhile, I don't like the notation anyway >> --- it's not immediately obvious that a cast to anyelement should mean >> something like that. What seems more sensible to me is to introduce >> a function to get the type of an expression, so that you could write >> something like > We already have such a function, pg_typeof(). No, pg_typeof is a more-or-less ordinary function that delivers an OID at runtime. What we need here is something that will work as a CAST target, ie, it has to be treated as a type name at parse time. regards, tom lane
2009/2/17 Tom Lane <tgl@sss.pgh.pa.us>: > I wrote: >> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: >>> I hope anyelement could be used in cast because casts are supported by >>> almost programming languages where template or generics are available. > >> I think what you're suggesting is that inside a polymorphic function, >> anyelement would somehow be a macro for the type that the function's >> current anyelement parameter(s) have. It's an interesting idea but >> it's just fantasy at the moment; I don't even have an idea of how we >> might implement that. > > After thinking about it for awhile, I don't like the notation anyway > --- it's not immediately obvious that a cast to anyelement should mean > something like that. What seems more sensible to me is to introduce > a function to get the type of an expression, so that you could write > something like > > cast(expression as typeof(expression)) > > This special function would act like C's sizeof and similar constructs > in that its argument would never be evaluated, only inspected at parse > time to determine its type. (There are already precedents for this in > SQL; see the IS OF construct.) So the original requirement would be > met with something like "expression::typeof($1)". > > A small disadvantage of this approach is that it's notationally a bit > uglier for anyelement/anyarray pairs. For example, consider a function > "foo(anyelement) returns anyarray". To get at the element type you just > say typeof($1), but if you have to name the array type you need a hack > like typeof(array[$1]). In the other direction (name the element type > of a parameter array) something like typeof($1[1]) would work. > > The countervailing advantage is that this solves a lot of problems that > overloading anyelement wouldn't ever solve, since you can get at the > type of any expression not just a bare parameter. > > Also I think it'd be relatively easy to stick into the parser; it > wouldn't require introduction of any new parse-time context information. > > Anyway, none of this is material for 8.4, just a possible TODO item. it's look like good idea regards Pavel Stehule > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
I wrote: > In the meantime I'm more convinced than ever that we should throw an > error for attempting such a cast. If people are imagining that it will > do something like that, we need to disillusion them. BTW, I wrote up what I thought was a trivial patch to make this happen, and promptly got a regression test failure: CREATE TABLE enumtest_child (parent rainbow REFERENCES enumtest_parent); INSERT INTO enumtest_parent VALUES ('red'); INSERTINTO enumtest_child VALUES ('red'); + ERROR: casting to a polymorphic type such as anyenum is meaningless + LINE 1: ... FROM ONLY "public"."enumtest_parent" x WHERE "id"::pg_catal... + ^ + QUERY: SELECT 1 FROM ONLY "public"."enumtest_parent" x WHERE "id"::pg_catalog.anyenum OPERATOR(pg_catalog.=) $1::pg_catalog.anyenumFOR SHARE OF x INSERT INTO enumtest_child VALUES ('blue'); -- fail What is happening is that the code to generate RI check queries is blindly casting to the declared input type of the operator it's selected, which here is "anyenum = anyenum". We could easily prevent it from doing that for polymorphic input types; but since I tripped over this case almost immediately, I'm wondering what other cases might be out there that would get broken by throwing this error. Seeing that this type of confusion hasn't come up before, I think it might be better to leave things alone here. regards, tom lane
ITAGAKI Takahiro wrote: > Here is a patch to allow 'on' and 'off' as input texts for boolean. > Duplicated boolean parsers in parse_bool() and boolin() are merged > into a new parse_bool_with_len(). Regarding your FIXME comment, I think parse_bool* should be in bool.c and declared in builtins.h, which guc.c already includes. (Conceptually, the valid format of a bool should be drived by the boolean type, not the GUC system, I think.)
Peter Eisentraut <peter_e@gmx.net> wrote: > ITAGAKI Takahiro wrote: > > Here is a patch to allow 'on' and 'off' as input texts for boolean. > > Regarding your FIXME comment, I think parse_bool* should be in bool.c > and declared in builtins.h, which guc.c already includes. > (Conceptually, the valid format of a bool should be drived by the > boolean type, not the GUC system, I think.) Here is an updated patch to move parse_bool* into bool.c. I also added tests of on/off values to the regression test. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Attachment
ITAGAKI Takahiro wrote: > Peter Eisentraut <peter_e@gmx.net> wrote: > >> ITAGAKI Takahiro wrote: >>> Here is a patch to allow 'on' and 'off' as input texts for boolean. >> Regarding your FIXME comment, I think parse_bool* should be in bool.c >> and declared in builtins.h, which guc.c already includes. >> (Conceptually, the valid format of a bool should be drived by the >> boolean type, not the GUC system, I think.) > > Here is an updated patch to move parse_bool* into bool.c. > I also added tests of on/off values to the regression test. applied