Thread: Questions about parsing boolean and casting to anyelement

Questions about parsing boolean and casting to anyelement

From
ITAGAKI Takahiro
Date:
The pg_autovacuum system catalog will be deprecated in 8.4,
but my customers use them to control autovacuum to emulate
maintenance window. So, I'm trying to re-implement the catalog
using a VIEW and RULEs in 8.4.

The attached is a WIP script, but I have some questions around it:
(XXX: I don't mean to propose the script in the core.)

  - Postgres interprets 'on' as true and 'off' as false in configuration
    parameters, but they are not accepted in sql-boolean.
    Is it a design? or should we add a parser for 'on' and 'off' ?
    I'd like to allow 'on' and 'off' in sql-boolean, too.

  - The input strings are stored as-is in pg_class.reloptions.
    So, mixed values could be shown in reloptions. For example
    autovacuum_enabled=0/1/on/off/true/false .
    Should we canonicalize them? However, I think the current behavior
    is not so bad because it can preserve user inputs.

  - Are there any limitations in casting to anyelement?
    I got an error when I define the 3rd argument of array_find()
    as anyelement:
        ERROR:  UNION types text and integer cannot be matched
    Even if I use casts, it seems to be ignored.

    CREATE FUNCTION array_find(text[], text, anyelement)
    RETURNS anyelement AS
    $$
        SELECT substring(i from E'\\W*=(.*)')::anyelement
          FROM unnest($1) AS t(i) WHERE i LIKE $2 || '=%'
         UNION ALL SELECT $3 LIMIT 1
    $$
    LANGUAGE sql IMMUTABLE STRICT;


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

Re: Questions about parsing boolean and casting to anyelement

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>   - Are there any limitations in casting to anyelement?

It's a no-op ... probably we shouldn't even let you do it, if the
lack of an error leaves room for such misinterpretation as this.
anyelement and friends are placeholders for use in function
declarations, not real types that it makes sense to cast to.

>     CREATE FUNCTION array_find(text[], text, anyelement)
>     RETURNS anyelement AS
>     $$
>         SELECT substring(i from E'\\W*=(.*)')::anyelement
>           FROM unnest($1) AS t(i) WHERE i LIKE $2 || '=%'
>          UNION ALL SELECT $3 LIMIT 1
>     $$
>     LANGUAGE sql IMMUTABLE STRICT;

The substring() necessarily produces type text, so I dunno why you
think $3 needs to be anything but text.
        regards, tom lane


Re: Questions about parsing boolean and casting to anyelement

From
ITAGAKI Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> >   - Are there any limitations in casting to anyelement?
> 
> It's a no-op ... probably we shouldn't even let you do it, if the
> lack of an error leaves room for such misinterpretation as this.
> anyelement and friends are placeholders for use in function
> declarations, not real types that it makes sense to cast to.

I hope anyelement could be used in cast because casts are supported by
almost programming languages where template or generics are available.
Moreover, we can cast to anyelement if we use C functions; using oid of
anyelement in runtime and querying an associated cast function from
system catalog.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




Re: Questions about parsing boolean and casting to anyelement

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>>> - Are there any limitations in casting to anyelement?
>> 
>> It's a no-op ... probably we shouldn't even let you do it, if the
>> lack of an error leaves room for such misinterpretation as this.
>> anyelement and friends are placeholders for use in function
>> declarations, not real types that it makes sense to cast to.

> I hope anyelement could be used in cast because casts are supported by
> almost programming languages where template or generics are available.

I think what you're suggesting is that inside a polymorphic function,
anyelement would somehow be a macro for the type that the function's
current anyelement parameter(s) have.  It's an interesting idea but
it's just fantasy at the moment; I don't even have an idea of how we
might implement that.

In the meantime I'm more convinced than ever that we should throw an
error for attempting such a cast.  If people are imagining that it will
do something like that, we need to disillusion them.
        regards, tom lane


Allow on/off as input texts for boolean.

From
ITAGAKI Takahiro
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote:
>   - Postgres interprets 'on' as true and 'off' as false in configuration
>     parameters, but they are not accepted in sql-boolean.
>     Is it a design? or should we add a parser for 'on' and 'off' ?
>     I'd like to allow 'on' and 'off' in sql-boolean, too.

Here is a patch to allow 'on' and 'off' as input texts for boolean.
Duplicated boolean parsers in parse_bool() and boolin() are merged
into a new parse_bool_with_len().

I think the change is useful when we treat reloptions in programs.
Since human-readable texts are not suitable for programs, we would need
conversions from text to boolean. Then the shared parser works well.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: Questions about parsing boolean and casting to anyelement

From
Sam Mason
Date:
On Mon, Feb 16, 2009 at 08:03:33PM -0500, Tom Lane wrote:
> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> > I hope anyelement could be used in cast because casts are supported by
> > almost programming languages where template or generics are available.

Programming languages with "generics" (aka, parametric polymorphism in
literature) should mean that you need *less* casts because the type
system is expressive enough that you don't need to "escape" through a
cast.

> I think what you're suggesting is that inside a polymorphic function,
> anyelement would somehow be a macro for the type that the function's
> current anyelement parameter(s) have.  It's an interesting idea but
> it's just fantasy at the moment; I don't even have an idea of how we
> might implement that.

A couple of solutions would immediately present themselves; making
functions first class objects and introducing something called "type
classes" (please note these bear little resemblance to "classes" in
object orientated programming).

If functions were first class objects; you could pass in the "input"
function (i.e. boolin, or numeric_in) to the "array_find" function
directly call it in place of the "magic" cast syntax (magic because it
has to figure out the type of the LHS, whereas if it was a function with
known type then it wouldn't need to infer the source type).

Type classes[1][2] are a general mechanism for making the "magic" above
tractable and sound.  The cast above would be exactly analogous to the
"read" function in Haskell, and is used very regularly in most code.

> In the meantime I'm more convinced than ever that we should throw an
> error for attempting such a cast.  If people are imagining that it will
> do something like that, we need to disillusion them.

Yes, sounds sensible at the moment.

--  Sam  http://samason.me.uk/[1] http://portal.acm.org/citation.cfm?id=75277.75283    is the original paper[2]
http://portal.acm.org/citation.cfm?id=141536   extends them to have multiple type parameters, not for PG but nice    to
knowit's been done before and isn't new ground
 


Re: Questions about parsing boolean and casting to anyelement

From
Tom Lane
Date:
I wrote:
> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>> I hope anyelement could be used in cast because casts are supported by
>> almost programming languages where template or generics are available.

> I think what you're suggesting is that inside a polymorphic function,
> anyelement would somehow be a macro for the type that the function's
> current anyelement parameter(s) have.  It's an interesting idea but
> it's just fantasy at the moment; I don't even have an idea of how we
> might implement that.

After thinking about it for awhile, I don't like the notation anyway
--- it's not immediately obvious that a cast to anyelement should mean
something like that.  What seems more sensible to me is to introduce
a function to get the type of an expression, so that you could write
something like
cast(expression as typeof(expression))

This special function would act like C's sizeof and similar constructs
in that its argument would never be evaluated, only inspected at parse
time to determine its type.  (There are already precedents for this in
SQL; see the IS OF construct.)  So the original requirement would be
met with something like "expression::typeof($1)".

A small disadvantage of this approach is that it's notationally a bit
uglier for anyelement/anyarray pairs.  For example, consider a function
"foo(anyelement) returns anyarray".  To get at the element type you just
say typeof($1), but if you have to name the array type you need a hack
like typeof(array[$1]).  In the other direction (name the element type
of a parameter array) something like typeof($1[1]) would work.

The countervailing advantage is that this solves a lot of problems that
overloading anyelement wouldn't ever solve, since you can get at the
type of any expression not just a bare parameter.

Also I think it'd be relatively easy to stick into the parser; it
wouldn't require introduction of any new parse-time context information.

Anyway, none of this is material for 8.4, just a possible TODO item.
        regards, tom lane


Re: Questions about parsing boolean and casting to anyelement

From
Brendan Jurd
Date:
On Wed, Feb 18, 2009 at 2:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> After thinking about it for awhile, I don't like the notation anyway
> --- it's not immediately obvious that a cast to anyelement should mean
> something like that.  What seems more sensible to me is to introduce
> a function to get the type of an expression, so that you could write
> something like

We already have such a function, pg_typeof().  I submitted a patch for
it in the November commitfest, and you committed it. [1]

Or is that not the sort of function you were thinking of?

Cheers,
BJ

[1] http://git.postgresql.org/?p=postgresql.git;a=commit;h=1a850edf036a1c7dbb9f4fcfeae1e5f2c68cf049


Re: Questions about parsing boolean and casting to anyelement

From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes:
> On Wed, Feb 18, 2009 at 2:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> After thinking about it for awhile, I don't like the notation anyway
>> --- it's not immediately obvious that a cast to anyelement should mean
>> something like that.  What seems more sensible to me is to introduce
>> a function to get the type of an expression, so that you could write
>> something like

> We already have such a function, pg_typeof().

No, pg_typeof is a more-or-less ordinary function that delivers an OID
at runtime.  What we need here is something that will work as a CAST
target, ie, it has to be treated as a type name at parse time.
        regards, tom lane


Re: Questions about parsing boolean and casting to anyelement

From
Pavel Stehule
Date:
2009/2/17 Tom Lane <tgl@sss.pgh.pa.us>:
> I wrote:
>> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>>> I hope anyelement could be used in cast because casts are supported by
>>> almost programming languages where template or generics are available.
>
>> I think what you're suggesting is that inside a polymorphic function,
>> anyelement would somehow be a macro for the type that the function's
>> current anyelement parameter(s) have.  It's an interesting idea but
>> it's just fantasy at the moment; I don't even have an idea of how we
>> might implement that.
>
> After thinking about it for awhile, I don't like the notation anyway
> --- it's not immediately obvious that a cast to anyelement should mean
> something like that.  What seems more sensible to me is to introduce
> a function to get the type of an expression, so that you could write
> something like
>
>        cast(expression as typeof(expression))
>
> This special function would act like C's sizeof and similar constructs
> in that its argument would never be evaluated, only inspected at parse
> time to determine its type.  (There are already precedents for this in
> SQL; see the IS OF construct.)  So the original requirement would be
> met with something like "expression::typeof($1)".
>
> A small disadvantage of this approach is that it's notationally a bit
> uglier for anyelement/anyarray pairs.  For example, consider a function
> "foo(anyelement) returns anyarray".  To get at the element type you just
> say typeof($1), but if you have to name the array type you need a hack
> like typeof(array[$1]).  In the other direction (name the element type
> of a parameter array) something like typeof($1[1]) would work.
>
> The countervailing advantage is that this solves a lot of problems that
> overloading anyelement wouldn't ever solve, since you can get at the
> type of any expression not just a bare parameter.
>
> Also I think it'd be relatively easy to stick into the parser; it
> wouldn't require introduction of any new parse-time context information.
>
> Anyway, none of this is material for 8.4, just a possible TODO item.

it's look like good idea

regards
Pavel Stehule

>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: Questions about parsing boolean and casting to anyelement

From
Tom Lane
Date:
I wrote:
> In the meantime I'm more convinced than ever that we should throw an
> error for attempting such a cast.  If people are imagining that it will
> do something like that, we need to disillusion them.

BTW, I wrote up what I thought was a trivial patch to make this happen,
and promptly got a regression test failure:
 CREATE TABLE enumtest_child (parent rainbow REFERENCES enumtest_parent); INSERT INTO enumtest_parent VALUES ('red');
INSERTINTO enumtest_child VALUES ('red');
 
+ ERROR:  casting to a polymorphic type such as anyenum is meaningless
+ LINE 1: ... FROM ONLY "public"."enumtest_parent" x WHERE "id"::pg_catal...
+                                                              ^
+ QUERY:  SELECT 1 FROM ONLY "public"."enumtest_parent" x WHERE "id"::pg_catalog.anyenum OPERATOR(pg_catalog.=)
$1::pg_catalog.anyenumFOR SHARE OF x INSERT INTO enumtest_child VALUES ('blue');  -- fail
 

What is happening is that the code to generate RI check queries is
blindly casting to the declared input type of the operator it's
selected, which here is "anyenum = anyenum".  We could easily prevent
it from doing that for polymorphic input types; but since I tripped over
this case almost immediately, I'm wondering what other cases might be
out there that would get broken by throwing this error.

Seeing that this type of confusion hasn't come up before, I think it
might be better to leave things alone here.
        regards, tom lane


Re: Allow on/off as input texts for boolean.

From
Peter Eisentraut
Date:
ITAGAKI Takahiro wrote:
> Here is a patch to allow 'on' and 'off' as input texts for boolean.
> Duplicated boolean parsers in parse_bool() and boolin() are merged
> into a new parse_bool_with_len().

Regarding your FIXME comment, I think parse_bool* should be in bool.c 
and declared in builtins.h, which guc.c already includes. 
(Conceptually, the valid format of a bool should be drived by the 
boolean type, not the GUC system, I think.)


Re: Allow on/off as input texts for boolean.

From
ITAGAKI Takahiro
Date:
Peter Eisentraut <peter_e@gmx.net> wrote:

> ITAGAKI Takahiro wrote:
> > Here is a patch to allow 'on' and 'off' as input texts for boolean.
>
> Regarding your FIXME comment, I think parse_bool* should be in bool.c
> and declared in builtins.h, which guc.c already includes.
> (Conceptually, the valid format of a bool should be drived by the
> boolean type, not the GUC system, I think.)

Here is an updated patch to move parse_bool* into bool.c.
I also added tests of on/off values to the regression test.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: Allow on/off as input texts for boolean.

From
Peter Eisentraut
Date:
ITAGAKI Takahiro wrote:
> Peter Eisentraut <peter_e@gmx.net> wrote:
> 
>> ITAGAKI Takahiro wrote:
>>> Here is a patch to allow 'on' and 'off' as input texts for boolean.
>> Regarding your FIXME comment, I think parse_bool* should be in bool.c 
>> and declared in builtins.h, which guc.c already includes. 
>> (Conceptually, the valid format of a bool should be drived by the 
>> boolean type, not the GUC system, I think.)
> 
> Here is an updated patch to move parse_bool* into bool.c.
> I also added tests of on/off values to the regression test.

applied