Thread: Re: contribute pg_get_viewdef2 et al

Re: contribute pg_get_viewdef2 et al

From
"Dave Page"
Date:

> -----Original Message-----
> From: Andreas Pflug [mailto:Andreas.Pflug@web.de]
> Sent: 07 May 2003 12:54
> To: Dave Page; pgadmin-hackers@postgresql.org; Tom Lane
> Subject: contribute pg_get_viewdef2 et al
>
>
> Dave Page wrote:
>
> I can imagine... That's why I didn't rely on the existence of the
> function, but have a fully functional fallback solution.

Which is good, but does make the code more complex. More importantly,
consider what could happen if the user did a database upgrade to a
version with different catalogues or whatever, but didn't upgrade the
extra functions to new versions, just left the old libs there and
reloaded their database. Or if we used a view or a plpgsql function that
had the same problem.

I'm not trying to be a pain in the *&% but I've been here before and
don't particularly want to go through the ensuing pain again.

> >
> Well that doesn't make sense at all. Indentation and line
> formatting can
> be done quite well on the client side, as it is implemented now.

I never really go a chance to look at the backend code, but I kinda
figured it builds the SQL in a vaguely similar way to what we do (though
obviously getting the info from the internal representation of the
object rather than the system catalogues), and that the new functions
could do the same just with appropriate \n's and spaces added for
formatting.

Or does pg_get_xxxdef2 just reformat pg_getxxxdef's output?

<snip views>

> The second version will create exactly the same plan as the
> first, and
> this view isn't the largest view I have to deal with...
> As you can see, there are still some parentheses which could
> be omitted
> (e.g. with multiple bool expressions), but the defensive
> strategy says
> "if not sure, put them in".

I agree your example is, umm, icky, but can you prove that your patch
will not misintepret anything and produce bad output? Once again, isn't
this a case of playing it safe?

> It would even be desirable to reduce the
> number of casts, but unfortunately implicit and explicit
> casts cannot be
> distinguished so that would be really unsafe.

Yes, I do find them annoying sometimes.

> Still, the latter is the second best version: retrieving the original
> source including formatting and comments is what I really want.

Jean-Michel did something like that in an unreleased version of pgAdmin
I. He stored object definitions in the database, and implemented some
dependency tracking to allow much more flexible editting of database
objects. As I recall it worked quite well, but could be made to fall
over if you really tried.

In that case storing the code worked because there were very few ALTER
TABLE options in those days, and no CREATE OR REPLACES, and the code
generally worked by completely recreating the modified object each time
- which the user would always do through pgAdmin. Storing the original
SQL in the catalogues however won't work because you have to let the
user use whatever interface they like, and give them access to the ALTER
statements. This of course means that your SQL is no longer correct as
soon as someone renames a column or table, or performs any such
modification of a named object.

Regards, Dave.


Re: contribute pg_get_viewdef2 et al

From
Andreas Pflug
Date:
Dave Page wrote:

>
>
>>-----Original Message-----
>>From: Andreas Pflug [mailto:Andreas.Pflug@web.de]
>>Sent: 07 May 2003 12:54
>>To: Dave Page; pgadmin-hackers@postgresql.org; Tom Lane
>>Subject: contribute pg_get_viewdef2 et al
>>
>>
>>Dave Page wrote:
>>
>>I can imagine... That's why I didn't rely on the existence of the
>>function, but have a fully functional fallback solution.
>>
>>
>
>Which is good, but does make the code more complex. More importantly,
>consider what could happen if the user did a database upgrade to a
>version with different catalogues or whatever, but didn't upgrade the
>extra functions to new versions, just left the old libs there and
>reloaded their database. Or if we used a view or a plpgsql function that
>had the same problem.
>
>I'm not trying to be a pain in the *&% but I've been here before and
>don't particularly want to go through the ensuing pain again.
>
>
>
>>Well that doesn't make sense at all. Indentation and line
>>formatting can
>>be done quite well on the client side, as it is implemented now.
>>
>>
>
>I never really go a chance to look at the backend code, but I kinda
>figured it builds the SQL in a vaguely similar way to what we do (though
>obviously getting the info from the internal representation of the
>object rather than the system catalogues), and that the new functions
>could do the same just with appropriate \n's and spaces added for
>formatting.
>
>Or does pg_get_xxxdef2 just reformat pg_getxxxdef's output?
>
pg_get_xxxdef2 is a copy of pg_get_xxxdef, with some logic to detect
needed parentheses. The way to create the query from the internal
representation isn't too tricky, just recursively traverse a tree of
expression nodes. The pg_get_xxxdef versions won't try to find out if
the parentheses are really needed, they are added always. I'm using my
function isSimpleNode instead, that's all.

static bool isSimpleNode(Node *node, NodeTag parentNodeType)
{
    if (!node)
    return true;

    switch (nodeTag(node))
    {
    // single words: always simple
    case T_Var:
    case T_Const:
    case T_Param:

    // function-like: name(..) or name[..]
    case T_ArrayRef:
    case T_FuncExpr:
    case T_ArrayExpr:
    case T_CoalesceExpr:
    case T_NullIfExpr:
    case T_Aggref:

        // CASE keywords act as parentheses
    case T_CaseExpr:
        return true;

        // appears simple since . has top precedence, unless parent is T_FieldSelect itself!
    case T_FieldSelect:
        return (parentNodeType == T_FieldSelect ? false : true);


        // maybe simple, check args
    case T_CoerceToDomain:
        return isSimpleNode((Node*) ((CoerceToDomain*)node)->arg, T_CoerceToDomain);
    case T_RelabelType:
        return isSimpleNode((Node*) ((RelabelType*)node)->arg, T_RelabelType);


    // depends on parent node type; needs further checking
    case T_SubLink:
    case T_NullTest:
    case T_BooleanTest:
    case T_OpExpr:
    case T_DistinctExpr:
        if (parentNodeType == T_BoolExpr)
        return true;
        // else check the same as for T_BoolExpr; no break here!
    case T_BoolExpr:
        switch (parentNodeType)
        {
        case T_ArrayRef:
        case T_FuncExpr:
        case T_ArrayExpr:
        case T_CoalesceExpr:
        case T_NullIfExpr:
        case T_Aggref:
        case T_CaseExpr:
            return true;
        default:
            break;
        }
        return false;

        // these are not completely implemented; so far, they're simple
    case T_SubPlan:
    case T_CoerceToDomainValue:
        return true;

    default:
        break;
    }
    // those we don't know: in dubio complexo
    return false;
}

>I agree your example is, umm, icky, but can you prove that your patch
>will not misintepret anything and produce bad output? Once again, isn't
>this a case of playing it safe?
>
How should I do this? How to PROVE software? The old problem.
You can have a look at the code, and say if there's a case when the
isSimpleNode function will falsely return true; in this case a wrong
query might be created. This function is intentionally made VERY simple,
so it shouldn't be too much work. I've sent this Tom, but he didn't
manage to have a look at it, I think.

>..casts..
>Yes, I do find them annoying sometimes.
>
>
Most of them can be omitted, but I haven't found a way to find out those
2 % which where coded explicitely :-(  Both are stored as T_RelabelType.

>In that case storing the code worked because there were very few ALTER
>TABLE options in those days, and no CREATE OR REPLACES, and the code
>generally worked by completely recreating the modified object each time
>- which the user would always do through pgAdmin. Storing the original
>SQL in the catalogues however won't work because you have to let the
>user use whatever interface they like, and give them access to the ALTER
>statements. This of course means that your SQL is no longer correct as
>soon as someone renames a column or table, or performs any such
>modification of a named object.
>
Right, that's why this only works if  the backend stores the query at
the moment it creates the plan. It's absolutely no option to stick users
to a single creation interface to have the query saved as side-effect;
this must be implemented integrally in the backend.

Regards,
Andreas


Re: contribute pg_get_viewdef2 et al

From
"Dave Page"
Date:

> -----Original Message-----
> From: Andreas Pflug [mailto:Andreas.Pflug@web.de]
> Sent: 07 May 2003 14:07
> To: Dave Page; pgadmin-hackers@postgresql.org
> Subject: Re: contribute pg_get_viewdef2 et al
>
>
> Dave Page wrote:
> >I agree your example is, umm, icky, but can you prove that
> your patch
> >will not misintepret anything and produce bad output? Once
> again, isn't
> >this a case of playing it safe?
> >
> How should I do this? How to PROVE software? The old problem.

Well, yes :-)

> You can have a look at the code, and say if there's a case when the
> isSimpleNode function will falsely return true; in this case a wrong
> query might be created. This function is intentionally made
> VERY simple,
> so it shouldn't be too much work. I've sent this Tom, but he didn't
> manage to have a look at it, I think.

I agree it looks pretty straightforward and simple. Try sending it to
the hackers list though - I'll bet Tom gets loads of personally
addressed queries and email everyday, and if he's like me gives
preference to those that used the lists.

> Right, that's why this only works if  the backend stores the query at
> the moment it creates the plan.

Yeah, but it doesn't work. Consider:

CREATE TABLE foo (bar int4);
CREATE VIEW foo_view AS
  SELECT
    bar
  FROM
    foo
  WHERE
    bar > 100;

<store view def>

ALTER TABLE foo ALTER COLUMN bar RENAME TO sheep;

At this point the stored view definition is no longer valid.


> It's absolutely no option to
> stick users
> to a single creation interface to have the query saved as
> side-effect;
> this must be implemented integrally in the backend.

Absolutely. Of course, Jean-Michel's code had very different intentions
for which such a mechanism was more appropriate.

Regards, Dave


Re: contribute pg_get_viewdef2 et al

From
Andreas Pflug
Date:
Hi Dave,

>I agree it looks pretty straightforward and simple. Try sending it to
>the hackers list though - I'll bet Tom gets loads of personally
>addressed queries and email everyday, and if he's like me gives
>preference to those that used the lists.
>
I did so, but had no response. Will try again soon.

>Yeah, but it doesn't work. Consider:
>
>CREATE TABLE foo (bar int4);
>CREATE VIEW foo_view AS
>  SELECT
>    bar
>  FROM
>    foo
>  WHERE
>    bar > 100;
>
><store view def>
>
>ALTER TABLE foo ALTER COLUMN bar RENAME TO sheep;
>
>At this point the stored view definition is no longer valid.
>
>
That's really a matter of philosophy. Taking MSSQL as an example, the
view wouldn't be runnable any more, if tables or columns are renamed. On
the other hand, tables can be dropped and recreated, and the view will
still be runnable because the saved plan is dropped and will be created
from source the first time it is used again.

As a solution in pgsql, there are two ways (combinable)
- Preventing table and column rename, if referenced by rules or views
(ALTER TABLE xx RENAME TO xx2 RESTRICT), just as DROP does
- invalidating the source, so only the reverse-engineered node
representation is available (ALTER TABLE xx RENAME TO xx2 CASCADE)

Regards,

Andreas


Re: contribute pg_get_viewdef2 et al

From
"Dave Page"
Date:

> -----Original Message-----
> From: Andreas Pflug [mailto:Andreas.Pflug@web.de]
> Sent: 07 May 2003 14:57
> To: Dave Page; pgadmin-hackers@postgresql.org
> Subject: Re: contribute pg_get_viewdef2 et al
>
>
> That's really a matter of philosophy. Taking MSSQL as an example, the
> view wouldn't be runnable any more, if tables or columns are
> renamed. On
> the other hand, tables can be dropped and recreated, and the
> view will
> still be runnable because the saved plan is dropped and will
> be created
> from source the first time it is used again.

I can't see many people voting for a change to that behaviour. It's too
big, with too little gain.

> As a solution in pgsql, there are two ways (combinable)
> - Preventing table and column rename, if referenced by rules or views
> (ALTER TABLE xx RENAME TO xx2 RESTRICT), just as DROP does
> - invalidating the source, so only the reverse-engineered node
> representation is available (ALTER TABLE xx RENAME TO xx2 CASCADE)

Possibly. You'd have to raise it on the hackers list, and see what the
response is.

Regards, Dave.