Re: Adds the parsing of a CREATE SCHEMA statement - Mailing list pgsql-patches

From Fernando Nasser
Subject Re: Adds the parsing of a CREATE SCHEMA statement
Date
Msg-id 3C8102EF.AA8E4715@redhat.com
Whole thread Raw
In response to Adds the parsing of a CREATE SCHEMA statement  (Fernando Nasser <fnasser@redhat.com>)
List pgsql-patches
Tom Lane wrote:
>
> This self-modification of the CreateSchema statement makes my head hurt
> ... isn't there a cleaner way?
>
> I would really like to see us move towards a processing pipeline in
> which parse analysis, rewrite, planning, and execution steps each take
> their input data structures as *read only*.  I know it's not like that
> today, but it needs to be so.  If I were to enumerate the bugs we've had
> in the past because of violations of that rule, I'd still be composing
> this message at dinnertime.  (And I still have a very long to-fix list
> of kluges, workarounds, and memory leaks that are traceable to the lack
> of read-only data structures.)  I really really don't want to see any
> new work introducing new violations of the rule.
>

OK, I will tell you what the problem is and maybe you can help finding
a solution.  Here it is:

The CREATE SCHEMA statement is actually a collection of object creation
and privilege statements.  What happens is that the creation of some
objects, or the assignment of privileges, may depend on the existence
of previous ones.  Following your suggestion, I grouped them in classes
and ordered them so most of the dependencies are solved.  So far so
good.

But then VIEWs came into play.  We can have:

  CREATE SCHEMA
  AUTHORIZATION HU

  CREATE TABLE STAFF
   (EMPNUM   CHAR(3) NOT NULL UNIQUE,
    EMPNAME  CHAR(20),
    GRADE    DECIMAL(4),
    CITY     CHAR(15))

  CREATE VIEW STAFFV1
           AS SELECT * FROM STAFF
              WHERE  GRADE >= 12

  CREATE VIEW STAFFV2
           AS SELECT * FROM STAFF
              WHERE  GRADE >= 12


The creation of the STAFFV1 view requires that the table STAFF has
been created already, i.e., that the CREATE TABLE has been executed
before it can be analyzed (I will explain why below, and that is where
you may be able to help).  Furthermore, the view STAFFV2 requires
that the CREATE VIEW STAFFV1 has been executed already.
That is why I had to iterate in the analyze-execute cycle.

Here is where you can help:  the reason I need the tables (or views)
which are used to create a view to be created before I can analyze
a new view is that it contains a SELECT statement and the analyze
of a SELECT statement _opens_ the relation to do some checking on
columns etc.   If we could avoid that, I could just analyze the
whole CREATE SCHEMA and then execute the (reordered) resulting
commands as a whole.

Here is where it happens:

transformStmt() when called with T_ViewStmt calls itself for the
SELECT statement that defines the view.

transformSelectStmt() is then called, which calls transformFromClause().

transformFromClause() loops calling transformFromClauseItem() and it
calls transformTableEntry().

transformTableEntry() calls addRangeTableEntry(), which is where
the problem is.

addRangeTableEntry() calls heap_open() on the relation we have not
created yet (if we don't iterate as I did) and... BOOM!!!
"relation xxxx does not exist", where xxxx is the relation
previously appearing in the CREATE SCHEMA statement.


addRangeTableEntry() has the following comment at the top:

/*
 * Add an entry for a relation to the pstate's range table (p_rtable).
 *
 * If pstate is NULL, we just build an RTE and return it without adding
it
 * to an rtable list.
 *
 * Note: formerly this checked for refname conflicts, but that's wrong.
 * Caller is responsible for checking for conflicts in the appropriate
scope.
 */

Is there any way we can parse views without doing this?  If we can, we
can
avoid the iteration between analyze-execute.


Another thing: can this happen again with other objects when we add them
to the list of things that can go into a create schema?  I.e., can the
analyze phase of an object depend on another from the same create schema
statement to be created (statement executed) already?


I felt sort of funny about keeping the state on the CreateSchemaStmt.
But this statement is a really weird one and the other possibilities
seemed
worse.  Everything considered, it was still the most logical place.
I guess the only way to avoid this is to get rid of all heap_open()
calls in the analyze phase, at least for statements that can appear in a
create schema statement.


Regards,
Fernando

--
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9

pgsql-patches by date:

Previous
From: "Rod Taylor"
Date:
Subject: \du undocumented in psql help
Next
From: Bruce Momjian
Date:
Subject: Re: \du undocumented in psql help