Re: Adds the parsing of a CREATE SCHEMA statement - Mailing list pgsql-patches
From | Fernando Nasser |
---|---|
Subject | Re: Adds the parsing of a CREATE SCHEMA statement |
Date | |
Msg-id | 3C8102EF.AA8E4715@redhat.com Whole thread Raw |
In response to | Adds the parsing of a CREATE SCHEMA statement (Fernando Nasser <fnasser@redhat.com>) |
List | pgsql-patches |
Tom Lane wrote: > > This self-modification of the CreateSchema statement makes my head hurt > ... isn't there a cleaner way? > > I would really like to see us move towards a processing pipeline in > which parse analysis, rewrite, planning, and execution steps each take > their input data structures as *read only*. I know it's not like that > today, but it needs to be so. If I were to enumerate the bugs we've had > in the past because of violations of that rule, I'd still be composing > this message at dinnertime. (And I still have a very long to-fix list > of kluges, workarounds, and memory leaks that are traceable to the lack > of read-only data structures.) I really really don't want to see any > new work introducing new violations of the rule. > OK, I will tell you what the problem is and maybe you can help finding a solution. Here it is: The CREATE SCHEMA statement is actually a collection of object creation and privilege statements. What happens is that the creation of some objects, or the assignment of privileges, may depend on the existence of previous ones. Following your suggestion, I grouped them in classes and ordered them so most of the dependencies are solved. So far so good. But then VIEWs came into play. We can have: CREATE SCHEMA AUTHORIZATION HU CREATE TABLE STAFF (EMPNUM CHAR(3) NOT NULL UNIQUE, EMPNAME CHAR(20), GRADE DECIMAL(4), CITY CHAR(15)) CREATE VIEW STAFFV1 AS SELECT * FROM STAFF WHERE GRADE >= 12 CREATE VIEW STAFFV2 AS SELECT * FROM STAFF WHERE GRADE >= 12 The creation of the STAFFV1 view requires that the table STAFF has been created already, i.e., that the CREATE TABLE has been executed before it can be analyzed (I will explain why below, and that is where you may be able to help). Furthermore, the view STAFFV2 requires that the CREATE VIEW STAFFV1 has been executed already. That is why I had to iterate in the analyze-execute cycle. Here is where you can help: the reason I need the tables (or views) which are used to create a view to be created before I can analyze a new view is that it contains a SELECT statement and the analyze of a SELECT statement _opens_ the relation to do some checking on columns etc. If we could avoid that, I could just analyze the whole CREATE SCHEMA and then execute the (reordered) resulting commands as a whole. Here is where it happens: transformStmt() when called with T_ViewStmt calls itself for the SELECT statement that defines the view. transformSelectStmt() is then called, which calls transformFromClause(). transformFromClause() loops calling transformFromClauseItem() and it calls transformTableEntry(). transformTableEntry() calls addRangeTableEntry(), which is where the problem is. addRangeTableEntry() calls heap_open() on the relation we have not created yet (if we don't iterate as I did) and... BOOM!!! "relation xxxx does not exist", where xxxx is the relation previously appearing in the CREATE SCHEMA statement. addRangeTableEntry() has the following comment at the top: /* * Add an entry for a relation to the pstate's range table (p_rtable). * * If pstate is NULL, we just build an RTE and return it without adding it * to an rtable list. * * Note: formerly this checked for refname conflicts, but that's wrong. * Caller is responsible for checking for conflicts in the appropriate scope. */ Is there any way we can parse views without doing this? If we can, we can avoid the iteration between analyze-execute. Another thing: can this happen again with other objects when we add them to the list of things that can go into a create schema? I.e., can the analyze phase of an object depend on another from the same create schema statement to be created (statement executed) already? I felt sort of funny about keeping the state on the CreateSchemaStmt. But this statement is a really weird one and the other possibilities seemed worse. Everything considered, it was still the most logical place. I guess the only way to avoid this is to get rid of all heap_open() calls in the analyze phase, at least for statements that can appear in a create schema statement. Regards, Fernando -- Fernando Nasser Red Hat Canada Ltd. E-Mail: fnasser@redhat.com 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
pgsql-patches by date: