Thread: Re: RE: [PATCHES] relation filename patch

Re: RE: [PATCHES] relation filename patch

From
"Ross J. Reedstrom"
Date:
On Sun, Apr 30, 2000 at 10:18:56PM -0400, Bruce Momjian wrote:
> > > If I recognize correctly,RelationGetRelationName() macro returns the
> > > relation name local to the backend and RelationGetPhysicalRelation-
> > > Name() macro returns pg_class entry name currently. Those names
> > > are same unless the relation is a temporary relation. They may be
> > > able to have separate entries in pg_class. I don't know why they
> > > don't currently.
> > 
> > Different backends can have the same temp file names at the same time,
> > so you would have to have a pg_class tuple that is visible only to the
> > current transactions, and allow multiple duplicate ones.  Easier not to
> > have it in pg_class and just hack the syscache code.
> > 

I didn't go to great trouble to keep the existing hack to the syscache
code working for temp tables, because I thought that the primary purpose
of that hack was to make sure the physical relation name, i.e. the file
name, would not collide for multiple backends. Tacking the OID onto
the filename fixes that.  Getting the right table for the user specific
namespace sounds like ... schemas, so I figured I could fix that in the
schema implementation.

However, I see now that the hack to the syscache actually implements
temp table override of an existing persistent relation, as well. Hmm,
is this override SQL92, or an extension? It seems to me that TEMPORARY
tables are not completely defined in the draft I have.  For example,
there's syntax defined for ON COMMIT PRESERVE ROWS, but the only semantics
seem to be some restrictions on check constraint definitions, not how
the temp table rows are supposed to be preserved.

Hmm. Further perusal leads me to believe that this override is a
standards extension, as the clause about the tablename being unique in
the current namespace does not have an exception for temporary tables.
Nothing wrong with that, just making it clear. What's the use/case for
this feature? Does it come from some other DMBS?


> > Second, I trust the patch keeps the on-disk file names similar to the
> > table names.  Doing some all-numeric file names just to work around NT
> > broken-ness is not going to fly.

The patch uses Tom Lane's makeObjectName code, to tack the oid on
the end of the relname, but keep it under NAMEDATALEN by trimming the
relname part.

> 
> I should add I have not seen this patch personally.  If it was sent to a
> mailing list, I somehow missed it.

It went to the patches list, although I'm still not getting mail from
there.  BTW, it's clearly marked 'experimental, not to include in CVS'
I'm not suggesting that this is the way to go: just testing how much
the assumption relname == filename has crept up through the code. Not
too bad, actually.

The bigger problem, for schema implementation, is that the assumption
'relname+dbname' is sufficent to uniquely identify a relation is more
pervasive: it's the basis of the relcache hash, isn't it? That's why I
changed the key for that hash. I'm not sure I got all the places that
make that assumption: in fact, the extra errors I get from the temptable
regresion tests strongly imply that I  missed a few (attempt to delete
non existent relcache entry.) In addition, doing it the way I did does
require that all storage managers assure uniqueness of the relphysname
across the entire DB. The current patch does that: the oid is unique.

Another way to go is to revert that change, but adding the schema to
the relcache key.

Ross
-- 
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005




Re: RE: [PATCHES] relation filename patch

From
Bruce Momjian
Date:
> Hmm. Further perusal leads me to believe that this override is a
> standards extension, as the clause about the tablename being unique in
> the current namespace does not have an exception for temporary tables.
> Nothing wrong with that, just making it clear. What's the use/case for
> this feature? Does it come from some other DMBS?

I am quite surprised people don't like that feature, or at least one
person doesn't.  If someone else creates a table, it should not prevent
me from creating a temporary table of the same name.

I know Informix doesn't implement it that way, and they complained
because a program started not working.  Research showed that someone had
created a real table with the same name as the temp table.

Our code even masks a real table created in the same session.  Once the
temp table is dropped, the real table becomes visible again.  See the
regression tests for an example of this.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RE: [PATCHES] relation filename patch

From
Bruce Momjian
Date:
> > Hmm. Further perusal leads me to believe that this override is a
> > standards extension, as the clause about the tablename being unique in
> > the current namespace does not have an exception for temporary tables.
> > Nothing wrong with that, just making it clear. What's the use/case for
> > this feature? Does it come from some other DMBS?
> 
> I am quite surprised people don't like that feature, or at least one
> person doesn't.  If someone else creates a table, it should not prevent
> me from creating a temporary table of the same name.
> 
> I know Informix doesn't implement it that way, and they complained

Sorry, I should have said "One of my users complained..."

> because a program started not working.  Research showed that someone had
> created a real table with the same name as the temp table.


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RE: [PATCHES] relation filename patch

From
Tom Lane
Date:
"Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu> writes:
> I didn't go to great trouble to keep the existing hack to the syscache
> code working for temp tables, because I thought that the primary purpose
> of that hack was to make sure the physical relation name, i.e. the file
> name, would not collide for multiple backends. Tacking the OID onto
> the filename fixes that.

Not good enough --- the logical names of the temp tables (ie, the names
they are given in pg_class) can't collide either, or we'll get failures
from the unique index on pg_class relname.  It could be that schemas
will provide a better answer to that point though.

I concur with your observation that the semantics we give temp tables
are more than the standard requires --- but I also concur with Bruce
that it's an exceedingly useful extension.  ISTM if an application has
to worry about whether its temp table name will collide with something
else, then much of the value of the feature is lost.  So masking
permanent table names is the right thing, just as local vars in
subroutines mask global vars of the same name.  Again, maybe we can
define schemas in such a way that the same effect can be gotten...

> The bigger problem, for schema implementation, is that the assumption
> 'relname+dbname' is sufficent to uniquely identify a relation is more
> pervasive: it's the basis of the relcache hash, isn't it? That's why I
> changed the key for that hash.

I didn't see what you did here, but I doubt that it was the right thing.
There are two things going on: rel name and rel OID must both be unique
within a database, but there can be duplicates across databases within
an installation.  So, most of the backend assumes that rel name or
rel OID is alone sufficient to identify a relation, and there are just
a couple of places that interface to installation-wide structures
(eg, the buffer manager and sinval-queue manager) that know they must
attach the current database's name or OID to make the identifier
globally unique.  This is one reason why a backend can't reconnect to
another database on the fly; we've got no way to find all those
database-dependent cache entries...

It's going to be *extremely* painful if more than one name/OID is needed
to refer to a relation in many places in the backend.  I think we want
to avoid that if possible.

It seems to me that what we want is to have schemas control the mapping
from rel name to rel OID, but to keep rel OID unique within a database.
So schemas are more closely related to the "temp table hack" than you
think.
        regards, tom lane


Re: RE: [PATCHES] relation filename patch

From
Brian E Gallew
Date:
Then <pgman@candle.pha.pa.us> spoke up and said:
> > Hmm. Further perusal leads me to believe that this override is a
> > standards extension, as the clause about the tablename being unique in
> > the current namespace does not have an exception for temporary tables.
> > Nothing wrong with that, just making it clear. What's the use/case for
> > this feature? Does it come from some other DMBS?
> 
> I know Informix doesn't implement it that way, and they complained
> because a program started not working.  Research showed that someone had
> created a real table with the same name as the temp table.
> 
> Our code even masks a real table created in the same session.  Once the
> temp table is dropped, the real table becomes visible again.  See the
> regression tests for an example of this.

Personally, I also like the Ingres table override feature, where if I
reference table "foo", Ingres first looks for a table "foo" owned by
me, and then one owned by the database owner.  I've not explored what
happens if neither I nor the DBA owns a "foo".  It's also unclear what
would happen in that case where multiple other had tables named "foo"
and sufficient permits on them to permit my access.

-- 
=====================================================================
| JAVA must have been developed in the wilds of West Virginia.      |
| After all, why else would it support only single inheritance??    |
=====================================================================
| Finger geek@cmu.edu for my public key.                            |
=====================================================================

Re: RE: [PATCHES] relation filename patch

From
"Ross J. Reedstrom"
Date:
On Mon, May 01, 2000 at 03:12:44PM -0400, Tom Lane wrote:
> "Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu> writes:
> > I didn't go to great trouble to keep the existing hack to the syscache
> > code working for temp tables, because I thought that the primary purpose
> > of that hack was to make sure the physical relation name, i.e. the file
> > name, would not collide for multiple backends. Tacking the OID onto
> > the filename fixes that.
> 
> Not good enough --- the logical names of the temp tables (ie, the names
> they are given in pg_class) can't collide either, or we'll get failures
> from the unique index on pg_class relname.  It could be that schemas
> will provide a better answer to that point though.

Well, that unique index will have to go away, one way or another, since
schema's require multiple persistent tables with the same undecorated
relname in the same db. Adding the schema to the pg_class table is
probably where this has to go, then making the index over both fields.
Then, temp tables can be implemented as a session specific temp schema.

> 
> I concur with your observation that the semantics we give temp tables
> are more than the standard requires --- but I also concur with Bruce
> that it's an exceedingly useful extension.  ISTM if an application has
> to worry about whether its temp table name will collide with something
> else, then much of the value of the feature is lost.  So masking
> permanent table names is the right thing, just as local vars in
> subroutines mask global vars of the same name.  Again, maybe we can
> define schemas in such a way that the same effect can be gotten...

Agreed. The masking does require more than just having a special schema
for temp tables, though. SO, some version of Bruce's macro hack is still
required.

> 
> I didn't see what you did here, but I doubt that it was the right thing.

Probably not: this is my first extensive digging into the backend code.
And the whole reason this is a 'lets try and implement somne of this,
and go back to the discussion' patch, instead of a proposed addition.

In defense of what I _did_: The temp table relname hacking is still
in place, and seems to work, and could be left in place. However, I
knew that relname would not stay unique, once schema are implemented,
but physrelname would (since the smgr needs it). 

> There are two things going on: rel name and rel OID must both be unique
> within a database, but there can be duplicates across databases within
> an installation.  So, most of the backend assumes that rel name or
> rel OID is alone sufficient to identify a relation, and there are just
> a couple of places that interface to installation-wide structures
> (eg, the buffer manager and sinval-queue manager) that know they must
> attach the current database's name or OID to make the identifier
> globally unique.  This is one reason why a backend can't reconnect to
> another database on the fly; we've got no way to find all those
> database-dependent cache entries...
> 

Right. But it seems to me that everywhere the code uses just a relname, it
assumes the currently connected DB, when eventually gets to the reference
by the code that interfaces to the installation-wide structures, no? Once
schema are in place, this can no longer work. One hack would be to have a
'current schema', like current db, but that won't work, one query will
need to refery to schema other than the default for table references
(otherwise, they're not very useful, are they?)

> It's going to be *extremely* painful if more than one name/OID is needed
> to refer to a relation in many places in the backend.  I think we want
> to avoid that if possible.
> 
> It seems to me that what we want is to have schemas control the mapping
> from rel name to rel OID, but to keep rel OID unique within a database.
> So schemas are more closely related to the "temp table hack" than you
> think.
> 

So, relname cannot be enough. Isn't OID already sufficent though? I
thought oids are unique across the entire installation, not just a
particular db. In any case, the solution may be to convert relname
(+default or user supplied schema) to rel oid, as early as possible,
then indexing (and caching) on that. If oids are db specific, the code
that needs system wide uniqueness needs to add the db name/OID, just
like it does for relname.

Actually, I know that schemas are _closely_ related to the temp table hack
(which was Bruce's own word for it, BTW, not mine;-), hence my hijacking
of his RelationGetRelationName and RelationGetPhysicalRelationName
macros. I was hoping to get temp tables back (mostly) for free once
schemas are in, but I think the extended semantics will require leaving
something like the existing code in place.

Ross

-- 
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005


Re: RE: [PATCHES] relation filename patch

From
Tom Lane
Date:
"Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu> writes:
> Well, that unique index will have to go away, one way or another, since
> schema's require multiple persistent tables with the same undecorated
> relname in the same db. Adding the schema to the pg_class table is
> probably where this has to go, then making the index over both fields.
> Then, temp tables can be implemented as a session specific temp schema.

Right, there would probably be a unique index on schema name + relname.

> Right. But it seems to me that everywhere the code uses just a relname, it
> assumes the currently connected DB, when eventually gets to the reference
> by the code that interfaces to the installation-wide structures, no? Once
> schema are in place, this can no longer work. One hack would be to have a
> 'current schema', like current db, but that won't work, one query will
> need to refery to schema other than the default for table references
> (otherwise, they're not very useful, are they?)

I was sort of envisioning a search path of schema names.  Temp table
masking could be implemented by pushing the session-local schema onto
the front of the search path.  Not sure how that relates to SQL3's ideas
about schemas, however.

> So, relname cannot be enough. Isn't OID already sufficent though? I
> thought oids are unique across the entire installation, not just a
> particular db.

Er, well, no.  Consider pg_class, which has both the same name and the
same OID in every DB in the installation --- but we have to treat it
as separately instantiated in each DB.  Most of the system tables work
that way.  OTOH we have a couple of top-level tables like pg_shadow,
which are the same table for all DBs in the installation.

It could be that we ought to eliminate the whole notion of separate
databases within an installation, or more accurately merge it with the
concept of schemas.  Really, the existing database mechanism is sort
of a limited implementation of schemas.

> In any case, the solution may be to convert relname
> (+default or user supplied schema) to rel oid, as early as possible,
> then indexing (and caching) on that.

Right, but there wouldn't be only one default schema, there'd be some
kind of search path in which an unqualified relname would be sought.
        regards, tom lane


Re: RE: [PATCHES] relation filename patch

From
"Ross J. Reedstrom"
Date:
On Mon, May 01, 2000 at 05:27:04PM -0400, Tom Lane wrote:
> 
> I was sort of envisioning a search path of schema names.  Temp table
> masking could be implemented by pushing the session-local schema onto
> the front of the search path.  Not sure how that relates to SQL3's ideas
> about schemas, however.
> 
> > So, relname cannot be enough. Isn't OID already sufficent though? I
> > thought oids are unique across the entire installation, not just a
> > particular db.
> 
> Er, well, no.  Consider pg_class, which has both the same name and the
> same OID in every DB in the installation --- but we have to treat it
> as separately instantiated in each DB.  Most of the system tables work
> that way.  OTOH we have a couple of top-level tables like pg_shadow,
> which are the same table for all DBs in the installation.
> 

Well, in schema-land this taken care of by the fact that all those system
tables live in a schema named information_schema, which is defined as
view on tables in the schema definition_schema. To some extent, our use
of pg_ for all the system tables simulates this.

> It could be that we ought to eliminate the whole notion of separate
> databases within an installation, or more accurately merge it with the
> concept of schemas.  Really, the existing database mechanism is sort
> of a limited implementation of schemas.
> 

See the discussion about this between Peter and I (and Jan?) last time
schemas came up. We agreed that pg's databases map to SQL92 Catalogs
rather nicely, with the whole installation being a 'cluster of catalogs'.
Now, If some one can explain to me what a 'module' is ...

> > In any case, the solution may be to convert relname
> > (+default or user supplied schema) to rel oid, as early as possible,
> > then indexing (and caching) on that.
> 
> Right, but there wouldn't be only one default schema, there'd be some
> kind of search path in which an unqualified relname would be sought.
> 

Perhaps, but that's an extension. SQL92 defines a default SCHEMA
for a session, which is set via the SET SCHEMA statement, strangely
enough. Having nesting SCHEMA's might be useful, but I'm not not sure
how. Getting any at all in there would help a lot. I'd suggest the
default be configurable on a per user basis. That'd allow some nifty
access controls, with just the existing VIEW code.

Turns out I needed these kind of schema's last week: I wanted to create
filtered access to a set of tables. Simple, right? Add booleans, create
views that test the boolean, remove SELECT privilege on the tables.
Only problem, is now I had to go and edit all 200+ SELECT statements
in the application code, to point at the views instead of the tables,
or rename every table, and edit the 200+ SELECTs in the other apps,
that do data entry and maintenance. If I had schema, I'd have changed
the default schema for the 'web' login, and created views that had the
same name as the tables, selecting back into the real tables in their own
schema. 10 mins, and done, don't touch the tested app. code at all! That's
what got me a round-toit, getting this patch off to be discussed.

Ross
-- 
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005


RE: RE: [PATCHES] relation filename patch

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Ross J. Reedstrom [mailto:reedstrm@wallace.ece.rice.edu]
> >
> > I didn't see what you did here, but I doubt that it was the right thing.
>
> Probably not: this is my first extensive digging into the backend code.
> And the whole reason this is a 'lets try and implement somne of this,
> and go back to the discussion' patch, instead of a proposed addition.
>
> In defense of what I _did_: The temp table relname hacking is still
> in place, and seems to work, and could be left in place.

Yes,pararell regression tests all pass here if relacache hashes
on pg_class entry name.

> However, I
> knew that relname would not stay unique, once schema are implemented,
> but physrelname would (since the smgr needs it).
>

It is dangerous to combine logical and physical concepts.
So it seems difficult to use physrelname both as a storage location
and as an unique relation name.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: RE: [PATCHES] relation filename patch

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> > Hmm. Further perusal leads me to believe that this override is a
> > standards extension,

> I am quite surprised people don't like that feature, or at least one
> person doesn't.

FWIW, I always considered that behaviour kind of suspicious. I don't think
it's a big problem in practice and could be useful in certain situations,
but personally I would have chosen not to do it this way.

> If someone else creates a table, it should not prevent me from
> creating a temporary table of the same name.

The proper solution to this is that "somebody" should not be allowed to
create tables wildly. Until we can properly do that it's not worth arguing
this out.

> Our code even masks a real table created in the same session.  Once
> the temp table is dropped, the real table becomes visible again.  See
> the regression tests for an example of this.

The real problem here is that there's no way of finding out whether you
just dropped the temporary table or the "real" one or whether a table is
temporary at all. Sure you can perhaps look into pg_class but tell that to
users.

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: RE: [PATCHES] relation filename patch

From
Bruce Momjian
Date:
> > Our code even masks a real table created in the same session.  Once
> > the temp table is dropped, the real table becomes visible again.  See
> > the regression tests for an example of this.
> 
> The real problem here is that there's no way of finding out whether you
> just dropped the temporary table or the "real" one or whether a table is
> temporary at all. Sure you can perhaps look into pg_class but tell that to
> users.

Do a \d on the table.  If it doesn't show up, it is temporary.  ;-)

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RE: [PATCHES] relation filename patch

From
Peter Eisentraut
Date:
On Tue, 2 May 2000, Bruce Momjian wrote:

> Do a \d on the table.  If it doesn't show up, it is temporary.  ;-)

But if the temporary is shadowing a permanent table then I know less than
nothing.

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden