Thread: Everything is now "required by the database system"

Everything is now "required by the database system"

From
Peter Eisentraut
Date:
With the new dependency system we have the entire system catalog content
pinned down and unchangeable.  This is a tiny dent in the nice extensible
nature of the system.

Would it be feasible to identify the non-essential parts of the built-in
objects (say, inet type, numeric type, associated functions, etc.) and
declare those with regular SQL commands in initdb?  In the end, the system
catalog contents in include/catalog/ would only contain the "bootstrap"
content.  For example, the pg_proc content could be made more manageable
that way.

Not sure if this is worth considering for this release, but it might be a
medium-term project.

Comments?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Everything is now "required by the database system"

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> With the new dependency system we have the entire system catalog content
> pinned down and unchangeable.  This is a tiny dent in the nice extensible
> nature of the system.
> 
> Would it be feasible to identify the non-essential parts of the built-in
> objects (say, inet type, numeric type, associated functions, etc.) and
> declare those with regular SQL commands in initdb?  In the end, the system
> catalog contents in include/catalog/ would only contain the "bootstrap"
> content.  For example, the pg_proc content could be made more manageable
> that way.
> 
> Not sure if this is worth considering for this release, but it might be a
> medium-term project.

Uh, some tools rely on those oids being fixed values, don't they?  For
example, I see ecpg using NUMERICOID.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Everything is now "required by the database system"

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> With the new dependency system we have the entire system catalog content
> pinned down and unchangeable.  This is a tiny dent in the nice extensible
> nature of the system.

It's still "extensible", it's just not so easily "contractible"...

I'm not sure that this matters, as I've never heard of anyone actually
troubling to remove unused datatypes etc.

> Would it be feasible to identify the non-essential parts of the built-in
> objects (say, inet type, numeric type, associated functions, etc.) and
> declare those with regular SQL commands in initdb?  In the end, the system
> catalog contents in include/catalog/ would only contain the "bootstrap"
> content.  For example, the pg_proc content could be made more manageable
> that way.

No, it would become a lot less manageable because we'd have a harder
time controlling OIDs for builtin types and functions.  We'd end up
having to push everything we deemed inessential out to non-builtin
status (compare the contrib items that create new types).  While there's
some stuff like money and the geometric types that maybe deserve such
demotion, there's not enough to get me excited about trimming it.

While reviewing the pg_depend patch I was hoping that we could pin just
a subset of the initial catalog contents, but eventually decided it was
(a) tricky and (b) not worth the trouble.
        regards, tom lane


Re: Everything is now "required by the database system"

From
Hannu Krosing
Date:
On Tue, 2002-08-13 at 22:38, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > With the new dependency system we have the entire system catalog content
> > pinned down and unchangeable.  This is a tiny dent in the nice extensible
> > nature of the system.
> 
> It's still "extensible", it's just not so easily "contractible"...
> 
> I'm not sure that this matters, as I've never heard of anyone actually
> troubling to remove unused datatypes etc.

It could become an issue if PostgreSQL became populat in embedded
systems, but then it can of course be done in include/catalog/.

> > Would it be feasible to identify the non-essential parts of the built-in
> > objects (say, inet type, numeric type, associated functions, etc.) and
> > declare those with regular SQL commands in initdb?  In the end, the system
> > catalog contents in include/catalog/ would only contain the "bootstrap"
> > content.  For example, the pg_proc content could be made more manageable
> > that way.
> 
> No, it would become a lot less manageable because we'd have a harder
> time controlling OIDs for builtin types and functions.

We have COPY ... WITH OIDS for some time already. 

Maybe we should also allow setting OID in INSERT and UPDATE ?

It could be a good idea to give out OID ranges for contrib modules so
that frontends would not need to worry about changing binary formats for
same types.

That could also suggest that the new int8-based datetime type should
have a separate OID from the old one.

> We'd end up
> having to push everything we deemed inessential out to non-builtin
> status (compare the contrib items that create new types).  While there's
> some stuff like money and the geometric types

It would be nice if for example GEOMETRY could be a separate installable
package (a datablade in Illustra parlance).

IP types (cidr, macadr) are also a good candidate for non-builtin type

money type could be a package by its own ;)

> that maybe deserve such
> demotion, there's not enough to get me excited about trimming it.
> While reviewing the pg_depend patch I was hoping that we could pin just
> a subset of the initial catalog contents, but eventually decided it was
> (a) tricky 

True

> (b) not worth the trouble.

But it could still be something to watch out for doing in the future.

Of course we will have then package dependency issues, but most likely
at least the GEOMETRY,IP and MONEY packages don't need each other.

There are also two kinds of builtins - things that are almost
exclusively used by system (smgr, oidvector, int2vector, tid, xid, cid,
regproc, refcursor, aclitem, name) and basic types of general utility
(int, date, text, ...)

Probably every type not used in system tables themselves could be made
loadable after initdb.

-----------------
Hannu




Re: Everything is now "required by the database system"

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
> On Tue, 2002-08-13 at 22:38, Tom Lane wrote:
>> It's still "extensible", it's just not so easily "contractible"...
>> 
>> I'm not sure that this matters, as I've never heard of anyone actually
>> troubling to remove unused datatypes etc.

> It could become an issue if PostgreSQL became populat in embedded
> systems, but then it can of course be done in include/catalog/.

For an embedded system I'd think you'd want to strip out the support
code for the unwanted types (ie, the utils/adt/ file(s)), not only the
catalog entries.  So it's source code changes in any case.  The catalog
entries alone occupy so little space that it's not even worth anyone's
trouble to remove them, AFAICS.

> Probably every type not used in system tables themselves could be made
> loadable after initdb.

It certainly *could* be done.  Whether it's worth the trouble is highly
doubtful.  I'd also be concerned about the performance hit (loadable
functions are noticeably slower than built-ins).

Again, when was the last time you heard of anyone actually bothering to
remove built-in entries from pg_proc or pg_type?  I can't see expending
a considerable amount of work on a "feature" that no one will use.
        regards, tom lane


Re: Everything is now "required by the database system"

From
Hannu Krosing
Date:
On Wed, 2002-08-14 at 00:38, Tom Lane wrote:
> Hannu Krosing <hannu@tm.ee> writes:
> > On Tue, 2002-08-13 at 22:38, Tom Lane wrote:
> >> It's still "extensible", it's just not so easily "contractible"...
> >> 
> >> I'm not sure that this matters, as I've never heard of anyone actually
> >> troubling to remove unused datatypes etc.
> 
> > It could become an issue if PostgreSQL became populat in embedded
> > systems, but then it can of course be done in include/catalog/.
> 
> For an embedded system I'd think you'd want to strip out the support
> code for the unwanted types (ie, the utils/adt/ file(s)), not only the
> catalog entries.  So it's source code changes in any case. The catalog
> entries alone occupy so little space that it's not even worth anyone's
> trouble to remove them, AFAICS.

But if the types themselves were installable, then it would also mean
that unneeded utils/adt/ code would not be installed without need.

> > Probably every type not used in system tables themselves could be made
> > loadable after initdb.
> 
> It certainly *could* be done.  Whether it's worth the trouble is highly
> doubtful.  I'd also be concerned about the performance hit (loadable
> functions are noticeably slower than built-ins).

Really ?

How much is the performance hit ?

Is it unavaoidable ? 

Is it the same on all systems ?

Is it the same for both new and old style C functions ?

Is the performance hit only the first time (when function is loaded) or
every time ?

> Again, when was the last time you heard of anyone actually bothering to
> remove built-in entries from pg_proc or pg_type?

I have sometimes removed _my_own_ unused types/functions before shipping
a product ;)

> I can't see expending a considerable amount of work on a "feature" that
> no one will use.

Sure.

-----------
Hannu





Re: Everything is now "required by the database system"

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
> On Wed, 2002-08-14 at 00:38, Tom Lane wrote:
>> For an embedded system I'd think you'd want to strip out the support
>> code for the unwanted types (ie, the utils/adt/ file(s)), not only the
>> catalog entries.

> But if the types themselves were installable, then it would also mean
> that unneeded utils/adt/ code would not be installed without need.

Only if someone went to the trouble of repackaging each such datatype
as a separate shared library.  That's a lot more work than what I
understood Peter to be suggesting (namely, install the catalog entries
a little later during initdb).

>> I'd also be concerned about the performance hit (loadable
>> functions are noticeably slower than built-ins).

> Really ?

Yup.

> How much is the performance hit ?

I'm not sure, but it's nontrivial.  I recall Oleg and Teodor moaning
about the poor performance of GiST-index contrib modules awhile back,
at a time when we did fmgr_info for index access functions once per
call instead of caching the results in the index' relcache entry.
It didn't hurt for built-in access functions but you sure noticed it
for loadable ones.

It might be feasible to persuade fmgr.c or dfmgr.c to short-circuit
lookups of dynamic functions after the first time, by keeping an
internal lookup table comparable to the compile-time-constant table
we have for builtins.  This would not make the problem go away, but
it would lessen the performance hit...
        regards, tom lane