Thread: Catalogs design question

Catalogs design question

From
"Steve Howe"
Date:
Hello all!!

       I'm developer of a interface for PostgreSQL for the Borland Kylix
and Delphi tools (http://www.vitavoom.com). I've run into the following
problems with catalogs:
       - pg_group: the grolist field is an array. How can I make a query
that tell me the usernames of a group ?       - pg_proc: the proargtypes field is an array. How can I make a query
that will link those types to the pg_types catalog ???
       This catalog design seems a very crude hack to make the things
working for me. Can't those relations be separated in another table ? Or
maybe a function that can search for a value in array, and make a wroking
reference for an array
element in a relation (something like "select typname from pg_type, pg_group
where oid
in grolist").       I also quote the PotgreSQL user manual
(http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/arrays.html):

"Tip: Arrays are not lists; using arrays in the manner described in the
previous paragraph is often a sign of database misdesign. The array field
should generally be split off into a separate table. Tables can obviously be
searched easily."

Best Regards,
Steve Howe








Re: Catalogs design question

From
Bruce Momjian
Date:
Yes, we inherited these arrays from Berkeley and haven't had any need to
remove them.  Are you trying to do things that the other interfaces like
ODBC and JDBC don't handle?

The group array is a hack but the pg_proc array would be hard to replace
becauseit acts as part of the unique key used for cache lookups.

---------------------------------------------------------------------------

> Hello all!!
> 
> 
>         I'm developer of a interface for PostgreSQL for the Borland Kylix
> and Delphi tools (http://www.vitavoom.com). I've run into the following
> problems with catalogs:
> 
>         - pg_group: the grolist field is an array. How can I make a query
> that tell me the usernames of a group ?
>         - pg_proc: the proargtypes field is an array. How can I make a query
> that will link those types to the pg_types catalog ???
> 
>         This catalog design seems a very crude hack to make the things
> working for me. Can't those relations be separated in another table ? Or
> maybe a function that can search for a value in array, and make a wroking
> reference for an array
> element in a relation (something like "select typname from pg_type, pg_group
> where oid
> in grolist").
>         I also quote the PotgreSQL user manual
> (http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/arrays.html):
> 
> "Tip: Arrays are not lists; using arrays in the manner described in the
> previous paragraph is often a sign of database misdesign. The array field
> should generally be split off into a separate table. Tables can obviously be
> searched easily."
> 
> Best Regards,
> Steve Howe
> 
> 
> 
> 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Catalogs design question

From
"Steve Howe"
Date:
Hello Bruce!

> Yes, we inherited these arrays from Berkeley and haven't had any need to
> remove them.  Are you trying to do things that the other interfaces like
> ODBC and JDBC don't handle?
About the groups: I just want to write a function that will return the users
names belonged by a given group. I understand I can load the arrays in
memory, then sequentially compare the members from pg_shadow, but doing it
goes against the database priciple after all.
About the procs: the Borland's dbExpress specification demands a
input/output list of parameters for stored procedures, and I'm going to use
functions as stored procedures. But I need to make a types list to be able
list what are those params.

> The group array is a hack but the pg_proc array would be hard to replace
> becauseit acts as part of the unique key used for cache lookups.
This design itself bothers me.
We have no other option left  ? Like arrays being referenced in relations ?
That's far from perfect, but at least would solve those issues and others
which might appear in other catalogs...

Best Regards,
Steve Howe



Re: Catalogs design question

From
"Steve Howe"
Date:
> >         I also quote the PotgreSQL user manual
> >
(http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/arrays.html):
>
> In the contrib/ directory are procedures to search arrays for values.
> This may help.


Thanks for the tip, but in fact I've seen them (and they're listed on the
same document I pointed on the original message).
These are sequential (slow) searches, and can't be indexed. in resume:
nothing but another crude hack :). I could even use it, but I can';t tell my
users "oh this feature works but you must compile this contrib code inyo
your servers". Many users can't do it, and many don't even know how to do it
:(

Best Regards,
Steve Howe



Re: Catalogs design question

From
Oleg Bartunov
Date:
Hi,

I think Bruce meant contrib/intarray which provides incredibly fast
indexed access to arrays of integers, which is your case.
We use it a lot, particularly in our full text search engine (OpenFTS).
regards,
Oleg
On Sat, 20 Oct 2001, Steve Howe wrote:

> > >         I also quote the PotgreSQL user manual
> > >
> (http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/arrays.html):
> >
> > In the contrib/ directory are procedures to search arrays for values.
> > This may help.
>
>
> Thanks for the tip, but in fact I've seen them (and they're listed on the
> same document I pointed on the original message).
> These are sequential (slow) searches, and can't be indexed. in resume:
> nothing but another crude hack :). I could even use it, but I can';t tell my
> users "oh this feature works but you must compile this contrib code inyo
> your servers". Many users can't do it, and many don't even know how to do it
> :(
>
> Best Regards,
> Steve Howe
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
Regards,    Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83



Re: Catalogs design question

From
Peter Eisentraut
Date:
Steve Howe writes:

> > The group array is a hack but the pg_proc array would be hard to replace
> > becauseit acts as part of the unique key used for cache lookups.
> This design itself bothers me.
> We have no other option left  ? Like arrays being referenced in relations ?
> That's far from perfect, but at least would solve those issues and others
> which might appear in other catalogs...

In general, the system catalogs are far from a perfect example (or even an
example at all) for pure, normalized relational database design.  A more
important concern in processing efficiency.  For instance, currently the
execution of a procedure takes one catalog lookup versus (1 + nargs) in a
more normalized design.  (This is an oversimplification, but you get the
idea.)

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



Re: Catalogs design question

From
Tom Lane
Date:
"Steve Howe" <howe@carcass.dhs.org> writes:
>> The group array is a hack but the pg_proc array would be hard to replace
>> becauseit acts as part of the unique key used for cache lookups.

> This design itself bothers me.
> We have no other option left  ? Like arrays being referenced in relations ?

Sure, it *could* be done another way.  As far as pg_proc goes, I agree
with Bruce: there are far too many places that know the existing
representation for us to consider changing it.  The pain involved would
vastly outweigh any possible benefit.

The representation of groups is not so widely known, however.  We could
probably get away with changing it, if someone wanted to propose a
better catalog schema and do the legwork to make it happen.
        regards, tom lane


Re: Catalogs design question

From
Haller Christoph
Date:
Hi Steve, 

Your question about  - pg_proc 
select t.typname from pg_type t , pg_proc p
where p.proname = '<your_stored_procedure>' and p.proargtypes[0] = t.oid ;
select t.typname from pg_type t , pg_proc p
where p.proname = '<your_stored_procedure>' and p.proargtypes[1] = t.oid ;
...
select t.typname from pg_type t , pg_proc p
where p.proname = '<your_stored_procedure>' and p.proargtypes[7] = t.oid ;

As far as I understand the proargtypes entries 0 means no further parameter. 
This oidvector type of proargtypes seems to have a start index of 0. 
As long as there are at maximum 8 parameters allowed, this looks practicable. 


Your question about  - pg_group 
The pg_group column is more bulky, because the int4[] type does not have 
an upper limit. 
So, the only solution I can see is 
get the number of array elements of the group you want to query 
select array_dims(grolist) from pg_group where groname = '<your_group>';

and then generate automatically a query like 

select u.usename from pg_user u , pg_group g where g.grolist[1] = u.usesysid and g.groname='<your_group>' 
union
select u.usename from pg_user u , pg_group g where g.grolist[2] = u.usesysid and g.groname='<your_group>' 
union
...
select u.usename from pg_user u , pg_group g where g.grolist[n] = u.usesysid and g.groname='<your_group>' ;

This looks very much like another crude hack you've already 
complained about. Sorry, but I can't help. 

Two more items I do not understand:
You said, the procedures to search arrays in contrib/ are slow. 
Maybe that's true, but usually you do not have thousands of users 
in a group, don't you. 
You said, many users cannot compile this contrib code. Yes, and they 
are not supposed to do so, because it's up to a system admin to do. 
What do I miss here? 

Regards, Christoph 


Re: Catalogs design question

From
Joel Burton
Date:
On Sat, 20 Oct 2001, Steve Howe wrote:

> Hello all!!
>
>
>         I'm developer of a interface for PostgreSQL for the Borland Kylix
> and Delphi tools (http://www.vitavoom.com). I've run into the following
> problems with catalogs:
>
>         - pg_group: the grolist field is an array. How can I make a query
> that tell me the usernames of a group ?
>         - pg_proc: the proargtypes field is an array. How can I make a query
> that will link those types to the pg_types catalog ???
>
>         This catalog design seems a very crude hack to make the things
> working for me. Can't those relations be separated in another table ? Or
> maybe a function that can search for a value in array, and make a wroking
> reference for an array
> element in a relation (something like "select typname from pg_type, pg_group
> where oid
> in grolist").
>         I also quote the PotgreSQL user manual
> (http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/arrays.html):

In the contrib/ directory are procedures to search arrays for values.
This may help.

-- 

Joel BURTON  |  joel@joelburton.com  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant



Re: Catalogs design question

From
"Steve Howe"
Date:
Hello Haller!!
>
> Your question about  - pg_proc
> select t.typname from pg_type t , pg_proc p
> where p.proname = '<your_stored_procedure>' and p.proargtypes[0] = t.oid ;
> select t.typname from pg_type t , pg_proc p
> where p.proname = '<your_stored_procedure>' and p.proargtypes[1] = t.oid ;
> ...
> select t.typname from pg_type t , pg_proc p
> where p.proname = '<your_stored_procedure>' and p.proargtypes[7] = t.oid ;
>
> As far as I understand the proargtypes entries 0 means no further
parameter.
> This oidvector type of proargtypes seems to have a start index of 0.
> As long as there are at maximum 8 parameters allowed, this looks
practicable.
There is such a limit ? I didn't know. This makes your code a working way.
I'll look further on this later... and even if it's not a query that I would
say it's beautiful, it's a way, thanks :).

> Your question about  - pg_group
> The pg_group column is more bulky, because the int4[] type does not have
> an upper limit.
> So, the only solution I can see is
> get the number of array elements of the group you want to query
> select array_dims(grolist) from pg_group where groname = '<your_group>';
>
> and then generate automatically a query like
>
> select u.usename from pg_user u , pg_group g where
>  g.grolist[1] = u.usesysid and g.groname='<your_group>'
> union
> select u.usename from pg_user u , pg_group g where
>  g.grolist[2] = u.usesysid and g.groname='<your_group>'
> union
> ...
> select u.usename from pg_user u , pg_group g where
>  g.grolist[n] = u.usesysid and g.groname='<your_group>' ;
>
> This looks very much like another crude hack you've already
> complained about. Sorry, but I can't help.
Yes, it's ugly code. I would rather write a function, but again I can't
assume the user has pl/perl or pl/pgsql (or any other).

> Two more items I do not understand:
> You said, the procedures to search arrays in contrib/ are slow.
> Maybe that's true, but usually you do not have thousands of users
> in a group, don't you.
Yes. I would use it if I can.
> You said, many users cannot compile this contrib code. Yes, and they
> are not supposed to do so, because it's up to a system admin to do.
> What do I miss here?
Oh, I develop an interface for PostgreSQL called
pgExpress(http://www.vitavoom.com) - it's like an ODBC driver or such. I
must provide the functionality I described for the driver users; it's not
for me. I would of course have compiled and used the contrib code. But the
driver must work "out-of-the-box", and requiring a recompile (where many
times is impossible to users) is not a solution...
Right now, I'm hardcoding that relation inside the driver, what's also not
what I dreamed about, but I seem to have no other choice.

Thanks for the ideas btw :)

Best Regards,
Steve Howe



Re: Catalogs design question

From
"Steve Howe"
Date:
Hello Haller!!!
> Your question about  - pg_proc
> select t.typname from pg_type t , pg_proc p
> where p.proname = '<your_stored_procedure>' and p.proargtypes[0] = t.oid ;
> select t.typname from pg_type t , pg_proc p
> where p.proname = '<your_stored_procedure>' and p.proargtypes[1] = t.oid ;
> ...
> select t.typname from pg_type t , pg_proc p
> where p.proname = '<your_stored_procedure>' and p.proargtypes[7] = t.oid ;
>
> As far as I understand the proargtypes entries 0 means no further
parameter.
> This oidvector type of proargtypes seems to have a start index of 0.
> As long as there are at maximum 8 parameters allowed, this looks
practicable.
There is no limit on the number of arguments. An user could create a weird
function like this:

howe=# CREATE FUNCTION test2(int2, int2, int2, int2, int2, int2, int2, int2,
int2, int2, int2, int2, int2) RETURNS int4AS 'SELECT 1 AS RESULT' LANGUAGE 'sql';
CREATE

and it would be allowed...

howe=# select proargtypes from pg_proc where proname='test';             proargtypes
----------------------------------------21 21 21 21 21 21 21 21 21 21 21 21 21
(1 row)

Again, the problem is that I can't predict (nor limit) what users will try
to do...


Best Regards,
Steve Howe



Re: Catalogs design question

From
Tom Lane
Date:
"Steve Howe" <howe@carcass.dhs.org> writes:
>> As long as there are at maximum 8 parameters allowed, this looks
>> practicable.

> There is no limit on the number of arguments.

You're both wrong: the limit is FUNC_MAX_ARGS, which hasn't been 8 in
quite some time.  It's presently 16 by default, and can be configured
higher at build time.

For the purposes of a frontend application, I think it's best to assume
that the specific limit is unknown --- ie, you should be able to
interoperate with a backend regardless of the FUNC_MAX_ARGS value it
was built with.
        regards, tom lane