Thread: What to expect when mixing inherited tables and different schemas while dealing with functions and views?

Hi!


As I said in other messages, I'm planning with a partner of mine to use
inheritance and schemas to separate access and data from several sources
(companies) for a client we have (accounting office).

The idea is:

    - 1 base schema with all table definitions plus functions plus views
    - 1 schema per client he (our client) have where all tables would be
      inherited from the base schema
    - Using views and functions declared on base schema to also manipulate
      and retrieve data from each schema


(Before starting, I've made a dump available at
http://godoy.homeip.net/~godoy/hydrogen.sql containing basic data for
testing, if you're interested you'd only need to change some usernames for
permissions in there.)


But we're facing some problems and before doing anything more serious we'd
like to know what to expect when:

a) We are with some restricted search_path set (e.g. after "SET search_path TO
   schema_1") and we make a "SELECT * FROM base.view".  What we're seeing is
   that views are tied to the schema where they were created, no matter if
   they are or not fully qualified in their definition.  Is this correct?  I'd
   expect views to respect the search_path if they aren't fully qualified
   (i.e. if I created them as "SELECT something FROM table" instead of "SELECT
   something FROM schema.table").

b) We are seeing a really weird behaviour when we use functions.  It appears
   that it disregards the search_path from the second run and on.  If I SELECT
   from a function with the search_path set to, e.g., schema_1, then when I
   set it to schema_2 then I'll still see data from schema_1.  Note, here,
   that even the function being created on the base schema results were
   correctly retrieved at first execution.  (You can repeat that use the above
   dump by connecting, setting the search path to any of three schemas,
   selecting from the function, changing to other schema and then selecting
   again from the same function -- you'll see the same result --; then, if you
   reconnect and do a first select in another schema and change your
   search_path you'll see a different result from the previous connection but
   it will be the same result for both search_paths.)


Are these expected behaviors?  Is there any way to make both views and
functions work with search_path?  It would be really great to be able to
define them only once and use them on all "1k" schemas unchanged.

I believe that the behavior with functions is "a bit" dangerous since it
allows leaking information from one schema to another after a context change.
(These tests aren't taking permissions into account yet, so after applying
those this problem could be reduced but...)



TIA,
--
Jorge Godoy      <jgodoy@gmail.com>


On May 23, 2006, at 9:20 PM, Jorge Godoy wrote:
> a) We are with some restricted search_path set (e.g. after "SET
> search_path TO
>    schema_1") and we make a "SELECT * FROM base.view".  What we're
> seeing is
>    that views are tied to the schema where they were created, no
> matter if
>    they are or not fully qualified in their definition.  Is this
> correct?  I'd
>    expect views to respect the search_path if they aren't fully
> qualified
>    (i.e. if I created them as "SELECT something FROM table" instead
> of "SELECT
>    something FROM schema.table").

Yes. Views essentially end up with schemas hard-coded into them. If
that doesn't work you should be able to create views on set returning
functions, though that's obviously more work.

I don't know how hard it would be to allow views to become
search_path aware on execution, or if such a change would be accepted.

Ultimately though, why is this a problem? Aren't you defining all the
views in their appropriate schema?

> b) We are seeing a really weird behaviour when we use functions.
> It appears
>    that it disregards the search_path from the second run and on.
> If I SELECT
>    from a function with the search_path set to, e.g., schema_1,
> then when I
>    set it to schema_2 then I'll still see data from schema_1.
> Note, here,
>    that even the function being created on the base schema results
> were
>    correctly retrieved at first execution.  (You can repeat that
> use the above
>    dump by connecting, setting the search path to any of three
> schemas,
>    selecting from the function, changing to other schema and then
> selecting
>    again from the same function -- you'll see the same result --;
> then, if you
>    reconnect and do a first select in another schema and change your
>    search_path you'll see a different result from the previous
> connection but
>    it will be the same result for both search_paths.)

This is due to query plan caching. If you grab a new connection every
time you switch companies it won't be an issue. There's also been
talk of adding the ability to 'reset' a connection, but I don't
remember the status of that or if it would reset the query plan cache.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461



Em Quinta 25 Maio 2006 19:27, Jim Nasby escreveu:
>
> Yes. Views essentially end up with schemas hard-coded into them. If
> that doesn't work you should be able to create views on set returning
> functions, though that's obviously more work.
>
> I don't know how hard it would be to allow views to become
> search_path aware on execution, or if such a change would be accepted.
>
> Ultimately though, why is this a problem? Aren't you defining all the
> views in their appropriate schema?

Yes, this is a problem because I have to create the views on each schema.  If
I could just use search_path, then I would end up with only one instance of
each view and function and they'd do the right thing using data from that
particular schema.

As they don't respect the search_path, then I have to create n+1 copies of
each view/function, one for each schema and one for the base schema.  This
looks like inneficient because if I need to change the view, I'll have to
change n+1 views instead of just one.  The same applies to functions :-(

I confess that I expected it to respect the search_path.

> This is due to query plan caching. If you grab a new connection every
> time you switch companies it won't be an issue. There's also been
> talk of adding the ability to 'reset' a connection, but I don't
> remember the status of that or if it would reset the query plan cache.

Making them respect the search_path would be also nice.  I thought that a
VOLATILE functions had no cache, even for the query plan.

One last try...  Languages other than plpgsql doesn't reuse / save the query
plan by default, right?  So if I switch to, e.g., plpythonu I wouldn't,
theoretically, have this problem when running functions, right?


It turns out that this won't work in an easy way in a standard installation of
PostgreSQL... :-(  Unfortunately.  It would save a lot of code, a lot of
redundant definitions and would make life a lot easier to manage the
database.

If we can solve the problem with functions by using a language other than
plpgsql (and of course sql), then we'd need to profile and try using another
language to write our functions and replace views with set returning
functions...  This isn't all that pretty, but instead of changing 1000
instances of each view and each function that needs some modification I
prefer changing one function that doesn't save the query plan (if possible at
all, of course).


Thanks again,
--
Jorge Godoy           <jgodoy@gmail.com>