Unions, schemas, and design questions... - Mailing list pgsql-general
From | Net Virtual Mailing Lists |
---|---|
Subject | Unions, schemas, and design questions... |
Date | |
Msg-id | 20041122023836.9910@mail.net-virtual.com Whole thread Raw |
List | pgsql-general |
I've been spending the last few days converting many databases into a single schema and have completed the process, but now I'm at somewhat of an impasse as to the best way to proceed forward.... It is important for me to explain that each of these databases has a rather different structure, going forward I'm using more of an inheritance model for each new schema, but that was simply was not possible back in the day and I hope one day to make the switch completely but it is just not possible to complete in time for this next thing I need to get done. So with that in mind, let me explain. Each of these schemas has two tables (a "users" table and a "proposals" table - there are actually other tables, but I think this is sufficient for this discussion) but the structure of these two tables is very different between each schema. For each schema there is also a class developed in some language which defines a set of functions necessary to manipulate records in each of these tables (each class defines a basic core class of methods which can be called). Tied into all of this is a user interface which allows users to search through the data, logging in, etc and at this point I have finally gotten the database to a performance level that I am very happy with and I am concerned about the implication of what I now have to implement. It is also probably important to note that each of these tables, within each schema, has a private sequence for each of these tables and I guess the only way to resolve this is to use a single sequence for all the tables, but for some reason that just doesn't sit right with me because it seems to sort of make all these schemas dependent on each other. Now I have the need to add a rather large repository of data which needs to be accessed by each of these schemas (lets say 300,000 rows). The concept is that for each schema I need to be able to tell it which selection of records from this "global pool" it should query. The best way I can think of doing this is with some sort of UNION query, first querying the schema table and then doing a union on the global data and as part of that query doing what is necessary to massage the data into the schemas format. I might point out that "massaging the data" will in and of itself be a rather complex task because it essentially would involve almost an on-the-fly data conversion, for things like which category the proposal is in (since each schema defines these differently), etc - but I don't want to think too much about those specifics right now.... The data file which feeds this "global pool" gets updated on a daily basis and the thought of pre-processing the data and inserting appropriate records into each schema is not very appealing from both a disk-space issue and the time it would take to process the file for 50+ different schemas (which is what this will likely grow to). Oh and to add to it, it needs to be possible for each schema to essentially make a copy of the global data into its local space where it can be modified by a user at which point the local version of that record needs to override the global one. ... I am not asking for a solution to all of this, just some thoughts as to possible strategies one might use to cope with this sort of thing and retain the performance..... Thanks! - Greg
pgsql-general by date: