Thread: Foreign key to a view (UNION of two or more tables), any alternative?
Foreign key to a view (UNION of two or more tables), any alternative?
From
Jose Gonzalez Gomez
Date:
Hi there, I have the following problem: I'm developing some forms for data entry for some clinical studies, with data stored on a PostgreSQL database. The study is based in a questionnaire, with stablished options for almost every question. They want answers to be stored as numeric codes so they can later use the information for statistical analysis. The current approach is to have some tables with valid answers and codes, and then a table with questionnaire data, including foreign keys to the corresponding table. Now imagine you have a question that may be answered with yes (1) or no (2). I would have something like the following: CREATE TABLE yes_no ( code integer NOT NULL, description varchar(255), CONSTRAINT "yes_no_PK" PRIMARY KEY (code) ) INSERT INTO yes_no (code,description) VALUES (1,'yes'); INSERT INTO yes_no (code,description) VALUES (2,'no'); CREATE TABLE questionnaire ( ....... someQuestion integer, ...... CONSTRAINT "someQuestion_FK" FOREIGN KEY (someQuestion) REFERENCES yes_no (code) ) The problem comes when you have questions that may be not applicable (8), or optional (doesn't know, doesn't answer) (9). The easy solution would be to have four tables: yes_no yes_no_not_applicable yes_no_optional yes_no_not_applicable_optional and then stablish foreign keys to the corresponding tables. This is quite cumbersome, as the example I have used here just have two possible values, but there are some of this codifications that include hundred of options (for example, cities): I would have four copies of the same information with all the related problems. My first try to solve this was to create a "data" table, a optional table, a not applicable table and then create views as needed: Tables: yes_no not_applicable optional Views (when needed) yes_no_not_applicable (SELECT * FROM yes_no UNION SELECT * FROM not_applicable) yes_no_optional (same idea) yes_no_not_applicable_optional (same idea) But this can't be done in PostgreSQL, as you can't create a foreign key referencing a view. So the question is: can this be solved in an easy portable way? maybe I should rethink the design so special values (not applicable / optional) aren't represented the same way or in the same column that real data? Any thoughts are really appreciated, best regards Jose
Re: Foreign key to a view (UNION of two or more tables), any alternative?
From
Bruno Wolff III
Date:
On Fri, Jun 17, 2005 at 14:35:01 +0200, Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote: > > The problem comes when you have questions that may be not applicable > (8), or optional (doesn't know, doesn't answer) (9). The easy solution > would be to have four tables: > > yes_no > yes_no_not_applicable > yes_no_optional > yes_no_not_applicable_optional How about having a table with the valid codes for each question? This should be relatively easy maintain and you can easily set up a foreign key reference to this table to enforce integrity.
Re: Foreign key to a view (UNION of two or more tables), any alternative?
From
Jose Gonzalez Gomez
Date:
On 6/17/05, Bruno Wolff III <bruno@wolff.to> wrote: > On Fri, Jun 17, 2005 at 14:35:01 +0200, > Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote: > > > > The problem comes when you have questions that may be not applicable > > (8), or optional (doesn't know, doesn't answer) (9). The easy solution > > would be to have four tables: > > > > yes_no > > yes_no_not_applicable > > yes_no_optional > > yes_no_not_applicable_optional > > How about having a table with the valid codes for each question? > This should be relatively easy maintain and you can easily set up > a foreign key reference to this table to enforce integrity. > There would be no problem in doing so with such an easy case, but think about having a table with cities (hundred, thousands?) and then have four copies for each of the above posibilities with its related maintenance nightmare.
On 06/19/2005 11:16:34 AM, Jose Gonzalez Gomez wrote: > On 6/17/05, Bruno Wolff III <bruno@wolff.to> wrote: > > On Fri, Jun 17, 2005 at 14:35:01 +0200, > > Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote: > > > > > > The problem comes when you have questions that may be not > applicable > > > (8), or optional (doesn't know, doesn't answer) (9). The easy > solution > > > would be to have four tables: > > > > > > yes_no > > > yes_no_not_applicable > > > yes_no_optional > > > yes_no_not_applicable_optional > > > > How about having a table with the valid codes for each question? > > This should be relatively easy maintain and you can easily set up > > a foreign key reference to this table to enforce integrity. > > > There would be no problem in doing so with such an easy case, but > think about having a table with cities (hundred, thousands?) and then > have four copies for each of the above posibilities with its related > maintenance nightmare. So the problem then is that there are codes (e.g. cities) that are used by multiple questions, sometimes optional or N/A is allowed and sometimes not. Don't use constraints, use triggers instead and have them check that the data is on the appropriate table. You then have two approaches. The first is completely dynamic. You have a "control" table with a row for every column (question). In the row you store whether or not n/a is allowed, whether or not optional is allowed, and what table to use for validation otherwise. The trigger reads the control table for each column/question and validates. The trigger uses plpgsql EXECUTE (or equivalent) to dynamically look up the data value in the appropriate table. The second approach is to hardcode the trigger. I'd use m4 as a pre-processor as your code will be very repetitious. Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Re: Foreign key to a view (UNION of two or more tables), any alternative?
From
Jose Gonzalez Gomez
Date:
On 6/19/05, Karl O. Pinc <kop@meme.com> wrote: > > On 06/19/2005 11:16:34 AM, Jose Gonzalez Gomez wrote: > > On 6/17/05, Bruno Wolff III <bruno@wolff.to> wrote: > > > On Fri, Jun 17, 2005 at 14:35:01 +0200, > > > Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote: > > > > > > > > The problem comes when you have questions that may be not > > applicable > > > > (8), or optional (doesn't know, doesn't answer) (9). The easy > > solution > > > > would be to have four tables: > > > > > > > > yes_no > > > > yes_no_not_applicable > > > > yes_no_optional > > > > yes_no_not_applicable_optional > > > > > > How about having a table with the valid codes for each question? > > > This should be relatively easy maintain and you can easily set up > > > a foreign key reference to this table to enforce integrity. > > > > > There would be no problem in doing so with such an easy case, but > > think about having a table with cities (hundred, thousands?) and then > > have four copies for each of the above posibilities with its related > > maintenance nightmare. > > So the problem then is that there are codes (e.g. cities) that are > used by multiple questions, sometimes optional or N/A is allowed > and sometimes not. > > Don't use constraints, use triggers instead and have them check that > the data is on the appropriate table. You then have two approaches. > The first is completely dynamic. You have a "control" table > with a row for every column (question). In the row > you store whether or not n/a is allowed, whether or not optional > is allowed, and what table to use for validation otherwise. > The trigger reads the control table for each column/question > and validates. The trigger uses plpgsql EXECUTE (or equivalent) > to dynamically look up the data value in the appropriate table. > > The second approach is to hardcode the trigger. I'd use m4 > as a pre-processor as your code will be very repetitious. > Thanks a lot for your suggestion. I like its dynamic nature, and I had planned to have something like that raising a bit the level of abstraction, but for a future project (they do a lot of questionnaires here :o) ) I've been thinking about the problem and I finally opted for another solution: I'm going to separate this data in two columns: answer and reason for unavailable answer. This would double the number of columns for questions where optional/ N/A allowed, but this way I may keep data integrity using a combination of referential integrity constraints (valid codes), and check constraints (only one of those two columns with a value other than null); this way I also avoid writing triggers that should be translated in the case of an hypotetical DBMS change. And I also have the feeling that this is a more correct design from a conceptual point of view. I'll solve the issue of having directly usable data for statistics (just one column per question) using a view. If anyone thinks there's a better approach, I'd be glad to hear... Thanks again, best regards Jose
Re: Foreign key to a view (UNION of two or more tables), any alternative?
From
Bruno Wolff III
Date:
On Sun, Jun 19, 2005 at 17:16:34 +0100, Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote: > > > There would be no problem in doing so with such an easy case, but > think about having a table with cities (hundred, thousands?) and then > have four copies for each of the above posibilities with its related > maintenance nightmare. You still have to maintain the data somehow. I think it would be easier with more cities to have a table holding the results rather than try to hard code something into the table definition.
* Karl O. Pinc <kop@meme.com> wrote: <snip> > So the problem then is that there are codes (e.g. cities) that are > used by multiple questions, sometimes optional or N/A is allowed > and sometimes not. For such cases you could introduce another layer, like a datatype. Each question can be answered with some datatype, and may optionally be empty - just like rows in a database :). CREATE TABLE q_types ( id oid default nextval('q_type_id'), description text, .. ); CREATE TABLE q_type_values ( type_id oid, answer_id oid, title text, ..., PRIMARY KEY(type_id, answer_id) ); CREATE TABLE q_question ( id oid default nextval, qtype oid references q_types (id), question text ); CREATE TABLE q_answer ( user_id oid references q_user(id), question_id oid references q_question(id), value oid ); ... Maintaining the integrity of q_answer.value is a little bit more complicated. I don't know if its possible with an foreign key, since it spans over multiple tables ( question_id->qtype + value ) You probably need an hand-written trigger. Unanswered questions (or selected n/a) could be marked by simply setting value to NULL. You also could introduce a separate flag in q_answer for that. BTW: its probably not such a bad idea to present the whole stuff as one writable view to the frontend and let the database do the logic of mapping answer texts <-> ids. cu -- --------------------------------------------------------------------- Enrico Weigelt == metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: contact@metux.de --------------------------------------------------------------------- Realtime Forex/Stock Exchange trading powered by postgresSQL :)) http://www.fxignal.net/ ---------------------------------------------------------------------