Thread: Foreign key to a view (UNION of two or more tables), any alternative?

Foreign key to a view (UNION of two or more tables), any alternative?

From
Jose Gonzalez Gomez
Date:
Hi there,

I have the following problem: I'm developing some forms for data entry
for some clinical studies, with data stored on a PostgreSQL database.
The study is based in a questionnaire, with stablished options for
almost every question. They want answers to be stored as numeric codes
so they can later use the information for statistical analysis. The
current approach is to have some tables with valid answers and codes,
and then a table with questionnaire data, including foreign keys to
the corresponding table.

Now imagine you have a question that may be answered with yes (1) or
no (2). I would have something like the following:

CREATE TABLE yes_no
  (
  code integer NOT NULL,
  description varchar(255),
  CONSTRAINT "yes_no_PK" PRIMARY KEY (code)
  )

INSERT INTO yes_no (code,description) VALUES (1,'yes');
INSERT INTO yes_no (code,description) VALUES (2,'no');

CREATE TABLE questionnaire
  (
  .......
  someQuestion integer,
  ......
  CONSTRAINT "someQuestion_FK" FOREIGN KEY (someQuestion)
      REFERENCES yes_no (code)
  )

The problem comes when you have questions that may be not applicable
(8), or optional (doesn't know, doesn't answer) (9). The easy solution
would be to have four tables:

yes_no
yes_no_not_applicable
yes_no_optional
yes_no_not_applicable_optional

and then stablish foreign keys to the corresponding tables. This is
quite cumbersome, as the example I have used here just have two
possible values, but there are some of this codifications that include
hundred of options (for example, cities): I would have four copies of
the same information with all the related problems.

My first try to solve this was to create a "data" table, a optional
table, a not applicable table and then create views as needed:

Tables:
yes_no
not_applicable
optional

Views (when needed)
yes_no_not_applicable (SELECT * FROM yes_no UNION SELECT * FROM not_applicable)
yes_no_optional (same idea)
yes_no_not_applicable_optional (same idea)

But this can't be done in PostgreSQL, as you can't create a foreign
key referencing a view.

So the question is: can this be solved in an easy portable way? maybe
I should rethink the design so special values (not applicable /
optional) aren't represented the same way or in the same column that
real data?

Any thoughts are really appreciated, best regards
Jose

Re: Foreign key to a view (UNION of two or more tables), any alternative?

From
Bruno Wolff III
Date:
On Fri, Jun 17, 2005 at 14:35:01 +0200,
  Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote:
>
> The problem comes when you have questions that may be not applicable
> (8), or optional (doesn't know, doesn't answer) (9). The easy solution
> would be to have four tables:
>
> yes_no
> yes_no_not_applicable
> yes_no_optional
> yes_no_not_applicable_optional

How about having a table with the valid codes for each question?
This should be relatively easy maintain and you can easily set up
a foreign key reference to this table to enforce integrity.

Re: Foreign key to a view (UNION of two or more tables), any alternative?

From
Jose Gonzalez Gomez
Date:
On 6/17/05, Bruno Wolff III <bruno@wolff.to> wrote:
> On Fri, Jun 17, 2005 at 14:35:01 +0200,
>   Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote:
> >
> > The problem comes when you have questions that may be not applicable
> > (8), or optional (doesn't know, doesn't answer) (9). The easy solution
> > would be to have four tables:
> >
> > yes_no
> > yes_no_not_applicable
> > yes_no_optional
> > yes_no_not_applicable_optional
>
> How about having a table with the valid codes for each question?
> This should be relatively easy maintain and you can easily set up
> a foreign key reference to this table to enforce integrity.
>
There would be no problem in doing so with such an easy case, but
think about having a table with cities (hundred, thousands?) and then
have four copies for each of the above posibilities with its related
maintenance nightmare.

Re: Foreign key to a view (UNION of two or more tables),

From
"Karl O. Pinc"
Date:
On 06/19/2005 11:16:34 AM, Jose Gonzalez Gomez wrote:
> On 6/17/05, Bruno Wolff III <bruno@wolff.to> wrote:
> > On Fri, Jun 17, 2005 at 14:35:01 +0200,
> >   Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote:
> > >
> > > The problem comes when you have questions that may be not
> applicable
> > > (8), or optional (doesn't know, doesn't answer) (9). The easy
> solution
> > > would be to have four tables:
> > >
> > > yes_no
> > > yes_no_not_applicable
> > > yes_no_optional
> > > yes_no_not_applicable_optional
> >
> > How about having a table with the valid codes for each question?
> > This should be relatively easy maintain and you can easily set up
> > a foreign key reference to this table to enforce integrity.
> >
> There would be no problem in doing so with such an easy case, but
> think about having a table with cities (hundred, thousands?) and then
> have four copies for each of the above posibilities with its related
> maintenance nightmare.

So the problem then is that there are codes (e.g. cities) that are
used by multiple questions, sometimes optional or N/A is allowed
and sometimes not.

Don't use constraints, use triggers instead and have them check that
the data is on the appropriate table.  You then have two approaches.
The first is completely dynamic.  You have a "control" table
with a row for every column (question).  In the row
you store whether or not n/a is allowed, whether or not optional
is allowed, and what table to use for validation otherwise.
The trigger reads the control table for each column/question
and validates.  The trigger uses plpgsql EXECUTE (or equivalent)
to dynamically look up the data value in the appropriate table.

The second approach is to hardcode the trigger.  I'd use m4
as a pre-processor as your code will be very repetitious.


Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                  -- Robert A. Heinlein


Re: Foreign key to a view (UNION of two or more tables), any alternative?

From
Jose Gonzalez Gomez
Date:
On 6/19/05, Karl O. Pinc <kop@meme.com> wrote:
>
> On 06/19/2005 11:16:34 AM, Jose Gonzalez Gomez wrote:
> > On 6/17/05, Bruno Wolff III <bruno@wolff.to> wrote:
> > > On Fri, Jun 17, 2005 at 14:35:01 +0200,
> > >   Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote:
> > > >
> > > > The problem comes when you have questions that may be not
> > applicable
> > > > (8), or optional (doesn't know, doesn't answer) (9). The easy
> > solution
> > > > would be to have four tables:
> > > >
> > > > yes_no
> > > > yes_no_not_applicable
> > > > yes_no_optional
> > > > yes_no_not_applicable_optional
> > >
> > > How about having a table with the valid codes for each question?
> > > This should be relatively easy maintain and you can easily set up
> > > a foreign key reference to this table to enforce integrity.
> > >
> > There would be no problem in doing so with such an easy case, but
> > think about having a table with cities (hundred, thousands?) and then
> > have four copies for each of the above posibilities with its related
> > maintenance nightmare.
>
> So the problem then is that there are codes (e.g. cities) that are
> used by multiple questions, sometimes optional or N/A is allowed
> and sometimes not.
>
> Don't use constraints, use triggers instead and have them check that
> the data is on the appropriate table.  You then have two approaches.
> The first is completely dynamic.  You have a "control" table
> with a row for every column (question).  In the row
> you store whether or not n/a is allowed, whether or not optional
> is allowed, and what table to use for validation otherwise.
> The trigger reads the control table for each column/question
> and validates.  The trigger uses plpgsql EXECUTE (or equivalent)
> to dynamically look up the data value in the appropriate table.
>
> The second approach is to hardcode the trigger.  I'd use m4
> as a pre-processor as your code will be very repetitious.
>

Thanks a lot for your suggestion. I like its dynamic nature, and I had
planned to have something like that raising a bit the level of
abstraction, but for a future project (they do a lot of questionnaires
here :o) )

I've been thinking about the problem and I finally opted for another
solution: I'm going to separate this data in two columns: answer and
reason for unavailable answer. This would double the number of columns
for questions where optional/ N/A allowed, but this way I may keep
data integrity using a combination of referential integrity
constraints (valid codes), and check constraints (only one of those
two columns with a value other than null); this way I also avoid
writing triggers that should be translated in the case of an
hypotetical DBMS change. And I also have the feeling that this is a
more correct design from a conceptual point of view. I'll solve the
issue of having directly usable data for statistics (just one column
per question) using a view.

If anyone thinks there's a better approach, I'd be glad to hear...

Thanks again, best regards
Jose

Re: Foreign key to a view (UNION of two or more tables), any alternative?

From
Bruno Wolff III
Date:
On Sun, Jun 19, 2005 at 17:16:34 +0100,
  Jose Gonzalez Gomez <jgonzalez.openinput@gmail.com> wrote:
> >
> There would be no problem in doing so with such an easy case, but
> think about having a table with cities (hundred, thousands?) and then
> have four copies for each of the above posibilities with its related
> maintenance nightmare.

You still have to maintain the data somehow. I think it would be
easier with more cities to have a table holding the results rather
than try to hard code something into the table definition.

Re: Foreign key to a view (UNION of two or more tables),

From
Enrico Weigelt
Date:
* Karl O. Pinc <kop@meme.com> wrote:

<snip>
> So the problem then is that there are codes (e.g. cities) that are
> used by multiple questions, sometimes optional or N/A is allowed
> and sometimes not.


For such cases you could introduce another layer, like a datatype.
Each question can be answered with some datatype, and may optionally
be empty - just like rows in a database :).

CREATE TABLE q_types
(
    id        oid    default nextval('q_type_id'),
    description    text,
    ..
);

CREATE TABLE q_type_values
(
    type_id    oid,
    answer_id    oid,
    title    text,
    ...,
    PRIMARY KEY(type_id, answer_id)
);

CREATE TABLE q_question
(
    id        oid    default nextval,
    qtype    oid    references q_types (id),
    question    text
);

CREATE TABLE q_answer
(
    user_id    oid    references q_user(id),
    question_id oid    references q_question(id),
    value       oid
);

...

Maintaining the integrity of q_answer.value is a little bit more
complicated. I don't know if its possible with an foreign key,
since it spans over multiple tables ( question_id->qtype + value )
You probably need an hand-written trigger.

Unanswered questions (or selected n/a) could be marked by simply
setting value to NULL. You also could introduce a separate flag
in q_answer for that.


BTW: its probably not such a bad idea to present the whole
stuff as one writable view to the frontend and let the database
do the logic of mapping answer texts <-> ids.


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service
  phone:     +49 36207 519931         www:       http://www.metux.de/
  fax:       +49 36207 519932         email:     contact@metux.de
---------------------------------------------------------------------
  Realtime Forex/Stock Exchange trading powered by postgresSQL :))
                                            http://www.fxignal.net/
---------------------------------------------------------------------