Re: Storing questionnaire data - Mailing list pgsql-general

From Jeff Davis
Subject Re: Storing questionnaire data
Date
Msg-id 1224804665.25425.77.camel@dell.linuxdev.us.dell.com
Whole thread Raw
In response to Re: Storing questionnaire data  ("Thom Brown" <thombrown@gmail.com>)
Responses Re: Storing questionnaire data  ("Thom Brown" <thombrown@gmail.com>)
List pgsql-general
On Thu, 2008-10-23 at 21:38 +0100, Thom Brown wrote:
> I'm afraid such a rigid structure is completely tailored for a
> specific questionnaire.  What if I, or even a client, wanted to
> generate different questionnaires?  I would like the data to indicate
> the flow of questions and answers rather than just use the database as
> pure storage for a completely coded solution.

If you construct a schema in such a way that there's really no
constraint on the data at all, then the user of that permissive schema
is effectively designing the database.

This is not necessarily a bad thing (or rather, it may not be
avoidable). For instance, if you have no idea what kind of questions
might be asked by the questionnaire, nor any idea what kind of questions
might be asked about the responses they receive to the questionnaire,
there's not much you can do. Pretty much anything is going to look a lot
like EAV.

The cost of this, however, is that you (as the designer of the
permissive schema) can no longer see any meaning in the data at all. You
may be able to dig around manually a bit and find out a few specific
things, but you can't do it in any automated way. This is because you
aren't the real designer of the database, you've passed that job along
to your users. Only they (hopefully) have any idea what it might mean.
The users might not be good database designers, in which case they'll
end up with a mess, and you won't be able to help them.

Also, as a performance matter, the optimizer also has no idea what your
data means, and so it can't take any useful shortcuts. So, it will
probably be slow.

The best you can really do is try to find whatever basic meaning you
can. Usually there is something there: there are basic data types people
will want (e.g. string, numeric, timestamp). There are questions,
perhaps groups of questions, order in which the questions should be
asked, order in which the questions are answered, time the question was
answered, and respondents. There is one (or fewer) answer per question
per respondent. Try to piece this stuff together in some way as to
provide maximum meaning to you (and to PostgreSQL) without destroying
the usefulness to your customers.

I think the article David mentioned:
http://www.varlena.com/GeneralBits/110.php
Is a pretty reasonable compromise for many use-cases. Perhaps more can
be done, but usually questionnaires are either too unimportant to really
dig in, or so important that designing a database around it is the
obvious thing to do.

Regards,
    Jeff Davis


pgsql-general by date:

Previous
From: Guy Rouillier
Date:
Subject: Re: Annoying Reply-To
Next
From: "Pavel Stehule"
Date:
Subject: Re: overhead of plpgsql functions over simple select