Re: Storing questionnaire data - Mailing list pgsql-general

From Thom Brown
Subject Re: Storing questionnaire data
Date
Msg-id bddc86150810240134j74bc116bn628fe160dbe83f86@mail.gmail.com
Whole thread Raw
In response to Re: Storing questionnaire data  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Storing questionnaire data  (David Fetter <david@fetter.org>)
List pgsql-general
Thanks David and Jeff.

I can see your point.  The provided link might actually be useful,
although I think I'd make some changes to it.

I wouldn't have trouble data-mining such a structure for individual
questionnaire results.  The planner will be shrugging its shoulders,
but I haven't actually tested that solution with many massive
questionnaires for its query performance.

I pretty much have my answer.  Thanks for your input guys.

Thom

On Fri, Oct 24, 2008 at 12:31 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> On Thu, 2008-10-23 at 21:38 +0100, Thom Brown wrote:
>> I'm afraid such a rigid structure is completely tailored for a
>> specific questionnaire.  What if I, or even a client, wanted to
>> generate different questionnaires?  I would like the data to indicate
>> the flow of questions and answers rather than just use the database as
>> pure storage for a completely coded solution.
>
> If you construct a schema in such a way that there's really no
> constraint on the data at all, then the user of that permissive schema
> is effectively designing the database.
>
> This is not necessarily a bad thing (or rather, it may not be
> avoidable). For instance, if you have no idea what kind of questions
> might be asked by the questionnaire, nor any idea what kind of questions
> might be asked about the responses they receive to the questionnaire,
> there's not much you can do. Pretty much anything is going to look a lot
> like EAV.
>
> The cost of this, however, is that you (as the designer of the
> permissive schema) can no longer see any meaning in the data at all. You
> may be able to dig around manually a bit and find out a few specific
> things, but you can't do it in any automated way. This is because you
> aren't the real designer of the database, you've passed that job along
> to your users. Only they (hopefully) have any idea what it might mean.
> The users might not be good database designers, in which case they'll
> end up with a mess, and you won't be able to help them.
>
> Also, as a performance matter, the optimizer also has no idea what your
> data means, and so it can't take any useful shortcuts. So, it will
> probably be slow.
>
> The best you can really do is try to find whatever basic meaning you
> can. Usually there is something there: there are basic data types people
> will want (e.g. string, numeric, timestamp). There are questions,
> perhaps groups of questions, order in which the questions should be
> asked, order in which the questions are answered, time the question was
> answered, and respondents. There is one (or fewer) answer per question
> per respondent. Try to piece this stuff together in some way as to
> provide maximum meaning to you (and to PostgreSQL) without destroying
> the usefulness to your customers.
>
> I think the article David mentioned:
> http://www.varlena.com/GeneralBits/110.php
> Is a pretty reasonable compromise for many use-cases. Perhaps more can
> be done, but usually questionnaires are either too unimportant to really
> dig in, or so important that designing a database around it is the
> obvious thing to do.
>
> Regards,
>        Jeff Davis
>
>

pgsql-general by date:

Previous
From: "Dot Yet"
Date:
Subject: PostgreSQL 8.3.4 Solaris x86 compilation issues
Next
From: "Sergey Levchenko"
Date:
Subject: partitioning question. need current month and archive partitions.