Thread: How do i store arbitrary questions and answers in SQL?

How do i store arbitrary questions and answers in SQL?

From
"lifeisgood"
Date:
Dear group,

I have been bugged by this problem on and off for years, and would like
to put it out to general discussion.  I suspect it is a common SQL
problem but I have never found a satisfactory answer - maybe there is
not one.

The problem : to store, in SQL/RDBMS, an arbitrary set of questions and
their answers, where the questions are text (obviously) but the answers
can be dates, text, integers, money etc.
think of it as a big questionnaire where at design time we have no idea
what the questions will be.

My usual solution to this problem is to store everything in varchar and
flag the type, converting
as I extract data.  It is not a desirable solution.

i.e.
CREATE TABLE Qu (ID INT, Question VARCHAR(64))

CREATE TABLE Answer(ID INT, questionID INT, answer_text VARCHAR(255),
datatype INT)



Are there any other solutions out there?


cheers



PS
I think I am essentially looking for a database that stores a datatype
of VARIANT (for those of MS VB background). However my (patchy)
knowledge of RDBMS internals says this is essentially breaks all the
rules for optimisation and storage so is never offered.

I can think of several ideas but they all fall short in some way

0. (current one)  I have to cast any search term to string first but
after that the SQL performs as one expects.

1. store the serialised object in binary form. (how does one search
this? Totally dependant on choice of middleware language)

2. Store different types in diff columns
    table answer (questionID, ans_text VARCHAR, ans_money MONEY,
ans_int INT ....
    But this makes searching through SQL even harder than casting, as
in each query i must what answer to expect.

3. Different answer tables each with answer types - same problem as 2
but even harder.

4. I suspect pl/python might be useful, but I cannot see how at the
moment...


Re: How do i store arbitrary questions and answers in SQL?

From
Enver ALTIN
Date:
Hi,

On Wed, Aug 23, 2006 at 02:40:29AM -0700, lifeisgood wrote:
> CREATE TABLE Answer(ID INT, questionID INT, answer_text VARCHAR(255),
> datatype INT)

You can actually store answer_text as a BLOB in the database that is
packed in any format your application can handle. You can serialize a
class that represents the answer you need and store it, for example.

-HTH
--
Enver

Attachment

Re: How do i store arbitrary questions and answers in SQL?

From
Karsten Hilbert
Date:
On Wed, Aug 23, 2006 at 04:13:24PM +0300, Enver ALTIN wrote:

> > CREATE TABLE Answer(ID INT, questionID INT, answer_text VARCHAR(255),
> > datatype INT)
>
> You can actually store answer_text as a BLOB in the database that is
> packed in any format your application can handle. You can serialize a
> class that represents the answer you need and store it, for example.

There was an article on that on Elein's General Bits not too
long ago.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: How do i store arbitrary questions and answers in SQL?

From
Tim Allen
Date:
lifeisgood wrote:
> The problem : to store, in SQL/RDBMS, an arbitrary set of questions and
> their answers, where the questions are text (obviously) but the answers
> can be dates, text, integers, money etc.
> think of it as a big questionnaire where at design time we have no idea
> what the questions will be.
>
> My usual solution to this problem is to store everything in varchar and
> flag the type, converting
> as I extract data.  It is not a desirable solution.
>
> i.e.
> CREATE TABLE Qu (ID INT, Question VARCHAR(64))
>
> CREATE TABLE Answer(ID INT, questionID INT, answer_text VARCHAR(255),
> datatype INT)

Use text, not varchar - varchar is just a waste of time and space. This
approach isn't all that bad, since if you're using libpq or similar to
read and write the values (without using a binary cursor), you're using
a text representation anyway. Just use the same text representation of
your data that the db interface is going to use.

> Are there any other solutions out there?

The other solution is to partition your table, make some number of
tables, one for each data type, with the value column using the correct
native type. Your code that reads and writes values then needs to be
smart enough to use the correct table depending on the data type.

We've used both approaches, and they both work fine for us. The text for
everything approach has the advantage of simplicity of interface code,
whereas partitioning on data type gives you better query planning and
better performance when you have a lot (ie millions of rows plus) of data.

> I can think of several ideas but they all fall short in some way
>
> 0. (current one)  I have to cast any search term to string first but
> after that the SQL performs as one expects.

No, you don't have to "cast" things to text - quite the reverse; if you
are querying on the contents of your value (answer) column and your
criterion depends on the correct type (eg find questions with integer
answers greater than 42) then you have to cast the text to integer in
the query.

> 1. store the serialised object in binary form. (how does one search
> this? Totally dependant on choice of middleware language)

I'd avoid this one - for the reason you've mentioned, among others. Not
sure what the middleware language has to do with it, though - if your
choice of middleware makes things harder then it's the wrong choice. If
middleware doesn't make things easier, then what use is it?

> 2. Store different types in diff columns
>     table answer (questionID, ans_text VARCHAR, ans_money MONEY,
> ans_int INT ....
>     But this makes searching through SQL even harder than casting, as
> in each query i must what answer to expect.

Definitely avoid this one. Lots of wasted space, as well as extra
software complexity, with little payoff.

> 3. Different answer tables each with answer types - same problem as 2
> but even harder.

This is the other option I mentioned above. It's not hard at all.

> 4. I suspect pl/python might be useful, but I cannot see how at the
> moment...

I can't see it either :). Decide what you want to do first, _then_ work
out how to implement it.

Tim

--
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/

Re: How do i store arbitrary questions and answers in SQL?

From
"macgillivary"
Date:
I agree with Tim, that your option 3 is really not all that hard, and I
suggest would provide you with the best solution.  I'm assuming,
however, that your application asks the question and checks the user
supplied answer with the answer in the db (and not Jeopardy style).  I
might add a 'join' or 'answertype' field in the questions (qu) table to
assist when drawing the application and to immediately know what table
to verify the supplied answer against.  Otherwise, I think you might
need a query to check the type on the answer the user supplies and make
a possible few passes (through the numeric and integer tables for
example when the user's response is 42) - or some combination of outers
to find the non null value field.

For what it's worth, I'd probably also include a 'type' table to hold
my application specific mask for the user's reply, and perhaps a
attribute to hold the joining table name (I'm thinking about the future
requirement to add a new type without much coding changes on the
application side).

So, now I'm interested in this, and I could see a use for something
similar in the near future, I'll put the following out there for
comment.  Obviously I haven't put it into practice but perhaps it could
work.  One of differences from your original post is the absense of a
serial field on the responses.  I've only put a primary key which would
link back to the the questions.qid field (so maybe it should be a fk?
see my note about my pg newness in a moment).  I'm assuming this is not
a multiple choice type of situation.  Each question has one answer
(although not currently enforced in the db layer here since there is
nothing stopping you from placing an answer in more than one of the
response tables - I'm relatively new to pg and not sure of anyway to
deal with this).  It's early, haven't had my first cup yet, but I would
start with something like the following:

CREATE TABLE questions  -- holds the questions or challenges
(
  qid serial NOT NULL,
  qchallenge text,
  qtype int2,   -- linking to types.tid
  CONSTRAINT questions_pkey PRIMARY KEY (qid)
)
WITHOUT OIDS;

CREATE TABLE types  -- mostly to assist application development
(
  tid serial NOT NULL,
  tdescription varchar(25),
  tmask varchar(25),   -- just a thought, could be useful when building
a web app
  tjoin varchar,   -- again, just thinking about ease of new additions
  CONSTRAINT types_pkey PRIMARY KEY (tid)
)
WITHOUT OIDS;

CREATE TABLE response_numeric
(
  rnqid int2 NOT NULL,   -- linking to questions.qid
  rnvalue numeric,
  CONSTRAINT response_numeric_pkey PRIMARY KEY (rnqid)
)
WITHOUT OIDS;

CREATE TABLE response_integer
(
  riqid int2 NOT NULL,
  rivalue int4,
  CONSTRAINT response_integer_pkey PRIMARY KEY (riqid)
)
WITHOUT OIDS;

CREATE TABLE response_text
(
  rtqid int2 NOT NULL,
  rtvalue text,
  CONSTRAINT response_text_pkey PRIMARY KEY (rtqid)
)
WITHOUT OIDS;

CREATE TABLE response_date
(
  rdqid int2 NOT NULL,
  rdvalue date,
  CONSTRAINT response_date_pkey PRIMARY KEY (rdqid)
)
WITHOUT OIDS;



Tim Allen wrote:
> > 3. Different answer tables each with answer types - same problem as 2
> > but even harder.
>
> This is the other option I mentioned above. It's not hard at all.


Re: How do i store arbitrary questions and answers in SQL?

From
"macgillivary"
Date:
Dang.  I re-read your post and see that you are not checking for the
correct answer, you simply want to store the responses like a
questionare.  Well I still think a different table for each type of
response would be beneficial, obviously, adding a serial field like you
have done and the foreign key linking to the qid.

I had a completely different problem on my mind (testing user's
knowledge).  Sorry about that.


macgillivary wrote:
> I agree with Tim, that your option 3 is really not all that hard, and I
> suggest would provide you with the best solution.  I'm assuming,
> however, that your application asks the question and checks the user
> supplied answer with the answer in the db (and not Jeopardy style).  I
> might add a 'join' or 'answertype' field in the questions (qu) table to
> assist when drawing the application and to immediately know what table
> to verify the supplied answer against.  Otherwise, I think you might
> need a query to check the type on the answer the user supplies and make
> a possible few passes (through the numeric and integer tables for
> example when the user's response is 42) - or some combination of outers
> to find the non null value field.
>
> For what it's worth, I'd probably also include a 'type' table to hold
> my application specific mask for the user's reply, and perhaps a
> attribute to hold the joining table name (I'm thinking about the future
> requirement to add a new type without much coding changes on the
> application side).
>
> So, now I'm interested in this, and I could see a use for something
> similar in the near future, I'll put the following out there for
> comment.  Obviously I haven't put it into practice but perhaps it could
> work.  One of differences from your original post is the absense of a
> serial field on the responses.  I've only put a primary key which would
> link back to the the questions.qid field (so maybe it should be a fk?
> see my note about my pg newness in a moment).  I'm assuming this is not
> a multiple choice type of situation.  Each question has one answer
> (although not currently enforced in the db layer here since there is
> nothing stopping you from placing an answer in more than one of the
> response tables - I'm relatively new to pg and not sure of anyway to
> deal with this).  It's early, haven't had my first cup yet, but I would
> start with something like the following:
>
> CREATE TABLE questions  -- holds the questions or challenges
> (
>   qid serial NOT NULL,
>   qchallenge text,
>   qtype int2,   -- linking to types.tid
>   CONSTRAINT questions_pkey PRIMARY KEY (qid)
> )
> WITHOUT OIDS;
>
> CREATE TABLE types  -- mostly to assist application development
> (
>   tid serial NOT NULL,
>   tdescription varchar(25),
>   tmask varchar(25),   -- just a thought, could be useful when building
> a web app
>   tjoin varchar,   -- again, just thinking about ease of new additions
>   CONSTRAINT types_pkey PRIMARY KEY (tid)
> )
> WITHOUT OIDS;
>
> CREATE TABLE response_numeric
> (
>   rnqid int2 NOT NULL,   -- linking to questions.qid
>   rnvalue numeric,
>   CONSTRAINT response_numeric_pkey PRIMARY KEY (rnqid)
> )
> WITHOUT OIDS;
>
> CREATE TABLE response_integer
> (
>   riqid int2 NOT NULL,
>   rivalue int4,
>   CONSTRAINT response_integer_pkey PRIMARY KEY (riqid)
> )
> WITHOUT OIDS;
>
> CREATE TABLE response_text
> (
>   rtqid int2 NOT NULL,
>   rtvalue text,
>   CONSTRAINT response_text_pkey PRIMARY KEY (rtqid)
> )
> WITHOUT OIDS;
>
> CREATE TABLE response_date
> (
>   rdqid int2 NOT NULL,
>   rdvalue date,
>   CONSTRAINT response_date_pkey PRIMARY KEY (rdqid)
> )
> WITHOUT OIDS;
>
>
>
> Tim Allen wrote:
> > > 3. Different answer tables each with answer types - same problem as 2
> > > but even harder.
> >
> > This is the other option I mentioned above. It's not hard at all.


Re: How do i store arbitrary questions and answers in SQL?

From
"macgillivary"
Date:
I agree that option 3 is the way to go.  There is a little reading at:
http://www.varlena.com/varlena/GeneralBits/110.php
which may be of interest.

Tim Allen wrote:
> lifeisgood wrote:
> > 3. Different answer tables each with answer types - same problem as 2
> > but even harder.
>
> This is the other option I mentioned above. It's not hard at all.