Re: How do i store arbitrary questions and answers in SQL? - Mailing list pgsql-general

From Tim Allen
Subject Re: How do i store arbitrary questions and answers in SQL?
Date
Msg-id 44ECEBC9.30404@proximity.com.au
Whole thread Raw
In response to How do i store arbitrary questions and answers in SQL?  ("lifeisgood" <paul1brian@gmail.com>)
Responses Re: How do i store arbitrary questions and answers in SQL?  ("macgillivary" <macgillivary@gmail.com>)
List pgsql-general
lifeisgood wrote:
> The problem : to store, in SQL/RDBMS, an arbitrary set of questions and
> their answers, where the questions are text (obviously) but the answers
> can be dates, text, integers, money etc.
> think of it as a big questionnaire where at design time we have no idea
> what the questions will be.
>
> My usual solution to this problem is to store everything in varchar and
> flag the type, converting
> as I extract data.  It is not a desirable solution.
>
> i.e.
> CREATE TABLE Qu (ID INT, Question VARCHAR(64))
>
> CREATE TABLE Answer(ID INT, questionID INT, answer_text VARCHAR(255),
> datatype INT)

Use text, not varchar - varchar is just a waste of time and space. This
approach isn't all that bad, since if you're using libpq or similar to
read and write the values (without using a binary cursor), you're using
a text representation anyway. Just use the same text representation of
your data that the db interface is going to use.

> Are there any other solutions out there?

The other solution is to partition your table, make some number of
tables, one for each data type, with the value column using the correct
native type. Your code that reads and writes values then needs to be
smart enough to use the correct table depending on the data type.

We've used both approaches, and they both work fine for us. The text for
everything approach has the advantage of simplicity of interface code,
whereas partitioning on data type gives you better query planning and
better performance when you have a lot (ie millions of rows plus) of data.

> I can think of several ideas but they all fall short in some way
>
> 0. (current one)  I have to cast any search term to string first but
> after that the SQL performs as one expects.

No, you don't have to "cast" things to text - quite the reverse; if you
are querying on the contents of your value (answer) column and your
criterion depends on the correct type (eg find questions with integer
answers greater than 42) then you have to cast the text to integer in
the query.

> 1. store the serialised object in binary form. (how does one search
> this? Totally dependant on choice of middleware language)

I'd avoid this one - for the reason you've mentioned, among others. Not
sure what the middleware language has to do with it, though - if your
choice of middleware makes things harder then it's the wrong choice. If
middleware doesn't make things easier, then what use is it?

> 2. Store different types in diff columns
>     table answer (questionID, ans_text VARCHAR, ans_money MONEY,
> ans_int INT ....
>     But this makes searching through SQL even harder than casting, as
> in each query i must what answer to expect.

Definitely avoid this one. Lots of wasted space, as well as extra
software complexity, with little payoff.

> 3. Different answer tables each with answer types - same problem as 2
> but even harder.

This is the other option I mentioned above. It's not hard at all.

> 4. I suspect pl/python might be useful, but I cannot see how at the
> moment...

I can't see it either :). Decide what you want to do first, _then_ work
out how to implement it.

Tim

--
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Inserting Data
Next
From: Jeff Davis
Date:
Subject: Re: XPath and XML support