Re: dynamic crosstab - Mailing list pgsql-general
From | Erik Jones |
---|---|
Subject | Re: dynamic crosstab |
Date | |
Msg-id | DD75275B-A579-4D5F-92FF-7BD961FC6A9A@myemma.com Whole thread Raw |
In response to | Re: dynamic crosstab ("Klein Balazs" <Balazs.Klein@t-online.hu>) |
Responses |
Re: dynamic crosstab
Re: dynamic crosstab |
List | pgsql-general |
On Feb 14, 2008, at 3:49 PM, Klein Balazs wrote: > My point was to get rid of the the EAV setup. Something like: > > CREATE TABLE questions ( > question_id serial primary key, > question text not null > ); > > CREATE TABLE people ( > person_id serial primary key, > .... > ); > > CREATE TABLE answers ( > person_id integer references people, > answers text[] > ); > > where the indexes into answers are ids from questions. You don't get > any easy foreign keys for those indexes into the questions table, > which you definitely don't have with the EAV setup anyway, but with > this you don't need any kind of pivot/crosstab functionality. > >> I can't imagine how I could store data directly that way (beside >> the usual >> thing that whenever I can I aim to store scalar value in a column). >> >> To do what you suggest I could have this: >> 1 (aaa,bbb,ccc) >> 2 (ddd,NULL,eee) >> but for this I would need to store a NULL for a person for all the >> questions >> he/she didn't answer. Now answers may come from all sorts of >> questionnaires >> so most people will only have responses on a subset, so this does >> not seem >> feasible. >> >> Or this: >> 1 (aaa,bbb,ccc) >> 2 (ddd,eee) >> but this would be loosing the purpose - there is no longer a >> meaningful way >> to compare the same info at different people. >> >> So directly storing the info in this structure does not seem to be >> the way >> for me. On the other hand a query may be able to generate the >> proper array >> without the usual problem of outputting unknown number of columns. First, please stop top-posting. It makes it difficult for both me and others to know to whom/what you are replying. Now on to the meat of the topic! When using arrays you do not need to manually store NULLS -- they are implied by gaps in array indices. Observe: CREATE TABLE questions ( question_id INTEGER PRIMARY KEY, question_text TEXT NOT NULL ); CREATE TABLE people ( person_id SERIAL PRIMARY KEY, answers TEXT[] ); INSERT INTO questions (question_id, question_text) VALUES (2, 'Will arrays work?'), (5, 'Can pigs fly?'); INSERT INTO people (person_id) VALUES (1), (2); UPDATE people SET answers[2] = 'yep!', answers[5] = 'nope!', answers[7] = 'this shouldn''t be here!' where person_id=1; UPDATE people SET answers[5]='if only they had wings' where person_id=2; SELECT * FROM people; person_id | answers --------------- +------------------------------------------------------------- 1 | [2:7]={yep!,NULL,NULL,nope!,NULL,"this shouldn't be here!"} 2 | [5:5]={"if only they had wings"} See how postgres handles filling the NULLs for you? What you'd really want to do with this would be to define some functions for setting and getting a person's answers to a given question or set of questions so that you could implement some kind of data integrity with regards to question ids and indices into the answers arrays such as in the example above you'd want to prevent an entry at index 7 when there is no entry in the questions table for question_id=7. This whole thing is still wide open for adding extra layers such as question groupings for separate questionnaires, etc. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
pgsql-general by date: