Thread: dynamic crosstab
Hi, I found this to create dynamic crosstabs (where the resulting columns are not known beforehand): http://www.ledscripts.com/tech/article/view/5.html (Thanks for Denis Bitouzé on http://www.postgresonline.com/journal/index.php?/archives/14-CrossTab-Queries-in-PostgreSQL-using-tablefunc-contrib.html for pointing it out.). This is basically dynamically generating an SQL string with CASE ... WHEN that will create a view. This could work although for hundreds of columns it looks a bit scary for me. Isn't there a more elegant way to achieve this with tablefunc crosstab and if there isn't don't you think it could/should be there? There is a syntax where you could specify the columns with a SELECT DISTINCT statement - couldn't it also generate the enumeration string eg. presuming that all returning colums are stored as text? Or if that is not possible instead of the enumeration part wouldn't it be better to put a name of the view that could be created/recreated? I know that most db people don't care much about pivot/crosstab in the db but imagine this situation: I am storing questionnaire results on people. Since the questionnaires are created by users I have no other way than using an EAV model like personID, questionID, responseValue to store responses. Now this table gets long 300 question per questionnaire, 3000 people and we have 1m row. Now whenever I need to download this data in my case 2/3rd of it would be redundant if I could pivot it first - and in a 20MB csv its significant (I know its a tradeoff between processing and storage). Moreover my users can't do anything with this dataformat - they need to pivot it offline anyway, which is not easy (Excel cant do it, Access cant do it, numberGo cant do it for different reasons). Although the application could do it I think this is a generic functionality that the database is more suited for. Please let me know if you know of a good db based way to create a dynamic crosstab in Postgres - or why there shouldn't be one. Thanks and regards. SWK
> > I found this to create dynamic crosstabs (where the resulting > columns are not known beforehand): > http://www.ledscripts.com/tech/article/view/5.html > (Thanks for Denis Bitouzé on > http://www.postgresonline.com/journal/index.php?/archives/14-C > rossTab-Queries-in-PostgreSQL-using-tablefunc-contrib.html > for pointing it out.). > This is basically dynamically generating an SQL string with CASE ... > WHEN that will create a view. > This could work although for hundreds of columns it looks a > bit scary for me. > Isn't there a more elegant way to achieve this with tablefunc > crosstab and if there isn't don't you think it could/should be there? > There is a syntax where you could specify the columns with a > SELECT DISTINCT statement - couldn't it also generate the > enumeration string eg. presuming that all returning colums > are stored as text? > Or if that is not possible instead of the enumeration part > wouldn't it be better to put a name of the view that could be > created/recreated? > > I know that most db people don't care much about > pivot/crosstab in the db but imagine this situation: > I am storing questionnaire results on people. Since the > questionnaires are created by users I have no other way than > using an EAV model like personID, questionID, responseValue > to store responses. Now this table gets long 300 question per > questionnaire, 3000 people and we have 1m row. Now whenever I > need to download this data in my case 2/3rd of it would be > redundant if I could pivot it first - and in a 20MB csv its > significant (I know its a tradeoff between processing and storage). > Moreover my users can't do anything with this dataformat - > they need to pivot it offline anyway, which is not easy > (Excel cant do it, Access cant do it, numberGo cant do it for > different reasons). > Although the application could do it I think this is a > generic functionality that the database is more suited for. > > Please let me know if you know of a good db based way to > create a dynamic crosstab in Postgres - or why there shouldn't be one. > Thanks and regards. > SWK Have you tried this crosstab? http://www.postgresql.org/docs/8.3/interactive/tablefunc.html
Hi, SunWuKung wrote: > Hi, > > I found this to create dynamic crosstabs (where the resulting columns ... > This could work although for hundreds of columns it looks a bit scary > for me. Well I'd say hundreds of columns are always scary, no matter how you do it :-) ... > I know that most db people don't care much about pivot/crosstab in the > db but imagine this situation: > I am storing questionnaire results on people. Since the questionnaires > are created by users I have no other way than using an EAV model like > personID, questionID, responseValue to store responses. Now this table > gets long 300 question per questionnaire, 3000 people and we have 1m > row. Now whenever I need to download this data in my case 2/3rd of it > would be redundant if I could pivot it first - and in a 20MB csv its > significant (I know its a tradeoff between processing and storage). > Moreover my users can't do anything with this dataformat - they need > to pivot it offline anyway, which is not easy (Excel cant do it, > Access cant do it, numberGo cant do it for different reasons). What about not pivoting it? You can run your analysis directly against your database. > Although the application could do it I think this is a generic > functionality that the database is more suited for. Well after all you want a CSV not a table. You could shortcut this with a generic query which creates array out of your "columns" and join them to a CSV line. This would just be outputted as one single column from database. > Please let me know if you know of a good db based way to create a > dynamic crosstab in Postgres - or why there shouldn't be one. See above :-) Regards Tino
On Wed, 2008-02-13 at 14:04 +0100, Tino Wildenhain wrote: > Well after all you want a CSV not a table. You could shortcut this > with a generic query which creates array out of your "columns" > and join them to a CSV line. This would just be outputted as > one single column from database. Depending on your use case, this may be a better way: In psql: => \copy (select col1,col2,col3 from data) TO data.csv CSV HEADER or on the command line: $ psql -c '\copy (select col1,col2,col3 from data) TO data.csv CSV HEADER' Strictly speaking, the CSV formatting isn't being done in the database but rather by psql. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Yes, thanks. The problem with those function is that they all have an AS (columname type, ...) part or equivalent. -----Original Message----- From: Masse Jacques [mailto:jacques.masse@bordeaux.cemagref.fr] Sent: Wednesday, February 13, 2008 10:20 AM To: SunWuKung; pgsql-general@postgresql.org Subject: RE: [GENERAL] dynamic crosstab > > I found this to create dynamic crosstabs (where the resulting > columns are not known beforehand): > http://www.ledscripts.com/tech/article/view/5.html > (Thanks for Denis Bitouzé on > http://www.postgresonline.com/journal/index.php?/archives/14-C > rossTab-Queries-in-PostgreSQL-using-tablefunc-contrib.html > for pointing it out.). > This is basically dynamically generating an SQL string with CASE ... > WHEN that will create a view. > This could work although for hundreds of columns it looks a > bit scary for me. > Isn't there a more elegant way to achieve this with tablefunc > crosstab and if there isn't don't you think it could/should be there? > There is a syntax where you could specify the columns with a > SELECT DISTINCT statement - couldn't it also generate the > enumeration string eg. presuming that all returning colums > are stored as text? > Or if that is not possible instead of the enumeration part > wouldn't it be better to put a name of the view that could be > created/recreated? > > I know that most db people don't care much about > pivot/crosstab in the db but imagine this situation: > I am storing questionnaire results on people. Since the > questionnaires are created by users I have no other way than > using an EAV model like personID, questionID, responseValue > to store responses. Now this table gets long 300 question per > questionnaire, 3000 people and we have 1m row. Now whenever I > need to download this data in my case 2/3rd of it would be > redundant if I could pivot it first - and in a 20MB csv its > significant (I know its a tradeoff between processing and storage). > Moreover my users can't do anything with this dataformat - > they need to pivot it offline anyway, which is not easy > (Excel cant do it, Access cant do it, numberGo cant do it for > different reasons). > Although the application could do it I think this is a > generic functionality that the database is more suited for. > > Please let me know if you know of a good db based way to > create a dynamic crosstab in Postgres - or why there shouldn't be one. > Thanks and regards. > SWK Have you tried this crosstab? http://www.postgresql.org/docs/8.3/interactive/tablefunc.html
Hi, ye, hundreds of columns - but there is no helping it, that’s the way many questionnaire are and the representation of theresponses (when not in a database) is always one person per row. I would need this for exporting, but also to show resultsonline. Although it’s a good idea I am afraid that an array could only help me when the info I store about all the persons in thequery are exactly the same (there wouldn’t be empty cells in a crosstab) - it’s very useful for some cases but in generalthat sounds like a dangerous presumption for me. I think this is a generic shortcoming of Postgres - whenever you are forced to create an EAV (Entity-Attribute-Value) modelyou have no generic or way of going back to the usual one entity per row model. This is something that Access has beenable to do (up to 255 columns) as far as I can remember. When I google about this topic I find that the majority of peopleare still referring to that solution as the easiest for this purpose. Tablefunc crosstab is so close to a good solutionfor this with the syntax where you could specify the columns with a query - the only shortcoming is that you stillhave to enumerate the columns and their datatype. I always hope that somebody might have something similar but generic- eg. create those columns automatically and just treat them all as text. Regards, SWK -----Original Message----- From: Tino Wildenhain [mailto:tino@wildenhain.de] Sent: Wednesday, February 13, 2008 2:05 PM To: SunWuKung Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] dynamic crosstab Hi, SunWuKung wrote: > Hi, > > I found this to create dynamic crosstabs (where the resulting columns ... > This could work although for hundreds of columns it looks a bit scary > for me. Well I'd say hundreds of columns are always scary, no matter how you do it :-) ... > I know that most db people don't care much about pivot/crosstab in the > db but imagine this situation: > I am storing questionnaire results on people. Since the questionnaires > are created by users I have no other way than using an EAV model like > personID, questionID, responseValue to store responses. Now this table > gets long 300 question per questionnaire, 3000 people and we have 1m > row. Now whenever I need to download this data in my case 2/3rd of it > would be redundant if I could pivot it first - and in a 20MB csv its > significant (I know its a tradeoff between processing and storage). > Moreover my users can't do anything with this dataformat - they need > to pivot it offline anyway, which is not easy (Excel cant do it, > Access cant do it, numberGo cant do it for different reasons). What about not pivoting it? You can run your analysis directly against your database. > Although the application could do it I think this is a generic > functionality that the database is more suited for. Well after all you want a CSV not a table. You could shortcut this with a generic query which creates array out of your "columns" and join them to a CSV line. This would just be outputted as one single column from database. > Please let me know if you know of a good db based way to create a > dynamic crosstab in Postgres - or why there shouldn't be one. See above :-) Regards Tino Internal Virus Database is out-of-date. Checked by AVG Free Edition. Version: 7.5.503 / Virus Database: 269.16.0/1137 - Release Date: 11/18/2007 5:15 PM
Hi, Yes I know that SPSS can do this - in fact that is the only way I could solve this so far, but that is a very expensive workaroundfor anybody not currently owning SPSS. Thanks. SWK -----Original Message----- From: jr [mailto:jorg.raskowski@tailorware.org.uk] Sent: Wednesday, February 13, 2008 1:31 PM To: SunWuKung Subject: Re: dynamic crosstab hi SWK SunWuKung wrote: > I know that most db people don't care much about pivot/crosstab in the > db but imagine this situation: > I am storing questionnaire results on people. Since the questionnaires > are created by users I have no other way than using an EAV model like are you using the right tool for this task? > Moreover my users can't do anything with this dataformat - they need > to pivot it offline anyway, which is not easy (Excel cant do it, > Access cant do it, numberGo cant do it for different reasons). back at college we used SPSS - the Statistical Package for Social Sciences. > Please let me know if you know of a good db based way to create a > dynamic crosstab in Postgres - or why there shouldn't be one. to be honest I don't; I think that a specialised product (such as SPSS) will solve both problems in one stroke. -- regards, jr. (jr@tailorware.org.uk)
Yes, once I have the select outputting it to CSV is not a problem. As you say PG handles that nicely. Thx SWK -----Original Message----- From: Reece Hart [mailto:reece@harts.net] Sent: Wednesday, February 13, 2008 9:39 PM To: Tino Wildenhain Cc: SunWuKung; pgsql-general@postgresql.org Subject: Re: [GENERAL] dynamic crosstab On Wed, 2008-02-13 at 14:04 +0100, Tino Wildenhain wrote: > Well after all you want a CSV not a table. You could shortcut this > with a generic query which creates array out of your "columns" > and join them to a CSV line. This would just be outputted as > one single column from database. Depending on your use case, this may be a better way: In psql: => \copy (select col1,col2,col3 from data) TO data.csv CSV HEADER or on the command line: $ psql -c '\copy (select col1,col2,col3 from data) TO data.csv CSV HEADER' Strictly speaking, the CSV formatting isn't being done in the database but rather by psql. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Balázs Klein wrote: > > Hi, > Yes I know that SPSS can do this - in fact that is the only way I > could solve this so far, but that is a very expensive workaround for > anybody not currently owning SPSS. Huh, perhaps you could try with PSPP ... (I don't know if it can do it, but I know it is supposed to be a replacement to SPSS). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Feb 14, 2008, at 2:04 AM, Balázs Klein wrote: > Hi, > ye, hundreds of columns - but there is no helping it, that’s the > way many questionnaire are and the representation of the responses > (when not in a database) is always one person per row. I would need > this for exporting, but also to show results online. > > Although it’s a good idea I am afraid that an array could only help > me when the info I store about all the persons in the query are > exactly the same (there wouldn’t be empty cells in a crosstab) - > it’s very useful for some cases but in general that sounds like a > dangerous presumption for me. As of versions >= 8.2 you can store NULL values in arrays. Perhaps you could have a Question -> Index table and then use an array per person for their answers. > > I think this is a generic shortcoming of Postgres - whenever you > are forced to create an EAV (Entity-Attribute-Value) model you have > no generic or way of going back to the usual one entity per row > model. This is something that Access has been able to do (up to 255 > columns) as far as I can remember. When I google about this topic I > find that the majority of people are still referring to that > solution as the easiest for this purpose. Tablefunc crosstab is so > close to a good solution for this with the syntax where you could > specify the columns with a query - the only shortcoming is that you > still have to enumerate the columns and their datatype. I always > hope that somebody might have something similar but generic - eg. > create those columns automatically and just treat them all as text. Have a look at http://www.varlena.com/varlena/GeneralBits/110.php for a totally different approach to questionnaires. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
Hi, the part that I don't know is how to put those NULLs in. It could well be doable I just can't do it myself. How does the query look like that produces from this input: PersonID AttributeID Value 1 1 aaa 1 2 bbb 1 3 ccc 2 1 ddd 2 3 eee this output, without manually enumerating the attributeids: 1 (aaa,bbb,ccc) 2 (ddd,NULL,eee) Thx. B. -----Original Message----- From: Erik Jones [mailto:erik@myemma.com] Sent: Thursday, February 14, 2008 5:15 PM To: Balázs Klein Cc: 'Tino Wildenhain'; 'SunWuKung'; pgsql-general@postgresql.org Subject: Re: [GENERAL] dynamic crosstab On Feb 14, 2008, at 2:04 AM, Balázs Klein wrote: > Hi, > ye, hundreds of columns - but there is no helping it, that’s the > way many questionnaire are and the representation of the responses > (when not in a database) is always one person per row. I would need > this for exporting, but also to show results online. > > Although it’s a good idea I am afraid that an array could only help > me when the info I store about all the persons in the query are > exactly the same (there wouldn’t be empty cells in a crosstab) - > it’s very useful for some cases but in general that sounds like a > dangerous presumption for me. As of versions >= 8.2 you can store NULL values in arrays. Perhaps you could have a Question -> Index table and then use an array per person for their answers. > > I think this is a generic shortcoming of Postgres - whenever you > are forced to create an EAV (Entity-Attribute-Value) model you have > no generic or way of going back to the usual one entity per row > model. This is something that Access has been able to do (up to 255 > columns) as far as I can remember. When I google about this topic I > find that the majority of people are still referring to that > solution as the easiest for this purpose. Tablefunc crosstab is so > close to a good solution for this with the syntax where you could > specify the columns with a query - the only shortcoming is that you > still have to enumerate the columns and their datatype. I always > hope that somebody might have something similar but generic - eg. > create those columns automatically and just treat them all as text. Have a look at http://www.varlena.com/varlena/GeneralBits/110.php for a totally different approach to questionnaires. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
On Feb 14, 2008, at 10:56 AM, Balázs Klein wrote: > -----Original Message----- > From: Erik Jones [mailto:erik@myemma.com] > Sent: Thursday, February 14, 2008 5:15 PM > To: Balázs Klein > Cc: 'Tino Wildenhain'; 'SunWuKung'; pgsql-general@postgresql.org > Subject: Re: [GENERAL] dynamic crosstab > > > On Feb 14, 2008, at 2:04 AM, Balázs Klein wrote: > >> Hi, >> ye, hundreds of columns - but there is no helping it, that’s the >> way many questionnaire are and the representation of the responses >> (when not in a database) is always one person per row. I would need >> this for exporting, but also to show results online. >> >> Although it’s a good idea I am afraid that an array could only help >> me when the info I store about all the persons in the query are >> exactly the same (there wouldn’t be empty cells in a crosstab) - >> it’s very useful for some cases but in general that sounds like a >> dangerous presumption for me. > > As of versions >= 8.2 you can store NULL values in arrays. Perhaps > you could have a Question -> Index table and then use an array per > person for their answers. > >> >> I think this is a generic shortcoming of Postgres - whenever you >> are forced to create an EAV (Entity-Attribute-Value) model you have >> no generic or way of going back to the usual one entity per row >> model. This is something that Access has been able to do (up to 255 >> columns) as far as I can remember. When I google about this topic I >> find that the majority of people are still referring to that >> solution as the easiest for this purpose. Tablefunc crosstab is so >> close to a good solution for this with the syntax where you could >> specify the columns with a query - the only shortcoming is that you >> still have to enumerate the columns and their datatype. I always >> hope that somebody might have something similar but generic - eg. >> create those columns automatically and just treat them all as text. > > Have a look at http://www.varlena.com/varlena/GeneralBits/110.php for > a totally different approach to questionnaires. > > Erik Jones > Hi, > the part that I don't know is how to put those NULLs in. > It could well be doable I just can't do it myself. > > How does the query look like that produces from this input: > PersonID AttributeID Value > 1 1 aaa > 1 2 bbb > 1 3 ccc > 2 1 ddd > 2 3 eee > > this output, without manually enumerating the attributeids: > 1 (aaa,bbb,ccc) > 2 (ddd,NULL,eee) > > Thx. > B. My point was to get rid of the the EAV setup. Something like: CREATE TABLE questions ( question_id serial primary key, question text not null ); CREATE TABLE people ( person_id serial primary key, .... ); CREATE TABLE answers ( person_id integer references people, answers text[] ); where the indexes into answers are ids from questions. You don't get any easy foreign keys for those indexes into the questions table, which you definitely don't have with the EAV setup anyway, but with this you don't need any kind of pivot/crosstab functionality. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
I can't imagine how I could store data directly that way (beside the usual thing that whenever I can I aim to store scalar value in a column). To do what you suggest I could have this: 1 (aaa,bbb,ccc) 2 (ddd,NULL,eee) but for this I would need to store a NULL for a person for all the questions he/she didn't answer. Now answers may come from all sorts of questionnaires so most people will only have responses on a subset, so this does not seem feasible. Or this: 1 (aaa,bbb,ccc) 2 (ddd,eee) but this would be loosing the purpose - there is no longer a meaningful way to compare the same info at different people. So directly storing the info in this structure does not seem to be the way for me. On the other hand a query may be able to generate the proper array without the usual problem of outputting unknown number of columns. thx B. -----Original Message----- From: Erik Jones [mailto:erik@myemma.com] Sent: Thursday, February 14, 2008 6:14 PM To: Balázs Klein Cc: 'Tino Wildenhain'; pgsql-general@postgresql.org Subject: Re: [GENERAL] dynamic crosstab On Feb 14, 2008, at 10:56 AM, Balázs Klein wrote: > -----Original Message----- > From: Erik Jones [mailto:erik@myemma.com] > Sent: Thursday, February 14, 2008 5:15 PM > To: Balázs Klein > Cc: 'Tino Wildenhain'; 'SunWuKung'; pgsql-general@postgresql.org > Subject: Re: [GENERAL] dynamic crosstab > > > On Feb 14, 2008, at 2:04 AM, Balázs Klein wrote: > >> Hi, >> ye, hundreds of columns - but there is no helping it, thats the >> way many questionnaire are and the representation of the responses >> (when not in a database) is always one person per row. I would need >> this for exporting, but also to show results online. >> >> Although its a good idea I am afraid that an array could only help >> me when the info I store about all the persons in the query are >> exactly the same (there wouldnt be empty cells in a crosstab) - >> its very useful for some cases but in general that sounds like a >> dangerous presumption for me. > > As of versions >= 8.2 you can store NULL values in arrays. Perhaps > you could have a Question -> Index table and then use an array per > person for their answers. > >> >> I think this is a generic shortcoming of Postgres - whenever you >> are forced to create an EAV (Entity-Attribute-Value) model you have >> no generic or way of going back to the usual one entity per row >> model. This is something that Access has been able to do (up to 255 >> columns) as far as I can remember. When I google about this topic I >> find that the majority of people are still referring to that >> solution as the easiest for this purpose. Tablefunc crosstab is so >> close to a good solution for this with the syntax where you could >> specify the columns with a query - the only shortcoming is that you >> still have to enumerate the columns and their datatype. I always >> hope that somebody might have something similar but generic - eg. >> create those columns automatically and just treat them all as text. > > Have a look at http://www.varlena.com/varlena/GeneralBits/110.php for > a totally different approach to questionnaires. > > Erik Jones > Hi, > the part that I don't know is how to put those NULLs in. > It could well be doable I just can't do it myself. > > How does the query look like that produces from this input: > PersonID AttributeID Value > 1 1 aaa > 1 2 bbb > 1 3 ccc > 2 1 ddd > 2 3 eee > > this output, without manually enumerating the attributeids: > 1 (aaa,bbb,ccc) > 2 (ddd,NULL,eee) > > Thx. > B. My point was to get rid of the the EAV setup. Something like: CREATE TABLE questions ( question_id serial primary key, question text not null ); CREATE TABLE people ( person_id serial primary key, .... ); CREATE TABLE answers ( person_id integer references people, answers text[] ); where the indexes into answers are ids from questions. You don't get any easy foreign keys for those indexes into the questions table, which you definitely don't have with the EAV setup anyway, but with this you don't need any kind of pivot/crosstab functionality. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
On Feb 14, 2008, at 3:49 PM, Klein Balazs wrote: > My point was to get rid of the the EAV setup. Something like: > > CREATE TABLE questions ( > question_id serial primary key, > question text not null > ); > > CREATE TABLE people ( > person_id serial primary key, > .... > ); > > CREATE TABLE answers ( > person_id integer references people, > answers text[] > ); > > where the indexes into answers are ids from questions. You don't get > any easy foreign keys for those indexes into the questions table, > which you definitely don't have with the EAV setup anyway, but with > this you don't need any kind of pivot/crosstab functionality. > >> I can't imagine how I could store data directly that way (beside >> the usual >> thing that whenever I can I aim to store scalar value in a column). >> >> To do what you suggest I could have this: >> 1 (aaa,bbb,ccc) >> 2 (ddd,NULL,eee) >> but for this I would need to store a NULL for a person for all the >> questions >> he/she didn't answer. Now answers may come from all sorts of >> questionnaires >> so most people will only have responses on a subset, so this does >> not seem >> feasible. >> >> Or this: >> 1 (aaa,bbb,ccc) >> 2 (ddd,eee) >> but this would be loosing the purpose - there is no longer a >> meaningful way >> to compare the same info at different people. >> >> So directly storing the info in this structure does not seem to be >> the way >> for me. On the other hand a query may be able to generate the >> proper array >> without the usual problem of outputting unknown number of columns. First, please stop top-posting. It makes it difficult for both me and others to know to whom/what you are replying. Now on to the meat of the topic! When using arrays you do not need to manually store NULLS -- they are implied by gaps in array indices. Observe: CREATE TABLE questions ( question_id INTEGER PRIMARY KEY, question_text TEXT NOT NULL ); CREATE TABLE people ( person_id SERIAL PRIMARY KEY, answers TEXT[] ); INSERT INTO questions (question_id, question_text) VALUES (2, 'Will arrays work?'), (5, 'Can pigs fly?'); INSERT INTO people (person_id) VALUES (1), (2); UPDATE people SET answers[2] = 'yep!', answers[5] = 'nope!', answers[7] = 'this shouldn''t be here!' where person_id=1; UPDATE people SET answers[5]='if only they had wings' where person_id=2; SELECT * FROM people; person_id | answers --------------- +------------------------------------------------------------- 1 | [2:7]={yep!,NULL,NULL,nope!,NULL,"this shouldn't be here!"} 2 | [5:5]={"if only they had wings"} See how postgres handles filling the NULLs for you? What you'd really want to do with this would be to define some functions for setting and getting a person's answers to a given question or set of questions so that you could implement some kind of data integrity with regards to question ids and indices into the answers arrays such as in the example above you'd want to prevent an entry at index 7 when there is no entry in the questions table for question_id=7. This whole thing is still wide open for adding extra layers such as question groupings for separate questionnaires, etc. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
Erik Jones wrote: > See how postgres handles filling the NULLs for you? What you'd really > want to do with this would be to define some functions for setting and > getting a person's answers to a given question or set of questions so > that you could implement some kind of data integrity with regards to > question ids and indices into the answers arrays such as in the example > above you'd want to prevent an entry at index 7 when there is no entry > in the questions table for question_id=7. It occurs to me that it shouldn't be terribly difficult to make an alternate version of crosstab() that returns an array rather than tuples (back when crosstab() was first written, Postgres didn't support NULL array elements). Is this worth considering for 8.4? Joe
Joe wrote > It occurs to me that it shouldn't be terribly difficult to make an > alternate version of crosstab() that returns an array rather than tuples > (back when crosstab() was first written, Postgres didn't support NULL > array elements). Is this worth considering for 8.4? I think there should be a generic way in Postgres to return from an EAV model. Although I have no evidence on that I keepthinking that the db must be more effective at that than the application would be. I was hoping that now with PG supporting plan invalidation it would be possible to return a recordset. If there is no genericway to return a recordset than being able to return an array is much better than nothing. B.
Erik Jones wrote: > First, please stop top-posting. It makes it difficult for both me > and others to know to whom/what you are replying. Sorry, I don't know much about mailing list customs - I had to look up what top-posting is. I will behave now ... I would prefer to keep the complications for when I retrieve the data rather then when I store it. I could imagine something like this though to create a crosstab as an array, but I am afraid that there is no assurance thatthe resulting array would contain the values in the same order for each focus: tbl(eID, aID, value) Select eID, array_accum(value) from ( (Select Distinct eID from tbl) e CROSS JOIN (Select Distinct aID from tbl) a ) ea LEFT OUTER JOIN tbl USING (eID, aID) GROUP BY eID B.
Balázs Klein wrote: > > I was hoping that now with PG supporting plan invalidation it would > be possible to return a recordset. Plan invalidation has nothing to do with it. In Postgres a returned recordset can be used as a row source in the FROM clause -- this requires data type information to be known at parse time. Joe
On Feb 15, 2008, at 6:29 AM, Balázs Klein wrote: > Erik Jones wrote: >> First, please stop top-posting. It makes it difficult for both me >> and others to know to whom/what you are replying. > > Sorry, I don't know much about mailing list customs - I had to look > up what top-posting is. I will behave now ... It's cool, now you know :) > I would prefer to keep the complications for when I retrieve the > data rather then when I store it. Really? When do you think users notice performance hits the most? I'd think, given that answers for a questionnaire are stored as a batch, people running reports on will be the ones to notice, i.e. at retrieval time. > > I could imagine something like this though to create a crosstab as > an array, but I am afraid that there is no assurance that the > resulting array would contain the values in the same order for each > focus: > > tbl(eID, aID, value) > > Select eID, array_accum(value) from > ( > (Select Distinct eID from tbl) e > CROSS JOIN > (Select Distinct aID from tbl) a > ) ea > LEFT OUTER JOIN > tbl USING (eID, aID) > GROUP BY eID That's cool. I still don't see why you're so set on an EAV, but it's your setup. Watch out, though, big questionnaires will turn into queries with an inordinate amount of joins and performance on those will suck. If you just used arrays directly you could pull all of the answers for a given person and/or questionnaire with pretty simple query. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
> given that answers for a questionnaire are stored as a > batch Not in our setup - for all sorts of reasons (preserving responses on a connection failure or restart, monitoring responselatency in real time, creating adaptive/branching questionnaires) we send each response separately. > people running reports on will be the ones to notice, i.e. at > retrieval time. I am not sure - different responses are aggregated into different attributes in different ways - those properties need tobe retrieved during scoring/report generation, so being able to create a join directly on a response is a good thing forme. But report generation - in our case it must be a DTP quality PDF - is such a beast anyway that db times dwarf comparedto pdf generation. The problem comes when I need to present the responses themselves in a human-friendly way - as an export or display or report.Do you think there is a way to ensure that the order of the values in the array below is the same for each person? tbl(eID, aID, value) Select eID, array_accum(value) from ( (Select Distinct eID from tbl) e CROSS JOIN (Select Distinct aID from tbl) a ) ea LEFT OUTER JOIN tbl USING (eID, aID) GROUP BY eID Thx for the help. B.
On Feb 14, 2008, at 8:19 PM, Joe Conway wrote: > Erik Jones wrote: >> See how postgres handles filling the NULLs for you? What you'd >> really want to do with this would be to define some functions for >> setting and getting a person's answers to a given question or set >> of questions so that you could implement some kind of data >> integrity with regards to question ids and indices into the >> answers arrays such as in the example above you'd want to prevent >> an entry at index 7 when there is no entry in the questions table >> for question_id=7. > > It occurs to me that it shouldn't be terribly difficult to make an > alternate version of crosstab() that returns an array rather than > tuples (back when crosstab() was first written, Postgres didn't > support NULL array elements). Is this worth considering for 8.4? That's a great idea. At the very least someone (you? me?) could start work on it and if it doesn't go into the main contrib package it could be made available on pgfoundry. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
> Balázs Klein wrote: > > > > I was hoping that now with PG supporting plan invalidation it would > > be possible to return a recordset. > > Plan invalidation has nothing to do with it. In Postgres a returned > recordset can be used as a row source in the FROM clause -- this > requires data type information to be known at parse time. > > Joe I thought that it includes that the return type can be changed/redefined at runtime. No luck there than. Thx. B.
On Feb 15, 2008, at 9:56 AM, Balázs Klein wrote: >> given that answers for a questionnaire are stored as a >> batch > > Not in our setup - for all sorts of reasons (preserving responses > on a connection failure or restart, monitoring response latency in > real time, creating adaptive/branching questionnaires) we send each > response separately. > >> people running reports on will be the ones to notice, i.e. at >> retrieval time. > > I am not sure - different responses are aggregated into different > attributes in different ways - those properties need to be > retrieved during scoring/report generation, so being able to create > a join directly on a response is a good thing for me. But report > generation - in our case it must be a DTP quality PDF - is such a > beast anyway that db times dwarf compared to pdf generation. > > The problem comes when I need to present the responses themselves > in a human-friendly way - as an export or display or report. Do you > think there is a way to ensure that the order of the values in the > array below is the same for each person? > > tbl(eID, aID, value) > > Select eID, array_accum(value) from > ( > (Select Distinct eID from tbl) e > CROSS JOIN > (Select Distinct aID from tbl) a > ) ea > LEFT OUTER JOIN > tbl USING (eID, aID) > GROUP BY eID The only way to ever guarantee a particular order is via an ORDER BY clause. Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
> -----Original Message----- > > Do youthink there is a way to ensure that the order of the values in the > > array below is the same for each person? > > > > tbl(eID, aID, value) > > > > Select eID, array_accum(value) from > > ( > > (Select Distinct eID from tbl) e > > CROSS JOIN > > (Select Distinct aID from tbl) a > > ) ea > > LEFT OUTER JOIN > > tbl USING (eID, aID) > > GROUP BY eID > > The only way to ever guarantee a particular order is via an ORDER BY > clause. Sure. I just didn’t know where to put it - most aggregates don't care about the row order, but for this one it is important.
On Fri, Feb 15, 2008 at 9:56 AM, Balázs Klein <Balazs.Klein@t-online.hu> wrote: > > given that answers for a questionnaire are stored as a > > batch > > Not in our setup - for all sorts of reasons (preserving responses on a connection failure or restart, monitoring responselatency in real time, creating adaptive/branching questionnaires) we send each response separately. > > > people running reports on will be the ones to notice, i.e. at > > retrieval time. > > I am not sure - different responses are aggregated into different attributes in different ways - those properties needto be retrieved during scoring/report generation, so being able to create a join directly on a response is a good thingfor me. But report generation - in our case it must be a DTP quality PDF - is such a beast anyway that db times dwarfcompared to pdf generation. Also, if you need to you can probably add a slony machine to your setup to run the reports on, and it doesn't matter how many reports you run, your production system will only have to run the user interfacing side. This allows for all kinds of optimizing indexing on the reporting server that you might not want to have on the production server.
Joe Conway wrote: > Erik Jones wrote: >> See how postgres handles filling the NULLs for you? What you'd really >> want to do with this would be to define some functions for setting and >> getting a person's answers to a given question or set of questions so >> that you could implement some kind of data integrity with regards to >> question ids and indices into the answers arrays such as in the example >> above you'd want to prevent an entry at index 7 when there is no entry >> in the questions table for question_id=7. > > It occurs to me that it shouldn't be terribly difficult to make an > alternate version of crosstab() that returns an array rather than tuples > (back when crosstab() was first written, Postgres didn't support NULL > array elements). Is this worth considering for 8.4? How about returning generic rows? Is that possible? It would be really neat if you didn't have to specify the return type in the query that invoked the crosstab. I keep wondering if there's a way to "pivot" (transpose) a result set defined by the standard. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Joe Conway wrote: >> It occurs to me that it shouldn't be terribly difficult to make an >> alternate version of crosstab() that returns an array rather than tuples >> (back when crosstab() was first written, Postgres didn't support NULL >> array elements). Is this worth considering for 8.4? > > How about returning generic rows? Is that possible? It would be really > neat if you didn't have to specify the return type in the query that > invoked the crosstab. Yeah, I was thinking about that as well. I'm not sure how difficult it would be. Hopefully I'll be able to find some time to play with it in the next month or so. > I keep wondering if there's a way to "pivot" (transpose) a result set > defined by the standard. I've looked at SQL2003 and couldn't find anything, but then again I could have easily missed it. Joe
On Tue, Feb 19, 2008 at 11:56:08AM -0300, Alvaro Herrera wrote: > Joe Conway wrote: > > Erik Jones wrote: > >> See how postgres handles filling the NULLs for you? What you'd > >> really want to do with this would be to define some functions > >> for setting and getting a person's answers to a given question > >> or set of questions so that you could implement some kind of > >> data integrity with regards to question ids and indices into the > >> answers arrays such as in the example above you'd want to prevent > >> an entry at index 7 when there is no entry in the questions > >> table for question_id=7. > > > > It occurs to me that it shouldn't be terribly difficult to make an > > alternate version of crosstab() that returns an array rather than > > tuples (back when crosstab() was first written, Postgres didn't > > support NULL array elements). Is this worth considering for 8.4? > > How about returning generic rows? Is that possible? One hack I've used in the past to get those is serializing the rows: XML, YAML and most recently JSON. > It would be really neat if you didn't have to specify the return > type in the query that invoked the crosstab. It would be handy :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
> > > It occurs to me that it shouldn't be terribly difficult to make an > > > alternate version of crosstab() that returns an array rather than > > > tuples (back when crosstab() was first written, Postgres didn't > > > support NULL array elements). Is this worth considering for 8.4? > > > > How about returning generic rows? Is that possible? > > One hack I've used in the past to get those is serializing the rows: > XML, YAML and most recently JSON. > > > It would be really neat if you didn't have to specify the return > > type in the query that invoked the crosstab. > > It would be handy :) +1 What about (for a 2 dim crosstab anyway) take a table and two column names to group by, and return the following results: an 1-d array with the column names, a 1-d with the rownames, and a 2-d array with the cell values; a function to take these three arrays and make csv readable text would be great; also a function to "explode" the arrays into a table (like an array_accum inverse), but this would take a type or something. Is this what every one means anyway?
I always hope that somebody might have something similar but > generic - eg. create those columns automatically and just treat them all > as text. I came up with this amateurish one based on http://www.ledscripts.com/tech/article/view/5.html. Maybe someone can use it: takes - a select statement - a name for the resulting view - the column name of the id - the column name of the attribute - the column name of the value - the aggregate function used It recreates the view of the given name as a crosstab of the sql specified. CREATE OR REPLACE FUNCTION "public"."create_crosstab_view" (eavsql_inarg varchar, resview varchar, rowid varchar, colid varchar,val varchar, agr varchar) RETURNS "pg_catalog"."void" AS $body$ DECLARE casesql varchar; dynsql varchar; r record; BEGIN dynsql=''; for r in select * from pg_views where lower(viewname) = lower(resview) loop execute 'DROP VIEW ' || resview; end loop; casesql='SELECT DISTINCT ' || colid || ' AS v from (' || eavsql_inarg || ') eav ORDER BY ' || colid; FOR r IN EXECUTE casesql Loop dynsql = dynsql || ', ' || agr || '(CASE WHEN ' || colid || '=' || r.v || ' THEN ' || val || ' ELSE NULL END) AS ' ||agr || '_' || r.v; END LOOP; dynsql = 'CREATE VIEW ' || resview || ' AS SELECT ' || rowid || dynsql || ' from (' || eavsql_inarg || ') eav GROUP BY '|| rowid; EXECUTE dynsql; END $body$ LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
Hello, Some time ago, it was written here: > ... > I think there should be a generic way in Postgres to return from an > EAV model. Although I have no evidence on that I keep thinking that > the db must be more effective at that than the application would be. > ... > > Yes, thanks. > The problem with those function is that they all have an AS (columname > type,...) part or equivalent. SWK, I fully understand your needs, and your (our) kind of frustration... I am in a similar situation, with an EAV table, and the need to do crosstab queries, without knowing in advance which columns the query should return, and how many columns. This is for chemical analysis results; laboratories store their results in an EAV way, and it is definitely a good choice, for a large number of reasons. On your side, have you found a decent solution? Has anyone got an answer? I am just about to try this one: http://www.ledscripts.com/tech/article/view/5.html <http://www.ledscripts.com/tech/article/view/5.ht=> But I don't like this style too much: the crosstab approach seems more "natural" to me. I may be totally wrong. So if there is a crosstab thing working in a generic way, that would be just super! A+ Pierre
2010/1/27 Pierre Chevalier <pierre.chevalier1967@free.fr>: > Hello, > > Some time ago, it was written here: >> >> ... >> I think there should be a generic way in Postgres to return from an EAV >> model. Although I have no evidence on that I keep thinking that the db must >> be more effective at that than the application would be. >> ... >> >> Yes, thanks. >> The problem with those function is that they all have an AS (columname >> type,...) part or equivalent. > > > SWK, I fully understand your needs, and your (our) kind of frustration... > > I am in a similar situation, with an EAV table, and the need to do crosstab > queries, without knowing in advance which columns the query should return, > and how many columns. > This is for chemical analysis results; laboratories store their results in > an EAV way, and it is definitely a good choice, for a large number of > reasons. > > On your side, have you found a decent solution? > Has anyone got an answer? > > I am just about to try this one: > http://www.ledscripts.com/tech/article/view/5.html > <http://www.ledscripts.com/tech/article/view/5.ht=> > > But I don't like this style too much: the crosstab approach seems more > "natural" to me. I may be totally wrong. > > So if there is a crosstab thing working in a generic way, that would be just > super! you cannot get crosstab via SELECT statement. There is workaround http://okbob.blogspot.com/2008/08/using-cursors-for-generating-cross.html Pavel > > A+ > Pierre > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Hello, Pavel Stehule claviota: > ... > you cannot get crosstab via SELECT statement. There is workaround > http://okbob.blogspot.com/2008/08/using-cursors-for-generating-cross.html > All right, I've just tried it: it works just fine in my case! Thanks a lot! Except a few things, but I am not (yet) familiar at all with postgresql functions. I have tried to customize a bit your function, tu suit some of my needs: - when I have NULL values in my EAV (Entity-Attribute-Value) table, I don't want zeroes to appear in the output table; - the total at the right end does not make sense in my case; I replaced it with a COUNT; therefore, I did as follows (changes are *inside stars*, I hope the formatting will work!): BEGIN FOR r IN EXECUTE 'SELECT DISTINCT ' || dimx_name || '::text AS val ' || dimx_source LOOP col_list := array_append(col_list, 'SUM(CASE ' || dimx_name || ' WHEN ' || quote_literal(r.val) || ' THEN ' || expr || ' ELSE *NULL* END) AS ' || quote_ident(r.val) || ''); END LOOP; query := 'SELECT ' || dimy_name || ', ' || array_to_string(col_list, ',') * || ', COUNT(' || expr || ') AS Count '* || dimy_source || ' GROUP BY ' || dimy_name; OPEN result NO SCROLL FOR EXECUTE query; RETURN result; END; $$ LANGUAGE plpgsql STRICT; Now, I still have some issues: as far as I can remember, in m$ access (yes, I know...), a long time ago, I used to do PIVOT queries on EAV tables, where I could chose which operation was to be made on the variable: simply the value (without GROUPing), or a SUM, AVG, etc. I don't have any running acce$$ handy, so I can't check this, I'm afraid. In the case of your function, if I understand well, the line with the GROUP BY does the trick. I will try to play with it. Later on. Something else: I am quite familiar with strict SQL, I use postgreSQL a lot, but I am not familiar with functions and, also, cursors. So I am a bit surprised by the behaviour of the cursor: I am reading doc... But what I would like to do is to redirect the output of the function (that is, the 'result' cursor) to a view, which will be used in other places. I thought something like FETCH INTO would do the trick, but it doesn't. Also, I need, at some point, to export the output to some CSV file. I usually do a quick bash script as follows: echo "COPY (SELECT * FROM dh_litho ORDER BY id, depto) TO stdout WITH CSV HEADER;" | psql bdexplo > somefile.csv And then I can feed somefile.csv to whatever program I want. I tried to do this with the cursor and the FETCH ALL, but it didn't work out well, as I had guessed... pierre@duran:~$ pierre@duran:~/fix_bd_amc$ echo "COPY ( > > SELECT do_cross_cursor('lab_pjcsa_analytecode', 'FROM lab_ana_results','sample_id', > > 'FROM lab_ana_results_sel ', > > 'value_num'); > > FETCH ALL FROM result WITH CSV HEADER; > > ) TO stdout WITH CSV HEADER ;" | psql bdexplo bash: pierre@duran:~/fix_bd_amc$: Aucun fichier ou dossier de ce type pierre@duran:~$ ERREUR: erreur de syntaxe sur ou près de « ; » bash: ERREUR: : commande introuvable bash: » : commande introuvable pierre@duran:~$ LIGNE 4 : 'value_num'); bash: Erreur de syntaxe près du symbole inattendu « ) » pierre@duran:~$ ^ bash: ^ : commande introuvable (sorry about the French!) I could not do this trick: any idea of how I could do this? I guess I should wrap the whole transaction into a one-line statement to be fed to to psql, but I can't figure out how to do it... Some help? A+ Pierre PS: I am used to "normal" mailing lists, but I got quite confused by the approach from grokbase: I thought I was posting on the grokbase list (http://grokbase.com/), and I see that the list pgsql-general@postgresql.org was the one I was posting to... Sorry for the noise, I am RTFMing at the moment... -- ____________________________________________________________________________ Pierre Chevalier Mesté Duran 32100 Condom Tél+fax : 09 75 27 45 62 05 62 28 06 83 06 37 80 33 64 Émail : pierre.chevalier1967CHEZfree.fr icq# : 10432285 http://pierremariechevalier.free.fr/ Logiciels Libres dans le Gers: http://gnusquetaires.org/ ____________________________________________________________________________
2010/1/27 Pierre Chevalier <pierre.chevalier1967@free.fr>: > Hello, > Pavel Stehule claviota: >> >> ... >> you cannot get crosstab via SELECT statement. There is workaround >> http://okbob.blogspot.com/2008/08/using-cursors-for-generating-cross.html >> > > All right, I've just tried it: it works just fine in my case! Thanks a lot! > > Except a few things, but I am not (yet) familiar at all with postgresql > functions. > I have tried to customize a bit your function, tu suit some of my needs: > > - when I have NULL values in my EAV (Entity-Attribute-Value) table, I don't > want zeroes to appear in the output table; > - the total at the right end does not make sense in my case; I replaced it > with a COUNT; > > therefore, I did as follows (changes are *inside stars*, I hope the > formatting will work!): > > BEGIN > FOR r IN EXECUTE 'SELECT DISTINCT ' > || dimx_name || '::text AS val ' || dimx_source > LOOP > col_list := array_append(col_list, 'SUM(CASE ' || dimx_name > || ' WHEN ' || quote_literal(r.val) || ' THEN ' || expr > || ' ELSE *NULL* END) AS ' || quote_ident(r.val) || ''); > END LOOP; > query := 'SELECT ' || dimy_name || ', ' > || array_to_string(col_list, ',') > * || ', COUNT(' || expr || ') AS Count '* > || dimy_source || ' GROUP BY ' || dimy_name; > OPEN result NO SCROLL FOR EXECUTE query; > RETURN result; > END; > $$ LANGUAGE plpgsql STRICT; > > > Now, I still have some issues: as far as I can remember, in m$ access (yes, > I know...), a long time ago, I used to do PIVOT queries on EAV tables, where > I could chose which operation was to be made on the variable: simply the > value (without GROUPing), or a SUM, AVG, etc. I don't have any running > acce$$ handy, so I can't check this, I'm afraid. > In the case of your function, if I understand well, the line with the GROUP > BY does the trick. I will try to play with it. Later on. > > > Something else: I am quite familiar with strict SQL, I use postgreSQL a lot, > but I am not familiar with functions and, also, cursors. So I am a bit > surprised by the behaviour of the cursor: I am reading doc... > But what I would like to do is to redirect the output of the function (that > is, the 'result' cursor) to a view, which will be used in other places. I > thought something like FETCH INTO would do the trick, but it doesn't. > > > Also, I need, at some point, to export the output to some CSV file. I > usually do a quick bash script as follows: > > echo "COPY (SELECT * FROM dh_litho ORDER BY id, depto) TO stdout WITH CSV > HEADER;" | psql bdexplo > somefile.csv > > And then I can feed somefile.csv to whatever program I want. I tried to do > this with the cursor and the FETCH ALL, but it didn't work out well, as I > had guessed... > hmm ...it cannot work :(. You cannot forward FETCH ALL statement on server side - without programming in C in this case you need small application for reading cursor and transformation to CVS Pavel > pierre@duran:~$ pierre@duran:~/fix_bd_amc$ echo "COPY ( >> > SELECT do_cross_cursor('lab_pjcsa_analytecode', 'FROM >> > lab_ana_results','sample_id', >> > 'FROM lab_ana_results_sel ', >> > 'value_num'); >> > FETCH ALL FROM result WITH CSV HEADER; >> > ) TO stdout WITH CSV HEADER ;" | psql bdexplo > bash: pierre@duran:~/fix_bd_amc$: Aucun fichier ou dossier de ce type > pierre@duran:~$ ERREUR: erreur de syntaxe sur ou près de « ; » > bash: ERREUR: : commande introuvable > bash: » : commande introuvable > pierre@duran:~$ LIGNE 4 : 'value_num'); > bash: Erreur de syntaxe près du symbole inattendu « ) » > pierre@duran:~$ ^ > bash: ^ : commande introuvable > > (sorry about the French!) > > > I could not do this trick: any idea of how I could do this? I guess I should > wrap the whole transaction into a one-line statement to be fed to to psql, > but I can't figure out how to do it... Some help? > > A+ > Pierre > > PS: I am used to "normal" mailing lists, but I got quite confused by the > approach from grokbase: I thought I was posting on the grokbase list > (http://grokbase.com/), and I see that the list pgsql-general@postgresql.org > was the one I was posting to... > Sorry for the noise, I am RTFMing at the moment... > > -- > ____________________________________________________________________________ > Pierre Chevalier > Mesté Duran > 32100 Condom > Tél+fax : 09 75 27 45 62 > 05 62 28 06 83 > 06 37 80 33 64 > Émail : pierre.chevalier1967CHEZfree.fr > icq# : 10432285 > http://pierremariechevalier.free.fr/ > Logiciels Libres dans le Gers: http://gnusquetaires.org/ > ____________________________________________________________________________ > > > >
2010/1/27 Pierre Chevalier <pierre.chevalier1967@free.fr>: > Pavel Stehule claviota: >>> >>> ... >>> But what I would like to do is to redirect the output of the function >>> (that >>> is, the 'result' cursor) to a view, which will be used in other places. I >>> thought something like FETCH INTO would do the trick, but it doesn't. >>> >>> >>> Also, I need, at some point, to export the output to some CSV file. I >>> usually do a quick bash script as follows: >>> >>> echo "COPY (SELECT * FROM dh_litho ORDER BY id, depto) TO stdout WITH CSV >>> HEADER;" | psql bdexplo > somefile.csv >>> >>> ... >>> >> >> hmm ...it cannot work :(. You cannot forward FETCH ALL statement on >> server side - without programming in C >> > > Ach! Too bad... Oh but... I used to program in C, long time ago, on HP-UX... > >> in this case you need small application for reading cursor and >> transformation to CVS >> > > Actually, if the small application was reading cursor, and transforming it > to a VIEW, this would solve both problems at once: > something like: > > CREATE VIEW crosstabbed_thing AS > (cursor_to_dataset(SELECT do_cross_cursor(...))); no it isn't possible. VIEW have to have fixed numbers of columns. You can write function that reads a cursor, create temp table, store result and will do a copy from temp table. There is one significant rule - any SELECT based statement have to have known number of columns in planner time - so number of colums must not depend on the data. There are no any workaround for it. You can do only don't use fixed SELECT statemens (VIEWS too - it is stored SELECT). look on SPI interface http://www.postgresql.org/docs/8.4/interactive/spi.html http://www.postgresql.org/docs/8.4/interactive/spi-examples.html but you have to use cursor based interface. Pavel > > And then: > echo "COPY (SELECT * FROM crosstabbed_thing) TO stdout WITH CSV HEADER;" | > psql > youpi.csv > > And there we are! > What about this plan? The cursor_to_dataset() should be written, in C if I > understand well. > I have to dig out my old C book, and browse through postgresql APIs, code > examples,etc. I guess... > > A+ > Pierre > > -- > ____________________________________________________________________________ > Pierre Chevalier > Mesté Duran > 32100 Condom > Tél+fax : 09 75 27 45 62 > 05 62 28 06 83 > 06 37 80 33 64 > Émail : pierre.chevalier1967CHEZfree.fr > icq# : 10432285 > http://pierremariechevalier.free.fr/ > Logiciels Libres dans le Gers: http://gnusquetaires.org/ > ____________________________________________________________________________ > > > >
2010/1/27 Pierre Chevalier <pierre.chevalier1967@free.fr>: > Pavel Stehule claviota: >>> >>> ... >>> Actually, if the small application was reading cursor, and transforming >>> it >>> to a VIEW, this would solve both problems at once: >>> something like: >>> >>> CREATE VIEW crosstabbed_thing AS >>> (cursor_to_dataset(SELECT do_cross_cursor(...))); >>> >> >> no it isn't possible. VIEW have to have fixed numbers of columns. >> > > Ach, flute... ;-( > >> You can write function that reads a cursor, create temp table, store >> result and will do a copy from temp table. >> > > Well... Not extremely elegant (it reminds me when I was stuck with access > and I could not do nested queries...), but why not? > Actually, if the table is a real temporary one (CREATE TEMPORARY TABLE), it > should not induce too much mess in the database layout. > >> There is one significant rule - any SELECT based statement have to >> have known number of columns in planner time - so number of colums >> must not depend on the data. There are no any workaround for it. You >> can do only don't use fixed SELECT statemens (VIEWS too - it is stored >> SELECT). >> > > All right, it makes sense now... > Nut... Idea! (careful...) what about if we do, just like in a VIEW, a CREATE > OR REPLACE, systematically when we do this kind of function? The only > drawback I can think of is that we can't have anything dependant on the VIEW > we generate. no, you cannot do it. You cannot create view and same statements ask on this view on top level. if you would to understand it - you have to understand to process pipeline: parser, planner, optimizer, executor. If you understand to this stages, then you will understand what is possible and what not. > > Another idea (more danger...): what about setting a sort of flag which says > that this VIEW should *not* be included in the planner? And it will have > unexpected number of columns? Would this be *absolutely* impossible to > state? :) sorry - you can do it, but not in pg - or you have to rewrite 50% of low level code > >> look on SPI interface >> http://www.postgresql.org/docs/8.4/interactive/spi.html >> http://www.postgresql.org/docs/8.4/interactive/spi-examples.html >> but you have to use cursor based interface. >> > > I'll try to Read The French Manual, rather than the one in English! I'll > look for it... > > > But the whole point is: this need of a generic cross-tab is really annoying > for a large number of people, it has been there for a long time, and I know > some people who just walk away from postgreSQL only because this feature is > lacking, and they return happily to their m$-access, therefore ignoring the > pure wealth of postgreSQL: sad, isn't it?... ms access hasn't klient-server architecture. Generating of crosstab is client side task - more - iOLTP databases are not the best tool for it. Better are OLAP databases with GUI clients - like Excel or MS Access. Regards Pavel Stehule > > A+ > Pierre > > -- > ____________________________________________________________________________ > Pierre Chevalier > Mesté Duran > 32100 Condom > Tél+fax : 09 75 27 45 62 > 05 62 28 06 83 > 06 37 80 33 64 > Émail : pierre.chevalier1967CHEZfree.fr > icq# : 10432285 > http://pierremariechevalier.free.fr/ > Logiciels Libres dans le Gers: http://gnusquetaires.org/ > ____________________________________________________________________________ > > > >
Pavel Stehule claviota: >> ... >> Actually, if the small application was reading cursor, and transforming it >> to a VIEW, this would solve both problems at once: >> something like: >> >> CREATE VIEW crosstabbed_thing AS >> (cursor_to_dataset(SELECT do_cross_cursor(...))); >> > no it isn't possible. VIEW have to have fixed numbers of columns. > Ach, flute... ;-( > You can write function that reads a cursor, create temp table, store > result and will do a copy from temp table. > Well... Not extremely elegant (it reminds me when I was stuck with access and I could not do nested queries...), but why not? Actually, if the table is a real temporary one (CREATE TEMPORARY TABLE), it should not induce too much mess in the database layout. > There is one significant rule - any SELECT based statement have to > have known number of columns in planner time - so number of colums > must not depend on the data. There are no any workaround for it. You > can do only don't use fixed SELECT statemens (VIEWS too - it is stored > SELECT). > All right, it makes sense now... Nut... Idea! (careful...) what about if we do, just like in a VIEW, a CREATE OR REPLACE, systematically when we do this kind of function? The only drawback I can think of is that we can't have anything dependant on the VIEW we generate. Another idea (more danger...): what about setting a sort of flag which says that this VIEW should *not* be included in the planner? And it will have unexpected number of columns? Would this be *absolutely* impossible to state? > look on SPI interface http://www.postgresql.org/docs/8.4/interactive/spi.html > http://www.postgresql.org/docs/8.4/interactive/spi-examples.html > but you have to use cursor based interface. > I'll try to Read The French Manual, rather than the one in English! I'll look for it... But the whole point is: this need of a generic cross-tab is really annoying for a large number of people, it has been there for a long time, and I know some people who just walk away from postgreSQL only because this feature is lacking, and they return happily to their m$-access, therefore ignoring the pure wealth of postgreSQL: sad, isn't it?... A+ Pierre -- ____________________________________________________________________________ Pierre Chevalier Mesté Duran 32100 Condom Tél+fax : 09 75 27 45 62 05 62 28 06 83 06 37 80 33 64 Émail : pierre.chevalier1967CHEZfree.fr icq# : 10432285 http://pierremariechevalier.free.fr/ Logiciels Libres dans le Gers: http://gnusquetaires.org/ ____________________________________________________________________________
Pavel Stehule claviota: >> ... >> But what I would like to do is to redirect the output of the function (that >> is, the 'result' cursor) to a view, which will be used in other places. I >> thought something like FETCH INTO would do the trick, but it doesn't. >> >> >> Also, I need, at some point, to export the output to some CSV file. I >> usually do a quick bash script as follows: >> >> echo "COPY (SELECT * FROM dh_litho ORDER BY id, depto) TO stdout WITH CSV >> HEADER;" | psql bdexplo > somefile.csv >> >> ... >> > > hmm ...it cannot work :(. You cannot forward FETCH ALL statement on > server side - without programming in C > Ach! Too bad... Oh but... I used to program in C, long time ago, on HP-UX... > in this case you need small application for reading cursor and > transformation to CVS > Actually, if the small application was reading cursor, and transforming it to a VIEW, this would solve both problems at once: something like: CREATE VIEW crosstabbed_thing AS (cursor_to_dataset(SELECT do_cross_cursor(...))); And then: echo "COPY (SELECT * FROM crosstabbed_thing) TO stdout WITH CSV HEADER;" | psql > youpi.csv And there we are! What about this plan? The cursor_to_dataset() should be written, in C if I understand well. I have to dig out my old C book, and browse through postgresql APIs, code examples,etc. I guess... A+ Pierre -- ____________________________________________________________________________ Pierre Chevalier Mesté Duran 32100 Condom Tél+fax : 09 75 27 45 62 05 62 28 06 83 06 37 80 33 64 Émail : pierre.chevalier1967CHEZfree.fr icq# : 10432285 http://pierremariechevalier.free.fr/ Logiciels Libres dans le Gers: http://gnusquetaires.org/ ____________________________________________________________________________
On Wed, Jan 27, 2010 at 2:14 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > hmm ...it cannot work :(. You cannot forward FETCH ALL statement on > server side - without programming in C > > in this case you need small application for reading cursor and > transformation to CVS If I'm understanding what you're doing could you write a function to return a set of record then run the cursor inside the function?
On 1/27/2010 3:49 AM, Pierre Chevalier wrote: > Pavel Stehule claviota: >>> ... >>> But what I would like to do is to redirect the output of the function >>> (that >>> is, the 'result' cursor) to a view, which will be used in other >>> places. I >>> thought something like FETCH INTO would do the trick, but it doesn't. >>> >>> >>> Also, I need, at some point, to export the output to some CSV file. I >>> usually do a quick bash script as follows: >>> >>> echo "COPY (SELECT * FROM dh_litho ORDER BY id, depto) TO stdout WITH >>> CSV >>> HEADER;" | psql bdexplo > somefile.csv >>> >>> ... >> >> hmm ...it cannot work :(. You cannot forward FETCH ALL statement on >> server side - without programming in C > > Ach! Too bad... Oh but... I used to program in C, long time ago, on > HP-UX... > How do you feel about a little perl? It would be pretty simple, and could generate a csv based on any resultset (any number of columns). I'd be happy to post a little get you started code if you wanted. -Andy
On 1/28/2010 9:11 AM, Andy Colson wrote: > On 1/27/2010 3:49 AM, Pierre Chevalier wrote: >> Pavel Stehule claviota: >>>> ... >>>> But what I would like to do is to redirect the output of the function >>>> (that >>>> is, the 'result' cursor) to a view, which will be used in other >>>> places. I >>>> thought something like FETCH INTO would do the trick, but it doesn't. >>>> >>>> >>>> Also, I need, at some point, to export the output to some CSV file. I >>>> usually do a quick bash script as follows: >>>> >>>> echo "COPY (SELECT * FROM dh_litho ORDER BY id, depto) TO stdout WITH >>>> CSV >>>> HEADER;" | psql bdexplo > somefile.csv >>>> >>>> ... >>> >>> hmm ...it cannot work :(. You cannot forward FETCH ALL statement on >>> server side - without programming in C >> >> Ach! Too bad... Oh but... I used to program in C, long time ago, on >> HP-UX... >> > > > How do you feel about a little perl? It would be pretty simple, and > could generate a csv based on any resultset (any number of columns). I'd > be happy to post a little get you started code if you wanted. > > -Andy > Humm... a comma, or quotes, would have made that make sense: > be happy to post a little "get you started" code if you wanted. here's some code, its based on Pavel's example, and dumps csv to stdout: #!/usr/bin/perl use strict; use warnings; use DBI; my $db = DBI->connect("dbi:Pg:dbname=andy", 'andy', '', {AutoCommit => 0, RaiseError => 1}); $db->do(<<EOS); SELECT do_cross_cursor('shop', 'FROM shops','gender','FROM employees e JOIN shops s ON s.id = e.shop_id', 'salary') EOS my $get = $db->prepare('FETCH ALL FROM result'); $get->execute; my $names = $get->{'NAME'}; print join(',', @$names), "\n"; while ( my @list = $get->fetchrow_array) { print join(',', @list), "\n"; } $get = undef; $db->do('commit'); $db->disconnect;
On 01/28/2010 08:57 AM, Andy Colson wrote: >> >> How do you feel about a little perl? It would be pretty simple, and >> could generate a csv based on any resultset (any number of columns). I'd >> be happy to post a little get you started code if you wanted. If you're going to go through all that, I don't understand why you wouldn't just use crosstab from contrib/tablefunc, and wrap it with application code that dynamically executes the query with the needed column definitions. It is a simple two step process: Using crosstab(text source_sql, text category_sql), - first execute category_sql to get a list of columns - build the complete crosstab SQL including the columns - execute the crosstab SQL The fact is that, as has been stated, the parser/planner requires the column type information because the result is potentially filtered or joined with other relations. There is no way around this, at least not currently, and probably not ever in this form. If PostgreSQL ever supports true procedures (i.e. CALL sp_crosstab(...)), then it would be possible to forego the column definitions as joining and filtering are not possible. Joe
Attachment
Pavel Stehule claviota: >> Nut... Idea! (careful...) what about if we do, just like in a VIEW, a CREATE >> OR REPLACE, systematically when we do this kind of function? The only >> drawback I can think of is that we can't have anything dependant on the VIEW >> we generate. >> > no, you cannot do it. You cannot create view and same statements ask > on this view on top level. > > if you would to understand it - you have to understand to process > pipeline: parser, planner, optimizer, executor. If you understand to > this stages, then you will understand what is possible and what not. > Ok, ok. I just imagine, for now... I guess I have to swallow a big bunch of doc and code before I can really understand this, so I take your word. >> Another idea (more danger...): what about setting a sort of flag which says >> that this VIEW should *not* be included in the planner? And it will have >> unexpected number of columns? Would this be *absolutely* impossible to >> state? >> > :) sorry - you can do it, but not in pg - or you have to rewrite 50% > of low level code > Oh well, not tonight... ;) >>> look on SPI interface >>> http://www.postgresql.org/docs/8.4/interactive/spi.html >>> http://www.postgresql.org/docs/8.4/interactive/spi-examples.html >>> but you have to use cursor based interface. >>> >> I'll try to Read The French Manual, rather than the one in English! I'll >> look for it... >> >> But the whole point is: this need of a generic cross-tab is really annoying >> for a large number of people, it has been there for a long time, and I know >> some people who just walk away from postgreSQL only because this feature is >> lacking, and they return happily to their m$-access, therefore ignoring the >> pure wealth of postgreSQL: sad, isn't it?... >> > ms access hasn't klient-server architecture. Generating of crosstab is > client side task - more - iOLTP databases are not the best tool for > it. Better are OLAP databases with GUI clients - like Excel or MS > Access. > OK, I got your point. Thanks for explaining patiently! In fact, what I wanted to do within postgresql, I'd better try to do it somewhere else, with a client more in an OLAP style. I got so used to do *everything* inside postgresql (well, through psql), complex queries, nested things, with visual graphs generated by queries... that I thought it was just endless! Oh well, this is the limit. Now, talking about GUI clients: I've been looking for a long time for a decent tool that could replace an access, to interact with my postgresql database. So far, the best I found is knoda, for my needs. I battled a bit with oobase, not very convincing, or I missed something. Apparently, as I can google, quite a few projects have been abandoned. Does someone knwo if there something new, on this side? I'm also quite "nostalgique" (English?) of dBase IV, and its screens, forms and tables with the F4 key... I was wandering whether a project of an ncurses-based front-end for postgreql (actually, it could be for any FLOSS database, I guess) could be developed: again, does someone know if such a project exists? -- Pierre Chevalier Mesté Duran 32100 Condom Tél : 09 75 27 45 62 - 06 37 80 33 64 http://pierremariechevalier.free.fr/ Logiciels Libres dans le Gers: http://gnusquetaires.org/
Andy Colson claviota: > ... > > be happy to post a little "get you started" code if you wanted. > > here's some code, its based on Pavel's example, and dumps csv to stdout: Hmm, pretty cryptic to my eyes... Thanks for not writing everything on one line! > #!/usr/bin/perl > use strict; > use warnings; > use DBI; > > > my $db = DBI->connect("dbi:Pg:dbname=andy", 'andy', '', {AutoCommit => > 0, RaiseError => 1}); > > > $db->do(<<EOS); > SELECT do_cross_cursor('shop', 'FROM shops','gender','FROM employees e > JOIN shops s ON s.id = e.shop_id', > 'salary') > EOS > > > my $get = $db->prepare('FETCH ALL FROM result'); > $get->execute; > > my $names = $get->{'NAME'}; > > print join(',', @$names), "\n"; > > while ( my @list = $get->fetchrow_array) > { > print join(',', @list), "\n"; > } > $get = undef; > $db->do('commit'); > $db->disconnect; OK, I think I got the point: instead of working from psql, you just call the function from outside, and you walk through the resulting dataset, adding commas and returns when needed. I 've just tried it with my data: it works! ;) It throws some insulting messages, though: Use of uninitialized value $list[5] in join or string at ./crosstab_perl.pl line 24. Use of uninitialized value $list[6] in join or string at ./crosstab_perl.pl line 24. Use of uninitialized value $list[7] in join or string at ./crosstab_perl.pl line 24. ... But the .csv file is there, after a redirection, and it seems fine! I'm just worried about the messages: anything serious, or can I just ignore them? I'll do a diff with the csv I generated before (with psql, \a, and some sed...) Thanks a lot! A+ Pierre -- Pierre Chevalier Mesté Duran 32100 Condom Tél : 09 75 27 45 62 - 06 37 80 33 64 http://pierremariechevalier.free.fr/ Logiciels Libres dans le Gers: http://gnusquetaires.org/
Andy Colson claviota: >> Ach! Too bad... Oh but... I used to program in C, long time ago, on >> HP-UX... > How do you feel about a little perl? Hm, I am not too familiar with perl. That's the least I can say. But, after all, why not? > It would be pretty simple, and could generate a csv Actually, I would have liked to have something that I can reuse within postgresql, just like a view. Sorry, I'm an idealist... > based on any resultset (any number of columns). I'd be happy to post a > little get you started code if you wanted. Sure, why not? Thanks! -- Pierre Chevalier Mesté Duran 32100 Condom Tél : 09 75 27 45 62 - 06 37 80 33 64 http://pierremariechevalier.free.fr/ Logiciels Libres dans le Gers: http://gnusquetaires.org/
2010/1/29 Pierre Chevalier <pierre.chevalier1967@free.fr>: > Pavel Stehule claviota: >>> >>> Nut... Idea! (careful...) what about if we do, just like in a VIEW, a >>> CREATE >>> OR REPLACE, systematically when we do this kind of function? The only >>> drawback I can think of is that we can't have anything dependant on the >>> VIEW >>> we generate. >>> >> >> no, you cannot do it. You cannot create view and same statements ask >> on this view on top level. >> >> if you would to understand it - you have to understand to process >> pipeline: parser, planner, optimizer, executor. If you understand to >> this stages, then you will understand what is possible and what not. >> > > Ok, ok. I just imagine, for now... I guess I have to swallow a big bunch of > doc and code before I can really understand this, so I take your word. > > >>> Another idea (more danger...): what about setting a sort of flag which >>> says >>> that this VIEW should *not* be included in the planner? And it will have >>> unexpected number of columns? Would this be *absolutely* impossible to >>> state? >>> >> >> :) sorry - you can do it, but not in pg - or you have to rewrite 50% >> of low level code >> > > Oh well, not tonight... ;) > > >>>> look on SPI interface >>>> http://www.postgresql.org/docs/8.4/interactive/spi.html >>>> http://www.postgresql.org/docs/8.4/interactive/spi-examples.html >>>> but you have to use cursor based interface. >>>> >>> >>> I'll try to Read The French Manual, rather than the one in English! I'll >>> look for it... >>> >>> But the whole point is: this need of a generic cross-tab is really >>> annoying >>> for a large number of people, it has been there for a long time, and I >>> know >>> some people who just walk away from postgreSQL only because this feature >>> is >>> lacking, and they return happily to their m$-access, therefore ignoring >>> the >>> pure wealth of postgreSQL: sad, isn't it?... >>> >> >> ms access hasn't klient-server architecture. Generating of crosstab is >> client side task - more - iOLTP databases are not the best tool for >> it. Better are OLAP databases with GUI clients - like Excel or MS >> Access. >> > > OK, I got your point. Thanks for explaining patiently! > In fact, what I wanted to do within postgresql, I'd better try to do it > somewhere else, with a client more in an OLAP style. > > I got so used to do *everything* inside postgresql (well, through psql), > complex queries, nested things, with visual graphs generated by queries... > that I thought it was just endless! Oh well, this is the limit. > I like this strategy too. But I think - and it is important limit. PostgreSQL is OLTP database. It isn't OLAP database or ROLAP database. If you like do some interactive analyses (on larger dataset) - you need minimally different server (because slow query can shots performance), you need some specialised sw - there are lot of sw better than postgres for this - Mondrian, Pentaho, olap4j. > > Now, talking about GUI clients: I've been looking for a long time for a > decent tool that could replace an access, to interact with my postgresql > database. So far, the best I found is knoda, for my needs. I battled a bit > with oobase, not very convincing, or I missed something. Apparently, as I > can google, quite a few projects have been abandoned. Does someone knwo if > there something new, on this side? > > I'm also quite "nostalgique" (English?) of dBase IV, and its screens, forms > and tables with the F4 key... I was wandering whether a project of an > ncurses-based front-end for postgreql (actually, it could be for any FLOSS > database, I guess) could be developed: again, does someone know if such a > project exists? sorry. I don't know about any Regards Pavel Stehule > > -- > Pierre Chevalier Mesté Duran 32100 Condom > Tél : 09 75 27 45 62 - 06 37 80 33 64 > http://pierremariechevalier.free.fr/ > Logiciels Libres dans le Gers: http://gnusquetaires.org/ > > > >
On 1/28/2010 5:51 PM, Pierre Chevalier wrote: >> >> while ( my @list = $get->fetchrow_array) >> { >> print join(',', @list), "\n"; >> } > It throws some insulting messages, though: > > Use of uninitialized value $list[5] in join or string at > ./crosstab_perl.pl line 24. > Use of uninitialized value $list[6] in join or string at > ./crosstab_perl.pl line 24. > Use of uninitialized value $list[7] in join or string at > ./crosstab_perl.pl line 24. Yeah, you can ignore them. Fields that are null in the database will be converted to undef in perl, which when printed spits out that warning. Right before the print, we could test for undef and set them to empty string like: map { if (!defined($_)) {$_ = '';}} @list; so the while loop would look like: while ( my @list = $get->fetchrow_array) { map { if (!defined($_)) {$_ = '';}} @list; print join(',', @list), "\n"; } -Andy