Thread: Column oriented pgsql
Is it possible to tweak (easily) Postgresql so the storage is column oriented versus row-oriented? We would like to increase read optimization on our data which is about 2TB.
Mag Gam wrote: > Is it possible to tweak (easily) Postgresql so the storage is column > oriented versus row-oriented? We would like to increase read > optimization on our data which is about 2TB. > > you read your tables by column, rather than by row?? SQL queries are inherently row oriented, the fundamental unit of storage is a 'tuple', which is a representation of a row of a table.
On Fri, May 08, 2009 at 11:25:30AM -0700, John R Pierce wrote: > Mag Gam wrote: >> Is it possible to tweak (easily) Postgresql so the storage is column >> oriented versus row-oriented? We would like to increase read >> optimization on our data which is about 2TB. >> >> > > you read your tables by column, rather than by row?? > > SQL queries are inherently row oriented, the fundamental unit of storage > is a 'tuple', which is a representation of a row of a table. http://en.wikipedia.org/wiki/Column_oriented_database This has come up on the lists from time to time; the short answer is it's really hard. - Josh / eggyknap
Attachment
On May 8, 2009, at 11:25 AM, John R Pierce wrote: > you read your tables by column, rather than by row?? > SQL queries are inherently row oriented, the fundamental unit of > storage is a 'tuple', which is a representation of a row of a table. I believe what is referring to is the disk storage organization, clustering a single column from multiple rows together onto a page. For example, if your typical use of a table is to read one particular column from a large number of rows, this could (in theory) improve performance. AFAIK, PostgreSQL doesn't support this.
Got it thanks! On Fri, May 8, 2009 at 2:57 PM, Christophe <xof@thebuild.com> wrote: > > On May 8, 2009, at 11:25 AM, John R Pierce wrote: >> >> you read your tables by column, rather than by row?? >> SQL queries are inherently row oriented, the fundamental unit of storage >> is a 'tuple', which is a representation of a row of a table. > > I believe what is referring to is the disk storage organization, clustering > a single column from multiple rows together onto a page. For example, if > your typical use of a table is to read one particular column from a large > number of rows, this could (in theory) improve performance. > > AFAIK, PostgreSQL doesn't support this. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Joshua Tolley wrote: > http://en.wikipedia.org/wiki/Column_oriented_database > This has come up on the lists from time to time; the short answer is it's > really hard. > indeed. among other issues is, just what order should those columns be stored in? database tables have no implicit order, they are abstractly unordered sets of rows. an index can impose an order but a given table can have multiple indexes, and a given query can sort on most any arbitrary thing it wants. and, say you are storing the columns sorted by the primary key, how do you do inserts or updates that change this order? instead of one table with (key, v1, v2, v3) how about N tables, (k,v1), (k,v2), (k,v3) ? or at least, one extra table with just the value that you want columnar access to?
If you are looking for a column based dbms, you might want to check out Monet - it is a columnar database.
http://monetdb.cwi.nl/
For some applications, columnar databases can be much faster than traditional rdbms systems. However, column based databases are not a 'one size fits all' answer.
Brent Friedman
Mag Gam wrote:
http://monetdb.cwi.nl/
For some applications, columnar databases can be much faster than traditional rdbms systems. However, column based databases are not a 'one size fits all' answer.
Brent Friedman
Mag Gam wrote:
Got it thanks! On Fri, May 8, 2009 at 2:57 PM, Christophe <xof@thebuild.com> wrote:On May 8, 2009, at 11:25 AM, John R Pierce wrote:you read your tables by column, rather than by row?? SQL queries are inherently row oriented, the fundamental unit of storage is a 'tuple', which is a representation of a row of a table.I believe what is referring to is the disk storage organization, clustering a single column from multiple rows together onto a page. For example, if your typical use of a table is to read one particular column from a large number of rows, this could (in theory) improve performance. AFAIK, PostgreSQL doesn't support this. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general