Thread: Table Column Retrieval
Recently I read that one of the distinctions between a standard database and a columnar one, which led to an increase in its efficiency, was and I quote: "Only relevant columns are retrieved (A row-wise database would pull all columns and typically discard 80-95% of them)" Is this true of PostgreSQL? That eventhough my query does not call for a column it is still pulled from the table row(s). I know that my client via the JDBC does not contain the data in the ResultSet for the column, because of the packet monitoring I have done on queries. danap
On Mon, Feb 22, 2010 at 07:23:09PM -0700, dmp wrote: > > Recently I read that one of the distinctions between a standard database > and > a columnar one, which led to an increase in its efficiency, was and I > quote: > > "Only relevant columns are retrieved (A row-wise database would pull > all columns and typically discard 80-95% of them)" > > Is this true of PostgreSQL? That eventhough my query does not call for a > column it is still pulled from the table row(s). I know that my client via > the JDBC does not contain the data in the ResultSet for the column, because > of the packet monitoring I have done on queries. PostgreSQL doesn't use columnar storage. Data are read from the disk in pages, and those pages contain not only the columns you're interested in but all the other columns in the table as well. The parts of the table you're not interested in aren't returned as part of the query, and thus don't show up in your result set, but they do get read from disk. The disadvantage of a columnar system is that when you read multiple columns, you have to piece together the rows of the table using columns read from various parts of the disk, effectively identical to doing a bunch of joins. For some workloads the columnar storage is a win, and for some workloads, row-based storage is the best bet. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com
Attachment
On Mon, Feb 22, 2010 at 8:59 PM, Joshua Tolley <eggyknap@gmail.com> wrote: > On Mon, Feb 22, 2010 at 07:23:09PM -0700, dmp wrote: >> >> Recently I read that one of the distinctions between a standard database >> and >> a columnar one, which led to an increase in its efficiency, was and I >> quote: >> >> "Only relevant columns are retrieved (A row-wise database would pull >> all columns and typically discard 80-95% of them)" >> >> Is this true of PostgreSQL? That eventhough my query does not call for a >> column it is still pulled from the table row(s). I know that my client via >> the JDBC does not contain the data in the ResultSet for the column, because >> of the packet monitoring I have done on queries. > > PostgreSQL doesn't use columnar storage. Data are read from the disk in pages, > and those pages contain not only the columns you're interested in but all the > other columns in the table as well. The parts of the table you're not > interested in aren't returned as part of the query, and thus don't show up in > your result set, but they do get read from disk. Note that toasted data that aren't needed are not retrieved, so if you've got a lots of text columns that can help. MySQL's innodb storage engine has an interesting compromise. Indexes ARE columnar storage, the non indexed fields are stored together in the main table. So if you're just hitting indexed columns it's column oriented access.