Thread: Table Column Retrieval

Table Column Retrieval

From
dmp
Date:
Recently I read that one of the distinctions between a standard database
and
a columnar one, which led to an increase in its efficiency, was and I
quote:

"Only relevant columns are retrieved (A row-wise database would pull
all columns and typically discard 80-95% of them)"

Is this true of PostgreSQL? That eventhough my query does not call for a
column it is still pulled from the table row(s). I know that my client via
the JDBC does not contain the data in the ResultSet for the column, because
of the packet monitoring I have done on queries.

danap

Re: Table Column Retrieval

From
Joshua Tolley
Date:
On Mon, Feb 22, 2010 at 07:23:09PM -0700, dmp wrote:
>
> Recently I read that one of the distinctions between a standard database
> and
> a columnar one, which led to an increase in its efficiency, was and I
> quote:
>
> "Only relevant columns are retrieved (A row-wise database would pull
> all columns and typically discard 80-95% of them)"
>
> Is this true of PostgreSQL? That eventhough my query does not call for a
> column it is still pulled from the table row(s). I know that my client via
> the JDBC does not contain the data in the ResultSet for the column, because
> of the packet monitoring I have done on queries.

PostgreSQL doesn't use columnar storage. Data are read from the disk in pages,
and those pages contain not only the columns you're interested in but all the
other columns in the table as well. The parts of the table you're not
interested in aren't returned as part of the query, and thus don't show up in
your result set, but they do get read from disk.

The disadvantage of a columnar system is that when you read multiple columns,
you have to piece together the rows of the table using columns read from
various parts of the disk, effectively identical to doing a bunch of joins.
For some workloads the columnar storage is a win, and for some workloads,
row-based storage is the best bet.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Attachment

Re: Table Column Retrieval

From
Scott Marlowe
Date:
On Mon, Feb 22, 2010 at 8:59 PM, Joshua Tolley <eggyknap@gmail.com> wrote:
> On Mon, Feb 22, 2010 at 07:23:09PM -0700, dmp wrote:
>>
>> Recently I read that one of the distinctions between a standard database
>> and
>> a columnar one, which led to an increase in its efficiency, was and I
>> quote:
>>
>> "Only relevant columns are retrieved (A row-wise database would pull
>> all columns and typically discard 80-95% of them)"
>>
>> Is this true of PostgreSQL? That eventhough my query does not call for a
>> column it is still pulled from the table row(s). I know that my client via
>> the JDBC does not contain the data in the ResultSet for the column, because
>> of the packet monitoring I have done on queries.
>
> PostgreSQL doesn't use columnar storage. Data are read from the disk in pages,
> and those pages contain not only the columns you're interested in but all the
> other columns in the table as well. The parts of the table you're not
> interested in aren't returned as part of the query, and thus don't show up in
> your result set, but they do get read from disk.

Note that toasted data that aren't needed are not retrieved, so if
you've got a lots of text columns that can help.

MySQL's innodb storage engine has an interesting compromise.  Indexes
ARE columnar storage, the non indexed fields are stored together in
the main table.  So if you're just hitting indexed columns it's column
oriented access.