Re: query a table with lots of coulmns - Mailing list pgsql-performance

From Josh Berkus
Subject Re: query a table with lots of coulmns
Date
Msg-id 541CA2E1.4060604@agliodbs.com
Whole thread Raw
In response to query a table with lots of coulmns  (Björn Wittich <Bjoern_Wittich@gmx.de>)
Responses Re: query a table with lots of coulmns
List pgsql-performance
On 09/19/2014 04:51 AM, Björn Wittich wrote:
>
> I am relatively new to postgres. I have a table with 500 coulmns and
> about 40 mio rows. I call this cache table where one column is a unique
> key (indexed) and the 499 columns (type integer) are some values
> belonging to this key.
>
> Now I have a second (temporary) table (only 2 columns one is the key of
> my cache table) and I want  do an inner join between my temporary table
> and the large cache table and export all matching rows. I found out,
> that the performance increases when I limit the join to lots of small
> parts.
> But it seems that the databases needs a lot of disk io to gather all 499
> data columns.
> Is there a possibilty to tell the databases that all these colums are
> always treated as tuples and I always want to get the whole row? Perhaps
> the disk oraganization could then be optimized?

PostgreSQL is already a row store, which means by default you're getting
all of the columns, and the columns are stored physically adjacent to
each other.

If requesting only 1 or two columns is faster than requesting all of
them, that's pretty much certainly due to transmission time, not disk
IO.  Otherwise, please post your schema (well, a truncated version) and
your queries.

BTW, in cases like yours I've used a INT array instead of 500 columns to
good effect; it works slightly better with PostgreSQL's compression.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Yet another abort-early plan disaster on 9.3
Next
From: Mark Kirkwood
Date:
Subject: Re: postgres 9.3 vs. 9.4