Re: Partition table in 9.0.x? - Mailing list pgsql-performance

From AJ Weber
Subject Re: Partition table in 9.0.x?
Date
Msg-id 50EC4D3D.9060209@comcast.net
Whole thread Raw
In response to Re: Partition table in 9.0.x?  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Partition table in 9.0.x?
List pgsql-performance
>
> It probably does, but from psql command line, you can do \d+ and \di+
\d+ doesn't appear to display any size information.

>
> If you have little control over your storage and are already IO bound,
> and the tables are growing rapidly, you may need to rethink that
> "deletes are rare" bit.  So the inserts and updates do target a hot
> part, while the selects are evenly spread?
>
> In that case, it is very important to know if the slow part are the
> selects, or the insert and deletes.  If the selects are slow, and the
> hot rows for selects can't be gathered together into a hot partition,
> then after clustering they will still be slow as the disk will still
> have to seek all over the place (massive data-mining type selects
> might be an exception to that, but I wouldn't count on it).
Since order_num is sequential, I could partition on it in broad
(sequential) ranges.  That would put all recent/new rows in one
table-partition that would be a fraction of the size of the overall
(unpartitioned) table.  I guess that would require manual maintenance
over-time (to switch to another, new partition as each grows).

>
> I think it is a matter of semantics. A small table is poor candidate
> for partitioning even if it has an excellent key to use for
> partitioning.  A large table could be a good candidate up until you
> realize it doesn't have a good key to use, at which point it stops
> being a good candidate (in my opinion).
>
My first idea to evenly-partition the table was to use the order_num and
do a "mod" on it with the number of tables I wanted to use.  That would
yield a partition-table number of 0-mod, and all rows for the same order
would stay within the same partition-table.  However, you're right in
thinking that a search for orders could -- really WOULD -- require
retrieving details from multiple partitions, probably increasing IO.  So
maybe the sequential partitioning (if at all) is better, just more
maintenance down-the-road.
>
> Was the order_num (from the parent table) the leading field of the 4
> column PK?  If not, you might want to reorder the PK so that it is the
> leading field and cluster again.  Or if reordering the PK columns is
> not convenient, make a new index on the order_num and cluster on that
> (perhaps dropping the index after the cluster, if it no longer serves
> a purpose)
>
Yes, the order_num is the first column in the PK, and our main browse
queries use, at a minimum, the first 2-3 columns in that PK in their
where-clause.

Many thanks again for all the input!
-AJ



pgsql-performance by date:

Previous
From: Patrick Dung
Date:
Subject: Re: Sub optimal performance with default setting of Postgresql with FreeBSD 9.1 on ZFS
Next
From: Jeff Janes
Date:
Subject: Re: Partition table in 9.0.x?