Re: Partition table in 9.0.x? - Mailing list pgsql-performance
From | AJ Weber |
---|---|
Subject | Re: Partition table in 9.0.x? |
Date | |
Msg-id | 50EC5FC7.7000902@comcast.net Whole thread Raw |
In response to | Re: Partition table in 9.0.x? (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: Partition table in 9.0.x?
|
List | pgsql-performance |
> It does if you use it without an argument, to display all the tables > in the search path: > > jjanes=# \d+ > List of relations > Schema | Name | Type | Owner | Size | Description > --------+------------------+-------+--------+---------+------------- > public | pgbench_accounts | table | jjanes | 128 MB | > public | pgbench_branches | table | jjanes | 40 kB | > public | pgbench_history | table | jjanes | 0 bytes | > public | pgbench_tellers | table | jjanes | 40 kB | > (4 rows) > > It rather annoys me that you actually get less information (no size, > no owner) when you use \d+ on a named table. I don't know if there is > a reason for that feature, or if it was just an oversight. That is rather peculiar. Sorry for that. Table in question is 9284MB (Parent table is 621MB) > > The current constraint exclusion code is quite simple-minded and > doesn't know how to make use of check constraints that use the mod > function, so the indexes of all partitions would have to be searched > for each order_num-driven query, even though we know the data could > only exist in one of them. The constraint exclusion codes does > understand check constraints that involve ranges. Hmm. That's a bit of a limitation I didn't know about. I assume it doesn't understand the percent (mod operator) just the same as not understanding the MOD() function? Either way, I guess this strategy does not pan-out. > There could still be some benefit as the table data would be > concentrated, even if the index data is not. I'm reaching way, way back in my head, but I think _some_ RDBMS I worked with previously had a way to "cluster" the rows around a single one of the indexes on the table, thus putting the index and the row-data "together" and reducing the number of IO's to retrieve the row if that index was used. Am I understanding that PG's "cluster" is strictly to group like rows together logically -- table data only, not to coordinate the table row with the index upon which you clustered them? > >> and all rows for the same order would stay >> within the same partition-table. > But usually a given order_num would only be of interest for a fraction > of a second before moving on to some other order_num of interest, so > by the time the relevant partition become fully cached, it would no > longer be hot. Or, if the partitions were small enough, you could > assume that all rows would be dragged into memory when the first one > was requested because they lay so close to each other. But it is not > feasible to have a large enough number of partitions to make that > happen. But if the table is clustered, this is exactly what you would > get--the trouble would be keeping it clustered. If most of the > line-items are inserted at the same time as each other, they probably > should be fairly well clustered to start with. Does decreasing the fill to like 90 help keep it clustered in-between times that I could shutdown the app and perform a (re-) cluster on the overall table? Problem is, with a table that size, and the hardware I'm "blessed with", the cluster takes quite a bit of time. :(
pgsql-performance by date: