Thread: table partitioning and select max(id)
I implemented table partitioning, and it caused havoc with a "select max(id)" on the parent table - the query plan has changed from a lightningly fast backwards index scan to a deadly seq scan. Both partitions are set up with primary key index and draws new IDs from the same sequence ... "select max(id)" on both partitions are fast. Are there any tricks I can do to speed up this query? I can't add the ID to the table constraints, we may still get in "old" data causing rows with fresh IDs to get into the old table. (I decided to keep this short rather than include lots of details - but at least worth mentioning that we're using PG9)
This is a known limitation of partitioning. One solution is to use a recursive stored proc, which can use indexes. Such a solution is discussed here: http://archives.postgresql.org/pgsql-performance/2009-09/msg00036.php Regards, Ken http://archives.postgresql.org/pgsql-performance/2009-09/msg00036.php On Fri, Feb 4, 2011 at 6:24 PM, Tobias Brox <tobixen@gmail.com> wrote: > I implemented table partitioning, and it caused havoc with a "select > max(id)" on the parent table - the query plan has changed from a > lightningly fast backwards index scan to a deadly seq scan. Both > partitions are set up with primary key index and draws new IDs from > the same sequence ... "select max(id)" on both partitions are fast. > Are there any tricks I can do to speed up this query? I can't add the > ID to the table constraints, we may still get in "old" data causing > rows with fresh IDs to get into the old table. > > (I decided to keep this short rather than include lots of details - > but at least worth mentioning that we're using PG9) > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- -Ken
Tobias Brox wrote: > I implemented table partitioning, and it caused havoc with a "select > max(id)" on the parent table - the query plan has changed from a > lightningly fast backwards index scan to a deadly seq scan. This problem was fixed in the upcoming 9.1: http://archives.postgresql.org/pgsql-committers/2010-11/msg00028.php http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=034967bdcbb0c7be61d0500955226e1234ec5f04 Here's the comment from that describing the main technique used to fix it: "This module tries to replace MIN/MAX aggregate functions by subqueries of the form (SELECT col FROM tab WHERE ... ORDER BY col ASC/DESC LIMIT 1) Given a suitable index on tab.col, this can be much faster than the generic scan-all-the-rows aggregation plan. We can handle multiple MIN/MAX aggregates by generating multiple subqueries, and their orderings can be different. However, if the query contains any non-optimizable aggregates, there's no point since we'll have to scan all the rows anyway." Unfortunately that change ends a series of 6 commits of optimizer refactoring in this area, so it's not the case that you just apply this one commit as a bug-fix to a 9.0 system. I have a project in process to do the full backport needed I might be able to share with you if that works out, and you're willing to run with a customer patched server process. Using one of the user-space ideas Ken suggested may very well be easier for you. I'm stuck with an app I can't rewrite to do that. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
[Greg Smith] > Here's the comment from that describing the main technique used to fix it: > > "This module tries to replace MIN/MAX aggregate functions by subqueries of > the form > > (SELECT col FROM tab WHERE ... ORDER BY col ASC/DESC LIMIT 1) Huh ... that sounds a bit like pg 8.0 to me ;-) I remember on 7.x one had to write "select id from table order by id desc limit 1" to force through a quick index scan. This was fixed in 8.0 IIRC. I did test "select id from table order by id desc limit 1" on my parent table yesterday, it would still do the seq-scan. Even adding a where-restriction to make sure only one partition was queried I still got the seq-scan. > Unfortunately that change ends a series of 6 commits of optimizer > refactoring in this area, so it's not the case that you just apply this one > commit as a bug-fix to a 9.0 system. I have a project in process to do the > full backport needed I might be able to share with you if that works out, > and you're willing to run with a customer patched server process. In this particular case, "wait for 9.1" seems to be the best option :-)
Tobias Brox wrote: > I did test "select id from table order by id desc limit 1" on my parent table > yesterday, it would still do the seq-scan. Even adding a > where-restriction to make sure only one partition was queried I still > got the seq-scan. > Right; you actually have to direct the query toward the specific partition by name, nothing run against the parent table will work. The new logic for 9.1 essentially splits the query into this alternate form, runs it against every partition individually, then combines the results. If you can afford to wait for 9.1, that is certainly the easy path here. It just works out of the box in that version. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books