Thread: Performance with sorting and LIMIT on partitioned table
We have performance problem with query on partitioned table when query use order by and we want to use first/last rows from result set. More detail description: We have big table where each row is one telephone call (CDR). Definitnion of this table look like this: CREATE TABLE accounting.cdr_full_partitioned (it is parrent table) ( cdr_id bigint NOT NULL, id_crx_group_from bigint, -- identifier of user start_time_invite timestamp with time zone, -- start call time call_status VARCHAR -- FINF-call finished, FINC-call unfinished ..some extra data.. ) We creating 12 partitions using 'start_time_invite' column, simply we create one partition for each month. We create costraints like this: ALTER TABLE accounting.cdr_y2009_m09 ADD CONSTRAINT y2009m09 CHECK (start_time_invite >= '2009-09-01 00:00:00+02'::timestamp with time zone AND start_time_invite < '2009-10-01 00:00:00+02'::timestamp with time zone); and we define necessery indexes of course CREATE INDEX cdr_full_partitioned_y2009_m09_id_crx_group_to_key1 ON accounting.cdr_full_partitioned_y2009_m09 USING btree (id_crx_group_from, start_time_invite, call_status); The problem appears when we want to select calls for specified user with specified call_Status e.g: SELECT * FROM accounting.cdr_full_partitioned WHERE id_crx_group_from='522921' AND call_status='FINS' AND start_time_invite>='2009-09-28 00:00:00+02' AND start_time_invite<'2009-10-12 23:59:59+02' AND ORDER BY start_time_invite LIMIT '100' OFFSET 0 you can see execution plan http://szymanskich.net/pub/postgres/full.jpg as you see 20000 rows were selected and after were sorted what take very long about 30-40s and after sorting it limit result to 100 rows. Using table without partition SELECT * FROM accounting.cdr_full WHERE (id_crx_group_from='522921') AND ( call_status='FINS' ) AND (start_time_invite>='2009-01-28 00:00:00+02') AND (start_time_invite<'2009-10-12 23:59:59+02') ORDER BY start_time_invite LIMIT '100' OFFSET 0 execution plan is very simple "Limit (cost=0.00..406.40 rows=100 width=456)" " -> Index Scan using cdr_full_crx_group_from_start_time_invite_status_ind on cdr_full (cost=0.00..18275.76 rows=4497 width=456)" " Index Cond: ((id_crx_group_from = 522921::bigint) AND (start_time_invite >= '2009-01-27 23:00:00+01'::timestamp with time zone) AND (start_time_invite < '2009-10-12 23:59:59+02'::timestamp with time zone) AND ((call_status)::text = 'FINS'::text))" it use index to fetch first 100 rows and it is super fast and take less than 0.5s. There is no rows sorting! I've tried to execute the same query on one partition: SELECT * FROM accounting.cdr_full_partitioned_y2009_m09 WHERE (id_crx_group_from='509498') AND ( call_status='FINS' ) AND (start_time_invite>='2009-09-01 00:00:00+02') AND (start_time_invite<'2009-10-12 23:59:59+02') You can see execution plan http://szymanskich.net/pub/postgres/ononeprtition.jpg and query is superfast because there is no sorting. The question is how to speed up query when we use partitioning? So far I have not found solution. I'm wonder how do you solve problems when result from partition must be sorted and after we want to display only first/last 100 rows? We can use own partitioning mechanism and partitioning data using id_crx_group_from and create dynamic query (depending on id_crx_group_from we can execute query on one partition) but it is not most beautiful solution. Michal Szymanski http://blog.szymanskich.net http://techblog.freeconet.pl
On Mon, Oct 12, 2009 at 10:14 AM, Michal Szymanski <mich20061@gmail.com> wrote: > We have performance problem with query on partitioned table when query > use order by and we want to use first/last rows from result set. > More detail description: > We have big table where each row is one telephone call (CDR). > Definitnion of this table look like this: > CREATE TABLE accounting.cdr_full_partitioned (it is parrent table) > ( > cdr_id bigint NOT NULL, > id_crx_group_from bigint, -- identifier of user > start_time_invite timestamp with time zone, -- start call time > call_status VARCHAR -- FINF-call finished, FINC-call > unfinished > ..some extra data.. > ) > > We creating 12 partitions using 'start_time_invite' column, simply we > create one partition for each month. We create costraints like this: > ALTER TABLE accounting.cdr_y2009_m09 > ADD CONSTRAINT y2009m09 CHECK (start_time_invite >= '2009-09-01 > 00:00:00+02'::timestamp with time zone AND start_time_invite < > '2009-10-01 00:00:00+02'::timestamp with time zone); > > and we define necessery indexes of course > > CREATE INDEX cdr_full_partitioned_y2009_m09_id_crx_group_to_key1 > ON accounting.cdr_full_partitioned_y2009_m09 > USING btree > (id_crx_group_from, start_time_invite, call_status); > > > The problem appears when we want to select calls for specified user > with specified call_Status e.g: > SELECT * FROM accounting.cdr_full_partitioned > WHERE > id_crx_group_from='522921' AND > call_status='FINS' AND > start_time_invite>='2009-09-28 00:00:00+02' AND > start_time_invite<'2009-10-12 23:59:59+02' AND > ORDER BY start_time_invite LIMIT '100' OFFSET 0 > > you can see execution plan http://szymanskich.net/pub/postgres/full.jpg > as you see 20000 rows were selected and after were sorted what take > very long about 30-40s and after sorting it limit > result to 100 rows. > > Using table without partition > > SELECT * FROM accounting.cdr_full WHERE > (id_crx_group_from='522921') AND ( > call_status='FINS' ) AND (start_time_invite>='2009-01-28 > 00:00:00+02') > AND (start_time_invite<'2009-10-12 23:59:59+02') ORDER BY > start_time_invite LIMIT '100' OFFSET 0 > > execution plan is very simple > "Limit (cost=0.00..406.40 rows=100 width=456)" > " -> Index Scan using > cdr_full_crx_group_from_start_time_invite_status_ind on cdr_full > (cost=0.00..18275.76 rows=4497 width=456)" > " Index Cond: ((id_crx_group_from = 522921::bigint) AND > (start_time_invite >= '2009-01-27 23:00:00+01'::timestamp with time > zone) AND (start_time_invite < '2009-10-12 23:59:59+02'::timestamp > with time zone) AND ((call_status)::text = 'FINS'::text))" > > it use index to fetch first 100 rows and it is super fast and take > less than 0.5s. There is no rows sorting! > I've tried to execute the same query on one partition: > SELECT * FROM accounting.cdr_full_partitioned_y2009_m09 > WHERE (id_crx_group_from='509498') AND ( > call_status='FINS' ) AND (start_time_invite>='2009-09-01 > 00:00:00+02') > AND (start_time_invite<'2009-10-12 23:59:59+02') > > You can see execution plan http://szymanskich.net/pub/postgres/ononeprtition.jpg > and query is superfast because there is no sorting. The question is > how to speed up query when we use partitioning? So far I have not > found solution. I'm wonder how do you solve problems > when result from partition must be sorted and after we want to display > only first/last 100 rows? > We can use own partitioning mechanism and partitioning data using > id_crx_group_from and create dynamic query (depending on > id_crx_group_from we can execute query on one partition) but it is not > most beautiful solution. Yeah - unfortunately the query planner is not real smart about partitioned tables yet. I can't make anything of the JPG link you posted. Can you post the EXPLAIN ANALYZE output for the case that is slow? What PG version is this? ...Robert
I have a similar, recent thread titled Partitioned Tables and ORDER BY with a decent break down. I think I am hitting the same issue Michal is.On Mon, Oct 12, 2009 at 10:14 AM, Michal Szymanski <mich20061@gmail.com> wrote:We have performance problem with query on partitioned table when query use order by and we want to use first/last rows from result set. More detail description: We have big table where each row is one telephone call (CDR). Definitnion of this table look like this: CREATE TABLE accounting.cdr_full_partitioned (it is parrent table) ( cdr_id bigint NOT NULL, id_crx_group_from bigint, -- identifier of user start_time_invite timestamp with time zone, -- start call time call_status VARCHAR -- FINF-call finished, FINC-call unfinished ..some extra data.. ) We creating 12 partitions using 'start_time_invite' column, simply we create one partition for each month. We create costraints like this: ALTER TABLE accounting.cdr_y2009_m09 ADD CONSTRAINT y2009m09 CHECK (start_time_invite >= '2009-09-01 00:00:00+02'::timestamp with time zone AND start_time_invite < '2009-10-01 00:00:00+02'::timestamp with time zone); and we define necessery indexes of course CREATE INDEX cdr_full_partitioned_y2009_m09_id_crx_group_to_key1 ON accounting.cdr_full_partitioned_y2009_m09 USING btree (id_crx_group_from, start_time_invite, call_status); The problem appears when we want to select calls for specified user with specified call_Status e.g: SELECT * FROM accounting.cdr_full_partitioned WHERE id_crx_group_from='522921' AND call_status='FINS' AND start_time_invite>='2009-09-28 00:00:00+02' AND start_time_invite<'2009-10-12 23:59:59+02' AND ORDER BY start_time_invite LIMIT '100' OFFSET 0 you can see execution plan http://szymanskich.net/pub/postgres/full.jpg as you see 20000 rows were selected and after were sorted what take very long about 30-40s and after sorting it limit result to 100 rows. Using table without partition SELECT * FROM accounting.cdr_full WHERE (id_crx_group_from='522921') AND ( call_status='FINS' ) AND (start_time_invite>='2009-01-28 00:00:00+02') AND (start_time_invite<'2009-10-12 23:59:59+02') ORDER BY start_time_invite LIMIT '100' OFFSET 0 execution plan is very simple "Limit (cost=0.00..406.40 rows=100 width=456)" " -> Index Scan using cdr_full_crx_group_from_start_time_invite_status_ind on cdr_full (cost=0.00..18275.76 rows=4497 width=456)" " Index Cond: ((id_crx_group_from = 522921::bigint) AND (start_time_invite >= '2009-01-27 23:00:00+01'::timestamp with time zone) AND (start_time_invite < '2009-10-12 23:59:59+02'::timestamp with time zone) AND ((call_status)::text = 'FINS'::text))" it use index to fetch first 100 rows and it is super fast and take less than 0.5s. There is no rows sorting! I've tried to execute the same query on one partition: SELECT * FROM accounting.cdr_full_partitioned_y2009_m09 WHERE (id_crx_group_from='509498') AND ( call_status='FINS' ) AND (start_time_invite>='2009-09-01 00:00:00+02') AND (start_time_invite<'2009-10-12 23:59:59+02') You can see execution plan http://szymanskich.net/pub/postgres/ononeprtition.jpg and query is superfast because there is no sorting. The question is how to speed up query when we use partitioning? So far I have not found solution. I'm wonder how do you solve problems when result from partition must be sorted and after we want to display only first/last 100 rows? We can use own partitioning mechanism and partitioning data using id_crx_group_from and create dynamic query (depending on id_crx_group_from we can execute query on one partition) but it is not most beautiful solution.Yeah - unfortunately the query planner is not real smart about partitioned tables yet. I can't make anything of the JPG link you posted. Can you post the EXPLAIN ANALYZE output for the case that is slow? What PG version is this? ...Robert
Essentially doing a SELECT against the parent with appropriate constraint columns in the WHERE clause is very fast (uses index scans against correct child table only) but the moment you add an ORDER BY it seems to be merging the parent (an empty table) and the child, sorting the results, and sequential scanning. So it does still scan only the appropriate child table in the end but indexes are useless.
Unfortunately the only workaround I can come up with is to query the partitioned child tables directly. In my case the partitions are rather large so the timing difference is 522ms versus 149865ms.
On Mon, Oct 19, 2009 at 6:58 AM, Joe Uhl <joeuhl@gmail.com> wrote: > I have a similar, recent thread titled Partitioned Tables and ORDER BY with > a decent break down. I think I am hitting the same issue Michal is. > > Essentially doing a SELECT against the parent with appropriate constraint > columns in the WHERE clause is very fast (uses index scans against correct > child table only) but the moment you add an ORDER BY it seems to be merging > the parent (an empty table) and the child, sorting the results, and > sequential scanning. So it does still scan only the appropriate child table > in the end but indexes are useless. > > Unfortunately the only workaround I can come up with is to query the > partitioned child tables directly. In my case the partitions are rather > large so the timing difference is 522ms versus 149865ms. These questions are all solvable depending on what you define 'solution' as. I would at this point be thinking in terms of wrapping the query in a function using dynamic sql in plpgsql...using some ad hoc method of determining which children to hit and awkwardly looping them and enforcing limit, ordering, etc at that level. Yes, it sucks, but it only has to be done for classes of queries constraint exclusion can't handle and you will only handle a couple of cases most likely. For this reason, when I set up my partitioning strategies, I always try to divide the data such that you rarely if ever, have to fire queries that have to touch multiple partitions simultaneously. merlin
On Mon, Oct 19, 2009 at 6:58 AM, Joe Uhl <joeuhl@gmail.com> wrote:
For anyone else hitting this issue and using hibernate my solution on the select side ended up being session-specific hibernate interceptors that rewrite the from clause after hibernate prepares the statement. This seems to be working alright especially since in our case the code, while not aware of DB partitioning, has the context necessary to select the right partition under the hood.
Thankfully we haven't yet had queries that need to hit multiple partitions so this works okay without too much logic for now. I suppose if I needed to go multi-partition on single queries and wanted to continue down the hibernate interceptor path I could get more creative with the from clause rewriting and start using UNIONs, or switch to a Postgres-level solution like you are describing.
This definitely sounds like a workable approach. I am doing something a little similar on the insert/update side to trick hibernate into writing data correctly into partitioned tables when it only knows about the parent.I have a similar, recent thread titled Partitioned Tables and ORDER BY with a decent break down. I think I am hitting the same issue Michal is. Essentially doing a SELECT against the parent with appropriate constraint columns in the WHERE clause is very fast (uses index scans against correct child table only) but the moment you add an ORDER BY it seems to be merging the parent (an empty table) and the child, sorting the results, and sequential scanning. So it does still scan only the appropriate child table in the end but indexes are useless. Unfortunately the only workaround I can come up with is to query the partitioned child tables directly. In my case the partitions are rather large so the timing difference is 522ms versus 149865ms.These questions are all solvable depending on what you define 'solution' as. I would at this point be thinking in terms of wrapping the query in a function using dynamic sql in plpgsql...using some ad hoc method of determining which children to hit and awkwardly looping them and enforcing limit, ordering, etc at that level. Yes, it sucks, but it only has to be done for classes of queries constraint exclusion can't handle and you will only handle a couple of cases most likely. For this reason, when I set up my partitioning strategies, I always try to divide the data such that you rarely if ever, have to fire queries that have to touch multiple partitions simultaneously. merlin
For anyone else hitting this issue and using hibernate my solution on the select side ended up being session-specific hibernate interceptors that rewrite the from clause after hibernate prepares the statement. This seems to be working alright especially since in our case the code, while not aware of DB partitioning, has the context necessary to select the right partition under the hood.
Thankfully we haven't yet had queries that need to hit multiple partitions so this works okay without too much logic for now. I suppose if I needed to go multi-partition on single queries and wanted to continue down the hibernate interceptor path I could get more creative with the from clause rewriting and start using UNIONs, or switch to a Postgres-level solution like you are describing.