Partitioned Tables and ORDER BY - Mailing list pgsql-performance

From Joe Uhl
Subject Partitioned Tables and ORDER BY
Date
Msg-id 4ACE062C.4010209@gmail.com
Whole thread Raw
List pgsql-performance
We have been using partitioning for some time with great success.  Up
until now our usage has not included ordering and now that we are trying
to use an ORDER BY against an indexed column a rather significant
shortcoming seems to be kicking in.

Parent table (have cut all but 4 columns to make it easier to post about)
CREATE TABLE people
(
   person_id character varying(36) NOT NULL,
   list_id character varying(36) NOT NULL,
   first_name character varying(255),
   last_name character varying(255),
   CONSTRAINT people_pkey (person_id, list_id)
);

A partition looks like this:
CREATE TABLE people_list1
(
   -- inherited columns omitted
   CONSTRAINT people_list1_list_id_check CHECK (list_id::text =
'the_unique_list_id'::text)
)
INHERITS (people);

Both the parent and the children have indexes on all 4 columns mentioned
above.  The parent table is completely empty.

If I run this query, directly against the partition, performance is
excellent:
select * from people_list1 order by first_name asc limit 50;

The explain analyze output:
  Limit  (cost=0.00..4.97 rows=50 width=34315) (actual
time=49.616..522.464 rows=50 loops=1)
    ->  Index Scan using idx_people_first_name_list1 on people_list1
(cost=0.00..849746.98 rows=8544854 width=34315) (actual
time=49.614..522.424 rows=50 loops=1)
  Total runtime: 522.773 ms

If I run this query, against the parent, performance is terrible:
select * from people where list_id = 'the_unique_list_id' order by
first_name asc limit 50;

The explain analyze output:
  Limit  (cost=726844.88..726845.01 rows=50 width=37739) (actual
time=149864.869..149864.884 rows=50 loops=1)
    ->  Sort  (cost=726844.88..748207.02 rows=8544855 width=37739)
(actual time=149864.868..149864.876 rows=50 loops=1)
          Sort Key: public.people.first_name
          Sort Method:  top-N heapsort  Memory: 50kB
          ->  Result  (cost=0.00..442990.94 rows=8544855 width=37739)
(actual time=0.081..125837.332 rows=8545138 loops=1)
                ->  Append  (cost=0.00..442990.94 rows=8544855
width=37739) (actual time=0.079..111103.743 rows=8545138 loops=1)
                      ->  Index Scan using people_pkey on people
(cost=0.00..4.27 rows=1 width=37739) (actual time=0.008..0.008 rows=0
loops=1)
                            Index Cond: ((list_id)::text =
'the_unique_list_id'::text)
                      ->  Seq Scan on people_list1 people
(cost=0.00..442986.67 rows=8544854 width=34315) (actual
time=0.068..109781.308 rows=8545138 loops=1)
                            Filter: ((list_id)::text =
'the_unique_list_id'::text)
  Total runtime: 149865.411 ms

Just to show that partitions are setup correctly, this query also has
excellent performance:
select * from people where list_id = 'the_unique_list_id' and first_name
= 'JOE';

Here is the explain analyze for that:
  Result  (cost=0.00..963.76 rows=482 width=37739) (actual
time=6.031..25.394 rows=2319 loops=1)
    ->  Append  (cost=0.00..963.76 rows=482 width=37739) (actual
time=6.029..21.340 rows=2319 loops=1)
          ->  Index Scan using idx_people_first_name on people
(cost=0.00..4.27 rows=1 width=37739) (actual time=0.010..0.010 rows=0
loops=1)
                Index Cond: ((first_name)::text = 'JOE'::text)
                Filter: ((list_id)::text = 'the_unique_list_id'::text)
          ->  Bitmap Heap Scan on people_list1 people
(cost=8.47..959.49 rows=481 width=34315) (actual time=6.018..20.968
rows=2319 loops=1)
                Recheck Cond: ((first_name)::text = 'JOE'::text)
                Filter: ((list_id)::text = 'the_unique_list_id'::text)
                ->  Bitmap Index Scan on idx_people_first_name_list1
(cost=0.00..8.35 rows=481 width=0) (actual time=5.566..5.566 rows=2319
loops=1)
                      Index Cond: ((first_name)::text = 'JOE'::text)
  Total runtime: 25.991 ms


This is Postgres 8.3.7 on the 2.6.28 kernel with constraint_exclusion
on.  Our partitions are in the 8 - 15 million row range.

I realize one option is to hit the partition directly instead of hitting
the parent table with the check constraint in the WHERE clause, but up
until now we have been able to avoid needing partition-awareness in our
code.  Perhaps we have hit upon something that will require breaking
that cleanliness but wanted to see if there were any workarounds.

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: position in DDL of columns used in indexes
Next
From: Tory M Blue
Date:
Subject: concurrent reindex issues