Thread: partitioned table performance

partitioned table performance

From
Andreas Kostyrka
Date:
Hi!

I'm just wondering, I've got a table that is partitioned into monthly
tables:

media_downloads -> media_downloads_YYYYMM
 I\- id (primary key)
  \- created_on (timestamp criteria for the monthly table split)

There are constraints upon the created_on column, all needed insert
instead rules are defined too.
One additional hardship is that id are not monotone against created_on,
id1 < id2 does not imply created_on1 <= created_on2 :(
The table contains basically almost 100M rows, and the number is
growing. (the table will be about a 12GB pg_dump.)
All relevant indexes (primary key id, index on created_on) are defined
too.

The good thing is, queries like all rows in the last 7 days work
reasonable fast, the optimizer just checks the 1-2 last month tables.

Using postgres 8.1.4-0ubuntu1, I've got to implement the following
queries in a reasonable fast way:

-- sequential reading of rows
SELECT * FROM media_downloads WHERE id > 1000000 ORDER BY id LIMIT 100;

Against the same monolithic table with about 16.5M rows, I'm getting a
cost of 20.6 pages. (Index scan)

Against the partitioned tables, I'm getting a cost of 5406822 pages.
Now I understand, that without any additional conditions, postgresql
needs to do the query for all subtables first, but explain against the
subtables show costs of 4-5 pages.
events=#  explain select * from media_downloads where id >90000000 order
by id limit 100;

QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=5406822.39..5406822.64 rows=100 width=1764)
   ->  Sort  (cost=5406822.39..5413639.50 rows=2726843 width=1764)
         Sort Key: public.media_downloads.id
         ->  Result  (cost=0.00..115960.71 rows=2726843 width=1764)
               ->  Append  (cost=0.00..115960.71 rows=2726843
width=1764)
                     ->  Seq Scan on media_downloads  (cost=0.00..10.50
rows=13 width=1764)
                           Filter: (id > 90000000)
                     ->  Index Scan using media_downloads_200510_pkey on
media_downloads_200510 media_downloads  (cost=0.00..3.75 rows=14
width=243)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200511_pkey on
media_downloads_200511 media_downloads  (cost=0.00..72.19 rows=172
width=239)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200512_pkey on
media_downloads_200512 media_downloads  (cost=0.00..603.64 rows=172
width=240)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200601_pkey on
media_downloads_200601 media_downloads  (cost=0.00..19.33 rows=232
width=239)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200602_pkey on
media_downloads_200602 media_downloads  (cost=0.00..56.82 rows=316
width=240)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200603_pkey on
media_downloads_200603 media_downloads  (cost=0.00..18.88 rows=270
width=243)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200604_pkey on
media_downloads_200604 media_downloads  (cost=0.00..1194.16 rows=939
width=298)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200605_pkey on
media_downloads_200605 media_downloads  (cost=0.00..79.28 rows=672
width=326)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200606_pkey on
media_downloads_200606 media_downloads  (cost=0.00..75.26 rows=1190
width=314)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200607_pkey on
media_downloads_200607 media_downloads  (cost=0.00..55.29 rows=1238
width=319)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200608_pkey on
media_downloads_200608 media_downloads  (cost=0.00..73.95 rows=1305
width=319)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200609_pkey on
media_downloads_200609 media_downloads  (cost=0.00..144.10 rows=1575
width=324)
                           Index Cond: (id > 90000000)
                     ->  Index Scan using media_downloads_200610_pkey on
media_downloads_200610 media_downloads  (cost=0.00..113532.57
rows=2718709 width=337)
                           Index Cond: (id > 90000000)
                     ->  Seq Scan on media_downloads_200611
media_downloads  (cost=0.00..10.50 rows=13 width=1764)
                           Filter: (id > 90000000)
                     ->  Seq Scan on media_downloads_200612
media_downloads  (cost=0.00..10.50 rows=13 width=1764)
                           Filter: (id > 90000000)
(37 rows)

events=#  explain select * from media_downloads_200610  where id
>90000000 order by id limit 100;
                                                         QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..4.18 rows=100 width=337)
   ->  Index Scan using media_downloads_200610_pkey on
media_downloads_200610  (cost=0.00..113582.70 rows=2719904 width=337)
         Index Cond: (id > 90000000)
(3 rows)

Interestingly, if one reformulates the query like that:

SELECT * FROM media_downloads WHERE id > 90000000 AND id < 90001000
ORDER BY id LIMIT 100;

results in a reasonable cost of 161.5 pages.

Now the above query is basically acceptable, as one iterate all rows
this way, but now I need to know max(id) to know when to stop my loop:

events=# explain select max(id) from media_downloads;
                                                  QUERY
PLAN
--------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=3676914.56..3676914.58 rows=1 width=4)
   ->  Append  (cost=0.00..3444211.85 rows=93081085 width=4)
         ->  Seq Scan on media_downloads  (cost=0.00..10.40 rows=40
width=4)
         ->  Seq Scan on media_downloads_200510 media_downloads
(cost=0.00..5615.84 rows=139884 width=4)
         ->  Seq Scan on media_downloads_200511 media_downloads
(cost=0.00..67446.56 rows=1724356 width=4)
         ->  Seq Scan on media_downloads_200512 media_downloads
(cost=0.00..66727.02 rows=1718302 width=4)
         ->  Seq Scan on media_downloads_200601 media_downloads
(cost=0.00..88799.91 rows=2321991 width=4)
         ->  Seq Scan on media_downloads_200602 media_downloads
(cost=0.00..121525.71 rows=3159571 width=4)
         ->  Seq Scan on media_downloads_200603 media_downloads
(cost=0.00..104205.40 rows=2701240 width=4)
         ->  Seq Scan on media_downloads_200604 media_downloads
(cost=0.00..342511.42 rows=9391242 width=4)
         ->  Seq Scan on media_downloads_200605 media_downloads
(cost=0.00..245167.39 rows=6724039 width=4)
         ->  Seq Scan on media_downloads_200606 media_downloads
(cost=0.00..430186.99 rows=11901499 width=4)
         ->  Seq Scan on media_downloads_200607 media_downloads
(cost=0.00..451313.72 rows=12380172 width=4)
         ->  Seq Scan on media_downloads_200608 media_downloads
(cost=0.00..474743.72 rows=13048372 width=4)
         ->  Seq Scan on media_downloads_200609 media_downloads
(cost=0.00..619711.52 rows=15754452 width=4)
         ->  Seq Scan on media_downloads_200610 media_downloads
(cost=0.00..426225.45 rows=12115845 width=4)
         ->  Seq Scan on media_downloads_200611 media_downloads
(cost=0.00..10.40 rows=40 width=4)
         ->  Seq Scan on media_downloads_200612 media_downloads
(cost=0.00..10.40 rows=40 width=4)
(18 rows)

events=# explain select max(id) from media_downloads_200610;
                                                                 QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.04..0.05 rows=1 width=0)
   InitPlan
     ->  Limit  (cost=0.00..0.04 rows=1 width=4)
           ->  Index Scan Backward using media_downloads_200610_pkey on
media_downloads_200610  (cost=0.00..475660.29 rows=12115845 width=4)
                 Filter: (id IS NOT NULL)
(5 rows)

For me as a human, it's obvious, that max(media_downloads) ==
max(media_downloads_200612..media_downloads_200510).

Any ideas how to make the optimizer handle partitioned tables more
sensible?

Andreas



Attachment

Re: partitioned table performance

From
"Simon Riggs"
Date:
On Sun, 2006-10-29 at 00:28 +0200, Andreas Kostyrka wrote:

> Any ideas how to make the optimizer handle partitioned tables more
> sensible?

Yes, those are known inefficiencies in the current implementation which
we expect to address for 8.3.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: partitioned table performance

From
Andreas Kostyrka
Date:
Am Montag, den 30.10.2006, 08:18 +0000 schrieb Simon Riggs:
> On Sun, 2006-10-29 at 00:28 +0200, Andreas Kostyrka wrote:
>
> > Any ideas how to make the optimizer handle partitioned tables more
> > sensible?
>
> Yes, those are known inefficiencies in the current implementation which
> we expect to address for 8.3.

Any ideas to force the current optimizer to do something sensible?

Andreas

>

Attachment

Re: partitioned table performance

From
"Simon Riggs"
Date:
On Mon, 2006-10-30 at 22:58 +0100, Andreas Kostyrka wrote:
> Am Montag, den 30.10.2006, 08:18 +0000 schrieb Simon Riggs:
> > On Sun, 2006-10-29 at 00:28 +0200, Andreas Kostyrka wrote:
> >
> > > Any ideas how to make the optimizer handle partitioned tables more
> > > sensible?
> >
> > Yes, those are known inefficiencies in the current implementation which
> > we expect to address for 8.3.
>
> Any ideas to force the current optimizer to do something sensible?

Brute force & ignorance: PL/pgSQL

Perhaps some other domain knowledge might help you shorten the search?

Thats all for now. It's not a minor fixup and nobody had time to fix
that for 8.2 since other fish were bigger.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com