Re: Partitioned table performance - Mailing list pgsql-performance
From | Stacy White |
---|---|
Subject | Re: Partitioned table performance |
Date | |
Msg-id | 001b01c4dcde$fc27be50$0200a8c0@grownups Whole thread Raw |
In response to | Partitioned table performance ("Stacy White" <harsh@computer.org>) |
Responses |
Re: Partitioned table performance
|
List | pgsql-performance |
Thanks for the quick reply, Josh. Here are some more, with wider tables and 'EXPLAIN ANALYZE' output. These tests use the same basic structure as before, but with 20 data columns rather than just the one: CREATE TABLE one_big_foo ( partition INTEGER, bar1 INTEGER, ... bar20 INTEGER ) Each set of test tables holds 1,000,000 tuples with a partition value of '1', and 1,000,000 with a partition value of '2'. The bar* columns are all set to non-null values. The 'one_big_foo' table stores all 2M rows in one table. 'super_foo' and 'union_foo' split the data into two tables, and use inheritance and union views (respectively) to tie them together, as described in my previous message. Query timings and 'EXPLAIN ANALYZE' results for full table scans and for partition scans follow: vod=# SELECT COUNT(*), MAX(bar1) FROM one_big_foo ; Time: 3695.274 ms vod=# SELECT COUNT(*), MAX(bar1) FROM super_foo ; Time: 4641.992 ms vod=# SELECT COUNT(*), MAX(bar1) FROM union_foo ; Time: 16035.025 ms vod=# SELECT COUNT(*), MAX(bar1) FROM one_big_foo WHERE partition = 1 ; Time: 4395.274 ms vod=# SELECT COUNT(*), MAX(bar1) FROM super_foo WHERE partition = 1 ; Time: 3050.920 ms vod=# SELECT COUNT(*), MAX(bar1) FROM union_foo WHERE partition = 1 ; Time: 7468.664 ms vod=# EXPLAIN ANALYZE SELECT COUNT(*), MAX(bar1) FROM one_big_foo ; QUERY PLAN ---------------------------------------------------------------------------- --------------------------------------------------- Aggregate (cost=61747.92..61747.92 rows=1 width=4) (actual time=18412.471..18412.474 rows=1 loops=1) -> Seq Scan on one_big_foo (cost=0.00..51747.61 rows=2000061 width=4) (actual time=0.097..10079.192 rows=2000000 loops=1) Total runtime: 18412.597 ms (3 rows) Time: 18413.919 ms vod=# EXPLAIN ANALYZE SELECT COUNT(*), MAX(bar1) FROM super_foo ; QUERY PLAN ---------------------------------------------------------------------------- --------------------------------------------------------------- Aggregate (cost=61749.87..61749.87 rows=1 width=4) (actual time=30267.913..30267.916 rows=1 loops=1) -> Append (cost=0.00..51749.24 rows=2000125 width=4) (actual time=0.127..22830.610 rows=2000000 loops=1) -> Seq Scan on super_foo (cost=0.00..0.00 rows=1 width=4) (actual time=0.005..0.005 rows=0 loops=1) -> Seq Scan on sub_foo1 super_foo (cost=0.00..25874.62 rows=1000062 width=4) (actual time=0.113..5808.899 rows=1000000 loops=1) -> Seq Scan on sub_foo2 super_foo (cost=0.00..25874.62 rows=1000062 width=4) (actual time=0.075..5829.095 rows=1000000 loops=1) Total runtime: 30268.061 ms (6 rows) Time: 30303.271 ms vod=# EXPLAIN ANALYZE SELECT COUNT(*), MAX(bar1) FROM union_foo ; QUERY PLAN ---------------------------------------------------------------------------- -------------------------------------------------------------------- Aggregate (cost=98573.40..98573.40 rows=1 width=4) (actual time=62542.849..62542.852 rows=1 loops=1) -> Subquery Scan union_foo (cost=0.00..88573.20 rows=2000040 width=4) (actual time=0.130..55536.040 rows=2000000 loops=1) -> Append (cost=0.00..68572.80 rows=2000040 width=80) (actual time=0.122..43210.763 rows=2000000 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..34286.40 rows=1000020 width=80) (actual time=0.118..16312.708 rows=1000000 loops=1) -> Seq Scan on union_foo1 (cost=0.00..24286.20 rows=1000020 width=80) (actual time=0.107..7763.460 rows=1000000 loops=1) -> Subquery Scan "*SELECT* 2" (cost=0.00..34286.40 rows=1000020 width=80) (actual time=0.116..16610.387 rows=1000000 loops=1) -> Seq Scan on union_foo2 (cost=0.00..24286.20 rows=1000020 width=80) (actual time=0.095..7549.522 rows=1000000 loops=1) Total runtime: 62543.098 ms vod=# EXPLAIN ANALYZE SELECT COUNT(*), MAX(bar1) FROM one_big_foo WHERE partition = 1 ; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------- Aggregate (cost=61711.25..61711.25 rows=1 width=4) (actual time=11592.135..11592.139 rows=1 loops=1) -> Seq Scan on one_big_foo (cost=0.00..56747.76 rows=992697 width=4) (actual time=0.106..7627.170 rows=1000000 loops=1) Filter: (partition = 1::numeric) Total runtime: 11592.264 ms (4 rows) Time: 11593.749 ms vod=# EXPLAIN ANALYZE SELECT COUNT(*), MAX(bar1) FROM super_foo WHERE partition = 1 ; QUERY PLAN ---------------------------------------------------------------------------- --------------------------------------------------------------------------- Aggregate (cost=33377.11..33377.11 rows=1 width=4) (actual time=15670.309..15670.312 rows=1 loops=1) -> Append (cost=0.00..28376.79 rows=1000064 width=4) (actual time=6.699..12072.483 rows=1000000 loops=1) -> Seq Scan on super_foo (cost=0.00..0.00 rows=1 width=4) (actual time=0.005..0.005 rows=0 loops=1) Filter: (partition = 1::numeric) -> Seq Scan on sub_foo1 super_foo (cost=0.00..28374.78 rows=1000062 width=4) (actual time=0.106..6688.812 rows=1000000 loops=1) Filter: (partition = 1::numeric) -> Index Scan using idx_sub_foo2_partition on sub_foo2 super_foo (cost=0.00..2.01 rows=1 width=4) (actual time=0.221..0.221 rows=0 loops=1) Index Cond: (partition = 1::numeric) Total runtime: 15670.463 ms (9 rows) Time: 15672.235 ms vod=# EXPLAIN ANALYZE SELECT COUNT(*), MAX(bar1) FROM union_foo WHERE partition = 1 ; QUERY PLAN ---------------------------------------------------------------------------- -------------------------------------------------------------------- Aggregate (cost=98573.40..98573.40 rows=1 width=4) (actual time=31897.629..31897.632 rows=1 loops=1) -> Subquery Scan union_foo (cost=0.00..88573.20 rows=2000040 width=4) (actual time=0.134..28323.692 rows=1000000 loops=1) -> Append (cost=0.00..68572.80 rows=2000040 width=80) (actual time=0.125..21969.522 rows=1000000 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..34286.40 rows=1000020 width=80) (actual time=0.120..16867.005 rows=1000000 loops=1) -> Seq Scan on union_foo1 (cost=0.00..24286.20 rows=1000020 width=80) (actual time=0.108..8017.931 rows=1000000 loops=1) -> Subquery Scan "*SELECT* 2" (cost=0.00..34286.40 rows=1000020 width=80) (actual time=0.011..0.011 rows=0 loops=1) -> Result (cost=0.00..24286.20 rows=1000020 width=80) (actual time=0.004..0.004 rows=0 loops=1) One-Time Filter: false -> Seq Scan on union_foo2 (cost=0.00..24286.20 rows=1000020 width=80) (never executed) Total runtime: 31897.897 ms (10 rows) Time: 31900.204 ms ----- Original Message ----- From: "Josh Berkus" <josh@agliodbs.com> To: <pgsql-performance@postgresql.org> Cc: "Stacy White" <harsh@computer.org> Sent: Sunday, December 05, 2004 3:06 PM Subject: Re: [PERFORM] Partitioned table performance Stacy, Thanks for the stats! > In some cases we've seen some increased performance in tests by splitting > the table into several smaller tables. Both 'UNION ALL' views, and the > superclass/subclass scheme work well at pruning down the set of rows a > query uses, but they seem to introduce a large performance hit to the time > to process each row (~50% for superclass/subclass, and ~150% for union > views). This seems reasonable, actually, given your test. Really, what you should be comparing it against is not against selecting from an individual partition, but selecting from the whole business as one large table. I also suspect that wider rows results in less overhead proportionally; note that your test contains *only* the indexed rows. I should soon have a test to prove this, hopefully. However, I would be interested in seeing EXPLAIN ANALYZE from your tests rather than just EXPLAIN. -- Josh Berkus Aglio Database Solutions San Francisco
pgsql-performance by date: