Re: Partitioned table performance - Mailing list pgsql-performance
From | Tom Lane |
---|---|
Subject | Re: Partitioned table performance |
Date | |
Msg-id | 18421.1103160608@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Partitioned table performance (Greg Stark <gsstark@mit.edu>) |
Responses |
Re: Partitioned table performance
|
List | pgsql-performance |
Greg Stark <gsstark@mit.edu> writes: > But I'm a bit puzzled. Why would Append have any significant cost? It's just > taking the tuples from one plan node and returning them until they run out, > then taking the tuples from another plan node. It should have no i/o cost and > hardly any cpu cost. Where is the time going? As best I can tell by profiling, the cost of the Append node per se is indeed negligible --- no more than a couple percent of the runtime in CVS tip for a test case similar to Stacy White's example. It looks bad in EXPLAIN ANALYZE, but you have to realize that passing the tuples up through the Append node doubles the instrumentation overhead of EXPLAIN ANALYZE, which is pretty sizable already. (If you turn on \timing in psql and try the query itself vs. EXPLAIN ANALYZE, the actual elapsed time is about double, at least for me.) The other effect, which I hadn't expected, is that the seqscans themselves actually slow down. I get regression=# explain analyze SELECT COUNT(*), MAX(bar1) FROM super_foo ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=16414.32..16414.32 rows=1 width=4) (actual time=32313.980..32313.988 rows=1 loops=1) -> Append (cost=0.00..13631.54 rows=556555 width=4) (actual time=0.232..21848.401 rows=524289 loops=1) -> Seq Scan on super_foo (cost=0.00..0.00 rows=1 width=4) (actual time=0.020..0.020 rows=0 loops=1) -> Seq Scan on sub_foo1 super_foo (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.187..6926.395 rows=262144loops=1) -> Seq Scan on sub_foo2 super_foo (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.168..7026.953 rows=262145loops=1) Total runtime: 32314.993 ms (6 rows) regression=# explain analyze SELECT COUNT(*), MAX(bar1) FROM sub_foo1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=8207.16..8207.16 rows=1 width=4) (actual time=9850.420..9850.428 rows=1 loops=1) -> Seq Scan on sub_foo1 (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.202..4642.401 rows=262144 loops=1) Total runtime: 9851.423 ms (3 rows) Notice the actual times for the sub_foo1 seqscans. That increase (when counted for both input tables) almost exactly accounts for the difference in non-EXPLAIN ANALYZE runtime. After digging around, I find that the reason for the difference is that the optimization to avoid a projection step (ExecProject) isn't applied for scans of inheritance unions: /* * Can't do it with inheritance cases either (mainly because Append * doesn't project). */ if (rel->reloptkind != RELOPT_BASEREL) return false; So if you were to try the example in a pre-7.4 PG, which didn't have that optimization, you'd probably find that the speeds were just about the same. (I'm too lazy to verify this though.) I looked briefly at what it would take to cover this case, and decided that it's a nontrivial change, so it's too late to do something about it for 8.0. I think it's probably possible to fix it though, at least for cases where the child tables have rowtypes identical to the parent. regards, tom lane
pgsql-performance by date: