Re: Partitioned table performance - Mailing list pgsql-performance

From Tom Lane
Subject Re: Partitioned table performance
Date
Msg-id 18421.1103160608@sss.pgh.pa.us
Whole thread Raw
In response to Re: Partitioned table performance  (Greg Stark <gsstark@mit.edu>)
Responses Re: Partitioned table performance  ("Jim C. Nasby" <decibel@decibel.org>)
List pgsql-performance
Greg Stark <gsstark@mit.edu> writes:
> But I'm a bit puzzled. Why would Append have any significant cost? It's just
> taking the tuples from one plan node and returning them until they run out,
> then taking the tuples from another plan node. It should have no i/o cost and
> hardly any cpu cost. Where is the time going?

As best I can tell by profiling, the cost of the Append node per se is
indeed negligible --- no more than a couple percent of the runtime in
CVS tip for a test case similar to Stacy White's example.

It looks bad in EXPLAIN ANALYZE, but you have to realize that passing
the tuples up through the Append node doubles the instrumentation
overhead of EXPLAIN ANALYZE, which is pretty sizable already.  (If you
turn on \timing in psql and try the query itself vs. EXPLAIN ANALYZE,
the actual elapsed time is about double, at least for me.)

The other effect, which I hadn't expected, is that the seqscans
themselves actually slow down.  I get

regression=# explain analyze SELECT COUNT(*), MAX(bar1) FROM super_foo ;
                                                               QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=16414.32..16414.32 rows=1 width=4) (actual time=32313.980..32313.988 rows=1 loops=1)
   ->  Append  (cost=0.00..13631.54 rows=556555 width=4) (actual time=0.232..21848.401 rows=524289 loops=1)
         ->  Seq Scan on super_foo  (cost=0.00..0.00 rows=1 width=4) (actual time=0.020..0.020 rows=0 loops=1)
         ->  Seq Scan on sub_foo1 super_foo  (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.187..6926.395
rows=262144loops=1) 
         ->  Seq Scan on sub_foo2 super_foo  (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.168..7026.953
rows=262145loops=1) 
 Total runtime: 32314.993 ms
(6 rows)

regression=# explain analyze SELECT COUNT(*), MAX(bar1) FROM sub_foo1;
                                                       QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=8207.16..8207.16 rows=1 width=4) (actual time=9850.420..9850.428 rows=1 loops=1)
   ->  Seq Scan on sub_foo1  (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.202..4642.401 rows=262144 loops=1)
 Total runtime: 9851.423 ms
(3 rows)

Notice the actual times for the sub_foo1 seqscans.  That increase (when
counted for both input tables) almost exactly accounts for the
difference in non-EXPLAIN ANALYZE runtime.

After digging around, I find that the reason for the difference is that
the optimization to avoid a projection step (ExecProject) isn't applied
for scans of inheritance unions:

    /*
     * Can't do it with inheritance cases either (mainly because Append
     * doesn't project).
     */
    if (rel->reloptkind != RELOPT_BASEREL)
        return false;

So if you were to try the example in a pre-7.4 PG, which didn't have
that optimization, you'd probably find that the speeds were just about
the same.  (I'm too lazy to verify this though.)

I looked briefly at what it would take to cover this case, and decided
that it's a nontrivial change, so it's too late to do something about it
for 8.0.  I think it's probably possible to fix it though, at least for
cases where the child tables have rowtypes identical to the parent.

            regards, tom lane

pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: indentifying the database in a Postgres log file.
Next
From: Christopher Browne
Date:
Subject: Re: Trying to create multi db query in one large queries