Re: Partitioned table performance - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: Partitioned table performance
Date
Msg-id 20041221225643.GU18180@decibel.org
Whole thread Raw
In response to Re: Partitioned table performance  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Sorry for the late reply, so I included the whole thread. Should this be
a TODO?

On Wed, Dec 15, 2004 at 08:30:08PM -0500, Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
> > But I'm a bit puzzled. Why would Append have any significant cost? It's just
> > taking the tuples from one plan node and returning them until they run out,
> > then taking the tuples from another plan node. It should have no i/o cost and
> > hardly any cpu cost. Where is the time going?
>
> As best I can tell by profiling, the cost of the Append node per se is
> indeed negligible --- no more than a couple percent of the runtime in
> CVS tip for a test case similar to Stacy White's example.
>
> It looks bad in EXPLAIN ANALYZE, but you have to realize that passing
> the tuples up through the Append node doubles the instrumentation
> overhead of EXPLAIN ANALYZE, which is pretty sizable already.  (If you
> turn on \timing in psql and try the query itself vs. EXPLAIN ANALYZE,
> the actual elapsed time is about double, at least for me.)
>
> The other effect, which I hadn't expected, is that the seqscans
> themselves actually slow down.  I get
>
> regression=# explain analyze SELECT COUNT(*), MAX(bar1) FROM super_foo ;
>                                                                QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=16414.32..16414.32 rows=1 width=4) (actual time=32313.980..32313.988 rows=1 loops=1)
>    ->  Append  (cost=0.00..13631.54 rows=556555 width=4) (actual time=0.232..21848.401 rows=524289 loops=1)
>          ->  Seq Scan on super_foo  (cost=0.00..0.00 rows=1 width=4) (actual time=0.020..0.020 rows=0 loops=1)
>          ->  Seq Scan on sub_foo1 super_foo  (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.187..6926.395
rows=262144loops=1) 
>          ->  Seq Scan on sub_foo2 super_foo  (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.168..7026.953
rows=262145loops=1) 
>  Total runtime: 32314.993 ms
> (6 rows)
>
> regression=# explain analyze SELECT COUNT(*), MAX(bar1) FROM sub_foo1;
>                                                        QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=8207.16..8207.16 rows=1 width=4) (actual time=9850.420..9850.428 rows=1 loops=1)
>    ->  Seq Scan on sub_foo1  (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.202..4642.401 rows=262144
loops=1)
>  Total runtime: 9851.423 ms
> (3 rows)
>
> Notice the actual times for the sub_foo1 seqscans.  That increase (when
> counted for both input tables) almost exactly accounts for the
> difference in non-EXPLAIN ANALYZE runtime.
>
> After digging around, I find that the reason for the difference is that
> the optimization to avoid a projection step (ExecProject) isn't applied
> for scans of inheritance unions:
>
>     /*
>      * Can't do it with inheritance cases either (mainly because Append
>      * doesn't project).
>      */
>     if (rel->reloptkind != RELOPT_BASEREL)
>         return false;
>
> So if you were to try the example in a pre-7.4 PG, which didn't have
> that optimization, you'd probably find that the speeds were just about
> the same.  (I'm too lazy to verify this though.)
>
> I looked briefly at what it would take to cover this case, and decided
> that it's a nontrivial change, so it's too late to do something about it
> for 8.0.  I think it's probably possible to fix it though, at least for
> cases where the child tables have rowtypes identical to the parent.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>

--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Tips for a system with _extremely_ slow IO?
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Partitioned table performance