Re: Sorted union - Mailing list pgsql-performance

From Scott Lamb
Subject Re: Sorted union
Date
Msg-id 436A2F9A.4050202@slamb.org
Whole thread Raw
In response to Re: Sorted union  ("Merlin Moncure" <merlin.moncure@rcsonline.com>)
List pgsql-performance
Merlin Moncure wrote:
> hmm, try pushing the union into a subquery...this is better style
> because it's kind of ambiguous if the ordering will apply before/after
> the union.

Seems to be a little slower. There's a new "subquery scan" step.

     explain analyze
     select    q.when_happened from (
     select    when_stopped as when_happened,
               1 as order_hint
     from      transaction t
     where     '2005-10-25 15:00:00' <= when_stopped
       and     when_stopped <= '2005-10-26 10:00:00'
     union all
     select    when_stopped as when_happened,
               2 as order_hint
     from      transaction t
     where     '2005-10-25 15:00:00' <= when_stopped
       and     when_stopped <= '2005-10-26 10:00:00'
     ) q order by  when_happened, order_hint;


                QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=713013.96..721751.25 rows=3494916 width=12) (actual
time=34392.264..37237.148 rows=3364006 loops=1)
    Sort Key: when_happened, order_hint
    ->  Subquery Scan q  (cost=0.00..229474.11 rows=3494916 width=12)
(actual time=0.194..20283.452 rows=3364006 loops=1)
          ->  Append  (cost=0.00..194524.95 rows=3494916 width=8)
(actual time=0.191..14967.632 rows=3364006 loops=1)
                ->  Subquery Scan "*SELECT* 1"  (cost=0.00..97262.48
rows=1747458 width=8) (actual time=0.189..5535.139 rows=1682003 loops=1)
                      ->  Index Scan using transaction_stopped on
"transaction" t  (cost=0.00..79787.90 rows=1747458 width=8) (actual
time=0.186..3097.268 rows=1682003 loops=1)
                            Index Cond: (('2005-10-25
15:00:00'::timestamp without time zone <= when_stopped) AND
(when_stopped <= '2005-10-26 10:00:00'::timestamp without time zone))
                ->  Subquery Scan "*SELECT* 2"  (cost=0.00..97262.48
rows=1747458 width=8) (actual time=0.173..5625.155 rows=1682003 loops=1)
                      ->  Index Scan using transaction_stopped on
"transaction" t  (cost=0.00..79787.90 rows=1747458 width=8) (actual
time=0.169..3146.714 rows=1682003 loops=1)
                            Index Cond: (('2005-10-25
15:00:00'::timestamp without time zone <= when_stopped) AND
(when_stopped <= '2005-10-26 10:00:00'::timestamp without time zone))
  Total runtime: 39775.225 ms
(11 rows)

> question: why do you want to flatten the table...is it not easier to
> work with as records?

For most things, yes. But I'm making a bunch of different graphs from
these data, and a few of them are much easier with events. The best
example is my concurrency graph. Whenever there's a start event, it goes
up one. Whenever there's a stop event, it goes down one. It's completely
trivial once you have it separated into events.

Thanks,
Scott

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Function with table%ROWTYPE globbing
Next
From: "Merlin Moncure"
Date:
Subject: Re: Sorted union