Home > mailing lists

Re: [SQL] can i make this sql query more efficiant? - Mailing list pgsql-performance

From	Manfred Koizar
Subject	Re: [SQL] can i make this sql query more efficiant?
Date	April 4, 2003 14:03:18
Msg-id	d0ir8vgn2sp6uh2h61nkm4tdasqqn4o18l@4ax.com Whole thread Raw
Responses	Re: [SQL] can i make this sql query more efficiant?
List	pgsql-performance

Tree view

On Fri, 4 Apr 2003 08:16:01 -0800, Josh Berkus <josh@agliodbs.com>
wrote:
>That version is only more efficient for small data sets.    I've generally
>found that case statements are slower than subselects for large data sets.

I'd be honestly interested in the circumstances where you made that
observation.

>YMMV.

Yes, it does :-)  Out of curiosity I did a few tests with PG 7.2 on my
old notebook:

CREATE TABLE baz (event int, level int);
INSERT INTO baz SELECT (100*random()+0.5), (3*random()+0.5);
INSERT INTO baz SELECT (100*random()+0.5), (3*random()+0.5) FROM baz;
...
INSERT INTO baz SELECT (100*random()+0.5), (3*random()+0.5) FROM baz;
CREATE INDEX baz_event ON baz(event);
ANALYSE baz;

SELECT event,
       SUM (CASE level WHEN 1 THEN 1 ELSE 0 END) AS ones,
       SUM (CASE level WHEN 2 THEN 1 ELSE 0 END) AS twos,
       SUM (CASE level WHEN 3 THEN 1 ELSE 0 END) AS threes
  FROM baz GROUP BY event;

SELECT event,
  (SELECT count(*) FROM baz a
    WHERE level = 1 AND a.event=baz.event) AS ones,
  (SELECT count(*) FROM baz a
    WHERE level = 2 and a.event=baz.event) AS twos,
  (SELECT count(*) FROM baz a
    WHERE level = 3 and a.event=baz.event) AS threes
  FROM baz GROUP BY event;

tuples   case         subselect
  8K    718.48 msec   16199.88 msec
 32K   6168.18 msec   74742.85 msec
128K  25072.34 msec  304585.61 msec

CLUSTER baz_event ON baz; ANALYSE baz;
This changes the subselect plan from seq scan to index scan.

128K  12116.07 msec   17530.85 msec

Add 128K more tuples, so that only the first half of the relation is
clustered.

256K  45663.35 msec  117748.23 msec

CLUSTER baz_event ON baz; ANALYSE baz;

256K  23691.81 msec   35138.26 msec

Maybe it is just the data distribution (100 events, 3 levels,
thousands of tuples) that makes CASE look faster than subselects ...

Servus
 Manfred

pgsql-performance by date:

From: Josh Berkus
Date: 04 April 2003, 13:08:00
Subject: Re: OSS database needed for testing

From: Josh Berkus
Date: 04 April 2003, 14:26:21
Subject: Re: [SQL] can i make this sql query more efficiant?

Re: [SQL] can i make this sql query more efficiant? - Mailing list pgsql-performance

Previous

Next