Re: Query performance discontinuity - Mailing list pgsql-performance
From | Rod Taylor |
---|---|
Subject | Re: Query performance discontinuity |
Date | |
Msg-id | 1037076680.66615.21.camel@jester Whole thread Raw |
In response to | Re: Query performance discontinuity (Mike Nielsen <miken@bigpond.net.au>) |
Responses |
Re: Query performance discontinuity
v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION |
List | pgsql-performance |
Given the estimated costs, PostgreSQL is doing the right things. However, in your case, it doesn't appear that the estimations are realistic. Index scans are much cheaper than advertised. Try setting your random_page_cost lower (1.5 to 2 rather than 4). Bumping sortmem to 32 or 64MB (if plenty of ram is available) will help most situations. Might see the 'pg_autotune' project for assistance in picking good values. http://gborg.postgresql.org/project/pgautotune/projdisplay.php On Mon, 2002-11-11 at 23:10, Mike Nielsen wrote: > Stephan, Tom & Josh: > > Here's the result I get from changing the <> to a > in the tstart > condition (no improvement): > > pganalysis=> explain analyze select * from ps2 where tstart> '2000-1-1 > pganalysis'> 00:00:00' and time_stamp > '2000-1-1 00:00:00' order by > pganalysis-> tstart,time_stamp limit 59628; > NOTICE: QUERY PLAN: > > Limit (cost=160313.27..160313.27 rows=59628 width=179) (actual > time=42269.87..42709.82 rows=59628 loops=1) > -> Sort (cost=160313.27..160313.27 rows=327895 width=179) (actual > time=42269.86..42643.74 rows=59629 loops=1) > -> Seq Scan on ps2 (cost=0.00..13783.40 rows=327895 width=179) > (actual time=0.15..15211.49 rows=327960 loops=1) > Total runtime: 43232.53 msec > > EXPLAIN > > Setting enable_seqscan=off produced a good result: > > > pganalysis=> explain analyze select * from ps2 where tstart> '2000-1-1 > pganalysis'> 00:00:00' and time_stamp > '2000-1-1 00:00:00' order by > pganalysis-> tstart,time_stamp limit 59628; > NOTICE: QUERY PLAN: > > Limit (cost=0.00..160322.87 rows=59628 width=179) (actual > time=40.39..2222.06 rows=59628 loops=1) > -> Index Scan using ps2_idx on ps2 (cost=0.00..881616.45 rows=327895 > width=179) (actual time=40.38..2151.38 rows=59629 loops=1) > Total runtime: 2262.23 msec > > EXPLAIN > > The ps2 table is in time_stamp order, but the tstarts aren't quite as > good -- they're mostly there, but they're computed by subtracting a > (stochastic) value from time_stamp. > > I haven't tinkered with sort_mem yet, but will once I've got this little > project wrapped up (1 or 2 days to go!). > > This, by the way, is pg log data that I've parsed up so I can do some > performance number-crunching for a client of mine. Is there a better > way to get comprehensive, per-query read, write and cache hit data? As > you can imagine, with millions of records, my client-side perl script > for the parsing is slow. I've tried my hand at writing an aggregate > function on the server side using lex and yacc, but decided that, at > least for this project, I'd rather let the machine do the head-banging > -- I can tokenize the raw syslog data (loaded into another pg table) > into an intermediate result in an aggregate function, and my parser > works on the token strings, but the perl script finished before I could > go any further... > > In the off chance, however, that I get invited to more of this kind of > work, it would be really handy to be able to get the data without all > this hassle! Any clues would be gratefully received. > > Regards, > > Mike > > > On Tue, 2002-11-12 at 04:44, Mike Nielsen wrote: > pgsql-performers, > > Just out of curiosity, anybody with any ideas on what happens to this > query when the limit is 59626? It's as though 59626 = infinity? > > pganalysis=> explain analyze select * from ps2 where tstart<> '2000-1-1 > 00:00:00' and time_stamp > '2000-1-1 00:00:00' order by > tstart,time_stamp limit 59624; > NOTICE: QUERY PLAN: > > Limit (cost=0.00..160328.37 rows=59624 width=179) (actual > time=14.52..2225.16 rows=59624 loops=1) > -> Index Scan using ps2_idx on ps2 (cost=0.00..881812.85 rows=327935 > width=179) (actual time=14.51..2154.59 rows=59625 loops=1) > Total runtime: 2265.93 msec > > EXPLAIN > pganalysis=> explain analyze select * from ps2 where tstart<> '2000-1-1 > 00:00:00' and time_stamp > '2000-1-1 00:00:00' order by > tstart,time_stamp limit 59625; > NOTICE: QUERY PLAN: > > Limit (cost=0.00..160331.06 rows=59625 width=179) (actual > time=0.45..2212.19 rows=59625 loops=1) > -> Index Scan using ps2_idx on ps2 (cost=0.00..881812.85 rows=327935 > width=179) (actual time=0.45..2140.87 rows=59626 loops=1) > Total runtime: 2254.50 msec > > EXPLAIN > pganalysis=> explain analyze select * from ps2 where tstart<> '2000-1-1 > 00:00:00' and time_stamp > '2000-1-1 00:00:00' order by > tstart,time_stamp limit 59626; > NOTICE: QUERY PLAN: > > Limit (cost=160332.32..160332.32 rows=59626 width=179) (actual > time=37359.41..37535.85 rows=59626 loops=1) > -> Sort (cost=160332.32..160332.32 rows=327935 width=179) (actual > time=37359.40..37471.07 rows=59627 loops=1) > -> Seq Scan on ps2 (cost=0.00..13783.52 rows=327935 width=179) > (actual time=0.26..12433.00 rows=327960 loops=1) > Total runtime: 38477.39 msec > > EXPLAIN > pganalysis=> explain analyze select * from ps2 where tstart<> '2000-1-1 > 00:00:00' and time_stamp > '2000-1-1 00:00:00' order by > tstart,time_stamp limit 59627; > NOTICE: QUERY PLAN: > > Limit (cost=160332.32..160332.32 rows=59627 width=179) (actual > time=38084.85..38260.88 rows=59627 loops=1) > -> Sort (cost=160332.32..160332.32 rows=327935 width=179) (actual > time=38084.83..38194.63 rows=59628 loops=1) > -> Seq Scan on ps2 (cost=0.00..13783.52 rows=327935 width=179) > (actual time=0.15..12174.74 rows=327960 loops=1) > Total runtime: 38611.83 msec > > EXPLAIN > > pganalysis=> \d ps2 > Table "ps2" > Column | Type | Modifiers > -------------+--------------------------+----------- > host | character varying(255) | > pid | integer | > line | integer | > time_stamp | timestamp with time zone | > seq | integer | > cpu_sys | real | > cpu_elapsed | real | > cpu_user | real | > cpu_syst | real | > cpu_usert | real | > mssp | integer | > sigp | integer | > msrt | integer | > msst | integer | > sigt | integer | > msrp | integer | > swap | integer | > swat | integer | > recp | integer | > rect | integer | > pgfp | integer | > pgft | integer | > icsp | integer | > vcst | integer | > icst | integer | > vcsp | integer | > fsbop | integer | > fsbos | integer | > fsbip | integer | > fsbis | integer | > dread | integer | > dwrit | integer | > sbhr | real | > sread | integer | > swrit | integer | > lbhr | real | > lread | integer | > lwrit | integer | > dbuser | character(8) | > tstart | timestamp with time zone | > Indexes: ps2_idx > > pganalysis=> \d ps2_idx > Index "ps2_idx" > Column | Type > ------------+-------------------------- > tstart | timestamp with time zone > time_stamp | timestamp with time zone > btree > > pganalysis=> > > psql (PostgreSQL) 7.2 > contains support for: readline, history, multibyte > > > Platform: Celeron 1.3GHz, 512MB 40GB IDE hard disk, Linux 2.4.8-26mdk > kernel > > Regards, > > Mike > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Rod Taylor
pgsql-performance by date: