Thread: Query tuning
Hello All, This query runs forever and ever. Nature of this table being lots of inserts/deletes/query, I vacuum it every half hour to keep the holes reusable and nightly once vacuum analyze to update the optimizer. We've got index on eventtime only. Running it for current day uses index range scan and it runs within acceptable time. Below is the explain of the query. Is the order by sequencenum desc prevents from applying limit optimization? explain SELECT * FROM EVENTLOG WHERE EVENTTIME>'07/23/06 16:00:00' AND EVENTTIME<'08/22/06 16:00:00' AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 0; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------- Limit (cost=15546930.29..15546931.54 rows=500 width=327) -> Sort (cost=15546930.29..15581924.84 rows=13997819 width=327) Sort Key: eventtime, sequencenum -> Seq Scan on eventlog (cost=0.00..2332700.25 rows=13997819 width=327) Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp without time zone) AND (((objdomainid)::text = 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text = 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text = 'tzRh39d0d91luNGT1weIUjLvFIcA'::text))) (5 rows) Thanks, Stalin Pg version 8.0.1, suse 64bit.
Subbiah, Stalin wrote: > Hello All, > > This query runs forever and ever. Nature of this table being lots of > inserts/deletes/query, I vacuum it every half hour to keep the holes > reusable and nightly once vacuum analyze to update the optimizer. We've > got index on eventtime only. Running it for current day uses index range > scan and it runs within acceptable time. Below is the explain of the > query. Is the order by sequencenum desc prevents from applying limit > optimization? > > explain SELECT * > FROM EVENTLOG > WHERE EVENTTIME>'07/23/06 16:00:00' > AND EVENTTIME<'08/22/06 16:00:00' > AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' > OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' > OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') > ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 0; > > QUERY PLAN > > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ------------------------------------------------------------- > Limit (cost=15546930.29..15546931.54 rows=500 width=327) > -> Sort (cost=15546930.29..15581924.84 rows=13997819 width=327) > Sort Key: eventtime, sequencenum > -> Seq Scan on eventlog (cost=0.00..2332700.25 rows=13997819 > width=327) > Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp > without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp > without time zone) AND (((objdomainid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text))) > (5 rows) > > Thanks, > Stalin > Pg version 8.0.1, suse 64bit. Firstly you should update to 8.0.8 - because it's in the same stream you won't need to do a dump/initdb/reload like a major version change, it should be a simple upgrade. Can you send explain analyze instead of just explain? It sounds like you're not analyz'ing enough - if you're doing lots of updates/deletes/inserts then the statistics postgresql uses to choose whether to do an index scan or something else will quickly be outdated and so it'll have to go back to a full table scan every time.. Can you set up autovacuum to handle that for you more regularly? -- Postgresql & php tutorials http://www.designmagick.com/
Actually these servers will be upgraded to 8.1.4 in couple of months. Here you go with explain analyze. # explain analyze SELECT * FROM EVENTLOG WHERE EVENTTIME>'07/23/06 16:00:00' AND EVENTTIME<'08/22/06 16:00:00' AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------- Limit (cost=15583110.14..15583111.39 rows=500 width=327) (actual time=427771.568..427772.904 rows=500 loops=1) -> Sort (cost=15583108.89..15618188.88 rows=14031998 width=327) (actual time=427770.504..427771.894 rows=1000 loops=1) Sort Key: eventtime, sequencenum -> Seq Scan on eventlog (cost=0.00..2334535.17 rows=14031998 width=327) (actual time=10.370..190038.764 rows=7699388 loops=1) Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp without time zone) AND (((objdomainid)::text = 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text = 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text = 'tzRh39d0d91luNGT1weIUjLvFIcA'::text))) Total runtime: 437884.134 ms (6 rows) -----Original Message----- From: Chris [mailto:dmagick@gmail.com] Sent: Tuesday, August 22, 2006 6:37 PM To: Subbiah, Stalin Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Query tuning Subbiah, Stalin wrote: > Hello All, > > This query runs forever and ever. Nature of this table being lots of > inserts/deletes/query, I vacuum it every half hour to keep the holes > reusable and nightly once vacuum analyze to update the optimizer. > We've got index on eventtime only. Running it for current day uses > index range scan and it runs within acceptable time. Below is the > explain of the query. Is the order by sequencenum desc prevents from > applying limit optimization? > > explain SELECT * > FROM EVENTLOG > WHERE EVENTTIME>'07/23/06 16:00:00' > AND EVENTTIME<'08/22/06 16:00:00' > AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' > OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' > OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') > ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 0; > > QUERY PLAN > > ---------------------------------------------------------------------- > -- > ---------------------------------------------------------------------- > -- > ---------------------------------------------------------------------- > -- > ---------------------------------------------------------------------- > -- > ------------------------------------------------------------- > Limit (cost=15546930.29..15546931.54 rows=500 width=327) > -> Sort (cost=15546930.29..15581924.84 rows=13997819 width=327) > Sort Key: eventtime, sequencenum > -> Seq Scan on eventlog (cost=0.00..2332700.25 > rows=13997819 > width=327) > Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp > without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp > without time zone) AND (((objdomainid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text))) > (5 rows) > > Thanks, > Stalin > Pg version 8.0.1, suse 64bit. Firstly you should update to 8.0.8 - because it's in the same stream you won't need to do a dump/initdb/reload like a major version change, it should be a simple upgrade. Can you send explain analyze instead of just explain? It sounds like you're not analyz'ing enough - if you're doing lots of updates/deletes/inserts then the statistics postgresql uses to choose whether to do an index scan or something else will quickly be outdated and so it'll have to go back to a full table scan every time.. Can you set up autovacuum to handle that for you more regularly? -- Postgresql & php tutorials http://www.designmagick.com/
Subbiah, Stalin wrote: > Actually these servers will be upgraded to 8.1.4 in couple of months. even so, you could get some bad data in there. http://www.postgresql.org/docs/8.0/static/release.html . Go through the old release notes and you'll find various race conditions, crashes etc. > Here you go with explain analyze. > > # explain analyze SELECT * > FROM EVENTLOG > WHERE EVENTTIME>'07/23/06 16:00:00' AND EVENTTIME<'08/22/06 16:00:00' > AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' > OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' > OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') > ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500; > > QUERY PLAN > > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ------------------------------------------------------------- > Limit (cost=15583110.14..15583111.39 rows=500 width=327) (actual > time=427771.568..427772.904 rows=500 loops=1) > -> Sort (cost=15583108.89..15618188.88 rows=14031998 width=327) > (actual time=427770.504..427771.894 rows=1000 loops=1) > Sort Key: eventtime, sequencenum > -> Seq Scan on eventlog (cost=0.00..2334535.17 rows=14031998 > width=327) (actual time=10.370..190038.764 rows=7699388 loops=1) > Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp > without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp > without time zone) AND (((objdomainid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text))) > Total runtime: 437884.134 ms > (6 rows) If you analyze the table then run this again what plan does it come back with? I can't read explain output properly but I suspect (and I'm sure I'll be corrected if need be) that the sort step is way out of whack and so is the seq scan because the stats aren't up to date enough. Do you have an index on objdomainid, objid and userdomainid (one index per field) ? I wonder if that will help much. -- Postgresql & php tutorials http://www.designmagick.com/
I get the same plan after running vacuum analyze. Nope, I don't have index on objdomainid, objid and userdomainid. Only eventime has it. -----Original Message----- From: Chris [mailto:dmagick@gmail.com] Sent: Tuesday, August 22, 2006 8:06 PM To: Subbiah, Stalin Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Query tuning Subbiah, Stalin wrote: > Actually these servers will be upgraded to 8.1.4 in couple of months. even so, you could get some bad data in there. http://www.postgresql.org/docs/8.0/static/release.html . Go through the old release notes and you'll find various race conditions, crashes etc. > Here you go with explain analyze. > > # explain analyze SELECT * > FROM EVENTLOG > WHERE EVENTTIME>'07/23/06 16:00:00' AND EVENTTIME<'08/22/06 16:00:00' > AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' > OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' > OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') > ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500; > > QUERY PLAN > > ---------------------------------------------------------------------- > -- > ---------------------------------------------------------------------- > -- > ---------------------------------------------------------------------- > -- > ---------------------------------------------------------------------- > -- > ------------------------------------------------------------- > Limit (cost=15583110.14..15583111.39 rows=500 width=327) (actual > time=427771.568..427772.904 rows=500 loops=1) > -> Sort (cost=15583108.89..15618188.88 rows=14031998 width=327) > (actual time=427770.504..427771.894 rows=1000 loops=1) > Sort Key: eventtime, sequencenum > -> Seq Scan on eventlog (cost=0.00..2334535.17 > rows=14031998 > width=327) (actual time=10.370..190038.764 rows=7699388 loops=1) > Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp > without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp > without time zone) AND (((objdomainid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text = > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text))) > Total runtime: 437884.134 ms > (6 rows) If you analyze the table then run this again what plan does it come back with? I can't read explain output properly but I suspect (and I'm sure I'll be corrected if need be) that the sort step is way out of whack and so is the seq scan because the stats aren't up to date enough. Do you have an index on objdomainid, objid and userdomainid (one index per field) ? I wonder if that will help much. -- Postgresql & php tutorials http://www.designmagick.com/
It seems to me that what would work best is an index scan backward on the eventtime index. I don't see why that wouldn't work for you, maybe the planner is just esitmating the seq scan and sort is faster for some reason. What does EXPLAIN say if you use a small limit and offset like 10? Or what does EXPLAIN say if you first run "set enable_seqscan=false;" (If you get the same plan, then I wouldn't bother running EXPLAIN ANALYZE, but if you get a different plan I would run EXPLAIN ANALYZE to see if the new plan is any faster.) > -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of > Subbiah, Stalin > Sent: Wednesday, August 23, 2006 1:03 PM > To: Chris > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Query tuning > > > I get the same plan after running vacuum analyze. Nope, I don't have > index on objdomainid, objid and userdomainid. Only eventime has it. > > -----Original Message----- > From: Chris [mailto:dmagick@gmail.com] > Sent: Tuesday, August 22, 2006 8:06 PM > To: Subbiah, Stalin > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Query tuning > > Subbiah, Stalin wrote: > > Actually these servers will be upgraded to 8.1.4 in couple > of months. > > even so, you could get some bad data in there. > http://www.postgresql.org/docs/8.0/static/release.html . Go > through the > old release notes and you'll find various race conditions, > crashes etc. > > > Here you go with explain analyze. > > > > # explain analyze SELECT * > > FROM EVENTLOG > > WHERE EVENTTIME>'07/23/06 16:00:00' AND > EVENTTIME<'08/22/06 16:00:00' > > > AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' > > OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' > > OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') > > ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500; > > > > QUERY PLAN > > > > > ---------------------------------------------------------------------- > > -- > > > ---------------------------------------------------------------------- > > -- > > > ---------------------------------------------------------------------- > > -- > > > ---------------------------------------------------------------------- > > -- > > ------------------------------------------------------------- > > Limit (cost=15583110.14..15583111.39 rows=500 width=327) (actual > > time=427771.568..427772.904 rows=500 loops=1) > > -> Sort (cost=15583108.89..15618188.88 rows=14031998 > width=327) > > (actual time=427770.504..427771.894 rows=1000 loops=1) > > Sort Key: eventtime, sequencenum > > -> Seq Scan on eventlog (cost=0.00..2334535.17 > > rows=14031998 > > width=327) (actual time=10.370..190038.764 rows=7699388 loops=1) > > Filter: ((eventtime > '2006-07-23 > 16:00:00'::timestamp > > without time zone) AND (eventtime < '2006-08-22 > 16:00:00'::timestamp > > without time zone) AND (((objdomainid)::text = > > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text = > > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text = > > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text))) > > Total runtime: 437884.134 ms > > (6 rows) > > If you analyze the table then run this again what plan does > it come back > with? > > I can't read explain output properly but I suspect (and I'm > sure I'll be > corrected if need be) that the sort step is way out of whack and so is > the seq scan because the stats aren't up to date enough. > > Do you have an index on objdomainid, objid and userdomainid (one index > per field) ? I wonder if that will help much. > > -- > Postgresql & php tutorials > http://www.designmagick.com/ > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq >
Changing limit or offset to a small number doesn't have any change in plans. Likewise enable_seqscan to false. They still take 8-10 mins to runs. -----Original Message----- From: Dave Dutcher [mailto:dave@tridecap.com] Sent: Wednesday, August 23, 2006 4:20 PM To: Subbiah, Stalin Cc: pgsql-performance@postgresql.org Subject: RE: [PERFORM] Query tuning It seems to me that what would work best is an index scan backward on the eventtime index. I don't see why that wouldn't work for you, maybe the planner is just esitmating the seq scan and sort is faster for some reason. What does EXPLAIN say if you use a small limit and offset like 10? Or what does EXPLAIN say if you first run "set enable_seqscan=false;" (If you get the same plan, then I wouldn't bother running EXPLAIN ANALYZE, but if you get a different plan I would run EXPLAIN ANALYZE to see if the new plan is any faster.) > -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Subbiah, > Stalin > Sent: Wednesday, August 23, 2006 1:03 PM > To: Chris > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Query tuning > > > I get the same plan after running vacuum analyze. Nope, I don't have > index on objdomainid, objid and userdomainid. Only eventime has it. > > -----Original Message----- > From: Chris [mailto:dmagick@gmail.com] > Sent: Tuesday, August 22, 2006 8:06 PM > To: Subbiah, Stalin > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Query tuning > > Subbiah, Stalin wrote: > > Actually these servers will be upgraded to 8.1.4 in couple > of months. > > even so, you could get some bad data in there. > http://www.postgresql.org/docs/8.0/static/release.html . Go through > the old release notes and you'll find various race conditions, crashes > etc. > > > Here you go with explain analyze. > > > > # explain analyze SELECT * > > FROM EVENTLOG > > WHERE EVENTTIME>'07/23/06 16:00:00' AND > EVENTTIME<'08/22/06 16:00:00' > > > AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' > > OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' > > OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') > > ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500; > > > > QUERY PLAN > > > > > ---------------------------------------------------------------------- > > -- > > > ---------------------------------------------------------------------- > > -- > > > ---------------------------------------------------------------------- > > -- > > > ---------------------------------------------------------------------- > > -- > > ------------------------------------------------------------- > > Limit (cost=15583110.14..15583111.39 rows=500 width=327) (actual > > time=427771.568..427772.904 rows=500 loops=1) > > -> Sort (cost=15583108.89..15618188.88 rows=14031998 > width=327) > > (actual time=427770.504..427771.894 rows=1000 loops=1) > > Sort Key: eventtime, sequencenum > > -> Seq Scan on eventlog (cost=0.00..2334535.17 > > rows=14031998 > > width=327) (actual time=10.370..190038.764 rows=7699388 loops=1) > > Filter: ((eventtime > '2006-07-23 > 16:00:00'::timestamp > > without time zone) AND (eventtime < '2006-08-22 > 16:00:00'::timestamp > > without time zone) AND (((objdomainid)::text = > > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text = > > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text = > > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text))) > > Total runtime: 437884.134 ms > > (6 rows) > > If you analyze the table then run this again what plan does it come > back with? > > I can't read explain output properly but I suspect (and I'm sure I'll > be corrected if need be) that the sort step is way out of whack and so > is the seq scan because the stats aren't up to date enough. > > Do you have an index on objdomainid, objid and userdomainid (one index > per field) ? I wonder if that will help much. > > -- > Postgresql & php tutorials > http://www.designmagick.com/ > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq >